[Performance]:  The accept rate of typical acceptance sampling

### Proposal to improve performance

_No response_

### Report of performance regression

I tested the `accept length` ( number of tokens per step) with`typical acceptance sampling`. The accept length is even smaller than default reject sampling method.
Here is my experimental details: 
1. The dataset I used was mt_bench.
2. Speculative decoding model's setup:
     llama3.1 8b as target model and Qwama-0.5B-Instruct as a draft model (num of speculative tokens is 2)
     llama3.1 8b as target model with MLP-speculator.
3 Temperature was set as 0.9
4 `posterior_threshold` and `posterior_alpha` were set as default values.

Do you have some experimental results on this? Or do I need to tune some parameters for `typical acceptance sampling`? Thanks a lot!



### Misc discussion on performance

_No response_

### Your current environment (if you think it is necessary)

```text
The output of `python collect_env.py`
```


### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Performance]: The accept rate of typical acceptance sampling #8639

Proposal to improve performance

Report of performance regression

Misc discussion on performance

Your current environment (if you think it is necessary)

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Performance]: The accept rate of typical acceptance sampling #8639

Description

Proposal to improve performance

Report of performance regression

Misc discussion on performance

Your current environment (if you think it is necessary)

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions