-
Notifications
You must be signed in to change notification settings - Fork 555
Upgrade to 0.11.1 newest vllm commit #3762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
Signed-off-by: Icey <[email protected]>
Signed-off-by: Icey <[email protected]>
Signed-off-by: Icey <[email protected]>
Signed-off-by: Icey <[email protected]>
Signed-off-by: Icey <[email protected]>
Signed-off-by: Icey <[email protected]>
Signed-off-by: Icey <[email protected]>
|
Here we first fix spec decoding, return logprobs for spec decoding can be a future work. |
Signed-off-by: Icey <[email protected]>
Signed-off-by: Icey <[email protected]>
Signed-off-by: Icey <[email protected]>
|
|
Signed-off-by: Icey <[email protected]>
Signed-off-by: Icey <[email protected]>
Signed-off-by: Icey <[email protected]>
| return max(layer_counts) | ||
|
|
||
|
|
||
| # Update cudagraph capture sizes for vllm config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is maybe not correct. I'll look more
| if vllm_version_is("0.11.0"): | ||
| if not model_config.is_multimodal_model and \ | ||
| structured_outputs_config.backend == "auto" and \ | ||
| not scheduler_config.send_delta_data and \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getattr(scheduler_config, "send_delta_data", False)
What this PR does / why we need it?
vllm-project/vllm@83f478b
Fix
spec decode rejection sampler, caused by vllm-project/vllm#26060Fix some
import, caused by vllm-project/vllm#27374Fix
scheduler_config.send_delta_data, caused by #3719Fix
init_with_cudagraph_sizes, caused by vllm-project/vllm#26016Fix
vl modelof replacing PatchEmbed's conv3d to linear layer, caused by vllm-project/vllm#27418Does this PR introduce any user-facing change?
N/A
How was this patch tested?
CI passed with new added/existing test.