Skip to content

Conversation

@yewentao256
Copy link
Member

@yewentao256 yewentao256 commented Sep 9, 2025

Purpose

Fixing #24513

FlashMLA is poorly supported on Blackwell, see
deepseek-ai/FlashMLA#83
for more details.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly disallows the use of the FlashMLA attention backend on NVIDIA Blackwell GPUs by adding a check for SM 10.0+ device capability. This is a necessary fix due to known support issues and is applied to both v0 and v1 engine implementations. My review suggests changing the exception type from AssertionError to NotImplementedError for better consistency with surrounding code and Python best practices.

yewentao256 and others added 2 commits September 9, 2025 12:04
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Wentao Ye <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Wentao Ye <[email protected]>
@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 9, 2025
@yewentao256 yewentao256 merged commit 15de5ff into vllm-project:main Sep 9, 2025
47 checks passed
@yewentao256 yewentao256 deleted the wye-ban-flashmla-on-blackwell branch September 9, 2025 18:59
@gx16377
Copy link

gx16377 commented Sep 12, 2025

Hi @yewentao256, is it possible to use current_platform.is_cuda() and has_device_capability() instead of CudaPlatform? Will be more friendly for other platforms, thank you

@yewentao256
Copy link
Member Author

Hi @yewentao256, is it possible to use current_platform.is_cuda() and has_device_capability() instead of CudaPlatform? Will be more friendly for other platforms, thank you

@gx16377 Thanks for letting me know, #24774

skyloevil pushed a commit to skyloevil/vllm that referenced this pull request Sep 13, 2025
Signed-off-by: yewentao256 <[email protected]>
Signed-off-by: Wentao Ye <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: yewentao256 <[email protected]>
Signed-off-by: Wentao Ye <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
Signed-off-by: yewentao256 <[email protected]>
Signed-off-by: Wentao Ye <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: xuebwang-amd <[email protected]>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: yewentao256 <[email protected]>
Signed-off-by: Wentao Ye <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: xuebwang-amd <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants