[Feature] Disallow FlashMLA on Blackwell #24521

yewentao256 · 2025-09-09T16:02:59Z

Purpose

FlashMLA is poorly supported on Blackwell, see
deepseek-ai/FlashMLA#83
for more details.

Signed-off-by: yewentao256 <[email protected]>

gemini-code-assist

Code Review

This pull request correctly disallows the use of the FlashMLA attention backend on NVIDIA Blackwell GPUs by adding a check for SM 10.0+ device capability. This is a necessary fix due to known support issues and is applied to both v0 and v1 engine implementations. My review suggests changing the exception type from AssertionError to NotImplementedError for better consistency with surrounding code and Python best practices.

vllm/attention/backends/flashmla.py

vllm/v1/attention/backends/mla/flashmla.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Wentao Ye <[email protected]>

gx16377 · 2025-09-12T05:51:02Z

Hi @yewentao256, is it possible to use current_platform.is_cuda() and has_device_capability() instead of CudaPlatform? Will be more friendly for other platforms, thank you

yewentao256 · 2025-09-12T18:44:41Z

Hi @yewentao256, is it possible to use current_platform.is_cuda() and has_device_capability() instead of CudaPlatform? Will be more friendly for other platforms, thank you

@gx16377 Thanks for letting me know, #24774

Signed-off-by: yewentao256 <[email protected]> Signed-off-by: Wentao Ye <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Signed-off-by: yewentao256 <[email protected]> Signed-off-by: Wentao Ye <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: xuebwang-amd <[email protected]>

disallow flashmla on blackwell

436f026

Signed-off-by: yewentao256 <[email protected]>

yewentao256 requested review from WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners September 9, 2025 16:03

mergify bot added the v1 label Sep 9, 2025

gemini-code-assist bot reviewed Sep 9, 2025

View reviewed changes

vllm/attention/backends/flashmla.py Outdated Show resolved Hide resolved

vllm/v1/attention/backends/mla/flashmla.py Outdated Show resolved Hide resolved

yewentao256 and others added 2 commits September 9, 2025 12:04

Update vllm/attention/backends/flashmla.py

b5293bd

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Wentao Ye <[email protected]>

Update vllm/v1/attention/backends/mla/flashmla.py

cb40ea8

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Wentao Ye <[email protected]>

tlrmchlsmth approved these changes Sep 9, 2025

View reviewed changes

tlrmchlsmth requested a review from LucasWilkinson September 9, 2025 16:17

yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 9, 2025

yewentao256 merged commit 15de5ff into vllm-project:main Sep 9, 2025
47 checks passed

yewentao256 deleted the wye-ban-flashmla-on-blackwell branch September 9, 2025 18:59

yewentao256 mentioned this pull request Sep 9, 2025

[Bug]: vLLM fails to start with FLASHMLA with DeepSeek in DP=16 on B200 due to invalid shape. #24513

Closed

1 task

yewentao256 mentioned this pull request Sep 12, 2025

[Bug] Fix is_flashmla_supported Check Error #24774

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature] Disallow FlashMLA on Blackwell #24521

[Feature] Disallow FlashMLA on Blackwell #24521

Uh oh!

yewentao256 commented Sep 9, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gx16377 commented Sep 12, 2025

Uh oh!

yewentao256 commented Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Feature] Disallow FlashMLA on Blackwell #24521

[Feature] Disallow FlashMLA on Blackwell #24521

Uh oh!

Conversation

yewentao256 commented Sep 9, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gx16377 commented Sep 12, 2025

Uh oh!

yewentao256 commented Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yewentao256 commented Sep 9, 2025 •

edited by github-actions bot

Loading