-
-
Notifications
You must be signed in to change notification settings - Fork 11.3k
[Feature] Disallow FlashMLA on Blackwell #24521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Disallow FlashMLA on Blackwell #24521
Conversation
Signed-off-by: yewentao256 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly disallows the use of the FlashMLA attention backend on NVIDIA Blackwell GPUs by adding a check for SM 10.0+ device capability. This is a necessary fix due to known support issues and is applied to both v0 and v1 engine implementations. My review suggests changing the exception type from AssertionError to NotImplementedError for better consistency with surrounding code and Python best practices.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Wentao Ye <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Wentao Ye <[email protected]>
|
Hi @yewentao256, is it possible to use current_platform.is_cuda() and has_device_capability() instead of CudaPlatform? Will be more friendly for other platforms, thank you |
|
Signed-off-by: yewentao256 <[email protected]> Signed-off-by: Wentao Ye <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: yewentao256 <[email protected]> Signed-off-by: Wentao Ye <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: yewentao256 <[email protected]> Signed-off-by: Wentao Ye <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: xuebwang-amd <[email protected]>
Signed-off-by: yewentao256 <[email protected]> Signed-off-by: Wentao Ye <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: xuebwang-amd <[email protected]>
Purpose
Fixing #24513
FlashMLA is poorly supported on Blackwell, see
deepseek-ai/FlashMLA#83
for more details.