-
-
Notifications
You must be signed in to change notification settings - Fork 11.2k
[CI/Test Fix] Fix CP tests on Blackwell #28404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI/Test Fix] Fix CP tests on Blackwell #28404
Conversation
Signed-off-by: Lucas Wilkinson <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds conditional skips to context parallelism tests based on GPU compute capability, which is a good approach to handle hardware-specific features in CI. The logic for deepseek-ai/DeepSeek-V2-Lite-Chat seems correct. However, the condition for bigcode/gpt_bigcode-santacoder is overly restrictive and could prevent tests from running on future GPU architectures. I've provided a suggestion to make this more forward-compatible.
|
Can you create a "good first issue" asking for someone to contribute a better error message if the attn backend is incompatible? I know we have ongoing work to add more backends |
Signed-off-by: Lucas Wilkinson <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Lucas Wilkinson <[email protected]>
I made an issue (#28407) but Im scared we will only get a minor improvement in the error message for alot of complexity (basically maintaining an independent list backends that support DCP); i think it may be best to wait for #24794 which should be landing shortly |
Signed-off-by: Lucas Wilkinson <[email protected]>
|
shoot, didnt mean to hit that button 🤦 |
FIX: #28400