-
-
Notifications
You must be signed in to change notification settings - Fork 11.3k
Add runai model streamer e2e test for GCS #28079
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add runai model streamer e2e test for GCS #28079
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
b2996e3 to
9f5e8a1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds an end-to-end test for the RunAI model streamer using a model from a public GCS bucket. The change is straightforward and helps prevent regressions. I've suggested strengthening the test's assertion to make it more robust in catching potential issues.
tests/model_executor/model_loader/runai_model_streamer/test_runai_model_streamer_loader.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
5f499c9 to
89c9529
Compare
|
Hi @22quinn @rahul-tuli @DarkLight1337 Would one of you be able to review this PR? |
89c9529 to
a55cd5c
Compare
|
@DarkLight1337 I have another question. Say there is a refactor of the RunAI model streamer, so now logic that could break the streamer, is in another file (file X), and file X isn't listed in When a test fails in torch nightly, what happens? Who is responsible for fixing that failure? Does the failure block the nightly build and/or the next vllm release? Or does vllm just proceed with the release/build, with that feature broken? |
Nightly failures are not blocking, but we try to fix as many as possible before releasing. |
|
New test test_runai_model_loader_download_files_gcs, fails in CI because To fix this, I tried adding Tested this out locally, but this fails in weird place: Maybe this is because the C++ code doesn't support the anonymous GCS access? I see there is a TODO. Also tried setting up a fake |
Head branch was pushed to by a user without write access
8e8de9c to
4ea7f8e
Compare
4ea7f8e to
f23b98f
Compare
Signed-off-by: Alexis MacAskill <[email protected]>
f23b98f to
e0e1089
Compare
After much trial and error, I fixed the test to correctly use the anonymous credentials, and now the test is passing |
Signed-off-by: Alexis MacAskill <[email protected]>
Purpose
The purpose of this is to add a basic RunAI model streamer e2e test that pulls from a public GCS bucket, and add this to the CI pipeline, to prevent regressions in the RunAI model streamer. Examples of past RunAI model streamer regressions that this e2e test would have caught is:
We also need code coverage for vllm/config/model.py
This test uses a small model, codegemma-2b which is around 5.6 GiB.
Test Plan
We plan to enable all of the RunAI model streamer tests in the CI pipeline. Particularly, we want to run the tests on changes to any of the following:
We also want to run the tests on the torch_nightly.
We didn't add this to
Model Executor Testbecause we would like to run the RunAI model streamer tests on changes tovllm/engine,tests/model_executor/runai_model_streamer, and on the nightly build (whichModel Executor Testdoesn't currently do).Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.