-
-
Notifications
You must be signed in to change notification settings - Fork 11.3k
FIX: NixlConnector: do not skip short do_remote_prefill requests #18590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ck_size) fix vllm-project#18429 Signed-off-by: Juncheng Gu <[email protected]>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
wwl2755
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for pointing out! Sorry for the oversight on existing logic of handling 0 block situation.
Signed-off-by: Juncheng Gu <[email protected]>
|
This overlaps with #18632 but it would be good to include your test changes. |
|
@juncgu actually I reverted the test change because it was failing and I don't think it was correct. When the number of tokens is < block_size, so that 0 blocks are to be sent, they are immediately freed and so the P worker doesn't require the D worker to make a request to free them. For this reason, the D worker side of the connector (correctly) does nothing the metadata contains no blocks. |
Yes, you are right, the current scheduler / connector_scheduler do not need the changes in the unit test. |
The original issue is #18490, and #18429 wanted to fix it.
But skipping short
do_remote_prefillrequest in connector's metadata will also skip sending the notification to the prefill worker for releasing remote kv pages.The modifications here are
test_nixl_connector.py::test_prompt_less_than_block_sizeaccordinglyFIX #18591