-
Notifications
You must be signed in to change notification settings - Fork 563
[P/D] Add readme for PD separation #4182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[P/D] Add readme for PD separation #4182
Conversation
Signed-off-by: wangxiaoteng <[email protected]>
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds a comprehensive README for prefill/decode (PD) separation. The documentation is detailed, but there are a few areas that need improvement for clarity and correctness. I've identified a duplicated command-line argument, some incorrect parameters in a table, a potentially broken link, and inconsistencies in configuration values. Additionally, the document structure could be improved to better explain the different configuration options presented. Addressing these points will significantly enhance the usability of this guide.
| --decoder-hosts 192.0.0.3\ | ||
| --decoder-ports 8004 | ||
| --port 1999 \ | ||
| --host 192.0.0.1 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| | --prefiller-hosts-num | Number of repetitions for prefiller node hosts | | ||
| | --prefiller-ports | Ports of prefiller nodes | | ||
| | --prefiller-ports-inc | Number of increments for prefiller node ports | | ||
| | --decoder-hosts | Hosts of decoder nodes | | ||
| | --decoder-hosts-num | Number of repetitions for decoder node hosts | | ||
| | --decoder-ports | Ports of decoder nodes | | ||
| | --decoder-ports-inc | Number of increments for decoder node ports | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameters --prefiller-hosts-num, --prefiller-ports-inc, --decoder-hosts-num, and --decoder-ports-inc described in this table do not appear to be supported by the load_balance_proxy_server_example.py script referenced later. This can be misleading for users. Please ensure the documentation accurately reflects the script's arguments.
| | --decoder-hosts-num | Number of repetitions for decoder node hosts | | ||
| | --decoder-ports | Ports of decoder nodes | | ||
| | --decoder-ports-inc | Number of increments for decoder node ports | | ||
| You can get the proxy program in the repository's examples, [load\_balance\_proxy\_server\_example.py](https://github.com/vllm-project/vllm-ascend/blob/v0.9.1-dev/examples/disaggregate_prefill_v1/load_balance_proxy_server_example.py) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link points to a specific development branch (v0.9.1-dev) and an incorrect directory name (disaggregate_prefill_v1). This makes the link fragile and likely to break. It should be updated to point to the main branch and use the correct path for consistency with other links in this file.
| You can get the proxy program in the repository's examples, [load\_balance\_proxy\_server\_example.py](https://github.com/vllm-project/vllm-ascend/blob/v0.9.1-dev/examples/disaggregate_prefill_v1/load_balance_proxy_server_example.py) | |
| You can get the proxy program in the repository's examples, [load\_balance\_proxy\_server\_example.py](https://github.com/vllm-project/vllm-ascend/blob/main/examples/disaggregated_prefill_v1/load_balance_proxy_server_example.py) |
| ## Prefill & Decode Configuration Details | ||
| In the PD separation scenario, we provide a optimized configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **decoder node** | ||
| 1. set HCCL_BUFFSIZE=1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: wangxiaoteng <[email protected]>
556e025 to
b61baf2
Compare
leo-pony
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't remove llmdatadist guide, as it is also need by A5 before 2026
What this PR does / why we need it?
Add readme for PD separation
Does this PR introduce any user-facing change?
No
How was this patch tested?
By ci