Skip to content

Conversation

@wangxiaoteng888
Copy link
Contributor

@wangxiaoteng888 wangxiaoteng888 commented Nov 13, 2025

What this PR does / why we need it?

Add readme for PD separation

Does this PR introduce any user-facing change?

No

How was this patch tested?

By ci

Signed-off-by: wangxiaoteng <[email protected]>
@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Nov 13, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a comprehensive README for prefill/decode (PD) separation. The documentation is detailed, but there are a few areas that need improvement for clarity and correctness. I've identified a duplicated command-line argument, some incorrect parameters in a table, a potentially broken link, and inconsistencies in configuration values. Additionally, the document structure could be improved to better explain the different configuration options presented. Addressing these points will significantly enhance the usability of this guide.

--decoder-hosts 192.0.0.3\
--decoder-ports 8004
--port 1999 \
--host 192.0.0.1 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The --host argument is duplicated. It's already specified on line 972. Please remove this redundant line to avoid confusion.

Comment on lines +1086 to +1092
| --prefiller-hosts-num | Number of repetitions for prefiller node hosts |
| --prefiller-ports | Ports of prefiller nodes |
| --prefiller-ports-inc | Number of increments for prefiller node ports |
| --decoder-hosts | Hosts of decoder nodes |
| --decoder-hosts-num | Number of repetitions for decoder node hosts |
| --decoder-ports | Ports of decoder nodes |
| --decoder-ports-inc | Number of increments for decoder node ports |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The parameters --prefiller-hosts-num, --prefiller-ports-inc, --decoder-hosts-num, and --decoder-ports-inc described in this table do not appear to be supported by the load_balance_proxy_server_example.py script referenced later. This can be misleading for users. Please ensure the documentation accurately reflects the script's arguments.

| --decoder-hosts-num | Number of repetitions for decoder node hosts |
| --decoder-ports | Ports of decoder nodes |
| --decoder-ports-inc | Number of increments for decoder node ports |
You can get the proxy program in the repository's examples, [load\_balance\_proxy\_server\_example.py](https://github.com/vllm-project/vllm-ascend/blob/v0.9.1-dev/examples/disaggregate_prefill_v1/load_balance_proxy_server_example.py)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This link points to a specific development branch (v0.9.1-dev) and an incorrect directory name (disaggregate_prefill_v1). This makes the link fragile and likely to break. It should be updated to point to the main branch and use the correct path for consistency with other links in this file.

Suggested change
You can get the proxy program in the repository's examples, [load\_balance\_proxy\_server\_example.py](https://github.com/vllm-project/vllm-ascend/blob/v0.9.1-dev/examples/disaggregate_prefill_v1/load_balance_proxy_server_example.py)
You can get the proxy program in the repository's examples, [load\_balance\_proxy\_server\_example.py](https://github.com/vllm-project/vllm-ascend/blob/main/examples/disaggregated_prefill_v1/load_balance_proxy_server_example.py)

Comment on lines 1149 to 1151
## Prefill & Decode Configuration Details
In the PD separation scenario, we provide a optimized configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This section introduces a new configuration using LLMDataDistCMgrConnector without explaining how it relates to the Mooncake...Connector configurations detailed earlier in the document. This makes the documentation confusing. Please add context to clarify when each configuration should be used.

- **decoder node**
1. set HCCL_BUFFSIZE=1024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There's an inconsistency in the recommended HCCL_BUFFSIZE for the decoder node. Here it is recommended to be 1024, but the example script on line 418 sets it to 600. Please resolve this discrepancy to avoid confusion.

Signed-off-by: wangxiaoteng <[email protected]>
Copy link
Collaborator

@leo-pony leo-pony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't remove llmdatadist guide, as it is also need by A5 before 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants