[P/D] Add readme for PD separation #4182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

wangxiaoteng888 wants to merge 2 commits into vllm-project:main from wangxiaoteng888:add_distribute_readme

+593 −458

Contributor

wangxiaoteng888 commented Nov 13, 2025 •

edited by github-actions bot

Loading

What this PR does / why we need it?

Add readme for PD separation

Does this PR introduce any user-facing change?

No

How was this patch tested?

By ci

vLLM version: v0.11.0
vLLM main: vllm-project/vllm@2918c1b


          origin_toy_readme

fa20b73

Signed-off-by: wangxiaoteng <[email protected]>

github-actions bot commented Nov 13, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

github-actions bot added the documentation label

gemini-code-assist bot reviewed

View reviewed changes

Contributor

gemini-code-assist bot left a comment

Code Review

This pull request adds a comprehensive README for prefill/decode (PD) separation. The documentation is detailed, but there are a few areas that need improvement for clarity and correctness. I've identified a duplicated command-line argument, some incorrect parameters in a table, a potentially broken link, and inconsistencies in configuration values. Additionally, the document structure could be improved to better explain the different configuration options presented. Addressing these points will significantly enhance the usability of this guide.

docs/source/tutorials/multi_node_pd_disaggregation_mooncake.md

    
                  --decoder-hosts 192.0.0.3\

                  --decoder-ports 8004

                --port 1999 \

                --host 192.0.0.1 \

Contributor

gemini-code-assist bot Nov 13, 2025

The --host argument is duplicated. It's already specified on line 972. Please remove this redundant line to avoid confusion.

docs/source/tutorials/multi_node_pd_disaggregation_mooncake.md

Comment on lines +1086 to +1092

    
              | --prefiller-hosts-num | Number of repetitions for prefiller node hosts |

              | --prefiller-ports | Ports of prefiller nodes |

              | --prefiller-ports-inc | Number of increments for prefiller node ports |

              | --decoder-hosts | Hosts of decoder nodes |

              | --decoder-hosts-num | Number of repetitions for decoder node hosts |

              | --decoder-ports | Ports of decoder nodes |

              | --decoder-ports-inc | Number of increments for decoder node ports |

Contributor

gemini-code-assist bot Nov 13, 2025

The parameters --prefiller-hosts-num, --prefiller-ports-inc, --decoder-hosts-num, and --decoder-ports-inc described in this table do not appear to be supported by the load_balance_proxy_server_example.py script referenced later. This can be misleading for users. Please ensure the documentation accurately reflects the script's arguments.

docs/source/tutorials/multi_node_pd_disaggregation_mooncake.md

    
              | --decoder-hosts-num | Number of repetitions for decoder node hosts |

              | --decoder-ports | Ports of decoder nodes |

              | --decoder-ports-inc | Number of increments for decoder node ports |

              You can get the proxy program in the repository's examples, [load\_balance\_proxy\_server\_example.py](https://github.com/vllm-project/vllm-ascend/blob/v0.9.1-dev/examples/disaggregate_prefill_v1/load_balance_proxy_server_example.py)

Contributor

gemini-code-assist bot Nov 13, 2025

This link points to a specific development branch (v0.9.1-dev) and an incorrect directory name (disaggregate_prefill_v1). This makes the link fragile and likely to break. It should be updated to point to the main branch and use the correct path for consistency with other links in this file.

Suggested change

      
            You can get the proxy program in the repository's examples, [load\_balance\_proxy\_server\_example.py](https://github.com/vllm-project/vllm-ascend/blob/v0.9.1-dev/examples/disaggregate_prefill_v1/load_balance_proxy_server_example.py)
          
            You can get the proxy program in the repository's examples, [load\_balance\_proxy\_server\_example.py](https://github.com/vllm-project/vllm-ascend/blob/main/examples/disaggregated_prefill_v1/load_balance_proxy_server_example.py)

docs/source/tutorials/multi_node_pd_disaggregation_mooncake.md Outdated

Comment on lines 1149 to 1151

    
              ## Prefill & Decode Configuration Details

              In the PD separation scenario, we provide a optimized configuration.

Contributor

gemini-code-assist bot Nov 13, 2025

This section introduces a new configuration using LLMDataDistCMgrConnector without explaining how it relates to the Mooncake...Connector configurations detailed earlier in the document. This makes the documentation confusing. Please add context to clarify when each configuration should be used.

docs/source/tutorials/multi_node_pd_disaggregation_mooncake.md Outdated

    
              - **decoder node**

              1. set HCCL_BUFFSIZE=1024

Contributor

gemini-code-assist bot Nov 13, 2025

There's an inconsistency in the recommended HCCL_BUFFSIZE for the decoder node. Here it is recommended to be 1024, but the example script on line 418 sets it to 600. Please resolve this discrepancy to avoid confusion.


          fix_readme

b61baf2

Signed-off-by: wangxiaoteng <[email protected]>

wangxiaoteng888 force-pushed the add_distribute_readme branch from 556e025 to b61baf2 Compare

November 14, 2025 08:18

leo-pony reviewed

View reviewed changes

Collaborator

leo-pony left a comment

please don't remove llmdatadist guide, as it is also need by A5 before 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels