[Cherry-pick][0.11.0] Adapted to torch_npu.npu_fused_infer_attention_score #4202

wxsIcey · 2025-11-14T07:51:37Z

What this PR does / why we need it?

Fixes a compatible bug with torch_npu.npu_fused_infer_attention_score which is discribed in #4020.
@momo609 tells us this solution.
cherry-pick: #4025

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

CI passed with new added/existing test.

…score Signed-off-by: Icey <[email protected]>

github-actions · 2025-11-14T07:51:46Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

The pull request updates a block size parameter from 64 to 128 in two distinct files. While the change itself appears to be a coordinated update, the direct hardcoding of this magic number in multiple locations introduces a significant maintainability risk. It is highly recommended to centralize such common configuration values into a single named constant, ideally in a shared utility module, to ensure consistency and simplify future updates.

gemini-code-assist · 2025-11-14T07:53:52Z

vllm_ascend/attention/attention_v1.py

    @staticmethod
    def get_supported_block_size() -> list[int]:
-        return [64]
+        return [128]


The block size 128 is a magic number that appears in multiple files. Consider defining this value as a named constant in a shared utility module (e.g., vllm_ascend/utils.py) to improve maintainability and ensure consistency across the codebase. Duplicating such values can lead to errors if the value needs to be updated in the future.

gemini-code-assist · 2025-11-14T07:53:52Z

vllm_ascend/patch/platform/patch_mamba_config.py

    ).page_size_bytes

-    block_alignment_bytes = 64
+    block_alignment_bytes = 128


This block_alignment_bytes value of 128 is duplicated with the get_supported_block_size in vllm_ascend/attention/attention_v1.py. To enhance maintainability and prevent inconsistencies, it is recommended to define this as a single, shared constant in a central utility file.

[Cherry-pick][0.11.0] Adapted to torch_npu.npu_fused_infer_attention_…

6344d7b

…score Signed-off-by: Icey <[email protected]>

wxsIcey added ready read for review ready-for-test start test by label for PR labels Nov 14, 2025

gemini-code-assist bot reviewed Nov 14, 2025

View reviewed changes

wangxiyuan approved these changes Nov 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Cherry-pick][0.11.0] Adapted to torch_npu.npu_fused_infer_attention_score #4202

[Cherry-pick][0.11.0] Adapted to torch_npu.npu_fused_infer_attention_score #4202

wxsIcey commented Nov 14, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 14, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 14, 2025

Uh oh!

gemini-code-assist bot Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Cherry-pick][0.11.0] Adapted to torch_npu.npu_fused_infer_attention_score #4202

Are you sure you want to change the base?

[Cherry-pick][0.11.0] Adapted to torch_npu.npu_fused_infer_attention_score #4202

Conversation

wxsIcey commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Nov 14, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wxsIcey commented Nov 14, 2025 •

edited

Loading