Skip to content

Conversation

yiliu30
Copy link
Contributor

@yiliu30 yiliu30 commented Aug 15, 2025

@yiliu30 yiliu30 requested a review from Copilot August 16, 2025 05:55
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR modifies the FP8 quantization MoE (Mixture of Experts) forward pass to support passing chunk size information to the underlying MoE operation. The change enables dynamic chunk size configuration by extracting tokens_num from hidden_states and delegating to the original module for additional kwargs.

  • Adds a helper method to extract extra kwargs from the original module
  • Modifies forward_quant to pass chunk size information via extra_kwargs

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@yiliu30 yiliu30 merged commit 7b7e01e into aice/v122 Aug 22, 2025
2 checks passed
@yiliu30 yiliu30 deleted the moe-chunk-size branch August 22, 2025 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant