Skip to content

Conversation

@bkryu
Copy link
Collaborator

@bkryu bkryu commented Sep 17, 2025

📌 Description

Current PR:

  • flashinfer_benchmark.py enhancements:
    • Adds --autotune support for mm_fp4 and bmm_fp8 benchmarking (trtllm and cutlass backends)
    • Adds mxfp4 support for mm_fp4 benchmarking.
  • Restores mm_fp4's default behavior that was changed in PR1688:
    • The use_nvfp4 input argument was added in PR1688 and set to False by default. The mm_fp4 behavior prior to PR1688 is always using nvfp4, which led to breaking existing mm_fp4 usages. Current PR sets the default True.

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).

Reviewer Notes

@bkryu bkryu requested a review from yzh119 September 17, 2025 19:43
@nvmbreughe
Copy link
Contributor

LGTM!

@bkryu bkryu marked this pull request as ready for review September 17, 2025 22:30
@yzh119 yzh119 merged commit e8f5460 into flashinfer-ai:main Sep 17, 2025
2 checks passed
fzyzcjy added a commit to fzyzcjy/flashinfer that referenced this pull request Sep 20, 2025
@bkryu bkryu deleted the benchmark_mxfp4_mm_fp4 branch October 2, 2025 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants