feat: Benchmark mm_fp4 mxfp4 support and gemm autotune support. Restore mm_fp4 API behavior #1706

bkryu · 2025-09-17T19:39:32Z

📌 Description

Current PR:

flashinfer_benchmark.py enhancements:
- Adds --autotune support for mm_fp4 and bmm_fp8 benchmarking (trtllm and cutlass backends)
- Adds mxfp4 support for mm_fp4 benchmarking.
Restores mm_fp4's default behavior that was changed in PR1688:
- The use_nvfp4 input argument was added in PR1688 and set to False by default. The mm_fp4 behavior prior to PR1688 is always using nvfp4, which led to breaking existing mm_fp4 usages. Current PR sets the default True.

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

I have installed pre-commit by running pip install pre-commit (or used your preferred method).
I have installed the hooks with pre-commit install.
I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

Tests have been added or updated as needed.
All tests are passing (unittest, etc.).

Reviewer Notes

… typos. Add autotuning to GEMM benchmarking

nvmbreughe · 2025-09-17T20:19:56Z

LGTM!

…t. Restore mm_fp4 API behavior (flashinfer-ai#1706)" This reverts commit e8f5460.

Add mm_fp4 mxfp4 support. Restore mm_fp4 API behavior to default. Fix…

a57c1ee

… typos. Add autotuning to GEMM benchmarking

bkryu requested a review from yzh119 September 17, 2025 19:43

bkryu marked this pull request as ready for review September 17, 2025 22:30

yzh119 approved these changes Sep 17, 2025

View reviewed changes

yzh119 merged commit e8f5460 into flashinfer-ai:main Sep 17, 2025
2 checks passed

fzyzcjy added a commit to fzyzcjy/flashinfer that referenced this pull request Sep 20, 2025

Revert "feat: Benchmark mm_fp4 mxfp4 support and gemm autotune suppor…

d504c61

…t. Restore mm_fp4 API behavior (flashinfer-ai#1706)" This reverts commit e8f5460.

fzyzcjy mentioned this pull request Sep 20, 2025

mm_fp4 regression (need 0.2s cpu time per run) #1741

Open

bkryu deleted the benchmark_mxfp4_mm_fp4 branch October 2, 2025 17:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Benchmark mm_fp4 mxfp4 support and gemm autotune support. Restore mm_fp4 API behavior #1706

feat: Benchmark mm_fp4 mxfp4 support and gemm autotune support. Restore mm_fp4 API behavior #1706

Uh oh!

bkryu commented Sep 17, 2025 •

edited

Loading

Uh oh!

nvmbreughe commented Sep 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Benchmark mm_fp4 mxfp4 support and gemm autotune support. Restore mm_fp4 API behavior #1706

feat: Benchmark mm_fp4 mxfp4 support and gemm autotune support. Restore mm_fp4 API behavior #1706

Uh oh!

Conversation

bkryu commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Description

🔍 Related Issues

🚀 Pull Request Checklist

✅ Pre-commit Checks

🧪 Tests

Reviewer Notes

Uh oh!

nvmbreughe commented Sep 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bkryu commented Sep 17, 2025 •

edited

Loading