Skip to content

Pull requests: flashinfer-ai/flashinfer

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

perf: add 1x4x1 cluster shape for fp8 bmm M<16 cases
#1473 opened Aug 12, 2025 by ttyio Loading…
5 tasks
feat: Enable multiple fused-moe backends
#1472 opened Aug 12, 2025 by amirkl94 Loading…
Remove __restrict__ extension to fix compilation error on GB200
#1470 opened Aug 12, 2025 by 842974287 Loading…
5 tasks
Fix TRTLLM NVFP4-out attention kernel scale factor dim issue
#1460 opened Aug 11, 2025 by elvischenv Loading…
4 of 5 tasks
feat(attention): add RoPE offset support for batch prefill
#1457 opened Aug 11, 2025 by MengAiDev Loading…
3 tasks done
Fix cuda-python v13.0 import compatibility
#1455 opened Aug 11, 2025 by yongwww Loading…
3 of 5 tasks
benchmark: add allreduce_fusion benchmark
#1450 opened Aug 10, 2025 by yyihuang Draft
5 tasks
refactor: unify autotuner for fp4 gemm backends
#1439 opened Aug 8, 2025 by ttyio Loading…
3 of 5 tasks
misc: Fix persistent kernel compilation
#1430 opened Aug 8, 2025 by Edenzzzz Loading…
5 tasks
Sink attention AoT
#1427 opened Aug 8, 2025 by nandor Loading…
5 tasks done
Restore llama4 fc2 required kernels
#1417 opened Aug 8, 2025 by aleozlx Loading…
5 tasks done
Removes MPI dependency from MNNVL AllReduce
#1379 opened Aug 4, 2025 by pranavm-nvidia Loading…
5 tasks
Unify and modularize decode and prefill test.
#1375 opened Aug 4, 2025 by weireweire Loading…
5 tasks done
feat: Support sliding window for persistent kernel
#1368 opened Aug 3, 2025 by Edenzzzz Loading…
5 tasks
[WIP]: Masked layout fp4 gemm using cute-dsl
#1331 opened Jul 25, 2025 by yzh119 Draft
5 tasks
refactor: Improved metainfo for trtllm-gen kernels
#1328 opened Jul 25, 2025 by cyx-6 Loading…
5 tasks
Add moe benchmark routine
#1327 opened Jul 25, 2025 by aleozlx Draft
3 of 5 tasks
Add k_scale and v_scale to persistent attention
#1322 opened Jul 24, 2025 by Edenzzzz Loading…
5 tasks
ProTip! Add no:assignee to see everything that’s not assigned.