-
Notifications
You must be signed in to change notification settings - Fork 427
Pull requests: flashinfer-ai/flashinfer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
perf: add 1x4x1 cluster shape for fp8 bmm M<16 cases
#1473
opened Aug 12, 2025 by
ttyio
Loading…
5 tasks
Remove __restrict__ extension to fix compilation error on GB200
#1470
opened Aug 12, 2025 by
842974287
Loading…
5 tasks
Fix TRTLLM NVFP4-out attention kernel scale factor dim issue
#1460
opened Aug 11, 2025 by
elvischenv
Loading…
4 of 5 tasks
feat(attention): add RoPE offset support for batch prefill
#1457
opened Aug 11, 2025 by
MengAiDev
Loading…
3 tasks done
Fix cuda-python v13.0 import compatibility
#1455
opened Aug 11, 2025 by
yongwww
Loading…
3 of 5 tasks
refactor: unify autotuner for fp4 gemm backends
#1439
opened Aug 8, 2025 by
ttyio
Loading…
3 of 5 tasks
gpt-oss: Add MXFP8 x MXFP4 CUTLASS MOE for SM100 and BF16 x MXFP4 CUTLASS for SM90 + SwigluBias Activation
#1396
opened Aug 6, 2025 by
djmmoss
Loading…
4 of 5 tasks
misc: Customize kv lens buffer size for sparse attention
#1383
opened Aug 5, 2025 by
Edenzzzz
Loading…
5 tasks
Removes MPI dependency from MNNVL AllReduce
#1379
opened Aug 4, 2025 by
pranavm-nvidia
Loading…
5 tasks
Unify and modularize decode and prefill test.
#1375
opened Aug 4, 2025 by
weireweire
Loading…
5 tasks done
feat: Support sliding window for persistent kernel
#1368
opened Aug 3, 2025 by
Edenzzzz
Loading…
5 tasks
refactor: Improved metainfo for trtllm-gen kernels
#1328
opened Jul 25, 2025 by
cyx-6
Loading…
5 tasks
Add k_scale and v_scale to persistent attention
#1322
opened Jul 24, 2025 by
Edenzzzz
Loading…
5 tasks
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.