Skip to content

[Neuron][Kernel] Vectorize KV cache load in FlashPagedAttention to maximize DMA bandwidth#13245

Merged
simon-mo merged 12 commits intovllm-project:mainfrom
lingfanyu:fast_vectorized_dma
Feb 21, 2025
Merged

[Neuron][Kernel] Vectorize KV cache load in FlashPagedAttention to maximize DMA bandwidth#13245
simon-mo merged 12 commits intovllm-project:mainfrom
lingfanyu:fast_vectorized_dma

Commits

Commits on Feb 13, 2025

Commits on Feb 14, 2025

Commits on Feb 18, 2025

Commits on Feb 19, 2025