-
Notifications
You must be signed in to change notification settings - Fork 12.7k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
server : export max observed n_past value
examples
server
#15361
opened Aug 16, 2025 by
okuvshynov
•
Draft
Add option to disable MMA support on Turing
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#15360
opened Aug 16, 2025 by
pt13762104
Loading…
Fix broken build: require updated pip to support --break-system-packages
devops
improvements to build systems and github actions
#15357
opened Aug 16, 2025 by
danchev
Loading…
vulkan: Use larger workgroups for mul_mat_vec when M is small
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#15355
opened Aug 15, 2025 by
jeffbolznv
Loading…
vulkan: Optimize argsort
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#15354
opened Aug 15, 2025 by
jeffbolznv
Loading…
vulkan: disable spirv-opt for bfloat16 shaders
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#15352
opened Aug 15, 2025 by
jeffbolznv
Loading…
model : support vision LiquidAI LFM2-VL family
examples
python
python script changes
#15347
opened Aug 15, 2025 by
tdakhran
Loading…
sched : copy only the used experts when offloading prompt processing
ggml
changes relating to the ggml tensor library for machine learning
#15346
opened Aug 15, 2025 by
slaren
Loading…
CANN: optimize the rope ops
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#15335
opened Aug 15, 2025 by
YangShuai52
Loading…
CANN: fix ggml_cann_rms_norm
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#15331
opened Aug 14, 2025 by
yuchuan-cao
Loading…
aLoRA Support
examples
python
python script changes
server
#15327
opened Aug 14, 2025 by
gabe-l-hart
•
Draft
1 task
OpenCL: add fused group_norm/norm, mul, add
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
testing
Everything test related
#15314
opened Aug 14, 2025 by
rmatif
Loading…
Add OpenVINO backend
devops
improvements to build systems and github actions
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
64 bit CUDA copy routines via GGML_CUDA_ALLOW_LARGE_TENSORS
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#15298
opened Aug 13, 2025 by
createthis
Loading…
ggml: riscv: add riscv spacemit backend
build
Compilation issues
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
#15288
opened Aug 13, 2025 by
alex-spacemit
Loading…
Add comprehensive Copilot instructions with Python environment, server testing, and git clang-format
devops
improvements to build systems and github actions
vulkan.Dockerfile: install vulkan SDK using tarball
devops
improvements to build systems and github actions
#15282
opened Aug 13, 2025 by
yeahdongcn
Loading…
vulkan: optimize rms_norm, and allow the work to spread across multiple SMs
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#15281
opened Aug 13, 2025 by
jeffbolznv
•
Draft
arm64: add i8mm route with SVE ggml_vec_dot_q4_K_q8_K and ggml_vec_dot_q6_K_…
ggml
changes relating to the ggml tensor library for machine learning
#15277
opened Aug 13, 2025 by
fj-y-saito
Loading…
Q6_K - Block Interleaving Implementation for x86 SIMD (AVX512/AVX2)
ggml
changes relating to the ggml tensor library for machine learning
#15275
opened Aug 12, 2025 by
Srihari-mcw
Loading…
Apple NPU acceleration integrated into llama.cpp, using MiniCPM-V 4.0 as an example.
examples
python
python script changes
#15262
opened Aug 12, 2025 by
tc-mb
Loading…
WIP: ggml-cuda: Add bf16 cuda support to fattn (Flash Attention)
examples
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
#15261
opened Aug 12, 2025 by
eous
Loading…
musa: fix build warnings
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#15258
opened Aug 12, 2025 by
yeahdongcn
Loading…
ci : Enable pre-built cuda releases on ubuntu (#5106)
devops
improvements to build systems and github actions
#15249
opened Aug 11, 2025 by
michaelgiba
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.