Releases: ngxson/llama.cpp
Releases · ngxson/llama.cpp
b5177
arg : clean up handling --mmproj with -hf (#13082) * arg : clean up handling --mmproj with -hf * rm change about no_mmproj * Revert "rm change about no_mmproj" This reverts commit 2cac8e0efb629d66c612f137e75d562f94bb9e6c. * handle no_mmproj explicitly * skip download mmproj on examples not using it
b5176
metal : fix floating-point range of attention scores in FA kernels (#…
b5175
vulkan: matmul gcn tuning (#13016) * tune matmul for gcn * this one is more power efficient * Update ggml/src/ggml-vulkan/ggml-vulkan.cpp Co-authored-by: 0cc4m <[email protected]> * disable this tune for the proprietary driver --------- Co-authored-by: 0cc4m <[email protected]>
b5174
llama-mtmd-cli: Sigint rework in mtmd vision example (#13080) * Sigint rework in mtmd vision example * Applied suggestions on mtmd-cli PR * Forgot to invert one of the conditions * Update examples/llava/mtmd-cli.cpp * Removed redundant exit check --------- Co-authored-by: pl752 <[email protected]> Co-authored-by: Xuan-Son Nguyen <[email protected]>
b5170
CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (#13014) * CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID * fix logic for RoPE support, CUDA graphs
b5169
mtmd : support SmolVLM (version 1 and 2) (#13050) * mtmd : support SmolVLM (version 1 and 2) * correct chat template * fix n_patches * scale_factor is an int * add more models to test
b5168
security : add note about RPC and server functionality (#13061) * security : add note about RPC functionality * security : add note about llama-server
b5166
llava : update documentations (#13055) * llava : update documentations * fix typo
b5165
ggml : add SSE 4.2 and x64 base variant for CPUs without AVX (#12871) * ggml : add SSE 4.2 variant for CPUs without AVX * ggml : add x64 base ABI variant
b5164
SYCL: Add non-contiguous support in ROPE (#12993) ggml-ci