Releases · l3utterfly/llama.cpp

03 Sep 16:13

cdedb70

b6368 Latest

Latest

sampling : optimize dist sampler (#15704)

ggml-ci

Assets 15

cudart-llama-bin-win-cuda-12.4-x64.zip

sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6

373 MB 2025-09-03T16:13:52Z
llama-b6368-bin-macos-arm64.zip

sha256:392acd8d1c5a44526eadaaec370b29bff5e7dbad8d1d64c661d10efdbf8357bb

11 MB 2025-09-03T16:14:01Z
llama-b6368-bin-macos-x64.zip

sha256:9591dbd50e70269436c10fc6010eb32d35f976975b9113b18c316ffbc77f6f4e

28.4 MB 2025-09-03T16:14:02Z
llama-b6368-bin-ubuntu-vulkan-x64.zip

sha256:0e828188766de6d91bc4f59af2fc81c4b11ffa78cbcd3ffdccf61761b6ce9aef

25.8 MB 2025-09-03T16:14:03Z
llama-b6368-bin-ubuntu-x64.zip

sha256:2d00c55ef84c6756055882a417cc34fef105c9a5a4c70d4f9f7a888a28333375

13 MB 2025-09-03T16:14:05Z
llama-b6368-bin-win-cpu-arm64.zip

sha256:cc38e0e9767dd0bbd1cc7126b168e5d0ef743abb6a4069b83169688a0eda3252

11.2 MB 2025-09-03T16:14:06Z
llama-b6368-bin-win-cpu-x64.zip

sha256:626f139886507e7baaa1315997b0fe7a68d997de5434a6c6b7de9ff5c7bbde16

14.2 MB 2025-09-03T16:14:07Z
llama-b6368-bin-win-cuda-12.4-x64.zip

sha256:b4744065eb66e3c7bce0587ec7626ced17bdfa568cad68c5bbed0c7afc5ffb6e

138 MB 2025-09-03T16:14:08Z
llama-b6368-bin-win-hip-radeon-x64.zip

sha256:68be1a39829b98eac08d50cc4992914c7dae2ff5c4706b38837cd0ef1fcc716f

287 MB 2025-09-03T16:14:12Z
llama-b6368-bin-win-opencl-adreno-arm64.zip

sha256:78d9a156c3d135b3646a2d29bdfcb8b123afb9317d046d8f432e98a19253ce4e

11.6 MB 2025-09-03T16:14:20Z
Source code (zip)

2025-09-03T15:16:26Z
Source code (tar.gz)

2025-09-03T15:16:26Z

11 Aug 05:12

github-actions

b6123

79c1160

b6123

cuda: refactored ssm_scan and use CUB (#13291)

* cuda: refactored ssm_scan to use CUB

* fixed compilation error when when not using CUB

* assign L to constant and use size_t instead of int

* deduplicated functions

* change min blocks per mp to 1

* Use cub load and store warp transpose

* suppress clang warning

Assets 15

30 Jul 06:56

github-actions

b6029

a118d80

b6029

embeddings: fix extraction of CLS pooling results (#14927)

* embeddings: fix extraction of CLS pooling results

* merge RANK pooling into CLS case for inputs

Assets 15

14 Jul 10:13

github-actions

b5891

0d92267

b5891

llama : add jinja template for rwkv-world (#14665)

* llama : add jinja template for rwkv-world

Signed-off-by: Molly Sophia <[email protected]>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <[email protected]>

---------

Signed-off-by: Molly Sophia <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>

Assets 15

11 Jul 15:06

github-actions

b5871

aaa088d

b5871

readme : add hot PRs (#14636)

* readme : add hot PRs

* cont

* readme : update title

* readme : hot PRs links

* cont

Assets 15

06 Jul 11:50

github-actions

b5835

6491d6e

b5835

vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (#14485)

Commit taken from remyoudompheng's PR https://github.com/ggml-org/llama.cpp/pull/12260

Co-authored-by: Rémy Oudompheng <[email protected]>

Assets 15

03 Jun 09:07

github-actions

b5581

71e74a3

b5581

opencl: add `backend_synchronize` (#13939)

* This is not needed by the normal use where the result is read
  using `tensor_get`, but it allows perf mode of `test-backend-ops`
  to properly measure performance.

Assets 18

19 May 08:46

github-actions

b5416

33d7aed

b5416

CANN: Support MOE Model MUL_MAT_ID (#13042)

Signed-off-by: noemotiovon <[email protected]>

Assets 20

20 Apr 05:37

github-actions

b5158

0013715

b5158

Disable CI cross-compile builds (#13022)

Assets 26

07 Apr 10:07

github-actions

b5061

916c83b

b5061

musa: fix compilation warnings in mp_22/31 (#12780)

Signed-off-by: Xiaodong Ye <[email protected]>

Assets 26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: l3utterfly/llama.cpp

b6368

Uh oh!

b6123

Uh oh!

b6029

Uh oh!

b5891

Uh oh!

b5871

Uh oh!

b5835

Uh oh!

b5581

Uh oh!

b5416

Uh oh!

b5158

Uh oh!

b5061

Uh oh!