Releases · ngxson/llama.cpp

28 Apr 15:36

1831f53

b5212

llama-bench: add `-d` depth arg (#13096)

* add depth param

* update llama-bench README and add depth param

* llama-bench: default params for depth arg for faster execution

* Update examples/llama-bench/README.md

Co-authored-by: Johannes Gäßler <[email protected]>

* fix buffer print ub

* use user provided args

* remove extra whitespaces

---------

Co-authored-by: Johannes Gäßler <[email protected]>

Assets 26

28 Apr 15:00

github-actions

b5211

4e87962

b5211

mtmd : fix glm-edge redundant token count (#13139)

* mtmd : fix glm-edge redundant token count

* fix chat template

* temporary disable GLMEdge test chat tmpl

Assets 26

28 Apr 14:35

github-actions

b5210

fb0471d

b5210

context : do not clear output buffer on reserve (#13152)

Co-authored-by: pockers21 <[email protected]>

Assets 26

28 Apr 13:07

github-actions

b5209

d2b2031

b5209

llama : (mrope) allow using normal 1D position for text token (#13138)

* llama : (mrope) use normal position for text token

* rm n_pos_per_embd from llm_graph_input_attn_temp

Assets 26

28 Apr 11:20

github-actions

b5208

5fa9e63

b5208

clip : refactor set input for cgraph + fix qwen2.5vl input (#13136)

* clip : refactor set input for cgraph

* more strict assert

* minicpmv : use clip_n_mmproj_embd instead of copying the same code everywhere

* split qwen2 and qwen2.5 code blocks

* minor style fix

Assets 26

28 Apr 09:55

github-actions

b5205

43f2b07

b5205

common : fix noreturn compile warning (#13151)

ggml-ci

Assets 26

28 Apr 09:12

github-actions

b5204

e5d6c25

b5204

llama-chat : fix typo GML --> GLM (#13143)

Assets 26

28 Apr 08:47

github-actions

b5203

f0dd6a1

b5203

musa: fix typo in cc control (#13144)

Signed-off-by: Xiaodong Ye <[email protected]>

Assets 26

28 Apr 08:34

github-actions

b5202

69699be

b5202

CUDA: fix q_nope_absorbed prec for DS 2 Lite f16 (#13137)

Assets 26

28 Apr 06:11

github-actions

b5201

85f36e5

b5201

arg : fix unused variable (#13142)

Assets 26

Uh oh!

Releases: ngxson/llama.cpp

b5212

Uh oh!

b5211

Uh oh!

b5210

Uh oh!

b5209

Uh oh!

b5208

Uh oh!

b5205

Uh oh!

b5204

Uh oh!

b5203

Uh oh!

b5202

Uh oh!

b5201

Uh oh!