Name and Version
version: 6692 (ca71fb9)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
Operating systems
Linux
GGML backends
CUDA
Hardware
3x Nvidia P40
Models
GLM-4.5-Air, GLM-4.5, GLM-4.6
Problem description & steps to reproduce
Reasoning content isn't being extracted to the "reasoning_content" field when using the GLM 4.5 or 4.6 models. It appears that the chat format is being auto detected as "Hermes 2 Pro" from server logs.
First Bad Commit
No response
Relevant log output
srv params_from_: Chat format: Hermes 2 Pro