-
Notifications
You must be signed in to change notification settings - Fork 13.8k
Deepseek V3.1 native tool calling support (OpenAI Style) #15533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Added COMMON_CHAT_FORMAT_DEEPSEEK_V3_1 enum value - Created common_chat_params_init_deepseek_v3_1() function (currently uses R1 implementation) - Created common_chat_parse_deepseek_v3_1() function that handles V3.1 thinking format: - Extracts reasoning content before '</think>' tag into reasoning_content - Extracts regular content after '</think>' tag into content - No opening '<think>' tag in V3.1 format - Added detection logic for V3.1 templates based on pattern: 'message['prefix'] is defined and message['prefix'] and thinking' - Added V3.1 case to parsing switch statement This addresses the issue where V3.1 outputs reasoning content followed by '</think>' and then regular content without the opening '<think>' tag.
This reverts commit c50d887.
Co-authored-by: Sigbjørn Skjæret <[email protected]>
|
@CISC who approves the workflows? |
I think all collaborators can approve them? Just approved this one. |
|
I still don’t see a merge button. Do we need @ngxson to review too? |
|
@createthis nope, only collaborators with write access can merge, so you need either @CISC or @ggerganov to merge it :> |
tool calling in the reasoning content, but then the model just stops the output without closing the </think> tag, so it's not a partial. In this case, use the tool call in the reasoning content.
|
I added an edge case where thinking is forced open, there is tool calling in the reasoning content, but then the model just stops the output without closing the |
|
@CISC @ggerganov Let me know if you want any more changes, otherwise please merge. This is working well on my end. |
|
@CISC @ggerganov I simplified TL;DR: After the second |
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
|
🎉🎉🎉 |
|
is there a way to turn on and off this? i'm working on a tool calling project. and would like to have just the models original output for testing |
@fernandaspets Hey neolithic. You can see the model's original output by starting llama.cpp with There is an open source CLI tool called Once I have identified an issue, then I usually write a unit test. You can see my unit test for multiple tool calls here: llama.cpp/tests/test-chat-parser.cpp Line 303 in 7ea15bb
|
This PR enables DeepSeek V3.1 thinking mode as the default. Disable with
--reasoning-budget 0.It also implements tool calling support.
Addresses #15496
My understanding is that this is a continuation of #9639 for DeepSeek V3.1 specifically.