common : add GLM-4.5 tool calling support #15186

dhandhalyabhavik · 2025-08-08T22:18:00Z

Add COMMON_CHAT_FORMAT_GLM_4_5 format enum
Implement GLM-4.5 tool call parser for <tool_call><arg_key><arg_value> format
Add template detection based on <arg_key> and <arg_value> tags
Fix null content handling in message parsing and serialization
Ensure GLM-4.5 detection runs before Hermes to avoid misidentification

This enables tool calling functionality for GLM-4.5 models when using --jinja flag. The parser handles GLM-4.5's XML-like tool call format with key-value argument pairs.

Personally verified working on Cherry Studio windows app with function as option.

~~Unfortunately its not working with OpenAI API SDK because jinja requires dict parser but OpenAI requires json.~~

Now works with OpenAI SDK too.
above issue is now fixed with corrected Jinja template. The template works great with cline too. I extensively tested it.

Corrected Jinja template.

[gMASK]<sop>
{%- if tools -%}
<|system|>
# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{% for tool in tools %}
{{ tool | tojson }}
{% endfor %}
</tools>

For each function call, output the function name and arguments within the following XML format:
<tool_call>{function-name}
<arg_key>{arg-key-1}</arg_key>
<arg_value>{arg-value-1}</arg_value>
<arg_key>{arg-key-2}</arg_key>
<arg_value>{arg-value-2}</arg_value>
...
</tool_call>{%- endif -%}
{%- macro visible_text(content) -%}
    {%- if content is string -%}
        {{- content }}
    {%- elif content is iterable and content is not mapping -%}
        {%- for item in content -%}
            {%- if item is mapping and item.type == 'text' -%}
                {{- item.text }}
            {%- elif item is string -%}
                {{- item }}
            {%- endif -%}
        {%- endfor -%}
    {%- else -%}
        {{- content }}
    {%- endif -%}
{%- endmacro -%}
{%- set ns = namespace(last_user_index=-1) %}
{%- for m in messages %}
    {%- if m.role == 'user' %}
        {% set ns.last_user_index = loop.index0 -%}
    {%- endif %}
{%- endfor %}
{% for m in messages %}
{%- if m.role == 'user' -%}<|user|>
{%- set user_content = visible_text(m.content) -%}
{{ user_content }}
{%- if enable_thinking is defined and not enable_thinking -%}
{%- if not user_content.endswith("/nothink") -%}
{{- '/nothink' -}}
{%- endif -%}
{%- endif -%}
{%- elif m.role == 'assistant' -%}
<|assistant|>
{%- set reasoning_content = '' %}
{%- set content = visible_text(m.content) %}
{%- if m.reasoning_content is string %}
    {%- set reasoning_content = m.reasoning_content %}
{%- else %}
    {%- if '</think>' in content %}
        {%- set think_parts = content.split('</think>') %}
        {%- if think_parts|length > 1 %}
            {%- set before_end_think = think_parts[0] %}
            {%- set after_end_think = think_parts[1] %}
            {%- set think_start_parts = before_end_think.split('<think>') %}
            {%- if think_start_parts|length > 1 %}
                {%- set reasoning_content = think_start_parts[-1].lstrip('\n') %}
            {%- endif %}
            {%- set content = after_end_think.lstrip('\n') %}
        {%- endif %}
    {%- endif %}
{%- endif %}
{%- if loop.index0 > ns.last_user_index and reasoning_content -%}
{{ '\n<think>' + reasoning_content.strip() +  '</think>'}}
{%- else -%}
{{ '\n<think></think>' }}
{%- endif -%}
{%- if content.strip() -%}
{{ '\n' + content.strip() }}
{%- endif -%}
{% if m.tool_calls %}
{% for tc in m.tool_calls %}
{%- if tc.function %}
    {%- set tc = tc.function %}
{%- endif %}
{{ '\n<tool_call>' + tc.name }}
{% set _args = tc.arguments %}
{% for k, v in _args.items() %}
<arg_key>{{ k }}</arg_key>
<arg_value>{{ v | tojson if v is not string else v }}</arg_value>
{% endfor %}
</tool_call>{% endfor %}
{% endif %}
{%- elif m.role == 'tool' -%}
{%- if m.content is string -%}
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
    {{- '<|observation|>' }}
{%- endif %}
{{- '\n<tool_response>\n' }}
{{- m.content }}
{{- '\n</tool_response>' }}
{%- else -%}
<|observation|>{% for tr in m.content %}

<tool_response>
{{ tr.output if tr.output is defined else tr }}
</tool_response>{% endfor -%}
{% endif -%}
{%- elif m.role == 'system' -%}
<|system|>
{{ visible_text(m.content) }}
{%- endif -%}
{%- endfor -%}
{%- if add_generation_prompt -%}
    <|assistant|>{{- '\n<think></think>' if (enable_thinking is defined and not enable_thinking) else '' -}}
{%- endif -%}

- Add COMMON_CHAT_FORMAT_GLM_4_5 format enum - Implement GLM-4.5 tool call parser for <tool_call><arg_key><arg_value> format - Add template detection based on <arg_key> and <arg_value> tags - Fix null content handling in message parsing and serialization - Ensure GLM-4.5 detection runs before Hermes to avoid misidentification This enables tool calling functionality for GLM-4.5 models when using --jinja flag. The parser handles GLM-4.5's XML-like tool call format with key-value argument pairs.

ajunca · 2025-08-09T11:21:35Z

I tried the PR, and it fixes tool calling on GLM 4.5 Air (unsloth version) getting called correctly.
Then though this other problem #15046 arise.

dhandhalyabhavik · 2025-08-09T13:52:43Z

I tried the PR, and it fixes tool calling on GLM 4.5 Air (unsloth version) getting called correctly. Then though this other problem #15046 arise.

But its Qwen tool calling issue right? I think once other pending PRs are merged you should not see the issue.

ajunca · 2025-08-09T18:39:01Z

Yea, I don't think is related to this specific PR. But the problem is shared with this Qwen tool calling issue.

dhandhalyabhavik · 2025-08-10T10:52:50Z

Cline

Works great now with Cline 💪,

Cherry studio with MCP

Works great with MCP settings too 🔥.

TNohSam · 2025-08-10T17:48:26Z

Hey, quick thought — I might be misunderstanding this, but it looks like this PR will parse GLM’s XML-style tool calls and turn them into JSON tool_calls before they reach the client.

If that’s the case, projects like Roo Code (which currently only know how to handle XML tool calls) might suddenly stop recognizing the output from GLM models when running through llama.cpp.

Am I right about this?

jfgonsalves · 2025-08-11T07:03:27Z

Does this template parse the thinking tags correctly? I'm getting my responses inline instead of in the reasoning_content field.

bfroemel · 2025-08-11T14:43:00Z

Very nice!

#15162 aims to achieve the same for Qwen3 Coder; only seems more mature/higher quality (using minja and letting it handle quoting/escaping argument strings, storing the jinja template in ./models/templates, having test cases in ./tests/test-chat.cpp,). Maybe @ochafik and @dhandhalyabhavik can sync up/collaborate and bring both PRs in a consistent way forward?

dhandhalyabhavik · 2025-08-11T16:57:04Z

Hello everyone, thanks for insightful comments, Let me answer all of you,

@TNohSam There are two ways to implement tool calling,
(1) use instruction following template, write parsing code and parse manually.
(2) OpenAI compatible tool calling where functions or tools are part of their chat object class <--- This is what people refer when they say model supports tool calling

I have tested Roo Code just now, it is working fine. Both type of function or tool calling will work with the current PR.

@jfgonsalves enable reasoning_content via llama-server's flag. Check flags.

@bfroemel sure, @ochafik can you please review my added changes? Help me merge this PR. I would really appreciate. Thank you.

dhandhalyabhavik · 2025-08-11T17:58:34Z

@jfgonsalves

You can enable reasoning_content via flag.

There is parser logic common for all models that will do this job. Check out the code here

This PR has nothing to do with it. Thank you for pointing it out though.

check it our here

ubergarm mentioned this pull request Aug 10, 2025

add jinja template support ikawrakow/ik_llama.cpp#677

Merged

4 tasks

TNohSam mentioned this pull request Aug 10, 2025

RFC: Native Tool Use for Top-Tier AI Models RooCodeInc/Roo-Code#4047

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

common : add GLM-4.5 tool calling support #15186

common : add GLM-4.5 tool calling support #15186

dhandhalyabhavik commented Aug 8, 2025 •

edited

Loading

Uh oh!

ajunca commented Aug 9, 2025 •

edited

Loading

Uh oh!

dhandhalyabhavik commented Aug 9, 2025

Uh oh!

ajunca commented Aug 9, 2025

Uh oh!

dhandhalyabhavik commented Aug 10, 2025

Uh oh!

TNohSam commented Aug 10, 2025 •

edited

Loading

Uh oh!

jfgonsalves commented Aug 11, 2025

Uh oh!

bfroemel commented Aug 11, 2025

Uh oh!

dhandhalyabhavik commented Aug 11, 2025 •

edited

Loading

Uh oh!

dhandhalyabhavik commented Aug 11, 2025

Uh oh!

Uh oh!

common : add GLM-4.5 tool calling support #15186

Are you sure you want to change the base?

common : add GLM-4.5 tool calling support #15186

Conversation

dhandhalyabhavik commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ajunca commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dhandhalyabhavik commented Aug 9, 2025

Uh oh!

ajunca commented Aug 9, 2025

Uh oh!

dhandhalyabhavik commented Aug 10, 2025

Cline

Cherry studio with MCP

Uh oh!

TNohSam commented Aug 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jfgonsalves commented Aug 11, 2025

Uh oh!

bfroemel commented Aug 11, 2025

Uh oh!

dhandhalyabhavik commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dhandhalyabhavik commented Aug 11, 2025

Uh oh!

Uh oh!

dhandhalyabhavik commented Aug 8, 2025 •

edited

Loading

ajunca commented Aug 9, 2025 •

edited

Loading

TNohSam commented Aug 10, 2025 •

edited

Loading

dhandhalyabhavik commented Aug 11, 2025 •

edited

Loading