5253 积分	0 好友	713 主题

发消息

Qwen3 vs Qwen3.5 Tool Call模板差异：ms-swift实战解析

发表于 2 小时前 | 查看: 5| 回复: 0

最近试着用 ms-swift 训练一批带有 tool call 的数据，调试时发现 Qwen3.5 给出的格式和以前有点不一样。

ms-swift 提供了一个方便进行 encode 过程调试的代码，准备好数据后用下面的代码就能看到 encode 之后的结果：

import json
from swift import get_processor, get_template

data = {"tools": "[{\"type\": \"function\", \"function\": {\"name\": \"realtime_aqi\", \"description\": \"天气预报。获取实时空气质量。当前空气质量，PM2.5，PM10信息\", \"parameters\": {\"type\": \"object\", \"properties\": {\"city\": {\"type\": \"string\", \"description\": \"城市名，例如：上海\"}}, \"required\": [\"city\"]}}}]", "messages": [{"role": "user", "content": "北京和上海今天的天气情况"}, {"role": "tool_call", "content": "{\"name\": \"realtime_aqi\", \"arguments\": {\"city\": \"北京\"}}"}, {"role": "tool_call", "content": "{\"name\": \"realtime_aqi\", \"arguments\": {\"city\": \"上海\"}}"}, {"role": "tool_response", "content": "{\"city\": \"北京\", \"aqi\": \"10\", \"unit\": \"celsius\"}"}, {"role": "tool_response", "content": "{\"city\": \"上海\", \"aqi\": \"72\", \"unit\": \"fahrenheit\"}"}, {"role": "assistant", "content": "根据天气预报工具，北京今天的空气质量指数为10，属于良好水平；上海今天的空气质量指数为72，属于轻度污染水平。"}]}

tokenizer = get_processor('Qwen/Qwen3.5-2B')
# template = get_template(tokenizer)  # 使用默认agent模板
template = get_template(tokenizer, agent_template='qwen3_5')
# print(f'agent_template: {template._agent_template}')
template.set_mode('train')
encoded = template.encode(data)
print(f'[INPUT_IDS] {template.safe_decode(encoded["input_ids"])}\n')
print(f'[LABELS] {template.safe_decode(encoded["labels"])}')

结果打印出来是这样的：

<|im_start|>user
北京和上海今天的天气情况<|im_end|>
<|im_start|>assistant

<tool_call>
<function=realtime_aqi>
<parameter=city>
北京
</parameter>
</function>
</tool_call>
<tool_call>
<function=realtime_aqi>
<parameter=city>
上海
</parameter>
</function>
</tool_call><|im_end|>

这个格式和想象中的不一样，印象中的格式应该是这样：

# ms-swift 你可以指定使用什么 agent_template 来处理你的数据
template = get_template(tokenizer, agent_template='hermes')
template.set_mode('train')
encoded = template.encode(data)
print(f'[INPUT_IDS] {template.safe_decode(encoded["input_ids"])}\n')
print(f'[LABELS] {template.safe_decode(encoded["labels"])}')
-------
<|im_start|>user
北京和上海今天的天气情况<|im_end|>
<|im_start|>assistant

<tool_call>
{"name": "realtime_aqi", "arguments": {"city": "北京"}}
</tool_call>
<tool_call>
{"name": "realtime_aqi", "arguments": {"city": "上海"}}
</tool_call><|im_end|>

然后发现 ms-swift 中，Qwen3.5 配的 AgentTemplate 是继承自 Qwen3Coder 的：

class Qwen3_5AgentTemplate(Qwen3CoderAgentTemplate):
    pass

总感觉哪里不对，于是就查了一下 Qwen3 和 Qwen3.5 的 chat_template。

Qwen3 的 template 之前我们总结过：

https://zhuanlan.zhihu.com/p/1955205700306342098

可见当时用的还是 Hermes 的 tool call 模板。

而 Qwen3.5 的 chat template：

https://www.modelscope.cn/models/Qwen/Qwen3.5-397B-A17B/file/view/master/chat_template.jinja?status=0

在 tool call 这里发生了一些变化：

        {%- if message.tool_calls and message.tool_calls is iterable and message.tool_calls is not mapping %}
            {%- for tool_call in message.tool_calls %}
                {%- if tool_call.function is defined %}
                    {%- set tool_call = tool_call.function %}
                {%- endif %}
                {%- if loop.first %}
                    {%- if content|trim %}
                        {{- '\n\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
                    {%- else %}
                        {{- '<tool_call>\n<function=' + tool_call.name + '>\n' }}
                    {%- endif %}
                {%- else %}
                    {{- '\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
                {%- endif %}
                {%- if tool_call.arguments is defined %}
                    {%- for args_name, args_value in tool_call.arguments|items %}
                        {{- '<parameter=' + args_name + '>\n' }}
                        {%- set args_value = args_value | tojson | safe if args_value is mapping or (args_value is sequence and args_value is not string) else args_value | string %}
                        {{- args_value }}
                        {{- '\n</parameter>\n' }}
                    {%- endfor %}
                {%- endif %}
                {{- '</function>\n</tool_call>' }}
            {%- endfor %}
        {%- endif %}

可以看到，Qwen3.5 内部配置的 chat template 已经将 tool call 部分的处理统一成类似于 Qwen3-Coder 的格式了，估计也是为了更好地将模型接入到各种代码助手 Agent 当中做的修改。

作者：emiya
来源：https://zhuanlan.zhihu.com/p/2022966991753937044

上一篇：Windows 11 小组件“春季清理”：微软承认易分心，默认关闭悬停打开与新闻角标
下一篇：RAG系统大模型幻觉：滴滴Agent岗二面剖析与4种规避方案

Qwen3．5, Qwen3, ms-swift, 工具调用, chat_template

Qwen3 vs Qwen3.5 Tool Call模板差异：ms-swift实战解析

相关帖子