Qwen 3.5 Chat Template Release with 21 Bug Fixes for Agent Workflows

A developer has released a patched chat template for Qwen 3.5 models, fixing 21 bugs encountered during agentic workflows. This is a drop-in replacement for the official template, requiring only a swap of the chat_template.jinja file.
Key Fixes
The developer specifically ran Qwen 3.5 35B for agentic workflows and addressed the following major issues:
- Tool Calling Crash: Fixed a crash related to
arguments | items(referenced as HF discussion #4). - Tool/Think Block Leak:
<tool_call>content no longer leaks into<think>blocks, with auto-disable thinking when tools are active. - Parallel Tool Calls: Calls are now properly separated with
\n\ndelimiters. - Deep Agent Loops: Prevents crashes after 5+ tool hops.
- Unknown Role Handling: Roles like 'planner' and 'critic' now gracefully fall back instead of causing a crash.
- Streaming Parsers: Provides clean XML boundaries for streaming.
- Configurable Truncation: Allows setting a maximum character limit for large tool arguments and responses.
- Developer Role Support: Adds support for roles like 'Claude Code', 'Codex', and 'OpenCode'.
A full list of all 21 fixes is available in the project's README.
Configuration
The template includes configurable variables. They can be set via command-line arguments:
--chat-template-kwargs '{"enable_thinking":true,"auto_disable_thinking_with_tools":true,"max_tool_response_chars":8192}'
Compatibility & Testing
The template has been tested on the following platforms with the specified minimum versions:
- llama.cpp (b4242+)
- Open WebUI (v0.4.8+)
- vLLM (v0.6.4+)
- Ollama (v0.5.0+)
- LM Studio (v0.3.5+)
- Text Generation WebUI
It is compatible with all Qwen 3.5 models (35B, 27B, 14B, 9B, 4B, and the Coder series) and is backward-compatible with Qwen3 32B.
Source and License
The template is available for download on HuggingFace at barubary/qwen3.5-barubary-attuned-chat-template. It is released under the Apache 2.0 license, and the developer welcomes feedback and bug reports.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Local voice-to-text transcription for OpenClaw using Parakeet TDT 0.6b v3
A developer has converted NVIDIA's Parakeet TDT 0.6b v3 model to run locally via ONNX on CPU, supporting 25 European languages. The model provides an OpenAI-compatible API endpoint through a Docker container, allowing integration with OpenClaw for audio file transcription.

OpenClaw .NET: NativeAOT Port with JSON-RPC Bridge for Existing Plugins
OpenClaw .NET is a C# port of OpenClaw that compiles to a ~23MB NativeAOT binary, eliminating JIT warmup and Node runtime overhead while maintaining compatibility with existing TypeScript/JavaScript plugins through a built-in JSON-RPC bridge.

Claude Code Prompt Improver v0.5.3: Plan Mode Refactor and Subagent-First Research
v0.5.3 adds a PreToolUse hook for plan mode readability (clean rewrites, no decision history) and moves vague prompt research to Task/Explore subagents on Haiku to save main-context tokens. The plugin now works on Windows and has 1.4K+ GitHub stars.

Fino: Open-Source MCP Server for Personal Finance Analysis with Claude
Fino is a free, open-source MCP server that connects Claude to bank accounts through Plaid, stores transaction data locally in SQLite, and provides Claude with tools for financial analysis.