Qwen 3.5 Tool Calling Fixes for Agentic Use: Server Status and Client-Side Workarounds

Tool Calling Bugs in Qwen 3.5 Agentic Setups
When running Qwen 3.5 models in agentic environments like coding agents or function calling loops, four specific bugs can cause tool calling to fail completely.
The Four Core Bugs
- XML tool calls leak as plain text: Qwen 3.5 emits tool calls as XML format (e.g., <function=bash><parameter=command>ls</parameter></function>). When servers fail to parse this—especially when text precedes the XML or thinking is enabled—the tool call arrives as raw text with finish_reason: stop, so your agent never executes it.
- <think> tags leak into text and poison context: llama.cpp forces thinking=1 internally regardless of enable_thinking: false, causing tags to accumulate across turns and destroy multi-turn sessions.
- Wrong finish_reason: Servers send "stop" when tool calls are present, causing agents to treat it as a final answer.
- Non-standard finish_reason: Some servers return "eos_token", "", or null, causing most frameworks to crash on the unknown value before checking if tool calls exist.
Server Status (April 2026)
The source provides a detailed status table for major inference servers:
- LM Studio 0.4.9: Best local option for XML parsing (fixed in v0.4.7), improved think leak handling, usually correct finish_reason.
- vLLM 0.19.0: Works with --tool-call-parser qwen3_coder flag, streaming bugs exist, think leak fixed, usually correct finish_reason.
- Ollama 0.20.2: Improved since fix for unclosed </think> bug, still flaky on XML parsing, sometimes wrong finish_reason.
- llama.cpp b8664: Parser exists but fails with thinking enabled, think leak broken, wrong finish_reason when parser fails.
Recommended Solutions
Use Unsloth GGUFs instead of stock Qwen 3.5 Jinja templates, which have known issues with |items filter failing on tool arguments. Unsloth ships with 21 template fixes.
Add a client-side safety net with three small functions that catch what servers miss. The source provides the first function:
import re, json, uuid
1. Parse Qwen XML tool calls from text content
def parse_qwen_xml_tools(text):
results = []
for m in re.finditer(r'<function=([\w.-]+)>([\s\S]?)</function>', text):
args = {}
for p in re.finditer(r'<parameter=([\w.-]+)>([\s\S]?)</parameter>', m.group(2)):
k, v = p.group(1).strip(), p.group(2).strip()
try:
v = json.loads(v)
except:
pass
args[k] = v
This function extracts tool calls from text content when servers fail to parse the XML properly, providing a fallback mechanism for agentic workflows.
📖 Read the full source: r/LocalLLaMA
👀 See Also

OpenClaw installation hurdles on Windows 11 and how to overcome them
A user details three specific obstacles when installing OpenClaw on a fresh Windows 11 machine: PowerShell execution policy, Windows Defender blocking, and missing dependencies like Node.js and Git.

OpenClaw 3.22 Upgrade Checklist: Practical Steps from a Developer Who Got Burned
A developer shares specific upgrade steps for OpenClaw 3.22, including checking for deprecated environment variables, creating backups, running migration commands, and verifying plugin compatibility.

Giving Claude M365 Access via Power Automate and a FastMCP Server
A developer built a lightweight MCP server that lets Claude interact with Microsoft 365 (inbox, calendar, OneDrive, Planner, Excel, Word) using Power Automate webhooks — no admin Graph permissions needed.

Anthropic publishes Champion Kit for Claude Code adoption
A playbook for engineers pushing Claude Code at their company: share reusable prompts, answer in public channels, and host a weekly show-and-tell thread — ~40 minutes total per week.