Claude Code Sub-Agents: Save Tokens With 90% Cache Hit

Anthropic numbers often ignored in the "use sub-agents!" hype: multi-agent systems consume about 15× more tokens than a single chat, and they are "less effective for tightly interdependent tasks such as coding" (source). However, cached tokens cost only 10% of normal (90% discount) — but only if the content flagged for caching is identical across requests (source).

Multi-agent multiplies token use by 15. The cache divides it by 10. Whether sub-agents save or burn comes down to one thing: do all sub-agents share the same prefix?

Three Ways to Delegate, Ordered by Cost

1. Sub-agent with subagent_type set. Custom system prompt, custom tools, custom permissions (Anthropic). Different prompt = different cache. No sharing with the parent. Full price every spawn. Use when you actually need isolation.
2. Clone that inherits the parent. No subagent_type. Inherits the parent's prompt, tools, and history exactly. Children 2..N hit the cache at 10% price. Five clones reading files in parallel ≈ 5× speed at ~1.5× cost.
3. No sub-agent. Stay in the main agent. Cheapest per turn. Right answer when the work depends on itself — refactors where step 2 needs step 1's result.

When NOT to Delegate (Anthropic's Own Line)

"Best for tasks that can be divided into parallel strands of research." Translation:

Good: read 7 files in parallel, audit folders for a pattern, gather info from many sources.
Bad: refactor a module, fix a bug where each step depends on the previous. Main agent only.

If you slice tightly coupled work into sub-agents, you pay 15× and gain nothing.

What Breaks the Cache

Anthropic: editing tool definitions, switching models, adding or removing images, or changing the earlier prompt structure breaks the cached prefix (source). So:

Install your MCPs at session start, not mid-session.
Pick the model up front.
Don't edit CLAUDE.md or auto-memory mid-session — they live inside the cached prefix.

📖 Read the full source: r/ClaudeAI

Parallel Sub-Agents in Claude Code: When They Save vs. Burn Tokens

Three Ways to Delegate, Ordered by Cost

When NOT to Delegate (Anthropic's Own Line)

What Breaks the Cache

👀 See Also

Claude Code Adds Remote Control Feature for Mobile Session Management

mindpm: A Free MCP Server for Persistent Project Memory with Claude

T9OS: An AI Orchestration System Built Entirely with Claude Code

Puzzle Game for Bots with Prizes: A New Challenge for AI Coders