Parallel Sub-Agents in Claude Code: When They Save vs. Burn Tokens

Anthropic numbers often ignored in the "use sub-agents!" hype: multi-agent systems consume about 15× more tokens than a single chat, and they are "less effective for tightly interdependent tasks such as coding" (source). However, cached tokens cost only 10% of normal (90% discount) — but only if the content flagged for caching is identical across requests (source).
Multi-agent multiplies token use by 15. The cache divides it by 10. Whether sub-agents save or burn comes down to one thing: do all sub-agents share the same prefix?
Three Ways to Delegate, Ordered by Cost
- 1. Sub-agent with
subagent_typeset. Custom system prompt, custom tools, custom permissions (Anthropic). Different prompt = different cache. No sharing with the parent. Full price every spawn. Use when you actually need isolation. - 2. Clone that inherits the parent. No
subagent_type. Inherits the parent's prompt, tools, and history exactly. Children 2..N hit the cache at 10% price. Five clones reading files in parallel ≈ 5× speed at ~1.5× cost. - 3. No sub-agent. Stay in the main agent. Cheapest per turn. Right answer when the work depends on itself — refactors where step 2 needs step 1's result.
When NOT to Delegate (Anthropic's Own Line)
"Best for tasks that can be divided into parallel strands of research." Translation:
- Good: read 7 files in parallel, audit folders for a pattern, gather info from many sources.
- Bad: refactor a module, fix a bug where each step depends on the previous. Main agent only.
If you slice tightly coupled work into sub-agents, you pay 15× and gain nothing.
What Breaks the Cache
Anthropic: editing tool definitions, switching models, adding or removing images, or changing the earlier prompt structure breaks the cached prefix (source). So:
- Install your MCPs at session start, not mid-session.
- Pick the model up front.
- Don't edit
CLAUDE.mdor auto-memory mid-session — they live inside the cached prefix.
📖 Read the full source: r/ClaudeAI
👀 See Also

Claude Code Adds Remote Control Feature for Mobile Session Management
Claude Code now allows developers to start tasks in their terminal and continue controlling sessions from mobile devices via the Claude app or claude.ai/code while Claude runs locally on their machine.

mindpm: A Free MCP Server for Persistent Project Memory with Claude
mindpm is a free, open-source MCP server that provides Claude with a local SQLite database to track tasks, decisions, notes, and session summaries across conversations. Setup takes 30 seconds with the command: claude mcp add mindpm -e MINDPM_DB_PATH=~/.mindpm/memory.db -- npx -y mindpm.

T9OS: An AI Orchestration System Built Entirely with Claude Code
An economics student built T9OS, a complete AI orchestration layer using Claude Code as the only programming tool. The system includes 18 production pipelines, a 12-state lifecycle engine, and 7 AI 'Guardians' that review every output.

Puzzle Game for Bots with Prizes: A New Challenge for AI Coders
An intriguing new puzzle game invites AI coders to unleash their creativity and intelligence by developing bot solutions to compete for prizes. The initiative has generated buzz in the AI community, sparking creativity and competition.