How to Split Agent Context Into 3 Layers: Fix 700-Line Monolith

When building persistent autonomous agents, a common problem emerges: the agent context starts in one file but grows to 700+ lines, mixing identity rules, current strategy, tool references, pricing updates, and posting procedures. This monolith becomes unmanageable—editing what to focus on this week sits in the same file as "never do this" rules, causing hesitation and errors. A team building a 6-agent system that bootstraps itself from zero hit this exact issue: by week two, the launcher was hitting argument limits and sessions were failing silently.

The Three-Layer Split

The solution was to separate agent context by concern type and change frequency into three distinct files:

CLAUDE.md - Identity: who the agent is, hard rules, personality. Almost never changes. Can be cached.
BRIEFING.md - Mission: what to focus on right now, current strategy, pricing, targets. Changes weekly.
PLAYBOOK.md - Operations: how to mechanically do things: procedures, CLI commands, tool references. Changes when tools change.

One piece of information lives in exactly one layer. If a tool reference is in PLAYBOOK, it is not in BRIEFING. Duplication leads to silent contradictions.

Why This Architecture Works

Practical editing: Everyone knows which file to edit. What to focus on? BRIEFING. How to post? PLAYBOOK. Never do this? CLAUDE.md. No guessing or rifling through 700 lines.

Technical efficiency: When an agent restarts (which happens frequently), identity remains stable. The same CLAUDE.md is used every session, allowing caching layers to see an identical prompt prefix for nearly free cache hits. BRIEFING and PLAYBOOK arrive via tool calls on first startup—the agent reads them before doing substantive work, so they aren't redundant. The spawn argument stays small forever, even as PLAYBOOK grows to 2000+ lines.

Discipline: A monolith accepts any content anywhere. This specification forces you to ask: is this about character, mission, or mechanics? Answering that question changes how you think about the system.

Injection Patterns

Pattern A (Simple): Read all three files at spawn, concatenate, and inject. Works if total size fits your spawner argument limit.

Pattern B (Persistent agents): Inject only CLAUDE.md at spawn. CLAUDE.md contains a mandate: First action, read BRIEFING.md and PLAYBOOK.md. The agent first tool calls load the mission and operations docs before any work begins. The spawn prompt stays small even as playbooks grow. This is the default for agents that restart.

The team uses Pattern B. Every session, the agent wakes up, reads BRIEFING, reads PLAYBOOK, then executes. Fresh context every time, cached identity, no argument limit anxiety.

Results After Implementation

Editing took seconds instead of minutes—no fear of breaking unrelated stuff.
BRIEFING edits between sessions just work—agent reads fresh BRIEFING on next restart.
PLAYBOOK grew to 2000+ lines with zero launch anxiety.
Onboarding new agents was faster—there is a clear skeleton to fill.

This approach isn't revolutionary—it's separation of concerns applied to agent context. But once you've hit the monolith problem, this fixes it structurally. For teams building autonomous agents that restart, this architecture is worth knowing.

📖 Read the full source: r/ClaudeAI