Cursor & Claude Code: Bloating Context Kills AI Reasoning

A developer on r/LocalLLaMA audited their API logs and prompt payloads after noticing token usage spiking and agent output degrading into slop after ~20 turns. Their conclusion: the models aren't getting lobotomized; they're suffocating on their own bloated context windows.

The Four Structural Blunders

After inspecting what Cursor and Claude Code actually do on a 10k+ line repo, the author identified four patterns:

Blind exploration: The agent recursively greps and dumps ~40 different files into context just to find one utility function. Often it misses an existing component and hallucinates a duplicate from scratch.
Raw ingestion: Dumping a 2,000-line file into the prompt to update a 5-line interface. Wastes vast context tokens.
Tool diarrhea: Verbose test logs and massive MCP tool definitions consume ~30k tokens before the model generates a single token of code.
Goldfish memory: Every session starts fresh — zero project awareness — so the same files get re-read repeatedly.

Tipping Point at 80% Context

Once the context hits ~80% capacity with noise, the model's attention mechanism degrades sharply. IQ visibly drops to room temperature, and it starts destroying the architecture. Standard chunking RAG doesn't fix this because it's garbage for logic — the agent remains blind to codebase structure until it burns tokens reading raw text.

Proposed Fix: AST or Graph DB

The author calls for an open-source agent that parses code into an AST or graph database before consuming context, so it understands structure without wasting tokens on raw text. This would prevent architectural spaghetti that costs 5 hours to fix for every 1 hour saved on typing.