ETH Zurich Study: Excessive Context Reduces AI Coding Agent Performance

A recent study from ETH Zurich provides concrete evidence that more context doesn't necessarily mean better performance for AI coding agents. The research tested four coding agents across 138 real GitHub tasks, with clear quantitative results.
Key Findings
The study revealed that LLM-generated context files actually reduced task success rates by 2-3% while inference costs increased by 20%. Even human-written context files only improved success by approximately 4%, while still significantly increasing costs.
The Core Problem
Researchers discovered that agents treated every instruction in context files as something that must be executed. In one experiment, when they stripped repositories down to only the generated context file, performance improved again. This indicates that agents struggle to distinguish between essential instructions and irrelevant historical information.
Practical Recommendations
The study recommends only including information that the agent genuinely cannot discover on its own, keeping context minimal. This is particularly relevant for communication data like email threads, which might look like context but are often interpreted as instructions when they're actually historical noise.
Context API Solution
To address this issue, researchers developed a context API (iGPT) that focuses on email processing. The API:
- Reconstructs email threads into conversation graphs before context reaches the model
- Deduplicates quoted text
- Detects who said what and when
- Returns structured JSON instead of raw text
This approach ensures agents receive filtered context rather than entire conversation histories, improving their ability to focus on relevant information.
📖 Read the full source: r/LocalLLaMA
👀 See Also
Opus 4.7 Can Follow ~500 Instructions, Up from ~150 a Year Ago
Research updated in May 2026 shows Opus 4.7 can reliably follow ~500 instructions, compared to ~150 in July 2025. GPT-5.5 handles ~5000. Implications for CLAUDE.md file size.

Claude Pro Subscription Bug: Paid Users Stuck on Free Plan
A bug in Claude Pro after using a gift pass leaves accounts stuck on Free despite successful payment and receipts. Anthropic support unresponsive for a week.

Qwen3.5-27B-FP8 performance benchmarks with OpenClaw agents
Testing shows Qwen3.5-27B-FP8 can run six OpenClaw agents simultaneously with throughput scaling to 120 tokens/second. The SGLang framework with prefix caching reduces 100K context prefill from 10 seconds to 200ms.

ACP Bug Investigation: Protocol Mismatch Causes 'metadata is missing' Error with Local Ollama
A confirmed bug in the ACP/OpenClaw integration prevents acpx spawn commands from working with local Ollama models due to a protocol mismatch where acpx expects JSON but receives text output.