3,177 API Calls: AI Coding Tools Context Window Analysis

The recent analysis conducted on four AI coding tools—Claude Code Opus 4.6, Claude Code Sonnet 4.5, Codex GPT-5.3, and Gemini 2.5 Pro—highlights substantial differences in managing API call context windows. Using the Context Lens tracer, the study intercepted 3,177 API calls to evaluate the tools’ efficiency and strategy in handling the context window when tasked with bug fixes in an Express.js environment.

Each coding tool tackled a specific bug—an incorrectly reordered null check in res.send(). Opus, Sonnet, Codex, and Gemini were tasked with identifying and fixing the bug, followed by running the test suite to verify the fix. They all succeeded, albeit with varying approaches and resources.

Claude Code Opus 4.6 consistently used around 23K to 27K tokens, mainly consisting of tool definitions (69% of the context). This indicates a reliance on re-sending these definitions due to the architecture, causing significant caching overhead. Codex (GPT-5.3) presented a wider range from 29.3K to 47.2K tokens, mostly tool results (72%), providing more variability depending on test command specificity. Sonnet, with similar variance, mixed definitions and results more evenly.

Gemini stands out due to its disproportionate use of tokens, peaking at 350.5K, utilizing almost exclusively tool results (96%), exploiting its large 1M context window. Despite a lower cost per token, Gemini’s inconsistent and expansive usage pattern without convergence across runs indicates a unique, albeit less efficient strategy.

These findings illustrate considerable disparities in how AI coding tools manage context windows, impacting both performance and cost efficiency. Developers should weigh token usage strategies when choosing the appropriate tool for their needs, particularly for tasks involving iterative changes or extensive project histories.

📖 Read the full source: HN LLM Tools

Analyzing AI Coding Tools: Dissecting 3,177 API Calls

👀 See Also

monk: A skill that silences agent narration to save context and tokens

DeepMind DiscoRL Meta Learning Update Rule Ported from JAX to PyTorch

OpenCortex: A Self-Improving Memory System for OpenClaw

UI and Server for Anthropic's Natural Language Autoencoders on llama.cpp