Claude Code PTC Cuts Analysis Tokens 40-65%

A developer has built a local Programmatic Tool Calling (PTC) implementation for Claude Code and analyzed 79 real usage sessions to measure actual benefits. PTC differs from normal tool calling by having the agent write code that runs in an isolated environment, with only final results entering the context window instead of every intermediate step.

What was built

The developer created Thalamus, a local MCP server that provides PTC-like capability to Claude Code. It includes four tools: execute() (runs Python with primitives), search, remember, and context. The implementation has 143 tests, uses Python stdlib only, and runs fully locally. The developer emphasizes this is their own implementation, not Anthropic's official PTC.

Measured results from 79 sessions

Token footprint per call: execute() averaged ~2,600 characters vs Read averaging ~4,400 characters
JSONL size reduction: Sessions using PTC showed -15.6% size reduction
Savings on analysis/research tasks: 40-65%
Savings on code-writing tasks: ~0%

The developer notes these real-world numbers are "far from 98%" savings reported in optimal scenarios by Anthropic and Cloudflare.

How the agent actually uses execute()

Content analysis of 112 execute() calls revealed:

64% used standard Python (os.walk, open, sqlite3, subprocess) — not the PTC primitives
30% used a single primitive (one fs.read or fs.grep)
5% did true batching (2+ primitives combined)

The "replace 5 Reads with 1 execute" pattern occurred in only 5% of actual usage. The agent mostly used execute() as a general-purpose compute environment for accessing files outside the project, running aggregations, and querying databases.

Adoption patterns

Initial measurement showed only 25% of sessions used PTC, with the agent defaulting to Read/Grep/Glob. After adding a ~1,100 token operational manual to CLAUDE.md, adoption jumped to 42.9%. Sessions focused on writing code (Edit + Bash dominant) showed zero PTC usage.

The developer concludes PTC shines in analysis, debugging, and cross-file research tasks, but not in edit-heavy development workflows.

📖 Read the full source: r/ClaudeAI