User-built PTC for Claude Code shows 40-65% token savings on analysis tasks, not code writing

A developer has built a local Programmatic Tool Calling (PTC) implementation for Claude Code and analyzed 79 real usage sessions to measure actual benefits. PTC differs from normal tool calling by having the agent write code that runs in an isolated environment, with only final results entering the context window instead of every intermediate step.
What was built
The developer created Thalamus, a local MCP server that provides PTC-like capability to Claude Code. It includes four tools: execute() (runs Python with primitives), search, remember, and context. The implementation has 143 tests, uses Python stdlib only, and runs fully locally. The developer emphasizes this is their own implementation, not Anthropic's official PTC.
Measured results from 79 sessions
- Token footprint per call:
execute()averaged ~2,600 characters vsReadaveraging ~4,400 characters - JSONL size reduction: Sessions using PTC showed -15.6% size reduction
- Savings on analysis/research tasks: 40-65%
- Savings on code-writing tasks: ~0%
The developer notes these real-world numbers are "far from 98%" savings reported in optimal scenarios by Anthropic and Cloudflare.
How the agent actually uses execute()
Content analysis of 112 execute() calls revealed:
- 64% used standard Python (os.walk, open, sqlite3, subprocess) — not the PTC primitives
- 30% used a single primitive (one fs.read or fs.grep)
- 5% did true batching (2+ primitives combined)
The "replace 5 Reads with 1 execute" pattern occurred in only 5% of actual usage. The agent mostly used execute() as a general-purpose compute environment for accessing files outside the project, running aggregations, and querying databases.
Adoption patterns
Initial measurement showed only 25% of sessions used PTC, with the agent defaulting to Read/Grep/Glob. After adding a ~1,100 token operational manual to CLAUDE.md, adoption jumped to 42.9%. Sessions focused on writing code (Edit + Bash dominant) showed zero PTC usage.
The developer concludes PTC shines in analysis, debugging, and cross-file research tasks, but not in edit-heavy development workflows.
📖 Read the full source: r/ClaudeAI
👀 See Also

Pilot: A Browser Automation Tool Built Entirely with Claude Code
A non-developer used Claude Code to build Pilot, a Chrome automation tool that lets AI control browsers via accessibility tree navigation. The tool assigns numbers to clickable elements so Claude can issue commands like 'click 5' instead of guessing screen positions.

Pepper MCP Server for iOS Simulator Interaction and Debugging
Pepper is an MCP server that injects a dylib into iOS simulator apps via DYLD_INSERT_LIBRARIES, enabling real-time interaction, screen reading, button tapping, variable inspection, and network traffic monitoring through a WebSocket bridge.

Open-source MCP memory server with knowledge graph and learning features
An open-source MCP server written in Rust provides persistent memory for AI agents with knowledge graph architecture, Hebbian learning, and hybrid search. It's 7.6MB with sub-millisecond latency and works with any MCP-compatible client.

Hermes Agent v0.6.0 offers improved local model support with per-model tool call parsers
Hermes Agent v0.6.0 from Nous Research provides per-model tool call parsers that handle tool calling properly on 30B class models, supports Ollama, vLLM, and sglang out of the box, and includes six terminal backends including Modal and Daytona for serverless deployment.