Routing Claude API traffic to control costs after Max subscription change

✍️ OpenClawRadar📅 Published: April 13, 2026🔗 Source
Routing Claude API traffic to control costs after Max subscription change
Ad

API billing migration and cost implications

As of noon PT, Anthropic's Max subscription no longer covers usage from third-party tools like OpenClaw. All OpenClaw users are now on API billing with these rates:

  • Claude Opus 4.6: $5 per million input tokens, $25 per million output tokens
  • Claude Sonnet 4.6: $3 per million input tokens, $15 per million output tokens
  • Claude Haiku 4.5: $1 per million input tokens, $5 per million output tokens

A heavy OpenClaw session on Opus can cost $1-4, while the same session on Sonnet costs $0.20-0.80 with similar results for most tasks.

Ad

The routing solution

Most OpenClaw operations don't require Opus: heartbeat checks, file reads, summaries, routing decisions, and short tool calls can all be handled by Sonnet. Without a routing layer, every request hits your default model, potentially wasting Opus budget on simple tasks.

A local proxy routes Claude requests by complexity: simple tasks go to Sonnet automatically, complex ones escalate to Opus. This approach has significantly reduced costs without quality loss on important tasks.

The proxy is open source and installable via npm: npm install -g @relayplane/proxy

Detailed documentation and discussion is available on r/ClaudeCode, where the solution has received 52K views.

📖 Read the full source: r/openclaw

Ad

👀 See Also

Relational Memory for LLMs: Three-Layer System Models User Relationships
Tools

Relational Memory for LLMs: Three-Layer System Models User Relationships

An open-source Python tool that adds relational memory to LLMs by modeling user-AI relationships across seven psychological dimensions, using a three-layer narrative structure instead of flat fact storage.

OpenClawRadar
Pair Programmer Plugin Adds Live Screen, Voice, and Audio Context to Claude Code
Tools

Pair Programmer Plugin Adds Live Screen, Voice, and Audio Context to Claude Code

A developer has built a plugin called Pair Programmer that gives Claude Code real-time desktop perception by capturing screen, microphone, and system audio streams. The architecture uses specialized agents running in parallel for different input types, with indexing currently handled by cloud models but designed to be model-agnostic.

OpenClawRadar
civStation: A VLM System for Playing Civilization VI via Natural Language Commands
Tools

civStation: A VLM System for Playing Civilization VI via Natural Language Commands

civStation is a computer-use VLM harness that plays Civilization VI by translating high-level natural language commands into in-game actions. The system uses a 3-layer architecture separating strategy and execution, with support for human-in-the-loop intervention.

OpenClawRadar
OpenClaw Integrates Features from Claude Code Leak
Tools

OpenClaw Integrates Features from Claude Code Leak

An OpenClaw user had their bot analyze the leaked Claude Code (Rust recreation by Instructkr) and selectively ported specific architectural patterns into their OpenClaw setup. The integration focuses on practical improvements like automatic startup continuity, conversation compaction, and a pre-tool/post-tool hook framework.

OpenClawRadar