How I reduced OpenClaw costs by 60% through model routing

Cost breakdown and analysis
An OpenClaw user running four agents for website data analytics, blog content, code review, and customer support discovered they were spending $420 over 20 days ($21/day). All agents were configured to use Claude Opus exclusively at $5/1M input tokens and $25/1M output tokens.
After logging 13,500 calls across all agents for 20 days, they categorized tasks by complexity:
- 70% were simple tasks: FAQ answers, basic formatting, one-line summaries, summarizing minor PRs
- 16% were standard tasks: longer email drafts, moderate code reviews, multi-paragraph summaries
- 9% were complex tasks: deep code analysis, long-form content, multi-file context
- 6% needed real reasoning: architecture decisions, complex debugging, multi-step logic
The analysis revealed they were paying premium Opus prices for 70% of tasks that cheaper models could handle without quality loss.
Model pricing comparison
The user researched current model pricing:
- Claude Opus 4.6: $5.00 input/$25.00 output per 1M tokens (premium)
- Claude Sonnet 4.6: $3.00 input/$15.00 output per 1M tokens (mid-tier)
- Claude Haiku 4.5: $1.00 input/$5.00 output per 200K tokens (budget)
- GPT-5.4: $2.50 input/$15.00 output per 1.05M tokens (premium)
- Gemini 3.1 Pro: $2.00 input/$12.00 output per 1M tokens (mid-tier)
- Gemini 3 Flash: $0.50 input/$3.00 output per 1M tokens (budget)
- GLM-5: $0.72–1.00 input/$2.30–3.20 output per 200K tokens (budget)
- Kimi K2.5: $0.60 input/$3.00 output per 256K tokens (budget)
- MiniMax M2.5: $0.30 input/$1.20 output per 1M tokens (ultra-budget)
Implementation and results
They now only run Opus on genuinely complex tasks. Everything else gets routed to Sonnet, Haiku, Kimi K2.5, or Qwen. The transition took about a week to find the right models for each task type.
Key findings from testing:
- Claude Haiku was most reliable for customer support: fast responses, followed formatting instructions well, kept answers concise
- Haiku requires explicit prompts - it won't infer tone or style from vague instructions like Opus does
- Rewriting system prompts to spell out exactly how replies should be structured made Haiku solid for support
- Kimi K2.5 is cheaper and handles longer context well for multi-turn conversations
Users haven't noticed any difference on simple tasks, and costs dropped from $420 to $168 over 20 days.
📖 Read the full source: r/openclaw
👀 See Also

Neuberg: Open-Source Multi-Market Trading Terminal Built with Claude AI
Neuberg is a browser-based trading terminal that connects to markets like Hyperliquid, Polymarket, and Alpaca, built using Claude and Claude Code. The development process revealed specific strengths in architectural critique and refactoring, along with limitations in long-context management and real-time systems.

Analyzing Claude Code Insights: Key Findings and Recommendations
A six-week report on Claude Code usage reveals iterative refinement dominates sessions, with key friction issues related to code verification and approaches.

Using Claude Code to Build a Satellite Image Analysis Pipeline for Retail Predictions
A developer used Claude Code to build a complete satellite imagery analysis pipeline that pulls Sentinel-2 optical and Sentinel-1 radar data via Google Earth Engine, processes parking lot boundaries from OpenStreetMap, and calculates occupancy metrics to predict retail earnings outcomes.

Practical AI Support Improvements from Claude Code Leak Analysis
A developer analyzed the Claude Code source leak and implemented six specific changes to their Chatbase setup: overhauling text snippets, adding sentiment analytics, building structured Q&A pairs, creating adversarial testing agents, connecting actions to tools, and cross-referencing topics.