Model Routing: Cut OpenClaw Costs 60%

Cost breakdown and analysis

An OpenClaw user running four agents for website data analytics, blog content, code review, and customer support discovered they were spending $420 over 20 days ($21/day). All agents were configured to use Claude Opus exclusively at $5/1M input tokens and $25/1M output tokens.

After logging 13,500 calls across all agents for 20 days, they categorized tasks by complexity:

70% were simple tasks: FAQ answers, basic formatting, one-line summaries, summarizing minor PRs
16% were standard tasks: longer email drafts, moderate code reviews, multi-paragraph summaries
9% were complex tasks: deep code analysis, long-form content, multi-file context
6% needed real reasoning: architecture decisions, complex debugging, multi-step logic

The analysis revealed they were paying premium Opus prices for 70% of tasks that cheaper models could handle without quality loss.

Model pricing comparison

The user researched current model pricing:

Claude Opus 4.6: $5.00 input/$25.00 output per 1M tokens (premium)
Claude Sonnet 4.6: $3.00 input/$15.00 output per 1M tokens (mid-tier)
Claude Haiku 4.5: $1.00 input/$5.00 output per 200K tokens (budget)
GPT-5.4: $2.50 input/$15.00 output per 1.05M tokens (premium)
Gemini 3.1 Pro: $2.00 input/$12.00 output per 1M tokens (mid-tier)
Gemini 3 Flash: $0.50 input/$3.00 output per 1M tokens (budget)
GLM-5: $0.72–1.00 input/$2.30–3.20 output per 200K tokens (budget)
Kimi K2.5: $0.60 input/$3.00 output per 256K tokens (budget)
MiniMax M2.5: $0.30 input/$1.20 output per 1M tokens (ultra-budget)

Implementation and results

They now only run Opus on genuinely complex tasks. Everything else gets routed to Sonnet, Haiku, Kimi K2.5, or Qwen. The transition took about a week to find the right models for each task type.

Key findings from testing:

Claude Haiku was most reliable for customer support: fast responses, followed formatting instructions well, kept answers concise
Haiku requires explicit prompts - it won't infer tone or style from vague instructions like Opus does
Rewriting system prompts to spell out exactly how replies should be structured made Haiku solid for support
Kimi K2.5 is cheaper and handles longer context well for multi-turn conversations

Users haven't noticed any difference on simple tasks, and costs dropped from $420 to $168 over 20 days.

📖 Read the full source: r/openclaw