How I reduced OpenClaw costs by 60% through model routing

✍️ OpenClawRadar📅 Published: March 16, 2026🔗 Source
How I reduced OpenClaw costs by 60% through model routing
Ad

Cost breakdown and analysis

An OpenClaw user running four agents for website data analytics, blog content, code review, and customer support discovered they were spending $420 over 20 days ($21/day). All agents were configured to use Claude Opus exclusively at $5/1M input tokens and $25/1M output tokens.

After logging 13,500 calls across all agents for 20 days, they categorized tasks by complexity:

  • 70% were simple tasks: FAQ answers, basic formatting, one-line summaries, summarizing minor PRs
  • 16% were standard tasks: longer email drafts, moderate code reviews, multi-paragraph summaries
  • 9% were complex tasks: deep code analysis, long-form content, multi-file context
  • 6% needed real reasoning: architecture decisions, complex debugging, multi-step logic

The analysis revealed they were paying premium Opus prices for 70% of tasks that cheaper models could handle without quality loss.

Model pricing comparison

The user researched current model pricing:

  • Claude Opus 4.6: $5.00 input/$25.00 output per 1M tokens (premium)
  • Claude Sonnet 4.6: $3.00 input/$15.00 output per 1M tokens (mid-tier)
  • Claude Haiku 4.5: $1.00 input/$5.00 output per 200K tokens (budget)
  • GPT-5.4: $2.50 input/$15.00 output per 1.05M tokens (premium)
  • Gemini 3.1 Pro: $2.00 input/$12.00 output per 1M tokens (mid-tier)
  • Gemini 3 Flash: $0.50 input/$3.00 output per 1M tokens (budget)
  • GLM-5: $0.72–1.00 input/$2.30–3.20 output per 200K tokens (budget)
  • Kimi K2.5: $0.60 input/$3.00 output per 256K tokens (budget)
  • MiniMax M2.5: $0.30 input/$1.20 output per 1M tokens (ultra-budget)
Ad

Implementation and results

They now only run Opus on genuinely complex tasks. Everything else gets routed to Sonnet, Haiku, Kimi K2.5, or Qwen. The transition took about a week to find the right models for each task type.

Key findings from testing:

  • Claude Haiku was most reliable for customer support: fast responses, followed formatting instructions well, kept answers concise
  • Haiku requires explicit prompts - it won't infer tone or style from vague instructions like Opus does
  • Rewriting system prompts to spell out exactly how replies should be structured made Haiku solid for support
  • Kimi K2.5 is cheaper and handles longer context well for multi-turn conversations

Users haven't noticed any difference on simple tasks, and costs dropped from $420 to $168 over 20 days.

📖 Read the full source: r/openclaw

Ad

👀 See Also

OpenClaw Agent Memory Continuity Solution Using Database Query System
Use Cases

OpenClaw Agent Memory Continuity Solution Using Database Query System

An OpenClaw user solved agent memory continuity between sessions by implementing a database that stores session data, allowing the agent to query past references instead of storing entire sessions in context. The agent named Sage could remember previous conversations after session resets using this approach.

OpenClawRadar
Garlic Farmer Builds 19K-Line AI Agent System on Android Phone
Use Cases

Garlic Farmer Builds 19K-Line AI Agent System on Android Phone

A Korean garlic farmer has built a 19,260-line Python AI agent system called 'garlic-agent' that runs entirely on an Android phone using Termux. The system rotates between multiple AI providers, saves context in SQLite, and uses a manual copy-paste workflow for development.

OpenClawRadar
Using Markdown Files as a Memory System for AI Coding Agents
Use Cases

Using Markdown Files as a Memory System for AI Coding Agents

A developer shares a method using {topic}_LOG.md and {topic}_SUMMARY.md files to persist conversations with Claude Code, solving compaction and agent restart issues by creating a dual memory system with detailed logs and indexed summaries.

OpenClawRadar
Developer Builds Personal Finance App in One Month Using Claude Code: Key Workflows and Challenges
Use Cases

Developer Builds Personal Finance App in One Month Using Claude Code: Key Workflows and Challenges

A developer with 14 years of experience built and shipped a personal finance forecasting app to the App Store in about a month using Claude Code. He identified three specific workflows where Claude Code was most effective and shared challenges with scope creep and data model complexity.

OpenClawRadar