How to Cut OpenClaw Agent Costs by 80% with Model Switching

A Reddit user spent two weeks manually logging every OpenClaw agent interaction to figure out where their money was going. The results are a clear blueprint for optimizing spend on AI agents.
The Breakdown
Over 14 days on a Telegram + Discord agent, token usage broke down as follows:
- Heartbeats (30-min polls) — 38% of usage. Running on Opus at ~$6.75/M tokens. Complete waste for a status ping.
- File reads and summaries — 29% of usage. Also on Opus. Flash handles these identically.
- Actual conversations — 22% of usage. Here model quality matters.
- Complex tasks — 11% of usage. Where Opus genuinely outperforms Flash.
In total, 67% of spend went to tasks where DeepSeek V4 Flash ($0.14/M) would deliver identical quality to Opus ($6.75/M effective after tokenizer).
The Fix: Default to Flash, Escalate Only When Needed
Set your primary model to deepseek/deepseek-v4-flash in openclaw.json:
"agents": {
"defaults": {
"model": {
"primary": "deepseek/deepseek-v4-flash"
}
}
}Then use /model anthropic/claude-opus-4-7 mid-session when you hit something truly hard. The switch is instant — no restart, same session. Type /model deepseek/deepseek-v4-flash when you're done to drop back to cheap.
Results
Costs dropped from ~$170/month to ~$35/month. The quality difference on heartbeats, file reads, and simple questions was literally zero.
The user notes that BetterClaw's free tier (with BYOK) now shows per-task API spend, which would have caught the heartbeat waste immediately. But the core move — switching primary to Flash and /model-ing up to Opus only when needed — is the real takeaway.
📖 Read the full source: r/openclaw
👀 See Also

How to Prevent CLAUDE.md Rot: Treat Rules Like Code
After 18 months of real-world use, one developer shares four disciplines to keep CLAUDE.md under 100 lines: use it as an index, separate rules from sources, audit on every PR, and delete more than you add.

Auth 400 Error Fix: Using Python's mnemonic Package to Avoid BIP39 Filter Triggers
A Reddit user identified that Anthropic's content filter triggers a 400 error when AI agents attempt to write the full BIP39 wordlist (2048 standardized English words) into Python code. The solution is to use the mnemonic Python package instead, which contains the wordlist internally.

Annotation-Driven UI: How to Design Templates in Figma and Let Claude Extract Coordinates
Skip building a custom layout engine: design flat PNGs in Figma, draw colored rectangles for slots, feed both to Claude, and get editable area definitions with tap targets. One afternoon instead of weeks.

OpenClaw WhatsApp Auto-Reply May Skip Media Understanding in 2026.4.2
A user reports that OpenClaw 2026.4.2's WhatsApp auto-reply flow can skip the media understanding pipeline, preventing transcription of voice notes when using external STT backends like Groq. The fix involves explicitly calling media understanding before agent dispatch.