Slash Claude costs 60x by offloading mechanical tasks to DeepSeek V4 Flash via MCP

A Reddit user analyzed their Claude usage and found the bulk of it went to mechanical tasks: classifying files, reformatting JSON, pulling fields from text, and summarizing docs they'd skim anyway. None of that needed Sonnet. The fix: a small cheap model running as a side worker via MCP, plus a single rule in CLAUDE.md telling Claude not to do those tasks.
Setup: an MCP tool + CLAUDE.md deny-list
The setup uses a single MCP tool that sends text and gets text back. Default model is DeepSeek V4 Flash (cheap, 1M context). The endpoint is one config line and works with any OpenAI-compatible provider (local ollama, vllm, lm studio). The repo is github.com/arizen-dev/deepseek-mcp (MIT, Python 3.10+).
The critical piece: the CLAUDE.md rule uses negative framing — a deny list, not a permission list. The user reports positive framing ("use DeepSeek for X") got ignored ~30% of the time. The deny list approach catches it reliably.
# In CLAUDE.md:
# do NOT use Claude for:
# - json formatting
# - field extraction
# - file classification
# - summarization you will review anyway
Results: 60x cost reduction
Over 3 weeks of real usage: 217 mechanical calls offloaded to DeepSeek V4 Flash, total spend $0.41. Same workload on Sonnet would have been roughly $7. That's a ~17x multiplier on just those tasks, and the user says overall bill dropped 60x when factoring in heavier tasks still on Sonnet.
How the side worker operates
The side worker is a supervised tool, not an agent — no tool calls, no file access, no chains. Latency is 3–25 seconds. You review the output. The whole shape is: send text, get text back, review, move on.
Who it's for
Developers using Claude API or Claude Code who want to cut spend on high-volume mechanical tasks without losing Sonnet's reasoning for complex work.
📖 Read the full source: r/ClaudeAI
👀 See Also

OpenClaw Community Thread: Share Your AI Coding Setup and Monthly Costs
A Reddit thread in r/openclaw collects practical setups for AI coding agents, focusing on model routing strategies, cost-saving rules, and community-sourced hardware-to-model mappings with monthly cost ranges.

Five Common OpenClaw Configuration Issues That Inflate API Costs
A Reddit post identifies five configuration problems in OpenClaw setups that lead to excessive API credit consumption, including using expensive models for routine tasks, missing budget limits, open gateways, unmanaged memory, and unaudited skills.

OpenClaw Pre-Launch Checklist for Security and Reliability
A Reddit user shares a practical six-point checklist for OpenClaw setup before going live, covering access control, safety rules, memory management, automation testing, delivery validation, and failure handling.

How to Secure Claude Cowork with a Proxy Layer: Practical Guide
A walkthrough on setting up a proxy layer to observe and secure Claude Cowork's behavior, published by General Analysis team.