Claude Code Skill Delegates Coding to Mistral/DeepSeek: 57M Tokens Saved, 90-100% Cost Reduction

Developer pcx_wave posted a detailed breakdown of vibe-skill, a Claude Code skill that delegates coding tasks to cheaper models (Mistral or DeepSeek) while using Claude for planning and review. After 10 days and 254 runs, they saved 57 million tokens and cut costs by 90-100% while maintaining Claude-quality output.
How It Works
Vibe-skill runs inside Claude Code. You type /vibeon <whatever>, Claude decomposes the task and delegates the actual coding to a lightweight model (via the open-source Vibe tool). Claude then reviews the diff and corrects failures. The cheap model handles token-burn; Claude only spends tokens on planning and review.
Results by Model
| Model | Tokens Delegated | Actual Cost | Claude Equivalent | Savings |
|---|---|---|---|---|
| DeepSeek V4 Flash | 29M | $4.13 | $92.16 | 95% |
| Mistral Medium 3.5 | 28M | $0 (Pro sub) | $84.77 | 100% |
Overall success rate: 98% across 254 runs. When delegation fails, Claude catches and corrects the output.
Token Economics
Mistral tokens are roughly 50% cheaper than Claude's; DeepSeek tokens are 95% cheaper. The author uses a Mistral Pro subscription ($18.36/mo) which includes about 1 billion free tokens. For Mistral Pro subscribers, delegation costs $0 until the quota is exhausted, after which it automatically falls back to DeepSeek (since Mistral PAYG at $1.52/M tokens is 10× more expensive than DeepSeek).
The break-even point: DeepSeek alone is cheaper than the Mistral Pro subscription if you delegate below 131M tokens/month ( $18.36 / $0.14 per M ). Above that volume, Mistral Pro wins with ~10× more headroom before hitting the quota.
Setup
The skill is open source at github.com/pcx-wave/vibe-skill. A similar Gemini skill is also available but less configurable and flaky. To use, clone the repo and load the skill into Claude Code — then just /vibeon your task.
📖 Read the full source: r/ClaudeAI
👀 See Also

Handoffs Pattern in Claude Workflows: Two-File Split vs One-Doc Summary
Long Claude sessions break on context decay. Handoffs compress what matters and start fresh. Two approaches: Matt Pocock's single-doc handoff skill vs a two-file split with persistent narrative and ephemeral prompt.

Fino: Open-Source MCP Server for Personal Finance Analysis with Claude
Fino is a free, open-source MCP server that connects Claude to bank accounts through Plaid, stores transaction data locally in SQLite, and provides Claude with tools for financial analysis.

Benchmark Results: When to Use Claude Opus with Codex vs. Pure Opus for Code Generation
A controlled benchmark tested the 'Plan with Opus, Execute with Codex' approach across three real coding tasks. Results show a cost crossover at approximately 600 lines of code, with specific recommendations based on project size.

Temporal-MCP: Wall-Clock Awareness for LLMs with OAuth Support
Temporal-MCP is a minimal MCP server that provides wall-clock awareness to LLMs, addressing time-related failure modes like incorrect greetings and stale context. It offers two tools (temporal_tick and temporal_peek) returning elapsed time, day-rollover detection, and fresh-thread flags.