Open-source local hook automatically switches Claude models to cut AI costs

✍️ OpenClawRadar📅 Published: March 7, 2026🔗 Source

A developer has open-sourced a local hook that automatically selects the most cost-effective Claude AI model based on the type of coding task, potentially reducing AI costs by 50-70% without quality loss.

How it works

The tool runs as a local hook in Cursor and Claude Code (both use the same hook system) before each prompt is sent. It sits next to Opus/plan and acts as an efficient front-end filter that prevents obviously bad model matches before they hit expensive models.

Key functionality

Reads the prompt and current model selection
Uses simple keyword rules to classify tasks (git operations, feature work, architecture/deep analysis)
Blocks if you're overpaying (e.g., Opus for git commit) and suggests Haiku or Sonnet
Blocks if you're underpowered (Sonnet/Haiku for architecture) and suggests Opus
Lets everything else through unchanged
! prefix bypasses the filter completely if you disagree with its suggestion

Technical details

3 files: bash + python3 + JSON
No proxy, no API calls, no external services
Fail-open design: if it hangs, Claude Code proceeds normally
Open-sourced at: https://github.com/coyvalyss1/model-matchmaker

Performance and testing

The developer analyzed several weeks of their own prompts and found:

60-70% were standard feature work Sonnet could handle
5-20% were debugging/troubleshooting
A significant portion were pure git/rename/formatting tasks that Haiku handles identically at 90% less cost

Retroactive analysis showed the tool would have cut 50-70% of AI spend with no quality drop. After tuning, it correctly handled 12/12 real test prompts.

Problem it solves

The issue isn't knowledge—developers know they should switch models—but friction. When in flow state, developers don't want to think about dropdown menus. This tool automates the decision-making process.

📖 Read the full source: r/ClaudeAI

👀 See Also

Tools

OpenClaw Smart Router Open-Sourced for Automatic Model Selection

A developer has open-sourced a Smart Router for OpenClaw that automatically classifies queries by complexity and routes them to optimal models, saving 60-80% on API costs compared to always using premium models like Claude or GPT-4o.

Mar 16, 2026, 05:45 PM UTC

OpenClawRadar

Tools

BotCost.dev: Free Analyzer to See How Much AI Bots Cost Your Site

BotCost.dev is a free tool that analyzes your server logs against 18 known AI bot fingerprints (GPTBot, ClaudeBot, Perplexity, etc.) and estimates monthly bandwidth cost — no upload required, runs in-browser.

May 11, 2026, 10:16 PM UTC

OpenClawRadar

Tools

Dual DGX Sparks vs Mac Studio M3 Ultra: Practical Comparison for Running Qwen3.5 397B Locally

A developer compared running Qwen3.5 397B locally on a $10K Mac Studio M3 Ultra 512GB and a $10K dual DGX Spark setup. The Mac Studio achieved 30-40 tok/s with 800 GB/s bandwidth but slow prefill, while the Sparks delivered 27-28 tok/s with faster compute but complex setup.

Mar 27, 2026, 02:45 AM UTC

OpenClawRadar

Tools

Freddy MCP Server Connects Wearables to AI Agents with Headless Sign-In

Freddy is a personal MCP server that links wearables (Polar, Oura, Withings, Suunto, Intervals.icu, Hevy, plus WHOOP, Strava, Dexcom in beta) to AI clients like Claude Code, ChatGPT, and Notion AI via OAuth. New headless sign-in enables scheduled workflows for autonomous agents.

May 11, 2026, 08:15 AM UTC

OpenClawRadar