Route Claude Code through Ollama: Slash Costs 90%

This repo by Coherence Daddy provides a complete setup to route Claude Code terminal sessions through a local Ollama instance while keeping Claude Desktop on Anthropic's paid Pro tier. The result: a claimed ~90% reduction in Claude Code API costs.

How It Works

You run two engines side by side:

Claude Desktop (Anthropic) – used for strategy, architecture, code review, and tricky bugs.
Claude Code → Ollama – used for lints, refactors, repetitive edits, batch file ops, and grep-and-replace tasks. Runs on a free open-source model (Gemma, Qwen, DeepSeek, your choice).

Setup Process

The repo includes a self-contained HTML presentation (21 slides) with a copy-paste prompt that does ~98% of the setup automatically. It auto-detects your OS (macOS, Windows + WSL2, Linux), installs everything, configures the router, and verifies both engines at the end.

To run locally:

git clone https://github.com/Coherence-Daddy/use-ollama-to-enhance-claude.git
cd use-ollama-to-enhance-claude/presentation
open index.html  # macOS, or drag into browser

Or directly use the copy-paste prompt from prompts/copy-paste-prompt.md.

Repository Structure

prompts/copy-paste-prompt.md – the setup prompt.
presentation/index.html – full visual deck (no build step required).
Also hosted at coherencedaddy.com/tutorials/use-ollama-to-enhance-claude.

Why This Exists

Claude Pro on desktop is great for thinking and architecture, but Claude Code in the terminal burns through quota fast on context-heavy tasks. Routing those tasks through Ollama (local or cloud-hosted free models) keeps the same UX but at a fraction of the cost.