Giving Claude a Local LLM as an Assistant via MCP on Mac

A Reddit user detailed how they gave Claude access to a local LLM running on a Mac Mini M4 (24GB RAM) via an MCP connection to Ollama. The setup uses Ollama serving Qwen 2.5 Coder (14B) as an assistant named 'Frank', which Claude can delegate tasks to under specific rules — must use fewer tokens than Claude itself, must not affect quality, and requires a final review.
Setup Details
- Hardware: Mac Mini M4 with 24GB RAM.
- Local LLM: Qwen 2.5 Coder (14B) running via Ollama (also tested with LM Studio).
- Connection: MCP (Model Context Protocol) to link Claude (CLI or Desktop App) with the local model.
- Instructions: Claude was given a memory Markdown file (
memory.md) with guidelines for when and how to use Frank — e.g., delegate text processing, large CSS/HTML file handling, and use only when it saves tokens without degrading output quality.
Practical Use Cases
- Text processing and transformation — offloaded to Frank to reduce Claude's token usage.
- Handling large CSS/HTML files that would be expensive for Claude to process directly.
- Running performance, coding, and logic tests — Claude evaluated local models via Frank rather than manually.
The user noted they are operating at the limits of their RAM/GPU and cannot test larger models (30B+). They invited others with more powerful hardware to try similar setups and share results.
This approach effectively creates a cost-free assistant for Claude, offloading token-heavy tasks while maintaining quality through Claude's final review.
📖 Read the full source: r/ClaudeAI
👀 See Also

Meta Ads MCP OAuth Works But Most Ad Accounts Not Enabled Yet
Meta Ads MCP OAuth flow works and loads 29 tools, but ads_get_ad_accounts returns is_ads_mcp_enabled: false with a message that the feature is gradually rolling out.

9 Building Blocks for Running Claude Code as a Persistent OS Across 18 Businesses
One developer runs 18 Claude Code instances as a shared OS with selective sync, state moved to MCP servers, receipt-based verification, and auto-loading rules. Details the architecture.

OpenAlly: Local AI Assistant for Android with Phone Control
OpenAlly is an Android app that runs an AI assistant locally on your phone via an embedded Node.js process, with 51 built-in skills and phone control capabilities through Aster companion. It connects to 19+ messaging platforms and supports 18 model providers with your own API keys.

SkillMesh: MCP-Friendly Router for Large Tool Catalogs Reduces Context Size by 70%
SkillMesh is an MCP-friendly router that retrieves only relevant expert cards for AI agent queries, reducing context size by 70% and improving tool selection. It supports Claude via MCP server, Codex skill bundles, and OpenAI-style function schemas.