ClawCut: A Python Proxy That Makes Small Local LLMs Usable with OpenClaw

What ClawCut Does
ClawCut is a Python Flask application that acts as a proxy between local LLM servers (like MLX or Ollama) and the OpenClaw framework. It was created to solve specific technical problems that make small local models (7B/14B) difficult to use as practical assistants with OpenClaw.
Key Problems Solved
- Context poisoning: Small models lose track of tool usage when they see their own old tool calls in chat history
- Infinite loops: Models get stuck repeating patterns instead of executing commands
- Output issues: Models output bash code as plain text in chat or choke on their own history after multiple commands
- Cron job failures: Scheduled background jobs generate responses that disappear because no active chat window is open
- LLM artifacts: Empty markdown blocks, internal XML tags, and dangling backticks clutter outputs
- Media upload refusal: Models sometimes refuse to upload generated files
How It Works
Dynamic amnesia for tool calls: During normal chat, history is preserved. When the proxy detects the model trying to use a system tool, it temporarily cuts off old chat history, giving the model "tunnel vision" to execute shell commands cleanly without loops or hallucinations.
Universal auto-delivery for cron jobs: The proxy monitors the model's stream and intercepts clean text responses at the end of thought processes. It then forces delivery via automatic tool calls to WhatsApp, Telegram, or Signal, making cron jobs proactively report to your phone.
Artifact filtering: Empty markdown blocks, internal XML tags, and dangling backticks are filtered out before reaching the frontend.
Tool-name manipulation: Simple stream manipulations bypass models' refusal to upload generated media files.
Tested Setup
- Raspberry Pi 5 (8GB) with OpenClaw 3.8
- Mac mini M4 Pro 24GB with MLX-LLM running Qwen2.5-Coder-7B-Instruct-4bit
- Windows machine with Ollama and Qwen 2.5 Coder 14B model (planned for ClawCut integration)
Limitations
ClawCut doesn't turn 7B models into GPT-4. Highly complex, multi-step logic chains remain challenging for small models. The proxy specifically addresses technical stumbling blocks that previously made them nearly unusable as everyday assistants.
📖 Read the full source: r/openclaw
👀 See Also

WebClaw: Open-Source MCP Server for Web Extraction with Claude
WebClaw is an open-source MCP server built with Claude Code that provides web extraction tools for Claude Desktop and Claude Code, solving Claude's built-in web_fetch limitations with TLS fingerprinting and content optimization.

Skynet: Multi-Agent Collaboration Network for Claude Code Agents
Skynet is an open-source network that enables role-based collaboration between multiple Claude Code agents and humans. It's installed as a skill using npx and managed through natural language commands.

Open Source Agent Skill for TypeScript, React, and Next.js Patterns
A developer has released a 4,000-line, 17-file structured markdown reference designed for AI agents like Claude Code to follow when generating or reviewing TypeScript, React, and Next.js code. It addresses common issues like improper API response validation and misuse of 'use client' directives.

Maestro v1.5.0 adds Claude Code support for multi-agent orchestration
Maestro v1.5.0, an open-source multi-agent orchestration platform, now runs as a native plugin on Claude Code in addition to Gemini CLI. The update includes deeper design planning, a 42-step orchestration backbone, agent capability enforcement, and security hardening.