ClawCut Proxy Released on GitHub to Optimize OpenClaw for Small LLMs

ClawCut Proxy is now available on GitHub as an experimental tool designed to optimize OpenClaw's interaction with local LLMs, particularly smaller models that struggle with OpenClaw's default large system prompts and complex tool definitions.
What ClawCut Solves
OpenClaw sends massive system prompts (often >28,000 characters) and complex JSON tool definitions to LLMs. While large cloud models or high-end local models (14B+) handle this well, small models (7B, 8B) running on limited hardware (Mac/MLX or Raspberry Pi) suffer from "Cognitive Overload," leading to:
- Extreme processing latency (slow Time To First Token)
- Models forgetting their identity or available tools
- Hallucinating text answers instead of executing local scripts
- Connection timeouts or malformed JSON responses
- Huge RAM consumption
How ClawCut Works
ClawCut acts as a "Man-in-the-Middle" between OpenClaw and your local LLM server with these optimization features:
- PROMPT TRIMMING: Automatically removes unused default skills from the system prompt to keep the context window small and focused
- SMART AMNESIA: Intelligently truncates chat history after successful tool executions to free up "mental space" for the model
- ATTENTION FORCER: Injects a reminder at the very end of the user query to ensure the model prioritizes tool usage
- TOOL FORCER: Injects keywords for tool calling and points to commands
- INPUT RESCUE: Short-circuits known incoming requests (like Cron-Jobs) to bypass LLM latency and ensure 100% reliability for automated tasks
- BASH-RESCUE: Detects poorly formatted script calls (e.g., naked code blocks) and converts them into valid OpenClaw tool calls on the fly
- Automatically filters dynamic timestamps from system prompts to enable near-instant responses via hardware caching
- Translates between OpenAI-compatible streams (MLX) and the Ollama/NDJSON format expected by OpenClaw
- Real-time console output of prefill duration, token count
Performance and Debugging
ClawCut provides significantly faster response times (TTFT) as the model has less text to process upfront, improved reliability when calling scripts, and robust error handling for stream interruptions or formatting errors. With DEBUG_MODE enabled, you can inspect the full "JSON Clutter" sent by OpenClaw to understand exactly what the model is processing.
When to Use
Ideal for small models (7B-8B) running on hardware like Mac (MLX), Windows, or Linux, especially if your model "chats" too much instead of executing commands. Use with caution if you're using highly intelligent, large models (14B+) that can handle complex prompts natively. In this case, the proxy can act purely as a logger and format translator without manipulating content if PASS_THROUGH_MODE = True.
📖 Read the full source: r/openclaw
👀 See Also

Anthropic's Multi-Agent Harness Design for Improving Claude's Code Quality
Anthropic's blog post details a harness design using multiple agents to address Claude's context anxiety and self-evaluation bias, with specific agent roles and scoring criteria for frontend and full-stack development.

ByteRover Memory Plugin for OpenClaw: Native Integration with Semantic Hierarchy
ByteRover Memory Plugin for OpenClaw provides native, structured long-term memory via a three-layer architecture and semantic hierarchy stored in Markdown files. It achieves 92.2% retrieval accuracy and requires OpenClaw v2026.3.22+.

Chrome Extension Bridges Google Messages to Claude Code via MCP
A developer built a Chrome extension that connects Google Messages Web to Claude Code using MCP with stdio and WebSocket transport. The extension lists chats, reads messages, and drafts replies but currently can't send messages due to Angular's zone.js isolation.

SkyClaw: Rust AI Agent Runtime for Cloud VPS with Telegram Control
SkyClaw is a 6.9 MB Rust-based AI agent runtime designed for cloud VPS deployment with Telegram as the sole interface. It executes shell commands, browses the web via headless Chrome, reads/writes files, and fetches URLs with multi-round tool chaining.