Semble: A Local MCP Server for Claude Code with 98% Token Reduction

✍️ OpenClawRadar📅 Published: April 30, 2026🔗 Source

Semble is an MCP server that lets Claude Code search local codebases efficiently, returning only relevant code chunks instead of full files. It uses a hybrid of static embeddings, BM25, and a code-optimized reranking stack, all running locally on CPU — no API keys, no GPU, no heavy dependencies.

Installation

Install via uvx:

claude mcp add semble -s user -- uvx --from "semble[mcp]" semble

Once installed, Claude Code can search any repo — local or remote — directly.

Key Details

Token reduction: Uses ~98% fewer tokens than the typical grep+read approach.
Performance: Indexes any repo in ~250ms, answers queries in ~1.5ms (all on CPU).
Quality: Reaches NDCG@10 of 0.854 — 99% of the best transformer hybrid tested, while being ~200x faster.
Benchmarked against: grepai, probe, colgrep, and other existing methods.
Open source: Available on GitHub under the MinishLab organization.

Who It's For

Developers using Claude Code on large codebases who want to reduce token burn and latency while getting high-quality code search results without external API calls.

📖 Read the full source: r/ClaudeAI

👀 See Also

Tools

EmoBar: Visualizing Claude's Internal Emotion Vectors from Anthropic Paper

A developer built EmoBar, an open-source tool that visualizes the 171 internal emotion representations in Claude identified in Anthropic's recent paper. The tool uses a dual-channel approach to surface these measurable vectors that causally drive model behavior.

Apr 14, 2026, 08:51 AM UTC

OpenClawRadar

Tools

OpenClaw skill reduces accessibility tree tokens from 600K to 1.3K for ad-heavy sites

A developer built an OpenClaw skill that uses ML-based element ranking to prune accessibility trees, cutting slickdeals.com from ~598K tokens to ~1.3K tokens by keeping only the top ~50 actionable elements.

Feb 26, 2026, 05:45 AM UTC

OpenClawRadar

Tools

Qwen3.5-35B-A3B-UD-Q6_K_XL Tested in Production Development Workflows

A developer tested the Qwen3.5-35B-A3B-UD-Q6_K_XL model across multiple real client projects, achieving solid performance with benchmarks of 1504pp2048 and 47.71 tg256, and token speeds of 80tps on a single GPU.

Feb 28, 2026, 01:45 PM UTC

OpenClawRadar

Tools

OpenClaw: Revolutionizing Website Maintenance with Continuous Surveillance

OpenClaw, an innovative AI-driven agency, redefines website maintenance by operating tirelessly around the clock. Harnessing advanced automation, it ensures optimal website functionality and promptly addresses issues.

Apr 20, 2026, 05:38 PM UTC

OpenClawRadar