Semble: Code Search for AI Agents Using 98% Fewer Tokens

Semble is a fast, token-efficient code search library built specifically for AI coding agents like Claude Code, Cursor, Codex, and OpenCode. It returns relevant code snippets from natural language or code queries, using ~98% fewer tokens than the typical grep+read fallback approach.

How It Works

Semble combines static Model2Vec embeddings (using their own potion-code-16M model) with BM25, fused via RRF and reranked with code-aware signals. All computation runs on CPU — no GPU, no API keys, no external services. Indexing an average repo takes ~250ms, and queries complete in ~1.5ms on CPU.

Key Features

Token-efficient: 98% fewer tokens than grep+read — returns only the relevant chunks.
Fast: ~250ms to index a typical repo, ~1.5ms per query (very large repos may take longer).
Accurate: 0.854 NDCG@10 on their benchmark of ~1250 query/document pairs across 63 repos and 19 languages — 99% of the best transformer setup (137M parameters) at ~200x faster indexing and ~10x faster queries.
Zero config: No API keys, GPU, or external services required.
MCP server: Drop-in for Claude Code, Cursor, Codex, OpenCode, and any MCP-compatible agent.
Local and remote: Pass a local path or a git URL. Indexes are cached per session and auto-updated on file changes.

Installation and Setup

MCP server (recommended for agents)

Requires uv to be installed. For Claude Code:

claude mcp add semble -s user -- uvx --from "semble[mcp]" semble

For Codex, add to ~/.codex/config.toml:

[mcp_servers.semble]
command = "uvx"
args = ["--from", "semble[mcp]", "semble"]

For OpenCode, add to ~/.opencode/config.json:

{
  "mcp": {
    "semble": {
      "type": "local",
      "command": ["uvx", "--from", "semble[mcp]", "semble"]
    }
  }
}

For Cursor, add to ~/.cursor/mcp.json or .cursor/mcp.json:

{
  "mcpServers": {
    "semble": {
      "command": "uvx",
      "args": ["--from", "semble[mcp]", "semble"]
    }
  }
}

Bash integration (alternative)

Install with pip or uv, then add the code search snippet to AGENTS.md or CLAUDE.md:

pip install semble
uv tool install semble

Then in AGENTS.md:

## Code Search
Use `semble search` to find code by describing what it does or naming a symbol/identifier, instead of grep:
```bash
semble search "authentication flow" ./my-project
```

MCP Tools

The MCP server exposes two tools:

search — Search a codebase with a natural-language or code query. Pass repo as a local directory path or an https:// git URL.
find_related — Given a file path and line number, return chunks semantically similar to the code at that location.

📖 Read the full source: HN AI Agents