Context Mode MCP Server Compresses Tool Outputs for Claude Code

What Context Mode Does

Context Mode addresses the problem where every MCP tool call in Claude Code dumps raw data into the 200K context window. Examples from the source show a Playwright snapshot costs 56 KB, twenty GitHub issues cost 59 KB, and one access log costs 45 KB. After 30 minutes, 40% of your context can be gone.

The MCP server sits between Claude Code and these outputs, processing them in sandboxes so only summaries reach the model. It achieves a 98% reduction in context usage (315 KB becomes 5.4 KB).

Installation and Setup

Single command install:

/plugin marketplace add mksglu/claude-context-mode
/plugin install context-mode@claude-context-mode

Or via CLI:

claude mcp add context-mode -- npx -y context-mode

The installation includes an auto-routing skill that automatically routes large outputs through Context Mode, plus a PreToolUse hook that injects context-mode routing into subagent prompts. No prompting is needed.

Available Tools

batch_execute: Run multiple commands + search multiple queries in ONE call (986 KB → 62 KB)
execute: Run code in 10 languages. Only stdout enters context (56 KB → 299 B)
execute_file: Process files in sandbox. Raw content never leaves (45 KB → 155 B)
index: Chunk markdown into FTS5 with BM25 ranking (60 KB → 40 B)
search: Query indexed content with multiple queries in one call (on-demand retrieval)
fetch_and_index: Fetch URL, convert to markdown, index (60 KB → 40 B)
stats: Session token tracking with per-tool breakdown

Technical Implementation

Each execute call spawns an isolated subprocess with its own process boundary. Scripts can't access each other's memory or state. The subprocess runs your code, captures stdout, and only that stdout enters the conversation context. The raw data — log files, API responses, snapshots — never leaves the sandbox.

Ten language runtimes are available: JavaScript, TypeScript, Python, Shell, Ruby, Go, Rust, PHP, Perl, R. Bun is auto-detected for 3-5x faster JS/TS execution.

Authenticated CLIs work through credential passthrough — gh, aws, gcloud, kubectl, docker inherit environment variables and config paths without exposing them to the conversation.

When output exceeds 5 KB and an intent is provided, Context Mode switches to intent-driven filtering: it indexes the full output into the knowledge base, searches for sections matching your intent, and returns only the relevant matches with a vocabulary of searchable terms for follow-up queries.

The knowledge base uses SQLite FTS5 (Full-Text Search 5) virtual tables. The index tool chunks markdown content by headings while keeping code blocks intact, then stores them. Search uses BM25 ranking — a probabilistic relevance algorithm that scores documents based on term frequency, inverse document frequency, and document length.

📖 Read the full source: HN AI Agents