Prompt-caching MCP plugin automatically reduces Claude API costs by identifying stable context

Prompt-caching is an MCP plugin that automatically reduces Claude API costs by leveraging Anthropic's caching feature. When using Claude Code or Cursor/Windsurf/Zed with the Anthropic API, each turn typically re-sends the entire context from scratch, which means thousands of tokens get billed at full rate repeatedly during long debugging sessions.
How it works
Anthropic provides a caching feature that makes repeated reads cost 0.1× instead of 1×, but this requires manually marking what gets cached. The prompt-caching plugin runs in the background, identifies stable parts of your context (system prompts, tool definitions, large file reads), and automatically marks them before each API call.
Performance results
- 20-turn bug fix: 85% cheaper
- 15-turn refactor: 80% cheaper
- 40-turn coding session: 92% cheaper
Installation
For Claude Code users:
/plugin marketplace add https://github.com/flightlesstux/prompt-caching
/plugin install prompt-caching@ercan-ermis
For Cursor/Windsurf/Zed:
npm install -g prompt-caching-mcp
Then point your MCP configuration at it.
The tool is open source under the MIT license and available for free. The repository is at https://github.com/flightlesstux/prompt-caching.
📖 Read the full source: r/ClaudeAI
👀 See Also

VT Code: Open-Source Rust TUI Coding Agent with Multi-Provider Support and Agent Skills
VT Code is a Rust-based terminal UI (TUI) coding agent supporting Anthropic, OpenAI, Gemini, and Codex, with local inference via LM Studio and Ollama. It includes Agent Skills, Model Context Protocol, and Agent Client Protocol.

Bifrost LLM Gateway: 11 Microsecond Overhead, Single Binary in Go
Bifrost is an open-source LLM proxy written in Go that routes requests to OpenAI, Anthropic, Azure, and Bedrock with 11 microsecond overhead per request, handling 5,000 RPS on a $20/month VPS.

MCP Server Tracks Known Bugs in Dev Tools to Improve LLM Recommendations
nanmesh-mcp is an MCP server that crawls GitHub Issues, Stack Overflow, and Reddit to track real problems in 57 development tools, providing LLMs with current bug data before making library recommendations.

Claude Code v2.1.166: Fallback Models, Glob Deny Rules, Cross-Session Hardening
Claude Code v2.1.166 introduces up to 3 fallback models, glob pattern support in deny rules, hardened cross-session messaging, and fixes for terminal flickering, orphaned processes, and more.