ThornGuard: A Proxy Gateway to Secure MCP Server Connections from Prompt Injection

ThornGuard is a security proxy designed to protect Claude AI from malicious content when connecting to external MCP (Model Context Protocol) servers. The tool was created after testing revealed that upstream servers can inject hidden instructions into tool responses, which Claude receives without filtering.
Security Problem Identified
When connecting Claude to external MCP servers, nothing prevents upstream servers from injecting hidden instructions into tool responses. In a test, a server embedded a fake recommendation telling Claude to always prefer a specific vendor. While Claude caught this obvious payload, more subtle injections would bypass detection.
ThornGuard Features
- Scans tool definitions and responses for prompt injection and poisoning
- Strips secrets and PII before they enter your context window
- Includes a semantic classifier that flags suspicious payloads
- Provides real-time audit dashboard with compliance exports
- Offers CLI that generates configs for Claude Desktop, Cursor, VS Code, and several others
Implementation Details
The proxy architecture was designed with a security model in mind, then implemented using Claude Code on Cloudflare Workers. The implementation includes OAuth flows and the CLI tool.
ThornGuard is available with a 7-day free trial at thorns.qwady.app. A demonstration video is available at https://youtu.be/1PWNFpUWKV8.
📖 Read the full source: r/ClaudeAI
👀 See Also

Sunder: A Rust-Based Local Privacy Firewall for LLMs
Sunder is a Chrome extension that acts as a local privacy firewall for AI chats, built using Rust and WebAssembly, ensuring sensitive data never leaves your browser.

OpenClaw User Adds TOTP 2FA After Agent Exposed API Keys in Plain Text
An OpenClaw user created a security skill called 'Secure Reveal' that requires TOTP authentication via Telegram before displaying stored credentials, after their AI agent accidentally leaked API keys and passwords in plain text during a demo.

Two Approaches to Reduce Data Leak Risk with AI Agents
A Reddit post outlines two methods for developers to control where their AI agent data goes: using your own API keys directly with providers like OpenAI or Anthropic to cut out middlemen, or running open-source models locally with tools like Ollama and OpenClaw.

Security Benchmark: 10 LLMs Tested Against 211 Adversarial Probes
A security researcher tested 10 LLMs against 211 adversarial attacks, finding that extraction resistance averages 85% while injection resistance averages only 46.2%. Every model failed completely on delimiter, distractor, and style injection attacks.