Open-source RAG attack and defense lab for local ChromaDB + LM Studio stacks

What this is
Aminrj Labs released an open-source RAG attack and defense lab that runs fully local on consumer hardware, specifically targeting ChromaDB + LM Studio stacks with standard LangChain-style chunking. No cloud services or API keys are required—it runs on hardware like a MacBook Pro.
Key findings from the lab
The lab measures knowledge base poisoning effectiveness against default local RAG setups. On an undefended ChromaDB system, poisoning attacks achieve 95% success. The attack operates at the retrieval layer—no jailbreak, model access, or prompt manipulation is needed. The model performs exactly as intended, just with poisoned context.
One notable observation about default chunking: with 512-token chunks and 200-token overlap, a document at a chunk boundary gets embedded twice as two independent chunks. This doubles retrieval probability without additional sophistication, a side effect of settings most local setups inherit without consideration.
The most common defense approach—output filtering—targets the wrong layer since the compromise occurs before generation. Embedding anomaly detection at ingestion proves effective: scoring incoming documents against the existing collection before writing them reduces poisoning success from 95% to 20%.
With all five defenses active, residual poisoning success is 10%. These cases are semantically close enough to the baseline that no layer catches them cleanly, representing the practical ceiling for defense.
Technical details
- Stack: ChromaDB + LM Studio with Qwen2.5-7B
- Chunking: Standard LangChain-style with 512-token chunks and 200-token overlap
- Attack success on undefended system: 95%
- Defense effectiveness with embedding anomaly detection: Drops poisoning to 20%
- Residual poisoning with all defenses: 10%
The repository contains the attack implementation, hardened version, and measurements for each defense layer.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Security Audit Experiment Shows AI Agent Performance Depends on Knowledge Access
A developer ran three security audits on the same Next.js codebase using different AI approaches: Claude Code's built-in review found 1 critical, 6 high, 13 medium issues; an AI agent without extra context found 1 critical, 5 high, 14 medium; an AI agent with 10 professional security books found 8 critical, 9 high, 10 medium issues.

CodeWall AI Agent Discovers Critical Vulnerabilities in McKinsey's Lilli Platform
CodeWall's autonomous offensive AI agent gained full read/write access to McKinsey's internal Lilli AI platform database within 2 hours, exposing 46.5 million chat messages, 728,000 files, and sensitive system configurations through SQL injection and IDOR vulnerabilities.

AISI Evaluation Shows Claude Mythos Preview's Cyber Capabilities in CTF and Multi-Step Attacks
The AI Security Institute evaluated Anthropic's Claude Mythos Preview, finding it successfully completed 73% of expert-level capture-the-flag challenges and solved a 32-step corporate network attack simulation in 3 out of 10 attempts.

Cisco source code stolen via Trivy supply chain attack
Cisco's internal development environment was breached using stolen credentials from the Trivy supply chain attack, resulting in the theft of source code from over 300 GitHub repositories including AI-powered products and customer code.