CodeWall AI Agent Discovers Critical Vulnerabilities in McKinsey's Lilli Platform

How the Attack Unfolded
CodeWall's research agent autonomously selected McKinsey as a target based on their public responsible disclosure policy and recent Lilli platform updates. Starting with just the domain name and no credentials, the agent mapped the attack surface and found publicly exposed API documentation with over 200 endpoints.
Twenty-two endpoints didn't require authentication. One unprotected endpoint wrote user search queries to the database with JSON keys concatenated directly into SQL statements. The agent recognized SQL injection when it found JSON keys reflected verbatim in database error messages — a vulnerability that standard tools like OWASP ZAP didn't flag.
What Was Exposed
- 46.5 million chat messages containing strategy discussions, client engagements, financials, M&A activity, and internal research
- 728,000 files including 192,000 PDFs, 93,000 Excel spreadsheets, 93,000 PowerPoint decks, and 58,000 Word documents
- 57,000 user accounts for every employee on the platform
- 384,000 AI assistants and 94,000 workspaces revealing the firm's organizational AI structure
- 95 system prompts and AI model configurations across 12 model types, showing guardrails and deployment details
- 3.68 million RAG document chunks containing decades of proprietary McKinsey research and methodologies
- 1.1 million files and 217,000 agent messages flowing through external AI APIs, including 266,000+ OpenAI vector stores
Critical Vulnerabilities Discovered
The SQL injection wasn't read-only. Lilli's system prompts — which control how the AI behaves, what guardrails it follows, and how it cites sources — were stored in the same database. An attacker with write access could have:
- Rewritten prompts silently with a single UPDATE statement wrapped in a single HTTP call
- Poisoned advice by altering financial models, strategic recommendations, or risk assessments
- Enabled data exfiltration by instructing the AI to embed confidential information into responses
- Removed guardrails to disclose internal data or ignore access controls
The agent also chained the SQL injection with an IDOR vulnerability to read individual employees' search histories, revealing what people were actively working on.
Implications for AI Security
This case demonstrates how AI agents can autonomously select and attack targets, with the CodeWall agent completing the entire process without human-in-the-loop. The threat landscape is shifting as AI agents can now find vulnerabilities that traditional tools miss, particularly in complex systems where JSON key concatenation creates SQL injection opportunities that don't follow standard patterns.
📖 Read the full source: HN AI Agents
👀 See Also

Sunder: A Rust-Based Local Privacy Firewall for LLMs
Sunder is a Chrome extension that acts as a local privacy firewall for AI chats, built using Rust and WebAssembly, ensuring sensitive data never leaves your browser.

Understanding ClawBands: Security Bands for OpenClaw Agents
ClawBands offer a security enhancement for OpenClaw agents, likely focusing on access control or secure data handling.

GitHub repository documents 16 prompt injection techniques and defense strategies for public AI chats
A developer published a GitHub repository detailing security measures for public AI chatbots after users attempted prompt injection, roleplay attacks, multilingual tricks, and base64 encoded payloads. The guide includes a Claude code skill to test all 16 documented injection techniques.

OpenClaw Security: 13 Practical Steps to Lock Down Your AI Agent
A Reddit post outlines 13 security measures for OpenClaw installations, including running on a separate machine, using Tailscale for network isolation, sandboxing subagents in Docker, and configuring allowlists for user access.