Domain-Camouflaged Injection Attacks Evade Detectors in Multi-Agent LLM Systems

✍️ OpenClawRadar📅 Published: May 23, 2026🔗 Source
Domain-Camouflaged Injection Attacks Evade Detectors in Multi-Agent LLM Systems
Ad

A new paper from Aaditya Pai identifies a critical blind spot in LLM injection detectors: domain-camouflaged injection attacks—payloads generated to mimic the vocabulary and authority structures of the target document—systematically evade detection. Standard detectors flag static payloads at high rates but fail against camouflaged ones.

Key Findings

  • Detection rate on Llama 3.1 8B: dropped from 93.8% (static) to 9.7% (camouflaged).
  • Detection rate on Gemini 2.0 Flash: dropped from 100% to 55.6%.
  • Llama Guard 3, a production safety classifier, detected zero camouflaged payloads (IDR = 0.000).
  • The Camouflage Detection Gap (CDG) is statistically significant across 45 tasks and three domains (Llama: χ² = 38.03, p < 0.001; Gemini: χ² = 17.05, p < 0.001).
Ad

Multi-Agent Debate Amplifies Attacks

Multi-agent debate architectures amplify static injection attacks by up to 9.9x on smaller models. Stronger models show collective resistance. Targeted detector augmentation only partially remediates the gap: 10.2% improvement on Llama, 78.7% on Gemini—indicating the vulnerability is architectural for weaker models.

Framework Released

The authors release their framework, task bank, and payload generator publicly. The blind spot extends beyond few-shot detectors to dedicated safety classifiers, suggesting fundamental weaknesses in current approach.

📖 Read the full source: HN LLM Tools

Ad

👀 See Also