79-96% Success: Audio Attack Hijacks Voice AI in 13 Models

New research presented at the IEEE Symposium on Security and Privacy reveals a practical attack vector against Large Audio-Language Models (LALMs). Attackers can embed imperceptible signals into audio clips to hijack model behavior, achieving a 79-96% average success rate across 13 leading open models, including commercial services from Microsoft and Mistral.

How the Attack Works

The modified audio clip is inaudible to human ears but triggers the model to execute hidden commands. Crucially, the attack works regardless of the user's accompanying instructions, making the same clip reusable against the same model multiple times. Training the adversarial signal takes approximately 30 minutes.

Exploited Capabilities

Researchers demonstrated that compromised models could be coerced into:

Conducting sensitive web searches without user knowledge
Downloading files from attacker-controlled sources
Sending emails containing user data to external addresses

Affected Models

The attack was validated against 13 popular open-weight LALMs, including commercial voice AI APIs. This highlights that current voice AI systems lack robust safeguards against adversarial audio perturbations.

📖 Read the full source: HN AI Agents

Hidden Audio Signals Hijack Voice AI Systems with 79-96% Success Rate

How the Attack Works

Exploited Capabilities

Affected Models

👀 See Also

BlindKey: Blind Credential Injection for AI Agents

OpenClaw User Adds TOTP 2FA After Agent Exposed API Keys in Plain Text

Ward: Open-source tool intercepts npm installs to block supply chain attacks for Claude Code users

AI-Built Apps Are Fragile: Why Small Changes Break Data Isolation and Permissions