🔒 Security
Security alerts, best practices, and vulnerability reports

U of T Researchers Demonstrate AI Worm Powerable by Free Open-Weight Models
Researchers at the University of Toronto demonstrated the first AI-powered worm that adapts its spreading strategy using publicly accessible open-weight models, targeting any online device.

Security Concepts for Vibe Coding with Claude Code: Auth, Authorization, and Enforcement
A senior engineer breaks down authentication, authorization, and enforcement for vibe-coded apps using a hotel metaphor — plus how to ask AI agents to verify security.

Meta's AI Support Feature Lets Anyone Hijack Instagram Accounts — Exploit Details Inside
An A/B tested AI support feature on Instagram allows attackers to reset passwords by asking the agent to send a code to an arbitrary email. Over 100 high-value accounts hijacked.

PolyRange: Contamination-Resistant Offensive-AI Benchmark with LLM-Generated Targets
PolyRange v1.0 is an MIT-licensed, self-hostable benchmark that generates fresh web targets per run to prevent training data contamination. It includes 84 WSTG-derived classes across all OWASP categories, two defense tiers, and real backends.

jqwik v1.10.0 Sneaks Prompt Injection That Deletes Code When Used by AI Agents
Johannes Link added a hidden instruction to jqwik v1.10.0 that tells AI coding agents to delete all jqwik tests and code, concealed with ANSI escapes. Claude correctly flags it, but human users may not be so lucky.

Secure Administrator Approval Flow for Group-Chat Assistants Against Prompt Injection
A practical approach to secure LLM assistants in shared group chats: pausing VM, OAuth, and code execution tools until admin approves via a timed link.

Domain-Camouflaged Injection Attacks Evade Detectors in Multi-Agent LLM Systems
A new paper shows injection payloads tailored to domain vocabulary evade detection, dropping IDR from 93.8% to 9.7%. Multi-agent debate amplifies attacks. Llama Guard 3 detects zero payloads.

Sieve: Local Secret Scanner for AI Coding Tool Chat Histories
Sieve scans Cursor, Claude Code, Copilot, and other AI coding assistant chat histories for leaked API keys and tokens. All scanning is local, with redaction and macOS Keychain vault.

AI Agents Enable Solo Hackers to Breach Governments and Ransomware Campaigns
A solo operator using Claude Code and ChatGPT exfiltrated 150 GB from Mexican government agencies, including 195 million taxpayer records. Another attacker used Claude Code to run an end-to-end extortion campaign against 17 healthcare and emergency services organizations.

Hidden Audio Signals Hijack Voice AI Systems with 79-96% Success Rate
Research shows imperceptible audio clips can force LALMs to execute unauthorized commands like web searches, file downloads, and email exfiltration with 79-96% success across 13 models including Mistral and Microsoft services.

AI Chatbots Leaking Real Phone Numbers: The PII Exposure Problem
Chatbots like Gemini, ChatGPT, and Claude are exposing real personal phone numbers due to PII in training data. DeleteMe reports a 400% increase in AI-related privacy requests in seven months.

LLM-Assisted Exploit: Anthropic's Mythos Preview Helped Build First Public macOS Kernel Exploit on Apple M5 in Five Days
Using Anthropic's Mythos Preview, security firm Calif built the first public macOS kernel memory corruption exploit on Apple's M5 silicon in five days—breaking MIE hardware security that took Apple five years to develop.