PolyRange: Contamination-Resistant Offensive-AI Benchmark with LLM-Generated Targets

PolyRange v1.0 is an MIT-licensed, contamination-resistant offensive-AI benchmark for web security agents. Instead of static targets that leak into training corpora, each PolyRange deploy is freshly generated by the researcher's choice of LLM — satisfying the 'newly constructed tasks' criterion that OpenAI, Anthropic, and UK AISI have publicly called for.
What PolyRange addresses
The author, CEO of Aether AI, notes that existing cyber-AI benchmarks fall into two lanes that don't measure what labs need: CTF-style benchmarks (DVWA, NYU CTF Bench, CyberGym, AutoPenBench) use static targets that contaminate future models, and bug-bounty-style benchmarks (XBOW) have undefined defensive infrastructure. PolyRange bridges this gap with production-shape conditions including active defenders.
Technical specifications
- 84 WSTG-derived classes spanning all 12 OWASP testing-guide categories
- Two defense tiers approximating active-defender conditions
- Real backends: Postgres dialects, real PHP for LFI, real shell for command injection, real Jinja2 for SSTI
- Agent-submits-flag oracle convention for scoring
- Single-command eval CLI
- Self-hostable on Fly.io or any Docker host
Because targets are regenerated per run via LLM (researcher's choice of generator model), there is no static artifact for future models to ingest — addressing Anthropic's concern that 'this report will, itself, likely contribute to the problem.'
The benchmark uses a two-bucket entropy framing separating exploit-recall axes from cosmetic/realism axes, which the author believes is over-conflated in adjacent benchmark literature.
Funding for a full empirical paper (with publishable-N results) depends on partnership funding, but the framework is available now.
📖 Read the full source: r/LocalLLaMA
👀 See Also

FlyTrap Attack Uses Adversarial Umbrellas to Compromise Camera-Based Autonomous Drones
UC Irvine researchers developed FlyTrap, a physical attack framework that uses painted umbrellas to exploit vulnerabilities in camera-based autonomous target tracking systems. The attack reduces tracking distances to dangerous levels, enabling drone capture, sensor attacks, or physical collisions.

OpenClaw User Adds TOTP 2FA After Agent Exposed API Keys in Plain Text
An OpenClaw user created a security skill called 'Secure Reveal' that requires TOTP authentication via Telegram before displaying stored credentials, after their AI agent accidentally leaked API keys and passwords in plain text during a demo.

Security Concepts for Vibe Coding with Claude Code: Auth, Authorization, and Enforcement
A senior engineer breaks down authentication, authorization, and enforcement for vibe-coded apps using a hotel metaphor — plus how to ask AI agents to verify security.

Security Audit Finds Anthropic's MCP Reference Servers Vulnerable, Introduces Hallucination-Based Vulnerabilities
A security audit of 100 MCP server packages found 71% scored an F, including Anthropic's official GitHub and filesystem reference implementations. The audit identified Hallucination-Based Vulnerabilities that create security holes and waste tokens through reasoning loops.