PolyRange: Contamination-Resistant Offensive-AI Benchmark with LLM-Generated Targets

✍️ OpenClawRadar📅 Published: May 31, 2026🔗 Source
PolyRange: Contamination-Resistant Offensive-AI Benchmark with LLM-Generated Targets
Ad

PolyRange v1.0 is an MIT-licensed, contamination-resistant offensive-AI benchmark for web security agents. Instead of static targets that leak into training corpora, each PolyRange deploy is freshly generated by the researcher's choice of LLM — satisfying the 'newly constructed tasks' criterion that OpenAI, Anthropic, and UK AISI have publicly called for.

What PolyRange addresses

The author, CEO of Aether AI, notes that existing cyber-AI benchmarks fall into two lanes that don't measure what labs need: CTF-style benchmarks (DVWA, NYU CTF Bench, CyberGym, AutoPenBench) use static targets that contaminate future models, and bug-bounty-style benchmarks (XBOW) have undefined defensive infrastructure. PolyRange bridges this gap with production-shape conditions including active defenders.

Ad

Technical specifications

  • 84 WSTG-derived classes spanning all 12 OWASP testing-guide categories
  • Two defense tiers approximating active-defender conditions
  • Real backends: Postgres dialects, real PHP for LFI, real shell for command injection, real Jinja2 for SSTI
  • Agent-submits-flag oracle convention for scoring
  • Single-command eval CLI
  • Self-hostable on Fly.io or any Docker host

Because targets are regenerated per run via LLM (researcher's choice of generator model), there is no static artifact for future models to ingest — addressing Anthropic's concern that 'this report will, itself, likely contribute to the problem.'

The benchmark uses a two-bucket entropy framing separating exploit-recall axes from cosmetic/realism axes, which the author believes is over-conflated in adjacent benchmark literature.

Funding for a full empirical paper (with publishable-N results) depends on partnership funding, but the framework is available now.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also

FlyTrap Attack Uses Adversarial Umbrellas to Compromise Camera-Based Autonomous Drones
Security

FlyTrap Attack Uses Adversarial Umbrellas to Compromise Camera-Based Autonomous Drones

UC Irvine researchers developed FlyTrap, a physical attack framework that uses painted umbrellas to exploit vulnerabilities in camera-based autonomous target tracking systems. The attack reduces tracking distances to dangerous levels, enabling drone capture, sensor attacks, or physical collisions.

OpenClawRadar
OpenClaw User Adds TOTP 2FA After Agent Exposed API Keys in Plain Text
Security

OpenClaw User Adds TOTP 2FA After Agent Exposed API Keys in Plain Text

An OpenClaw user created a security skill called 'Secure Reveal' that requires TOTP authentication via Telegram before displaying stored credentials, after their AI agent accidentally leaked API keys and passwords in plain text during a demo.

OpenClawRadar
Security Concepts for Vibe Coding with Claude Code: Auth, Authorization, and Enforcement
Security

Security Concepts for Vibe Coding with Claude Code: Auth, Authorization, and Enforcement

A senior engineer breaks down authentication, authorization, and enforcement for vibe-coded apps using a hotel metaphor — plus how to ask AI agents to verify security.

OpenClawRadar
Security Audit Finds Anthropic's MCP Reference Servers Vulnerable, Introduces Hallucination-Based Vulnerabilities
Security

Security Audit Finds Anthropic's MCP Reference Servers Vulnerable, Introduces Hallucination-Based Vulnerabilities

A security audit of 100 MCP server packages found 71% scored an F, including Anthropic's official GitHub and filesystem reference implementations. The audit identified Hallucination-Based Vulnerabilities that create security holes and waste tokens through reasoning loops.

OpenClawRadar