Claude Code Used to Simulate 4,000+ Blind Werewolf Games with LLMs

✍️ OpenClawRadar📅 Published: February 27, 2026🔗 Source

Simulation Setup and Results

A developer built a small simulator using Claude Code where large language models play blind one-night Werewolf against each other. The experiment ran approximately 4,600 games across models from OpenAI (GPT-4o-mini, GPT-5-mini) and xAI (Grok-3-fast, Grok-4-1-fast).

The game variant has minimal signals: 7 players, 1 wolf, no roles, one short discussion, then a simultaneous vote. The only differentiating factor between players is their name. Despite this limited setup, the simulation revealed consistent patterns where some names get voted out significantly more often than others across every model tested, while other names almost never get voted out.

Important Caveats and Access

The developer explicitly states this isn't a causal claim — just an outcome pattern from a toy setup. The name groups are broad, some names appear less frequently, and there are multiple ways this could be an artifact of the setup rather than revealing anything fundamental about the models. However, the consistency of these patterns across runs and models was noted as surprising.

For those interested in exploring further:

Dashboard: https://huggingface.co/spaces/Queue-Bit-1/llm-bias-dashboard
Code + raw logs: https://github.com/Queue-Bit-1/wolf

The developer is curious if others have observed similar name effects in multi-agent simulations.

📖 Read the full source: r/ClaudeAI

👀 See Also

Tools

Testing Local LLMs for Autonomous Code Generation: Quality vs. Speed Benchmark

A developer built a harness to test local LLMs on real Go code generation tasks, measuring compilation success, field extraction accuracy, and throughput. Results compare models across quality and speed.

May 8, 2026, 06:16 PM UTC

OpenClawRadar

Tools

Claude AI Built a UFO Data Visualizer with Government Data in Hours

A Reddit user used Claude AI to build a full-stack UFO sighting visualizer from newly released U.S. Dept. of War data, hosted on Cloudflare, in just a few hours.

May 9, 2026, 02:19 AM UTC

OpenClawRadar

Tools

Data Analyst Builds Prompt Calibrator Tool with Claude, No Prior Frontend Experience

A data analyst with no HTML, CSS, or JavaScript experience built Prompt Calibrator, a client-side web tool that structures AI prompts through a form with four fields and four modes. The tool was developed using Claude as a code review partner and is hosted on GitHub Pages.

Apr 14, 2026, 05:45 AM UTC

OpenClawRadar

Tools

HolyClaude: Docker Container for Claude Code with Browser UI and Headless Chromium

HolyClaude is an open-source Docker container that packages Claude Code CLI with a browser UI, headless Chromium, and additional AI coding tools. Setup requires only docker compose up and provides access at localhost:3001.

Mar 26, 2026, 06:45 AM UTC

OpenClawRadar