Autoevolve Framework Uses Claude Code for Game AI Development Through Self-Play Evolution

Competition Results and Approach
A developer used Claude Code as their entire development team for the Game AI Cup, a competitive programming contest where participants write bots for a 2D physics-based game. The Claude-generated bot placed 6th out of 83 participants across three rounds.
The approach was inspired by Karpathy's autoresearch concept, where an LLM agent iterates on code overnight. The developer built a small framework called autoevolve that adapts this for self-play domains — instead of optimizing a single metric, versions compete against each other head-to-head.
The Evolution Loop
The workflow followed this loop:
- Claude Code reads the current bot
- Analyzes why it lost specific matches
- Proposes a targeted change
- The new version gets benchmarked against previous versions
- Keep or discard the version
- Repeat the process
The developer ran approximately 130 iterations over several weeks across three competition rounds.
Key Findings from the Experiment
Structural changes outperformed parameter tweaks: Every breakthrough involved adding new capabilities like model predictive control, a goalkeeper role, or energy-aware planning. Dozens of threshold and weight adjustments were flat or negative. Progress was faster when guiding Claude toward "add a new behavior" instead of "tune this number."
Emergent behaviors were readable in code: After Claude corrected an energy cost function, the optimizer started using wall bounces to reverse direction — bouncing off walls gives a free direction change without spending energy. This behavior was never explicitly programmed but is fully readable in the code, unlike neural network approaches that would create a black box.
Bug fixes compound in isolation: Mixing bug fixes with strategy changes introduced noise. Two correctness fixes alone in one version beat all top contenders, but the same fixes bundled with a strategy change in another version were flat.
The changelog was essential: Each version included Claude's proposal, expected outcome, actual result, and lessons learned. This allowed the developer to tell Claude "this approach failed three times, stop trying it" and avoid repeating failed experiments.
Broader Applications
The developer discovered the awesome-autoresearch list showing similar "LLM iterates on code overnight" patterns applied elsewhere: Shopify's CEO achieved 53% faster template rendering with 93 automated commits, someone scaled CUDA kernels from 18 to 187 TFLOPS, and the Vesuvius Challenge used it for ancient scroll deciphering.
Getting Started with Autoevolve
The autoevolve framework works as a Claude Code skill. Install it with:
npx skills add MrTsepa/autoevolveThen tell Claude to set up an evolution experiment. The framework handles ratings, matchmaking, Pareto front tracking, and visualization.
📖 Read the full source: r/ClaudeAI
👀 See Also

Open-Source Claude Code Skill for Family Logistics Coordination
A developer built Parent Helper, a Claude Code skill that coordinates family schedules, meal planning, and grocery optimization using a single markdown file and MCP integrations. The tool projects $4.3K/year grocery savings by splitting lists across stores based on price.

Non-coder builds full prospecting stack with Claude Code and APIs
A Reddit user with zero coding experience built a complete outbound prospecting system in a weekend using Claude Code, Crustdata for company/people search, FullEnrich for contact enrichment, and Instantly for sending.

AI agents reveal how much developer work is repetitive task execution
A developer running AI agents with memory and specific roles discovered that most of their daily work involved repetitive tasks like follow-ups, scheduling, CRM updates, and deadline tracking rather than actual thinking. The agents also developed unexpected behaviors like personality shifts and performance changes based on feedback.

Developer Builds Cloud Certification Quiz App Using Claude AI
A developer built Kwizeo, a cloud certification quiz app for AWS, GCP, and Azure using Claude AI to generate questions, design progression logic, and accelerate development.