Codev: AI agent workflow for 106 PRs in 14 days

Codev is an open-source AI agent coordination system that enforces a structured development workflow. The project demonstrates how to move AI from prototyping to production work with specific practices extracted from handling 106 pull requests in 14 days.
Six core practices
- Specs and plans are source code: Specifications and plans live in git alongside source code, not in chat history. A new agent reads arch.md for the big picture, then its specific spec. This ensures you always know why something was built.
- Three models review every phase: Claude, Gemini, and Codex catch almost entirely different bugs. No single model found more than 55% of issues. In testing, 20 bugs were caught before shipping: Claude Code found 5 bugs, while Gemini and Codex caught another 15, including a severe security issue Claude missed.
- Enforce the process, don't suggest it: A state machine forces Spec → Plan → Implement → Review → PR. The AI can't skip steps, and tests must pass before advancing. The system provides rails because AIs don't stick to the plan by themselves.
- Annotate, don't edit: Most work involves writing specs and reviews that guide the code, rather than hacking at files in an open-ended chat.
- Agents coordinate agents: An architect agent spawns builder agents into isolated git worktrees. You direct the architect; it directs the builders. They message each other asynchronously.
- Manage the whole lifecycle: Most AI tools help write code faster (about 30% of the job). The other 70% involves planning, reviewing, integrating, deployment scripts, and managing staging vs production. Codev has AI run the entire pipeline from spec to PR and beyond.
Results and costs
The system enabled one engineer to produce what a team of 3-4 would typically do. Code quality measured 1.2 points better on a 10-point scale compared to using Claude Code alone. The approach takes longer and uses more tokens, but costs remain reasonable at approximately $1.60 per PR.
According to the developer, the protocol enforcement is the game changer: "I would find the AI just wouldn't stick to specs or plans." The agent coordination also proved effective, with the architect agent managing multiple builder agents fixing different bugs simultaneously.
📖 Read the full source: HN AI Agents
👀 See Also

Claude Code Memory Leak Fix for Linux Homelabs
A developer discovered Claude Code has a severe memory leak in glibc malloc that consumed 400GB RAM and crashed their Proxmox homelab, then created a two-tier guard solution with LD_PRELOAD shim and watchdog.

Synapse: Real-Time Dashboard for Visualizing Claude Code Agent Sessions
Synapse is a real-time dashboard that visualizes Claude Code agent sessions as interactive node graphs, showing agent spawns, tool calls, and subagents. It requires Node.js and Claude, installs via npm, and offers multiple analysis views and remote approval features.

Natural Language Autoencoders: Turning Claude's Internal Representations into Text
Transformer Circuits Thread publishes Natural Language Autoencoders that decode Claude's internal activations into readable text. GitHub repo and interactive demo available.

DIY OpenClaw Alternative Using Claude Code in Headless Mode
A developer built a Python server that sends prompts to Claude Code in headless mode, with Telegram bot access, Hammerspoon automation, and local markdown file storage for tasks, schedules, and notes.