Codev: AI agent workflow for 106 PRs in 14 days

Codev is an open-source AI agent coordination system that enforces a structured development workflow. The project demonstrates how to move AI from prototyping to production work with specific practices extracted from handling 106 pull requests in 14 days.
Six core practices
- Specs and plans are source code: Specifications and plans live in git alongside source code, not in chat history. A new agent reads arch.md for the big picture, then its specific spec. This ensures you always know why something was built.
- Three models review every phase: Claude, Gemini, and Codex catch almost entirely different bugs. No single model found more than 55% of issues. In testing, 20 bugs were caught before shipping: Claude Code found 5 bugs, while Gemini and Codex caught another 15, including a severe security issue Claude missed.
- Enforce the process, don't suggest it: A state machine forces Spec → Plan → Implement → Review → PR. The AI can't skip steps, and tests must pass before advancing. The system provides rails because AIs don't stick to the plan by themselves.
- Annotate, don't edit: Most work involves writing specs and reviews that guide the code, rather than hacking at files in an open-ended chat.
- Agents coordinate agents: An architect agent spawns builder agents into isolated git worktrees. You direct the architect; it directs the builders. They message each other asynchronously.
- Manage the whole lifecycle: Most AI tools help write code faster (about 30% of the job). The other 70% involves planning, reviewing, integrating, deployment scripts, and managing staging vs production. Codev has AI run the entire pipeline from spec to PR and beyond.
Results and costs
The system enabled one engineer to produce what a team of 3-4 would typically do. Code quality measured 1.2 points better on a 10-point scale compared to using Claude Code alone. The approach takes longer and uses more tokens, but costs remain reasonable at approximately $1.60 per PR.
According to the developer, the protocol enforcement is the game changer: "I would find the AI just wouldn't stick to specs or plans." The agent coordination also proved effective, with the architect agent managing multiple builder agents fixing different bugs simultaneously.
📖 Read the full source: HN AI Agents
👀 See Also

VibeSmith: Local Tool for Detecting Skill Conflicts in Claude Code Projects
VibeSmith is a local macOS desktop app that provides unified visibility across Claude Code projects, detecting conflicts when global and project-level components share names, visualizing dependencies as DAGs, and tracking context token usage.

OpenClaw developer builds Kumiho cognitive memory plugin for persistent agent collaboration
A developer created Kumiho, an AI cognitive memory system backed by a knowledge graph, to address OpenClaw's lack of memory across sessions. The openclaw-kumiho plugin hooks into conversations to recall context, capture structured summaries, and maintain versioned creative outputs.

Semble: A Local MCP Server for Claude Code with 98% Token Reduction
Semble is an open-source MCP server for Claude Code that replaces grep+read workflows, using embeddings, BM25, and reranking to reduce token usage by ~98% while indexing repos in ~250ms.

I ripped out OpenClaw's default markdown memory and built a Node.js/Postgres API layer instead
A developer disabled OpenClaw's memory-core plugin and built a typed Node.js/Express + PostgreSQL backend. Context drift dropped to zero.