AI Agents Killing Code Review: Principal-Agent Problem

The industry-standard code review process — review-then-commit, popularized by GitHub PRs — was designed for low-trust collaboration. A human makes a change, another human reviews it, iterations occur, and the change lands. This worked because reviewers could cheaply infer effort and understanding from reading the code. AI agents break that completely.

The agent-in-the-middle disaster

The best case with AI agents: a human prompts a machine to write code, the human reviews it, then sends it to a second human for traditional review. That doubles the review burden. Worse, agents increase total change volume. The result: review bandwidth is exhausted before even a fraction of the agent productivity gains materialize.

But reality is worse. The actual pattern is: human types a short prompt, lightly QAs the output, packages it as a PR, and then funnel reviewer comments back to the agent for fixes. This is a textbook principal-agent problem: the reviewer (principal) can no longer infer effort or understanding from the code, because the code was generated by a machine. The human driving the agent has no incentive to actually read the code or think critically about reviewer feedback. They spend 5 minutes and generate serious review load for another engineer.

This is what's killing open source — "slop PRs" from people who have no understanding of the project, its constraints, or its tools.

A way forward for small teams

For small high-trust teams, there is a simpler process: human prompts agent → human reviews the code → human deploys directly (no second reviewer). The human driving the machine takes full responsibility by owning the deployment. The principal-agent problem disappears because the human is both the driver and the deployer.

At exe.dev, a team of nine uses this approach successfully. Key practices: write far more integration and e2e tests, build agent-based workflows to analyze commits for safety/performance/usability bugs, and ensure a human is always accountable for the final deploy.

The traditional code review model is not salvageable with agents. Small teams can adapt; large organizations and open source projects face a harder structural problem.

📖 Read the full source: HN AI Agents

AI Agents Are Killing Code Review — The Principal-Agent Problem Explained

The agent-in-the-middle disaster

A way forward for small teams

👀 See Also

STAR Reasoning Framework Accuracy Drops from 100% to 0% in Production Prompts

Anthropic's Natural Language Autoencoders Turn Claude's Activations into Readable English — Here's How

Benchmark Comparison of Qwen 3.5 Models Against Major AI Models

Opus 4.7 Reasoning Effort Benchmark: Medium Beats High and Max on Real Tasks