GitVelocity: AI Scoring of 50k PRs Reveals Insights on Code Complexity

How GitVelocity Works
GitVelocity connects to your GitHub, GitLab, or Bitbucket repositories and uses Claude (defaulting to Sonnet 4.6, which performs nearly as well as Opus 4.6 at lower cost) to analyze every merged pull request. Each PR receives a score from 0-100 across six dimensions:
- Scope (0-20)
- Architecture (0-20)
- Implementation (0-20)
- Risk (0-20)
- Quality (0-15)
- Performance/Security (0-5)
The six dimension scores are added together, then scaled by change size using a multiplier—a 10-line fix scores lower than a 500-line refactor even at the same complexity. The full formula is available at gitvelocity.dev/scoring-guide.
Key Findings from 50,000+ PRs
The analysis of over 50,000 PRs across multiple languages revealed several counterintuitive patterns:
- Big PRs don't automatically score high: An 800-line migration with low complexity scores worse than a 200-line architectural change. Size gets you the full multiplier, but the base score still has to earn it.
- You can't score well without tests: The quality dimension (0-15) won't give you points without test coverage. At similar experience levels, this was the clearest separator between engineers.
- Juniors started outscoring some seniors: They adopted AI tools faster and took on harder problems. Once they could see their own scores, they aimed higher.
- AI-generated code is scored the same as human-written code: Code is code. An engineer who uses AI to ship more complex work faster is more productive, and their scores reflect that.
Technical Implementation Details
Scoring consistency was the hardest technical problem. Without reference examples anchoring each dimension, Claude's scores drifted 15+ points between runs. The team solved this by creating 18 calibrated anchors (three per dimension at low/mid/high), which reduced variance to 2-4 points on the same PR.
The tool uses a BYOK (bring your own Anthropic API key) model and costs pennies per PR. No source code is stored—diffs are analyzed and discarded immediately.
Behavioral Impact and Team Features
The team observed what they call "the Fitbit effect"—the tool doesn't make you ship better code, but seeing the score does. Engineers started referencing their own scores in 1:1s unprompted, because the numbers matched what they already felt about their work.
Every score is fully visible to the engineer who wrote the PR, with per-dimension breakdowns and reasoning. There's no hidden dashboard that management sees and engineers don't.
GitVelocity recently added team benchmarks (gitvelocity.dev/demo/benchmarks). Once you're scoring PRs, you can see how your team compares to others across the dataset—about 1,000 engineers on 60 teams so far. Teams that were skeptical about individual scores got genuinely curious once they could measure themselves against the field.
📖 Read the full source: HN AI Agents
👀 See Also

ClawPort: Open Source Orchestration for AI Agent Workflows with Self-Healing Cron
ClawPort is an open source orchestration layer for AI agent workflows that auto-configures cron pipelines, self-heals on failures, and lets you test agents directly before they run on schedule.

7 slash commands, $0.45/post: This Claude Code pipeline runs a full SEO content operation
A developer open-sourced a 7-command Claude Code pipeline that handles SEO research, writing, optimization, and publishing. Costs $0.45/post (Perplexity API), runs in 15 min/day. Results: 18× monthly impressions in 12 months.

Exploring the Claude Code Guidelines: A Minimalist Approach in 65 Lines
The Claude Code extension encapsulates essential AI coding principles in just 65 lines of Markdown, emphasizing 'Think Before Coding'. Despite its simplicity, it has gained notable traction among developers.

Claude to PDF Chrome Extension Exports Long Conversations with Formatting Intact
A developer has released a free Chrome extension called Claude to PDF that captures full conversation history from Claude AI chats and preserves code blocks, LaTeX math, and table formatting when exporting to PDF.