F-Bombs Per Thousand Prompts: Developer Tracks Frustration in 44K Logs

A developer publishing under /u/ChartBuilder created a metric called fpk — f-bombs per thousand prompts — to quantify frustration while using Claude Code. The data spans 5 months, 44,212 prompts, and 6,120 sessions.

Headline numbers per model

claude-opus-4-5: 38.11 fpk
claude-opus-4-7: 11.11 fpk
claude-haiku-4-5: 0.00 fpk (used as subagent, never orchestrator)

That's a 3.4× drop in frustration between the two Opus versions, closely tracking Anthropic's official quality recovery from the Feb-Mar regression — but visible in a way release notes don't capture.

Fpk by Claude Code CLI version

2.1.30-69 era: 40 fpk
2.1.100+ era: 12 fpk
Worst single version: 2.1.42 at 173.79 fpk
Best: 2.1.110 at 0.00 fpk over 300+ prompts

Key insight: most frustration is environmental, not model-related

The author notes: "most cursing wasn't at the model. It was at environmental friction like gh auth failures, docker issues, screenshots breaking. The model is mostly the unwitting witness to my frustration with the surrounding tooling, not the cause."

But sometimes the model is the cause too — the full writeup includes a "greatest hits" collection of memorable outbursts.

Reproducible tooling

The developer has published tools to compute fpk on your own Claude Code logs:

Full writeup with methodology: mpiv.ai/blog/fpk-f-bombs-per-thousand-the-dev-experience-metric-you-didnt-know-you-needed
Open-source repo with audit tooling: github.com/MPIsaac-Per/claude-code-ops-audit

If you use Claude Code heavily and want a quantitative signal of how much friction you're actually experiencing, this metric is worth adopting. The drop between models and across CLI versions is a concrete indicator of Anthropic's recovery — and the environmental sources of rage are something every team can address.

📖 Read the full source: r/ClaudeAI