Multi-Agent Haiku System Matches Claude Opus on Complex Number Theory Problem at 15x Lower Cost

Experimental Setup and Results
A Reddit user conducted a comparative test between two Claude model configurations on a challenging number theory problem. The problem required proving that for an odd prime p, the sum 1^(p-1) + 2^(p-1) + ... + (p-1)^(p-1) is congruent to -1 (mod p), using Fermat's Little Theorem and properties of primitive roots.
Two configurations were tested:
- Config X (Opus solo): Claude Opus 4.5 with max_tokens: 2048, no auditor
- Config Y (Haiku multi-agent): Haiku generator produces full proof, second Haiku auditor checks every step, with two passes if auditor flags anything, max_tokens: 1024 each call
Scoring and Performance
Both configurations scored 4/4 using this rubric:
- Correctly invokes Fermat's Little Theorem
- Correctly handles primitive root argument
- Summation over complete residue system valid
- Congruence conclusion follows correctly
The Haiku auditor returned VERIFIED with no disagreement. Performance metrics:
- Opus solo: ~8.7 seconds, score 4/4
- Haiku + auditor: ~10.9 seconds, score 4/4
Cost Analysis
The economic implications are significant:
- Opus solo: $0.075/1000 tokens × ~800 tokens = ~$0.06 per query
- Haiku + Haiku: $0.0025/1000 tokens × ~1600 tokens = ~$0.004 per query
This represents approximately 15x lower cost for identical results on this problem. The problem was described as "genuinely hard" and not training-data-obvious like simpler proofs.
The source notes that on clean problems where Fermat's Little Theorem does the heavy lifting (each a^(p-1) ≡ 1, sum (p-1) ones, get p-1 ≡ -1), the auditor pattern adds about a 17% time tax to confirm correctness. The pattern is particularly valuable for problems where the generator might stumble with quantization stutter or hallucinated algebra.
📖 Read the full source: r/ClaudeAI
👀 See Also

Nia-docs tool creates local filesystem from documentation URLs for Claude AI
The nia-docs tool lets you run npx nia-docs with a documentation URL to create a local filesystem of the docs, which Claude AI can then access directly without additional configuration.

iai-mcp: Local daemon gives Claude persistent memory across sessions with 99% recall
iai-mcp is an open-source local daemon that captures every Claude conversation, organizes it into three memory tiers, and feeds context back on new sessions. Achieves >99% verbatim recall, retrieval under 100ms, and session-start cost under 3,000 tokens.

Agent Browser Protocol: Open-source Chrome fork for AI agents achieves 90% on Mind2Web benchmark
Agent Browser Protocol (ABP) is an open-source Chrome fork that freezes JavaScript and time after each action to convert web browsing into multimodal chat for AI agents. It achieved 90.53% on the Online Mind2Web Benchmark and can be added to Claude Code with a single command.

Understudy: A Teachable Desktop Agent That Learns Tasks by Demonstration
Understudy is a local-first desktop agent runtime that can operate GUI apps, browsers, shell tools, files, and messaging in one session. You demonstrate a task once, it records screen video and semantic events, extracts intent rather than coordinates, and turns it into a reusable skill.