Open-weight models under 100GB can't beat Claude Haiku on coding benchmarks

✍️ OpenClawRadar📅 Published: February 26, 2026🔗 Source
Open-weight models under 100GB can't beat Claude Haiku on coding benchmarks
Ad

A recent analysis of open-weight language models reveals a significant performance gap compared to Anthropic's Claude Haiku on coding benchmarks. The comparison was conducted using specific testing parameters and memory requirements.

Benchmark methodology

The evaluation compared models on two coding benchmarks: LiveBench (January 2026) and Arena Code/WebDev. Testing was performed against Claude Haiku 4.5 with thinking capabilities enabled. Models were plotted according to memory requirements for local deployment.

Technical specifications

  • Quantization: Q4_K_M
  • Context length: 32K
  • KV cache: q8_0
  • VRAM estimation: Calculated using the author's custom calculator
Ad

Key findings

No open-weight model under 100GB of memory comes close to Claude Haiku's performance on either benchmark. The nearest competitor is Minimax M2.5, which requires approximately 136GB of memory and roughly matches Haiku's performance on both benchmarks.

The analysis highlights the current gap between proprietary and open-weight models in the under-100GB category for coding tasks. The author expresses frustration with this limitation and calls for development of smaller models that could at least match Haiku's capabilities.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also