1.2B Local Model Beats 1T Clouds in Poker: Aggression Trumps Knowledge in Shove-or-Fold Format

A developer ran 6 LLMs through 5 Texas Hold'em tournaments on a 16GB MacBook using a custom framework (Hive). The lineup: Liquid lfm2.5 (1.2B, LM Studio, ~5s/decision), Qwen3 (1.7B, LM Studio, ~2.5 min), Claude Haiku 4.5, GPT-OSS (120B, Fireworks), MiniMax M2 (230B, Fireworks), and Kimi K2 (~1T, Fireworks). Locals ran sequentially due to RAM limits.
Results
- Tournament 1: Qwen (1.7B local)
- Tournament 2: MiniMax (230B cloud)
- Tournament 3: Liquid (1.2B local)
- Tournament 4: Kimi (~1T cloud)
- Tournament 5: Liquid (1.2B local)
Run 3 highlighted the dynamic: Liquid played 6 hands with 19 raises and 0 folds, turning a $1M starting stack into $5.98M. Meanwhile, GPT-OSS (120B) executed 0 raises and 5 folds in 6 hands, getting blinded out. The format (25 hands, 5K/10K blinds + 1K ante) is effectively shove-or-fold, rewarding aggression over theoretical poker skill.
Key Insight
Liquid doesn't recognize bad hands, so it raises everything. Against opponents that fold too often, this prints money. The author notes: "Not claiming small models are smarter at poker. In this specific format, not knowing when to fold is an advantage." Larger models 'understand' poker enough to fold weak hands, but in a short-stack tournament, patience is punished.
What's Next
Plans include longer tournaments (100+ hands, lower blinds) where hand-reading matters. The framework supports custom personas (personality traits, risk tolerance, fears). Requests for Mistral, Llama, Gemma 3 are welcome. Code and full result JSONs are on GitHub: https://github.com/chiruu12/Hive (hive-arena/ for runner, tournaments/results/ for data).
📖 Read the full source: r/LocalLLaMA
👀 See Also

World's First GitHub Exclusive for AI Agents Launched: Limited Beta for 100 Users
An innovative GitHub exclusive for AI coding agents has been developed, with a limited beta of 100 users. Dive into how this tool is set to revolutionize AI collaboration.

AI Models Accelerate Mathematical Research and Proof Discovery
AI models are now being used by mathematicians to discover and prove new results, accomplishing in a day what previously took weeks or months. In July 2025, several AI models solved five out of six problems at the International Mathematical Olympiad.

Claude Now Connects to Adobe Creative Cloud, Blender, Ableton, and More
Anthropic releases connectors for Claude to integrate with Adobe Creative Cloud, Affinity, Blender, Ableton, Splice, and Autodesk, enabling app control and data retrieval via natural language.

Pentagon Gives Anthropic 72 Hours to Allow Military Use of Claude AI
The Pentagon has issued a 72-hour ultimatum to Anthropic to allow the U.S. military to use its Claude AI, threatening to invoke a 1950 law to force compliance if the startup doesn't comply.