Benchmark: Gemma4 12B vs Qwen3 8B quantized on 24GB Mac Mini

Performance comparison of two local models for OpenClaw
A developer ran a head-to-head test comparing Gemma4 12B and Qwen3:8b-q4_K_M on a 24GB Mac Mini. The test used two prompts: "explain how a carburetor works" and "write a Python function to detect memory leaks." Claude helped write a command to grep the output for measurement.
Benchmark results
Carburetor explanation task:
- Qwen3:8b-q4_K_M: Prompt eval: 89.8 t/s, Generation: 19.6 t/s
- Gemma4: Prompt eval: 20.8 t/s, Generation: 27.6 t/s
Python coding task:
- Qwen3:8b-q4_K_M: Prompt eval: 133.8 t/s, Generation: 18.7 t/s
- Gemma4: Prompt eval: 26.1 t/s, Generation: 26.1 t/s
Key findings
Qwen3 processes prompts 4-5x faster than Gemma4, which matters for OpenClaw because of the large context prompts typically sent. Gemma4 generates output slightly faster. For many OpenClaw uses, Qwen3 wins on speed. The developer notes that Gemma4 is a 12B model and might produce slightly better output, though this wasn't tested.
The developer runs various tasks on local models including cron jobs, heartbeat monitoring, memory indexing, and often has OpenClaw call subagents running local models. They're testing Gemma4 as the local model for all these background tasks but don't expect to notice performance differences since these run in the background.
📖 Read the full source: r/openclaw
👀 See Also

Reddit User Shares AI Tool for Gathering Financial Account Balances
A Reddit post on r/openclaw presents an AI agent designed to streamline the collection of financial account balances using Python. Users discuss automation potential via custom scripts leveraging APIs like Plaid.

Agent frameworks waste 350,000+ tokens per session resending static files
A benchmark on a local Qwen 3.5 122B setup revealed agent frameworks waste over 350,000 tokens per session by resending static files. A compile-time approach reduced query context from 1,373 tokens to 73, achieving a 95% reduction.

AutoAgents Rust Framework Adds Python Bindings for Prototyping
AutoAgents, a Rust-based multi-agent framework, now has Python bindings that allow developers to prototype in Python while maintaining the same Rust core runtime, provider interfaces, pipeline model, and agent semantics. The bindings enable experimentation with local AI models without external systems.

Shipwright: An Open-Source Project Management Tool Built on Claude Code
Shipwright is an open-source project management tool that runs on Claude Code with 44 skills, 7 specialized agents, and 16 workflows. It includes binary quality gates and recovery playbooks, and was used to audit credential registries and evaluate automation platforms before engineering work began.