Local LLM Performance Benchmarks on Mac Mini with OpenClaw and LM Studio

✍️ OpenClawRadar📅 Published: April 18, 2026🔗 Source

A Reddit user shared concrete performance benchmarks for running a local large language model on a Mac Mini with 32GB RAM. The post addresses the scarcity of specific performance data for this hardware configuration.

Technical Setup Details

The user reported the following configuration and results:

Software versions: OpenClaw 2026.3.8, LM Studio 0.4.6+1
Model: Unsloth gpt-oss-20b-Q4_K_S.gguf
Context size: 26035
Performance metrics: 34 tokens/second after the first prompt, 0.7 second time to first token

Model Configuration

The user specified these model settings (all at defaults):

GPU offload = 18
CPU thread pool size = 7
Max concurrents = 4
Number of experts = 4
Flash attention = on

The Q4_K_S quantization indicates this is a 4-bit quantized version of the 20-billion parameter model, which reduces memory requirements while maintaining reasonable performance. The 32GB RAM on the Mac Mini is sufficient for this model size with the given context length. The 34 tokens/second throughput is a practical benchmark for developers considering similar local LLM setups on Apple Silicon hardware.

📖 Read the full source: r/openclaw

👀 See Also

Tools

Startup Bookkeeper: Free Claude Skill for Small Business Tracking

Startup Bookkeeper is an open-source Claude AI skill that helps bootstrapped founders track expenses by categorizing transactions from plain English descriptions, processing receipt photos with OCR, and generating dashboards or P&L statements.

Mar 29, 2026, 11:45 AM UTC

OpenClawRadar

Tools

Fino: Open-Source MCP Server for Personal Finance Analysis with Claude

Fino is a free, open-source MCP server that connects Claude to bank accounts through Plaid, stores transaction data locally in SQLite, and provides Claude with tools for financial analysis.

Mar 30, 2026, 03:45 AM UTC

OpenClawRadar

Tools

md-redline: GUI tool for reviewing and handing off markdown docs to Claude

md-redline is an open-source tool that lets you open markdown files in a GUI, leave inline comments stored as HTML markers in the .md file, and hand back off to Claude for updates. It runs locally with no account, cloud, or database required.

Apr 16, 2026, 04:45 AM UTC

OpenClawRadar

Tools

NexQuant: Rust-native 3-bit KV-cache engine for edge deployment

NexQuant is a production-hardened Rust engine that enables running high-context models on consumer hardware with 3-5x memory reduction. It supports Metal, CUDA, Vulkan, and CPU backends.

Apr 2, 2026, 12:45 AM UTC

OpenClawRadar