Reddit user reports 18.8 tok/s CPU inference with Qwen 3 30B Q4 on Zen 4

✍️ OpenClawRadar📅 Published: April 15, 2026🔗 Source

A Reddit user shared their experience testing local LLM inference on CPU instead of investing in expensive GPU hardware.

Key Details

The user was considering purchasing GPU hardware for local LLM inference, including:

P40 GPUs
V100 GPUs (almost bought an SXM2 version that doesn't plug into normal motherboards)
RTX 3090s (priced at $800+ due to AI demand)

After being advised to try CPU inference first, they tested:

Model: Qwen 3 30B Q4
Hardware: Zen 4 processor with DDR5 memory
Performance: 18.8 tokens per second on CPU
Expectation vs Reality: Expected 3-5 tok/s, got nearly 19 tok/s

The user noted that "Zen 4 + DDR5 is cracked for inference."

Practical Testing Results

The user conducted a real coding task comparison:

An 8B model "confidently wrote completely wrong code"
The 30B model "nailed it first try"
They described the 30B model's performance as "basically GPT-4o level for $0"

This suggests that for certain coding tasks, a properly quantized 30B model running on modern CPU hardware can provide results comparable to larger cloud-based models without the hardware investment typically associated with local LLM inference.

📖 Read the full source: r/LocalLLaMA

👀 See Also

News

Structured workflow beats plan mode and superpowers on AI DES benchmark

Ouroboros workflow ranked #1 on the AI-assisted Discrete-Event Simulation benchmark, outperforming Claude's plan mode and fat-skill superpowers approach by using a structured clarify-plan-execute-evaluate-recover-iterate cycle.

May 1, 2026, 06:16 PM UTC

OpenClawRadar

News

Claude Code v2.1.191: /rewind, CPU fixes, MCP reliability improvements

Claude Code v2.1.191 adds /rewind to resume cleared conversations, cuts streaming CPU usage 37%, fixes agent resurrection, and improves MCP reliability with retries.

Jun 25, 2026, 12:18 AM UTC

OpenClawRadar

News

AI Graveyard: 100 Shutdown & Acquired AI Tools Tracked – 88 in 2026 Alone

ToolDirectory.ai's AI Graveyard tracks 100 discontinued or acquired AI products, with 88 deaths recorded in 2026. Categories include Developer Tools, AI Agents, Customer Support, and more, with many acquisitions folding into larger platforms like Salesforce.

May 5, 2026, 02:21 PM UTC

OpenClawRadar

News

Proving Model Identity with Tinfoil's Modelwrap Technology

Tinfoil's Modelwrap ensures that inference providers serve the exact model weights they claim to, using cryptographic commitments verified by secure enclaves.

Feb 21, 2026, 11:45 PM UTC

OpenClawRadar