Gemma 4 Released: Four Model Sizes for Local AI Hosting

✍️ OpenClawRadar📅 Published: April 6, 2026🔗 Source
Gemma 4 Released: Four Model Sizes for Local AI Hosting
Ad

Gemma 4 Model Specifications

Gemma 4 is now available as a self-hosted AI model with four distinct configurations for different hardware scenarios. According to the source, it doesn't compete with Claude, Codex, or Gemini but is positioned as a practical option for multi-routing scenarios where a small, capable self-hosted model can save tokens.

Model Variants and Hardware Requirements

  • E2B (2.3B effective parameters): Built for edge devices like phones and Raspberry Pi. Requires ~4-8GB RAM and runs well on a CPU. Recommended for hosting on VPS.
  • E4B (4.5B effective parameters): Built for laptops and low-end hardware. Maintains a low memory footprint.
  • 26B MoE (25B total, 3.8B active): Built for consumer GPUs. Runs at inference speeds similar to a 4B model.
  • 31B Dense: Built for mid-range GPUs and workstations. Requires approximately 16-20GB VRAM when using 4-bit quantization.
Ad

Capabilities and Availability

All Gemma 4 models are multimodal with both text and vision capabilities. The E2B and E4B edge models specifically support real-time audio. The models are built for advanced reasoning and agentic workflows.

Gemma 4 is available on Google AI Studio, Hugging Face, Kaggle, and Ollama.

📖 Read the full source: r/openclaw

Ad

👀 See Also