1-Bit Bonsai Image 4B: On-Device Image Generation via Binary/Ternary FLUX.2

✍️ OpenClawRadar📅 Published: June 1, 2026🔗 Source

PrismML has released Bonsai Image 4B, a family of compact image-generation models derived from FLUX.2 Klein 4B using binary and ternary quantization. The diffusion transformer weights are represented as {−1, +1} (1-bit) or {−1, 0, +1} (ternary) with FP16 group-wise scaling factors, yielding 1.125 and 1.71 effective bits per weight respectively.

Key Specifications

1-bit Bonsai Image 4B: transformer footprint 0.93 GB (8.3× reduction from 7.75 GB FP16 FLUX.2 Klein 4B). Apple Silicon payload (including compressed text encoder + FP16 VAE) is 3.42 GB.
Ternary Bonsai Image 4B: transformer footprint 1.21 GB (6.4× reduction). Apple Silicon payload 3.88 GB.
Mean active memory for 512×512 generation: 1.5 GB (1-bit) / 1.96 GB (ternary) vs 11.74 GB for original FLUX.2 Klein 4B.
For 1024×1024: 1.95 GB / 2.38 GB vs 14.39 GB.

Performance Benchmarks

The model runs on Apple Silicon (iPhones, iPads, Macs) via MLX low-bit paths, and on CUDA GPUs via Gemlite low-bit GEMM kernels. Generation times:

iPhone 17 Pro Max: 9.4 seconds for 512×512 image
Mac M4 Pro: ~6 seconds for 512×512 image (up to 5.6× faster than stock full-precision MFLUX pipeline)

The transformer reduction is achieved via binary/ternary layers (~14× / ~10× compression relative to FP16), while a small set of precision-sensitive projection layers (~5%) remain in FP16. The model is evaluated on GenEval, HPSv3, and DPG-Bench for quality and prompt fidelity.

Who It's For

Developers deploying image generation on-device (laptops, phones, edge devices) who need open weights and practical local inference without cloud dependency.

📖 Read the full source: HN LLM Tools

👀 See Also

News

Claude Code v2.1.139 Adds Agent View, /goal Command, and Major MCP Improvements

Claude Code v2.1.139 introduces a new agent view for session management, a /goal command for multi-turn tasks, expanded hook capabilities, and fixes for MCP server memory issues and terminal corruption.

May 11, 2026, 08:15 PM UTC

OpenClawRadar

News

Benchmark Results: Qwen3.5 Models on Apple Silicon vs AMD GPUs with ROCm vs Vulkan

A developer benchmarked Qwen3.5 models (35B MoE, 27B dense, 122B MoE) across Apple Silicon Macs and AMD GPU workstations, comparing ROCm and Vulkan backends with context-scaling tests. Hardware included M5 Max, M1 Max, and three AMD GPUs with different PCIe configurations.

Mar 26, 2026, 06:45 PM UTC

OpenClawRadar

🦀

News

Claude Plan Users Now Get Monthly Agent SDK Credits Starting June 15, 2026

Claude Pro, Max, Team, and Enterprise plan subscribers can claim a monthly credit for Agent SDK usage, covering claude -p, GitHub Actions integration, and third-party apps. Credits refresh monthly, are per-user, and cannot be pooled.

May 13, 2026, 08:15 PM UTC

OpenClawRadar

News

Claude Code v2.1.77 Release: Token Limits, Sandbox Controls, and Bug Fixes

Claude Code v2.1.77 increases default maximum output token limits for Claude Opus 4.6 to 64k tokens and adds an allowRead sandbox filesystem setting. The release includes over 30 fixes for issues ranging from memory management to terminal UI behavior.

Mar 17, 2026, 05:45 AM UTC

OpenClawRadar