1-Bit Bonsai Image 4B: On-Device Image Generation via Binary/Ternary FLUX.2

✍️ OpenClawRadar📅 Published: June 1, 2026🔗 Source
1-Bit Bonsai Image 4B: On-Device Image Generation via Binary/Ternary FLUX.2
Ad

PrismML has released Bonsai Image 4B, a family of compact image-generation models derived from FLUX.2 Klein 4B using binary and ternary quantization. The diffusion transformer weights are represented as {−1, +1} (1-bit) or {−1, 0, +1} (ternary) with FP16 group-wise scaling factors, yielding 1.125 and 1.71 effective bits per weight respectively.

Key Specifications

  • 1-bit Bonsai Image 4B: transformer footprint 0.93 GB (8.3× reduction from 7.75 GB FP16 FLUX.2 Klein 4B). Apple Silicon payload (including compressed text encoder + FP16 VAE) is 3.42 GB.
  • Ternary Bonsai Image 4B: transformer footprint 1.21 GB (6.4× reduction). Apple Silicon payload 3.88 GB.
  • Mean active memory for 512×512 generation: 1.5 GB (1-bit) / 1.96 GB (ternary) vs 11.74 GB for original FLUX.2 Klein 4B.
  • For 1024×1024: 1.95 GB / 2.38 GB vs 14.39 GB.
Ad

Performance Benchmarks

The model runs on Apple Silicon (iPhones, iPads, Macs) via MLX low-bit paths, and on CUDA GPUs via Gemlite low-bit GEMM kernels. Generation times:

  • iPhone 17 Pro Max: 9.4 seconds for 512×512 image
  • Mac M4 Pro: ~6 seconds for 512×512 image (up to 5.6× faster than stock full-precision MFLUX pipeline)

The transformer reduction is achieved via binary/ternary layers (~14× / ~10× compression relative to FP16), while a small set of precision-sensitive projection layers (~5%) remain in FP16. The model is evaluated on GenEval, HPSv3, and DPG-Bench for quality and prompt fidelity.

Who It's For

Developers deploying image generation on-device (laptops, phones, edge devices) who need open weights and practical local inference without cloud dependency.

📖 Read the full source: HN LLM Tools

Ad

👀 See Also