Lemonade by AMD: Open Source Local LLM Server for GPU and NPU

✍️ OpenClawRadar📅 Published: April 5, 2026🔗 Source
Lemonade by AMD: Open Source Local LLM Server for GPU and NPU
Ad

What Lemonade Is

Lemonade is a local AI server built by AMD and the local AI community that runs text, image, and speech models on GPUs and NPUs. It's open source, designed to be private, and claims to be ready in minutes on any PC.

Key Features and Specifications

  • Native C++ Backend: Lightweight service that is only 2MB
  • One Minute Install: Simple installer that sets up the stack automatically
  • OpenAI API Compatible: Works with hundreds of apps out-of-box and integrates in minutes
  • Auto-configures for your hardware: Configures dependencies for your GPU and NPU
  • Multi-engine compatibility: Works with llama.cpp, Ryzen AI SW, FastFlowLM, and more
  • Multiple Models at Once: Run more than one model at the same time
  • Cross-platform: A consistent experience across Windows, Linux, and macOS (beta)
  • Built-in app: A GUI that lets you download, try, and switch models quickly
  • Unified API: One local service for every modality including chat, vision, image generation, transcription, and speech generation
Ad

Model Support and Performance

The server can load models like gpt-oss-120b or Qwen-Coder-Next for advanced tool use. For tuning, you can use --no-mmap to speed up load times and increase context size to 64 or more. The source mentions that with 128 GB of unified RAM, you can load larger models.

Ecosystem Integration

Lemonade is integrated in many apps and works out-of-box with hundreds more thanks to the OpenAI API standard. Mentioned integrations include Open WebUI, n8n, Gaia Infinity, Arcade, GitHub Copilot, OpenHands, Dify, Deep Tutor, and Iterate.ai.

Community and Development

The project has 2.1k stars on GitHub and an active Discord community with 117 online at the time of the source. It's described as being built by the local AI community for every PC, with the philosophy that local AI should be free, open, fast, and private.

📖 Read the full source: HN LLM Tools

Ad

👀 See Also