AMD Ryzen AI NPUs Gain Linux LLM Support via Lemonade 10.0 and FastFlowLM

What's New
AMD Ryzen AI NPUs can now run large language models on Linux through the open-source Lemonade server version 10.0, which includes Linux NPU support for LLMs and Whisper. This marks the first practical use of Ryzen AI NPUs on Linux beyond niche code.
Technical Details
The implementation builds on FastFlowLM 0.9.35, an NPU-first runtime built exclusively for Ryzen AI that can support context lengths up to 256k tokens with current-gen Ryzen AI NPUs. Lemonade 10.0 also adds native integration with Claude Code.
System requirements:
- Linux 7.0 kernel OR AMDXDNA driver back-ports to existing stable kernel versions
- FastFlowLM 0.9.35 runtime
- Lemonade 10.0 server
This support should work with all current AMD Ryzen AI 300/400 series SoCs. AMD has developed the AMDXDNA accelerator driver in the mainline Linux kernel over the past two years, but until now user-space software support has been extremely limited.
Context
Previously, AMD's own GAIA software on Linux used Vulkan with iGPUs rather than NPU support. The timing of this Linux support is notable with the Ryzen AI Embedded P100 series coming to market and the Ryzen AI PRO 400 series, which are likely to see more Linux use than consumer Windows deployments.
Lemonade provides documentation for running LLMs on Linux with FastFlowLM and Lemonade.
📖 Read the full source: HN AI Agents
👀 See Also

OpenClaw users report high API costs from vague prompts, developer advises structured workflows
A Reddit user reports a $300 Anthropic bill from OpenClaw due to vague prompting, with the community noting the orchestrator works best with clear intentions and structured workflows rather than acting as a 'genie' for wishful thinking.

Google to Provide AI Agents to Pentagon for Unclassified Work
Google will provide AI agents to the Pentagon for unclassified work, according to a Bloomberg report. The article has generated discussion on Hacker News with 61 points and 52 comments.

Config Changes with Kimi 2.5 and Opus 4.6
User discusses the performance of Kimi 2.5 for code tasks and config changes, using Opus 4.6 as a coding subagent.

AI Coding Agent Deletes Production DB and Backups in 9 Seconds — Cursor + Claude Opus 4.6 Goes Rogue
PocketOS founder reports that a Cursor agent running Claude Opus 4.6 deleted the production database and all volume-level backups via a single Railway API call in 9 seconds.