SpruceChat Runs 0.5B LLM On-Device on Miyoo Handhelds via llama.cpp

What This Is
SpruceChat is a project that runs the Qwen2.5-0.5B language model entirely on-device on several handheld gaming consoles using llama.cpp. It requires no cloud connection or WiFi after the initial setup.
Key Details
The model lives in RAM after the first boot, and tokens stream in one by one during generation. It runs on the Miyoo A30, Miyoo Flip, Trimui Brick, and Trimui Smart Pro.
Performance on the Miyoo A30 (which has a Cortex-A7 quad-core processor):
- Model load: ~60 seconds on first boot
- Generation speed: ~1-2 tokens per second
- Prompt evaluation: ~3 tokens per second
The developer notes it's not fast, but it streams so you can watch it think. They mention 64-bit devices are quicker.
The AI is described as having "the personality of a spruce tree: patient, unhurried, quietly amazed by everything."
If the device is on WiFi, you can also hit the llama-server from a browser on a phone or laptop to chat with a real keyboard.
The repository is at https://github.com/RED-BASE/SpruceChat. The project was built with help from Claude, and there's already a collaborator working on expanding device support. The first release is up with both armhf and aarch64 binaries, and the model is included.
📖 Read the full source: r/LocalLLaMA
👀 See Also

aco-system: An Entire Company OS for Claude That Writes User Stories, Breaks Tasks, Reviews PRs
A Reddit user shared how aco-system turned a single GitHub issue into a fully validated PR with tests — driven entirely by Claude. Includes user story generation, task breakdown, secret checking, and PR review.

Logseq Brain v0.6.0: Persistent Memory Plugin for Claude Code Adds Journey Log and Section-Targeted Reads
Logseq Brain v0.6.0 adds a journey log for all operations, section-targeted reads for token savings, and progressive disclosure for skill files.

Jean-Claude: A Satirical LLM Frontend Mocking EU AI Regulation, with 412 Cookie Partners and VAT Invoices Every 5 Messages
Jean-Claude is a satirical LLM frontend that applies extreme EU-style bureaucracy to AI usage: 412 cookie partners, four-eyes principle requiring co-signature, per-token CO₂ tracking with mandatory €offset, VAT invoices every 5 messages, and a compliance center with fake GDPR/AI Act metrics.

LightMem: Lightweight Memory System for LLM Agents with 10×+ Gains and 100× Lower Cost
LightMem is a modular memory system for LLM agents that achieves up to 10.9% accuracy improvement while reducing tokens by up to 117×, API calls by up to 159×, and runtime by over 12×. It's designed for scalable long-context reasoning across agent workflows.