Skales: Desktop AI Agent with Ollama Support, 300MB Idle RAM

Skales is a desktop AI agent built as a native Electron app with .exe installers for Windows and .dmg for macOS. The creator, a designer with two years of LLM experience, developed it after struggling with Docker and terminal commands for local AI setup, aiming to create something accessible to non-technical users like family members.
Key Features
- Works with Ollama for fully local inference, or any cloud provider including OpenRouter, OpenAI, Claude, Gemini, Grok, Mistral, and DeepSeek (BYOK)
- ReAct autopilot with bi-temporal memory
- Browser automation via Playwright
- Native integrations: Gmail, Telegram, WhatsApp, Discord, Google Calendar
- Multi-agent group chat where different models debate topics
- Desktop buddy that sits on screen when minimized (similar to Clippy) for task assignment without switching windows
- Built-in killswitch and website/search blacklists for security
- ~300MB idle RAM usage
- All data stored locally in ~/.skales-data
Technical Details
The app is built with Electron + Next.js + Node.js. It's source-available under BSL-1.1 license, free for personal use, with the creator noting they "didn't want a big company to fork it and commercially resell it." The GitHub repository is at github.com/skalesapp/skales.
The creator reports that their 60+ year old mother got it running instantly, and their 6-year-old used the built-in coding skill to create a retro game (one level of Super Mario).
📖 Read the full source: r/LocalLLaMA
👀 See Also

Announcing Flyto Indexer: Enhanced AI Code Refactoring with Source Dependency Analysis
Flyto Indexer, an MCP server, builds a symbol graph of your codebase, aiding AI in smart code refactoring by analyzing dependencies and call sites.

Qwen2-0.5B Fine-Tuned for Local Task Automation with llama.cpp
A developer fine-tuned Qwen2-0.5B for task automation using LoRA on ~1000 custom examples, creating a 300MB GGUF model that runs locally on CPU via llama.cpp. The model takes natural language tasks, detects task types, and generates execution plans with CLI commands and hotkeys.
Needle: A 26M Parameter Tool-Calling Model Built Entirely Without FFNs
Needle is a 26M parameter function-calling model with no MLPs, achieving 6000 tok/s prefill and 1200 tok/s decode on consumer devices. It beats FunctionGemma-270M, Qwen-0.6B, Granite-350M, and LFM2.5-350M on single-shot tool calling.

Claude Code Best Practice GitHub repository reaches 5,000 stars
A GitHub repository called 'claude-code-best-practice' has reached 5,000 stars. The repository was created with Claude to document best practices, tips, and workflows from both the creator and the community.