Qwen2-0.5B Fine-Tuned for Local Task Automation with llama.cpp

A developer has fine-tuned Qwen2-0.5B for task automation, creating a model that runs entirely locally on CPU without requiring GPU or cloud APIs. The project, named ACE, is available on GitHub.
What It Does
- Takes natural language tasks (e.g., "copy logs to backup")
- Detects task type: atomic, repetitive, or clarification
- Generates execution plans consisting of CLI commands and hotkeys
- Runs entirely locally on CPU (no GPU, no cloud APIs)
Technical Details
- Base model: Qwen2-0.5B
- Training: LoRA fine-tuning on approximately 1000 custom task examples
- Quantization: GGUF Q4_K_M format (300MB file size)
- Inference: llama.cpp
- Inference time: 3-10 seconds on i3/i5 processors
Main Challenges During Training
- Data quality: Had to regenerate dataset 2-3 times due to garbage examples
- Overfitting: Took multiple iterations to get validation loss stable
- EOS token handling: Model wouldn't stop generating until tokenizer config was fixed
- GGUF conversion: Required BF16 dtype + imatrix quantization to get stable outputs
Limitations (v0.1)
- Requires full file paths (no smart file search yet)
- CPU inference only (slower on older hardware)
- Basic execution (no visual understanding)
Performance Benchmarks
- i5 (2018+) with SSD: 3-5 seconds
- i3 (2015+) with SSD: 5-10 seconds
- Older hardware (Pentium + HDD): 30-90 seconds
The developer is seeking feedback on performance across different hardware, edge cases that break the model, and feature requests for v0.2.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Claude Code Skills for Automated Project Scaffolding
A developer has built Claude Code skills that automate full-stack project setup with commands for React, Next.js, Node.js APIs, and Turborepo monorepos. The skills pull latest dependencies, support 50+ integrations, and are MIT licensed.

Soul MCP Server Adds Persistent Memory and Safety for Local LLMs
Soul is an open-source MCP server that provides persistent memory across sessions for local LLMs with two commands: n2_boot at start and n2_work_end at end. It includes Ark safety features that block dangerous commands like rm -rf and DROP DATABASE at zero token cost, plus cloud storage configuration.

ClawPort: Open Source Orchestration for AI Agent Workflows with Self-Healing Cron
ClawPort is an open source orchestration layer for AI agent workflows that auto-configures cron pipelines, self-heals on failures, and lets you test agents directly before they run on schedule.

TruthGuard: Shell Script Hooks That Catch AI Coding Agent Lies
TruthGuard is an open-source tool that uses shell script hooks to verify what Claude Code and Gemini CLI actually do versus what they claim. It catches phantom edits, exit code lies, dangerous shortcuts, and blocks commits when tests fail.