Steelman R5: Fine-tuned 14B Model Outperforms Claude Opus on Ada Code Generation

Model and Training Details
The Steelman R5 model is a fine-tuned version of Qwen2.5-Coder-14B-Instruct specifically optimized for Ada code generation. Training used QLoRA 4-bit via Unsloth with TRL SFTTrainer on a dataset of 3,430 Ada/SPARK instruction pairs where every training example passes gnatmake -gnat2022 -gnatwa compilation.
Training configuration: LoRA rank 32, alpha 64, targeting q/k/v/o/gate/up/down projections. The model was fully retrained from base each round on accumulated dataset (adapter continuation caused catastrophic forgetting at R2). Training ran for 1 epoch with learning rate 2e-5, constant schedule, taking about 49 minutes per round on a rented H100. Five rounds total (R1–R5), with R2 discarded.
Benchmark Results
Custom Ada Compilation Benchmark (1,000 prompts, first-attempt clean compile):
- Steelman R5 (14B): 68.6% compile rate
- Claude Opus 4.6: 42.1% compile rate
- Claude Sonnet 4.6: 37.2% compile rate
- Qwen2.5-Coder-14B (base, untuned): ~35% compile rate
- Claude Sonnet 4: 27.5% compile rate
MultiPL-E HumanEval-Ada (157 problems, pass@1):
- Steelman R5: 47.1% pass@1, 74.5% compile rate
- Qwen2.5-Coder-14B (base): 34.4% pass@1, 51.0% compile rate
These are the first published Ada pass@1 results on HumanEval for any open model.
Usage and Availability
Run the model with: ollama run hf.co/the-clanker-lover/steelman-14b-ada-v0.1-GGUF
The GGUF version fits in 12GB VRAM with Q4_K_M quantization.
Limitations
- Compilation ≠ correctness: 68.6% compiles, but only 47.1% produces correct output on HumanEval
- Error-fix capability is weak (5.1%) - don't expect it to debug Ada code
- SPARK contracts compile but aren't verified with gnatprove
- Synthetically generated training data - no human Ada developers wrote these examples
- 14B model size means it may miss things a larger model would catch
Resources
- Model: https://huggingface.co/the-clanker-lover/steelman-14b-ada-v0.1
- GGUF: https://huggingface.co/the-clanker-lover/steelman-14b-ada-v0.1-GGUF
- Dataset: https://huggingface.co/datasets/the-clanker-lover/steelman-sft-ada
📖 Read the full source: r/LocalLLaMA
👀 See Also

Telegram Bot for Claude Code CLI Control from Mobile
A developer built a Telegram bot that bridges to the Claude Code CLI, allowing control via mobile commands like /commit, /code_review, and /simplify. The bot auto-discovers custom skills, processes photos/documents/voice notes, and supports group chat sessions.

MCP Support Merged into llama.cpp with New WebUI Features
The Model Context Protocol (MCP) pull request for llama.cpp has been merged, adding MCP support, tool calls, an agentic loop, and a server selector to the llama-server/WebUI side.

Building a Self-Improving Knowledge System with Claude Code and Obsidian
A developer built a 25-tool system that gives Claude Code persistent memory through semantic search, knowledge graphs, and spaced repetition over an Obsidian vault. The system indexes content with bge-m3 embeddings, detects contradictions, auto-prunes stale notes, and generates Obsidian Canvas maps automatically.

Cowork Context Management Kit Solves Claude's File Overload Problem
A developer built a context management kit for Cowork after Claude AI was reading all 462 files in their project folder, causing performance issues and contradictions. The solution includes global instructions, a manifest file system, and a Cowork skill to prioritize relevant documents.