Kvaser: An Open-Source Local-First AI Orchestrator with Sub-Agent Routing and Wolfram Integration

Kvaser is an open-source orchestration server that started as an experiment with Qwen 3.6 35B and evolved into a full Man-in-the-Middle proxy for local AI workflows. It sits between your frontend (like Open WebUI) and backend (llama.cpp), exposing a standard OpenAI endpoint.
Key Technical Features
- Zero-Embedding RAG: Queries local Kiwix datasets (Wikipedia, StackOverflow) directly via an MCP server, avoiding vector database overhead.
- Wolfram Engine Integration: Augmented with Mathematica StackOverflow dump from Kiwix to improve query structuring for symbolic math.
- GEDCOM MCP: Custom genealogy tool that combines family tree data with Kiwix for historical context.
- Sub-Agent Routing: Each sub-agent can be configured individually and routed to different machines or models.
- Smart Tool Whitelisting: Limit which tools each sub-agent sees — allows smaller models like Qwen 3.5 4B to stay focused while the 35B model handles complex tasks.
- Algorithmic Augmentation: Implements algorithmic tools for complex tasks like finding common ancestors or calculating relationships, instead of relying on LLM inference.
Architecture
The system moves beyond a single agent to a full orchestration model with sub-agents. This solves "tool bloat" and complex tree traversal issues that arose as more tools were added.
Use Case: Genealogy with Historical Context
By combining GEDCOM family tree data with Kiwix, the model can augment ancestor records with historical context — a powerful example of local-first orchestration.
Source Code
Available on GitHub: https://github.com/Na1w/kvaser-core
📖 Read the full source: r/LocalLLaMA
👀 See Also

Open Source AI Agent Prompt Library Reaches 100 GitHub Stars
A community repository called ai-setup provides shared system prompts, Cursor rules, Claude configs, and local model workflow setups for AI agents. The project has 100 GitHub stars and 90 merged PRs.

Code Evolution Method Triples LLM Performance on ARC-AGI-2 Benchmark
Researchers achieved a 2.8x improvement on the ARC-AGI-2 benchmark using code evolution with open-weight models, reaching 34% accuracy at $2.67 per task. The same method pushed Gemini 3.1 Pro to 95% accuracy at $8.71 per task.

Sonarly: AI-driven Production Alert Triage and Resolution
Sonarly connects with observability tools to triage and resolve production alerts, reducing noise and focusing on critical issues.

PACT: A Programmatic Governance Framework for Claude Code After Agent Failure Patterns
A developer built PACT (Programmatic Agent Constraint Toolkit) after three months of recurring Claude Code failures on a 350+ file mobile app. The framework replaces unenforceable rules with mechanical constraints that physically block violations through pre-tool-use hooks.