Kvaser Review: Open-Source AI Orchestrator with Sub-Agent Routing

Kvaser is an open-source orchestration server that started as an experiment with Qwen 3.6 35B and evolved into a full Man-in-the-Middle proxy for local AI workflows. It sits between your frontend (like Open WebUI) and backend (llama.cpp), exposing a standard OpenAI endpoint.

Key Technical Features

Zero-Embedding RAG: Queries local Kiwix datasets (Wikipedia, StackOverflow) directly via an MCP server, avoiding vector database overhead.
Wolfram Engine Integration: Augmented with Mathematica StackOverflow dump from Kiwix to improve query structuring for symbolic math.
GEDCOM MCP: Custom genealogy tool that combines family tree data with Kiwix for historical context.
Sub-Agent Routing: Each sub-agent can be configured individually and routed to different machines or models.
Smart Tool Whitelisting: Limit which tools each sub-agent sees — allows smaller models like Qwen 3.5 4B to stay focused while the 35B model handles complex tasks.
Algorithmic Augmentation: Implements algorithmic tools for complex tasks like finding common ancestors or calculating relationships, instead of relying on LLM inference.

Architecture

The system moves beyond a single agent to a full orchestration model with sub-agents. This solves "tool bloat" and complex tree traversal issues that arose as more tools were added.

Use Case: Genealogy with Historical Context

By combining GEDCOM family tree data with Kiwix, the model can augment ancestor records with historical context — a powerful example of local-first orchestration.