Savant Commander 48B: A Custom Qwen 3 Mixture-of-Experts Model with 12 Distilled Models

Savant Commander 48B is a custom Mixture-of-Experts (MOE) model built on Qwen 3 architecture that combines 12 distilled models from various providers including Claude, Gemini, OpenAI, and Deepseek. The model uses hand-coded routing to isolate each distill while allowing connections between them simultaneously.
Key Features and Architecture
- Based on Qwen 3 with 256K context length
- 4x12B MOE structure (48B total parameters)
- Custom routing isolates each distilled model while maintaining inter-model connections
- Prompt-controlled activation - users can select which distilled model(s) to use
- Enables direct comparison between different distilled models using identical prompts
Model Variants and Availability
The project includes both regular and uncensored ("Heretic") versions. The uncensored version was created by applying the Heretic process to each individual model before adding them to the MOE structure, rather than applying it to the entire MOE.
Available GGUF formats:
- Regular version:
https://huggingface.co/DavidAU/Qwen3-48B-A4B-Savant-Commander-GATED-12x-Closed-Open-Source-Distill-GGUF - Uncensored version:
https://huggingface.co/DavidAU/Qwen3-48B-A4B-Savant-Commander-Distill-12X-Closed-Open-Heretic-Uncensored-GGUF
Source repositories:
- Regular:
https://huggingface.co/DavidAU/Qwen3-48B-A4B-Savant-Commander-GATED-12x-Closed-Open-Source-Distill - Uncensored:
https://huggingface.co/DavidAU/Qwen3-48B-A4B-Savant-Commander-Distill-12X-Closed-Open-Heretic-Uncensored
Practical Applications
The model's prompt-controlled routing allows developers to test and compare outputs from different distilled models using the same prompts. Command and control functions are documented in the repository card with detailed instructions.
This approach to MOE architecture provides a practical way to leverage multiple specialized models within a single inference framework, particularly useful for comparing model behaviors or selecting specific model characteristics for different tasks.
📖 Read the full source: r/LocalLLaMA
👀 See Also

AutoAgents Rust Framework Adds Python Bindings for Prototyping
AutoAgents, a Rust-based multi-agent framework, now has Python bindings that allow developers to prototype in Python while maintaining the same Rust core runtime, provider interfaces, pipeline model, and agent semantics. The bindings enable experimentation with local AI models without external systems.

HostMyClaudeHTML: One-Click Sharing for Claude HTML Artifacts
A developer built hostmyclaudehtml.com, a free tool that lets you share Claude-generated HTML artifacts as live URLs by dragging and dropping the .html file. No account is required for uploaders or viewers.

Developer shares 10+ MCP servers for AI agent settlement, reputation, and micropayments
A developer built BlindOracle on Claude Code with 100+ agents and created 10+ MCP servers for settlement, reputation, and micropayments. The architecture includes private commit-reveal forecasts, on-chain scoring, per-request micropayments, and verifiable agent attestation.

NGX-OS: Network OS Built for AI with eBPF and MCP Integration
NGX-OS is a network operating system designed from the ground up for AI integration, using eBPF for real-time telemetry and MCP for direct LLM access to network state data without translation layers.