Savant Commander 48B: A Custom Qwen 3 Mixture-of-Experts Model with 12 Distilled Models

✍️ OpenClawRadar📅 Published: March 24, 2026🔗 Source
Savant Commander 48B: A Custom Qwen 3 Mixture-of-Experts Model with 12 Distilled Models
Ad

Savant Commander 48B is a custom Mixture-of-Experts (MOE) model built on Qwen 3 architecture that combines 12 distilled models from various providers including Claude, Gemini, OpenAI, and Deepseek. The model uses hand-coded routing to isolate each distill while allowing connections between them simultaneously.

Key Features and Architecture

  • Based on Qwen 3 with 256K context length
  • 4x12B MOE structure (48B total parameters)
  • Custom routing isolates each distilled model while maintaining inter-model connections
  • Prompt-controlled activation - users can select which distilled model(s) to use
  • Enables direct comparison between different distilled models using identical prompts

Model Variants and Availability

The project includes both regular and uncensored ("Heretic") versions. The uncensored version was created by applying the Heretic process to each individual model before adding them to the MOE structure, rather than applying it to the entire MOE.

Available GGUF formats:

  • Regular version: https://huggingface.co/DavidAU/Qwen3-48B-A4B-Savant-Commander-GATED-12x-Closed-Open-Source-Distill-GGUF
  • Uncensored version: https://huggingface.co/DavidAU/Qwen3-48B-A4B-Savant-Commander-Distill-12X-Closed-Open-Heretic-Uncensored-GGUF

Source repositories:

  • Regular: https://huggingface.co/DavidAU/Qwen3-48B-A4B-Savant-Commander-GATED-12x-Closed-Open-Source-Distill
  • Uncensored: https://huggingface.co/DavidAU/Qwen3-48B-A4B-Savant-Commander-Distill-12X-Closed-Open-Heretic-Uncensored
Ad

Practical Applications

The model's prompt-controlled routing allows developers to test and compare outputs from different distilled models using the same prompts. Command and control functions are documented in the repository card with detailed instructions.

This approach to MOE architecture provides a practical way to leverage multiple specialized models within a single inference framework, particularly useful for comparing model behaviors or selecting specific model characteristics for different tasks.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also