Pantheon-Reasoning-27B: Dense Reasoning RP Model

Gryphe has released Pantheon-Reasoning-27B, a fine-tuned reasoning model for roleplay built on llmfan46/Qwen3.6-27B-uncensored-heretic-v2-Native-MTP-Preserved. The model aims to bring structured reasoning to character work — weighing tone, planning narrative beats, and considering how a character would actually respond before generating a line.

The training data composition (all with full reasoning traces):

Pantheon data (~28%) — core roleplay corpus with back-generated reasoning traces
Opus-4.6-Reasoning-24k (~21%) — cleaned Claude Opus 4.6 reasoning traces for STEM, coding, and instruction-following
WorldSim data (~16%) — long-form Opus 4.6 narrative roleplay with native reasoning, mainly third-person present tense
Text adventure data (~16%) — interactive fiction and text adventure content with back-generated reasoning
General roleplay data (~16%) — varied roleplay transcripts with back-generated reasoning
Tiamat data (~3%) — character/RP dataset from Tiamat-24B-Magistral with multi-step improvement pipeline, reasoning back-generated per exchange

The model was trained with preserve_thinking: true, so thinking tags remain active across all assistant turns in multi-turn conversations — not just the first.

GGUF quants are available for local inference. The base model choice (Qwen 3.6 27B) was intentional for refusal reduction and writing capability. Gryphe notes they considered Gemma 4 31B but found it “an absolute pain to train” due to architectural quirks.

📖 Read the full source: r/LocalLLaMA

Pantheon-Reasoning-27B: A Dense Reasoning RP Model from Gryphe

👀 See Also

OpenClaw contributor criticizes project's focus on pixel-perfect parity over modern features

Claude-Code v2.1.74 Release: Memory Leak Fixes, Context Optimization, and Plugin Improvements

Unlocking OpenClaw's Potential: Integrating with CodeX

Anthropic Acquires Stainless for $300M+ — Now Owns Dominant MCP Server Generator