Open-source solo RPG engine uses three Claude instances for parsing, narration, and direction

Architecture and Pipeline
EdgeTales is a Python-based solo RPG engine where players type character actions, dice mechanics resolve outcomes behind the scenes, and Claude AI writes atmospheric prose based on results. The core design principle is "AI narrates, it does not decide"—dice determine success or failure, while Claude only turns outcomes into story.
The system uses a triple-AI pipeline with three Claude instances per player turn:
- Brain (Claude Haiku): Parses free-text input into structured JSON with fields like RPG move, stat, target NPC, position/effect level. Takes ~300ms and costs ~$0.0002.
- Narrator (Claude Sonnet): Receives structured prompts with dice results, NPC context, and story arc, then writes atmospheric prose. Also embeds hidden metadata (
<new_npcs>,<memory_updates>) that the parser extracts for game state updates. Takes ~2s and costs ~$0.003. - Director (Claude Haiku): Runs asynchronously after the player sees narration. Analyzes scenes like a TV showrunner for NPC behavior hints, plot thread tracking, and scene summaries. Only triggers on specific events (failed rolls, new NPCs, every 3rd scene) with zero player-facing latency.
Total cost per turn is ~$0.003–0.004, making a 20-scene session cost ~6–8 cents. The Director's output goes into <director_guidance> tags in the next Narrator prompt, with graceful degradation if the Director fails.
Technical Implementation Details
Prompt Engineering Lessons:
- Structured XML context injection (
<world>,<character>,<npc>,<story_arc>,<director_guidance>) made Sonnet's output more consistent than prose instructions. - Haiku is effective for structured parsing—the Brain returns valid JSON with 8+ fields from free-form multilingual input.
- JSON repair is essential. Both models occasionally produce malformed JSON (missing commas in German text, unescaped newlines, trailing commas). A
_repair_json()function with try-first approach handles this with zero overhead for valid JSON. - NPC deduplication uses three safety nets: explicit
<npc_rename>tags, fuzzy substring matching before creation, and alias-aware search.
NPC Memory System: Each NPC has importance-weighted memory calculated as Score = 0.40 × Recency + 0.35 × Importance + 0.25 × Relevance. The Director generates "reflections" (how an NPC feels) alongside factual observations. Memory stays bounded at 25 entries per NPC with intelligent consolidation.
Technical Stack: Python 3.11+, NiceGUI, Anthropic SDK, EdgeTTS/Chatterbox (TTS), Faster-Whisper (STT). The codebase is ~6,800 lines across 5 files. Features include 20+ narration languages, voice I/O, PDF export, kid-friendly mode, and Raspberry Pi compatibility.
📖 Read the full source: r/ClaudeAI
👀 See Also

Claude Banana: A Claude Code plugin for image generation with design system awareness
Claude Banana is a Claude Code plugin that generates images using Google's Gemini API with context-aware prompt crafting. It reads Tailwind configs, CSS variables, design tokens, and existing assets to understand project visual styles.

LAP: 1,500+ API Specs Compiled for LLM Consumption to Reduce Claude Hallucinations
LAP is a tool that compiles 1,500+ real API specifications into a lean format optimized for LLMs, providing verified endpoints and parameters to prevent AI coding agents like Claude from hallucinating incorrect API calls.

MCP Server Adds Persistent Memory with Retrieval Scoring to Claude Code
A developer built an MCP server called engram-mcp that gives Claude Code persistent memory across sessions and projects, featuring automatic retrieval scoring based on outcome success and drift detection for stale knowledge.

Omnicoder-9B Performance Review: Speed vs. Tool Calling Issues
Omnicoder-9B, a coding-focused model fine-tuned on Qwen3.5 9B with outputs from Opus 4.6, GPT 5.4, GPT 5.3 Codex and Gemini 3.1 Pro, shows strong performance on mid-tier hardware but has tool calling issues in IDEs.