Case Study: Using LLM Prompts Instead of Programmatic Scaffolding for Multi-Agent Software Builds

System Overview and Results
A multi-agent system consisting of a Claude Opus orchestrator and Codex worker agents completed 10 fully autonomous software builds without human code intervention. The system produced 10 TypeScript browser games totaling over 50,000 lines of code and hundreds of passing tests.
The orchestrator—a frontier LLM given a prompt and CLI access—decomposed objectives, dispatched parallel workers, analyzed results, triaged errors, and coordinated integration. No programmatic scaffold, state machine, or task-routing infrastructure was used; the orchestration logic is a prompt, not a program.
Key Findings from the Case Study
- Scope enforcement through prompts fails completely under compiler pressure (0/20), while mechanical enforcement via post-hoc file reversion is trivially effective (20/20)
- Type contracts are not required for integration at any scale tested (6–36 modules) when the integration agent has unrestricted edit access
- The orchestrator maintained perfect task continuity across 11 context compaction events
- Cost analysis reveals a statefulness premium: with ~95% cache hit rates, the majority of orchestrator processing is re-reading prior conversation context
- A bare-prompt ablation falsifies the strong claim that models independently discover coordination patterns, but reveals that solo execution outperforms coordinated builds below ~30K LOC
System Architecture and Data
The system uses a tree architecture: a human provides objectives to a Claude Opus orchestrator, which decomposes work into parallel tasks dispatched to Codex workers. Workers operate fully autonomously and communicate exclusively through the file system.
The complete dataset includes:
- 10 Claude orchestrator sessions (52 MB)
- 88 Codex worker sessions (89 MB)
- 62 worker stdout logs (186.7 MB, 6.1M lines)
- 55 objective files with full prompt text
- 1 TUI event log (21 MB, 173K lines)
Total corpus: 295M tokens across 88 Codex worker sessions and 10 Claude orchestrator sessions.
System Evolution
The system evolved through five phases over approximately six months. The operator began with manual copy-paste between dual LLM chat windows, graduated to terminal CLI tools for file system access, then built a programmatic scaffold with memory and routing. The scaffold worked but was brittle—every edge case required new code. A single Claude session with CLI access outperformed it.
The resulting system, orch-minimal, retains 62,792 lines of supporting code, but the core orchestration logic is a prompt, not a program.
📖 Read the full source: r/LocalLLaMA
👀 See Also

OpenClaw Bot Automates KMZ Data Extraction and Spreadsheet Merging
A user reports using OpenClaw bot to parse KMZ files, extract eight specific data points including mile markers, calculate decimal mile positions with high accuracy, and merge new data into existing spreadsheets without overwriting. The process took 5 minutes of processing time and 15% of a $100 max plan session budget.

Claude Opus 4.6 Used to Build Dating App with 700+ Users in One Month
A developer used Claude Opus 4.6 to build a complete dating app with Flutter frontend, Node.js backend, and MongoDB database. The app gained 700+ registered users in about a month and includes matching, chat, and referral features.

OpenClaw Introduces One-Prompt Email Reporting for Seamless Operations
OpenClaw takes operational efficiency to the next level by enabling its agents to generate and send operational reports via a single prompt. This innovative feature simplifies workflow and enhances automation.

Developer's AI Productivity Trap: From 80 Commits/Month to 1,400+ with 17 Agents
A developer reports that AI coding agents didn't replace their job but multiplied their workload, going from 80 commits/month on one CRM project to managing 17 AI agents, 12 parallel projects, and 1,400+ commits across 39 repositories.