GSD-Lite: TDD State Machine for Claude Code Enforces Testing

GSD-Lite is an open-source MCP server that bolts onto Claude Code and runs projects through a 12-state workflow machine. The tool is MIT licensed and consists of about 15 source files total.

How It Works

After planning what to build in conversation with Claude, GSD-Lite takes over automatically: write code, review it, verify it, advance to the next phase. The execution loop follows this pattern:

Orchestrator picks next task
Executor writes code (TDD, checkpoint)
Reviewer checks (separate context, spec + quality)
Accept? Next task. Reject? Rework.
All tasks done? Phase gate check
Gate passes? Next phase
All phases done? You're done

Key Features

TDD Enforcement: The "Iron Law" is baked into every task dispatch: no production code without a failing test first. The prompt lists exact rationalizations Claude uses to skip tests ("This is just a config change," "The existing tests already cover this") and flags them as known excuses.

Separate Agent Contexts: Reviews run in a separate agent context where the reviewer never sees the executor's reasoning—only the diff and task spec. This prevents rubber-stamping and helps catch real bugs.

Debugger Agent: When a task fails 3 times, instead of another retry, a debugger agent gets dispatched. This separate agent reproduces the failure, forms hypotheses, tests them, identifies where the fix should go, then provides findings to the executor.

Dependency Tracking: If one task changes an API signature, anything downstream gets invalidated and re-queued automatically.

Technical Details

The system uses 6 commands, 4 agents, and 11 MCP tools. State is managed in one JSON file with schema validation and version conflicts handled via optimistic concurrency.

Why Not the Original Version

The first version had 32 commands, 12 agents, over 100 source files, and a 2400-line installer. The author threw it away and rewrote from scratch because most of that complexity was burning context window without providing value.

Unexpected Findings

The anti-rationalization approach works—listing specific phrases Claude uses to skip steps directly in the agent prompt reduced the skip rate. The author notes that negative examples seem to steer the model better than just saying "always write tests." Session persistence was the hardest implementation challenge.

📖 Read the full source: r/ClaudeAI