GSD-Lite: A State Machine for Claude Code That Enforces TDD and Prevents Test Skipping

GSD-Lite is an open-source MCP server that bolts onto Claude Code and runs projects through a 12-state workflow machine. The tool is MIT licensed and consists of about 15 source files total.
How It Works
After planning what to build in conversation with Claude, GSD-Lite takes over automatically: write code, review it, verify it, advance to the next phase. The execution loop follows this pattern:
- Orchestrator picks next task
- Executor writes code (TDD, checkpoint)
- Reviewer checks (separate context, spec + quality)
- Accept? Next task. Reject? Rework.
- All tasks done? Phase gate check
- Gate passes? Next phase
- All phases done? You're done
Key Features
TDD Enforcement: The "Iron Law" is baked into every task dispatch: no production code without a failing test first. The prompt lists exact rationalizations Claude uses to skip tests ("This is just a config change," "The existing tests already cover this") and flags them as known excuses.
Separate Agent Contexts: Reviews run in a separate agent context where the reviewer never sees the executor's reasoning—only the diff and task spec. This prevents rubber-stamping and helps catch real bugs.
Debugger Agent: When a task fails 3 times, instead of another retry, a debugger agent gets dispatched. This separate agent reproduces the failure, forms hypotheses, tests them, identifies where the fix should go, then provides findings to the executor.
Dependency Tracking: If one task changes an API signature, anything downstream gets invalidated and re-queued automatically.
Technical Details
The system uses 6 commands, 4 agents, and 11 MCP tools. State is managed in one JSON file with schema validation and version conflicts handled via optimistic concurrency.
Why Not the Original Version
The first version had 32 commands, 12 agents, over 100 source files, and a 2400-line installer. The author threw it away and rewrote from scratch because most of that complexity was burning context window without providing value.
Unexpected Findings
The anti-rationalization approach works—listing specific phrases Claude uses to skip steps directly in the agent prompt reduced the skip rate. The author notes that negative examples seem to steer the model better than just saying "always write tests." Session persistence was the hardest implementation challenge.
📖 Read the full source: r/ClaudeAI
👀 See Also

Google Releases Sashiko: AI Code Review Agent for Linux Kernel Patches
Google engineers have open-sourced Sashiko, an agentic AI code review system designed for the Linux kernel. It found 53% of bugs in an unfiltered set of 1,000 recent upstream issues that were missed by human reviewers.

DeepMind DiscoRL Meta Learning Update Rule Ported from JAX to PyTorch
A developer has ported DeepMind's DiscoRL meta learning update rule from the 2025 Nature article from JAX to PyTorch. The implementation includes a GitHub repository with a Colab notebook, API, and weights hosted on Hugging Face.

Persistent Memory for Claude: Local Stack with MCP, 39ms Retrieval, 82% Token Reduction
A developer built a persistent memory layer for Claude using local vector search (Qdrant + Qwen3) and MCP integration, achieving 82% token reduction, 39ms hot-path retrieval, and session crystallization via L4 nodes.

Wrangle: A Native macOS Editor for Managing Claude Code Sessions
Wrangle is a native macOS markdown editor built specifically for managing multiple Claude Code sessions, featuring embedded terminals and smart notifications. The developer created it after VS Code couldn't keep up with their daily workflow of running many Claude Code sessions.