How to Use AI Agents for TDD in Website Projects

Development Workflow with AI Agents

A developer outlines their approach to website development using AI coding agents with a test-driven development methodology. They use both Claude Code for work projects and local models for private projects, specifically Qwen Code on top of Qwen3.5-27B running on llama.cpp with 2xRTX 3090 GPUs.

Initial Project Setup

At the beginning of a project, they implement basic modules:

Basic DB schema
Basic auth API
UI routing
UI basic layout
Basic API (admins and users)
Basic API/E2E tests (written manually or by AI)
Context files for coding agents (AGENTS.md, CLAUDE.md)

Iterative Development Process

After setup, the iterative process begins:

Write detailed specs of API/E2E tests in markdown for a feature
Generate API/E2E tests from the markdown test descriptions
Start coding agent session with ability to run tests
Ask agent to implement functionality until tests pass

Model Capabilities and Trade-offs

The developer notes that more capable models like Claude allow skipping markdown files entirely for simple websites, while Qwen3.5-27B has different thresholds. Less capable models require more specific instructions to mitigate failure modes, including locking logic by instructing not to touch certain files or using only specific wrappers.

They hypothesize that developers shouldn't be obsessed with code patterns and quality if code is covered by tests and works, comparing AI agents to managing 10-100 junior/middle developers at the cost of an AI subscription.

Local Model Specifics

For local models running on 2xRTX3090, they use Qwen3.5-27B-GGUF-Q8_0 with parallel = 1 and full context, believing this is important for agentic sessions not to be autocompressed early. They note that dumber models force clearer articulation of E2E tests and desired implementation, while Claude fills in design choices automatically but can lead to loss of control.

Coding TDD Loop Implementation

The developer provides a draft of their coding TDD loop:

outer loop begins: run all pytest tests using command `pytest tests/ -x` and will exit there aren't any failures; the default loglevel will be warning, so not much output there
if everything passes; exit the outer loop; if something failed, extracts failed test name
runs the failed test name with full logs, like `pytest tests/../test_first_failing_test.py --log-level DEBUG` and collects the output of the tests into the file
extracts lines near the 'error'/'fail' strings with `egrep -i -C 10 '(error|fail)' <fail

This approach represents a practical implementation of TDD with AI agents, balancing automation with necessary oversight to maintain codebase control.

📖 Read the full source: r/LocalLLaMA