Culpa: Open Source Deterministic Replay Engine for AI Agent Debugging

Culpa is an open source deterministic replay engine designed specifically for debugging AI agent sessions. The core problem it addresses is the nondeterministic nature of LLM agents—when they fail, you can't reproduce the exact failure by simply re-running the session.
How It Works
The tool records every LLM call along with the full execution context during an agent session. When you need to debug a failure, it replays the session using the recorded responses as stubs instead of making new API calls. This makes the replay fully deterministic and costs nothing since it doesn't hit the real APIs.
Key Features
- Proxy Mode: Works with tools like Claude Code and Cursor without requiring any code changes
- Python SDK: Available for developers building their own agents
- API Support: Compatible with Anthropic and OpenAI APIs
- Forking Capability: You can fork at any recorded decision point, inject a different response, and see what would have happened
Practical Benefits
Since the replay uses recorded responses instead of making actual API calls, debugging sessions incur zero API costs. The deterministic nature of replays makes it possible to reliably reproduce and analyze failures that would otherwise be impossible to recreate due to the inherent randomness in LLM responses.
The project is actively seeking feedback, particularly from developers building agent workflows. The creator notes they're a CS freshman and looking to improve the tool.
📖 Read the full source: r/LocalLLaMA
👀 See Also

CRMy: Open Source CRM and Customer Context Engine for OpenClaw
CRMy is an open source CRM and Customer Context Engine built specifically for OpenClaw agents. It includes a complete CLI, OpenClaw plugin with 12 CRM tools, PostgreSQL backend, and self-hosted deployment with two commands.

Anthropic Launches Claude for Small Business with Pre-Built Workflows for QuickBooks, HubSpot, Canva
Claude for Small Business is a toggle-install package within Claude Cowork that connects to QuickBooks, PayPal, HubSpot, Canva, Docusign, Google Workspace, and Microsoft 365, with 15 ready-to-run agentic workflows for payroll, month-end close, invoicing, campaign management, and more.

Structured Reasoning Template Improves AI Code Review Accuracy
A Reddit user shares a structured reasoning template adapted from Meta research that forces AI models to complete specific analytical steps before generating code reviews, improving accuracy by 5-12 percentage points according to arXiv:2603.01896.

OpenClaw's QMD Memory Search Fast Path Had Silent Bugs
OpenClaw's built-in memory search uses basic keyword matching, but users can switch to QMD for semantic search across workspace markdown files. A fast path through MCPorter was broken with three bugs causing every call to silently fail and fall back to slower CLI execution.