MartinLoop: Open-Source Control Plane for AI Coding Agents with Budget Stops and Audit Trails
MartinLoop is an open-source control plane for AI coding agents that addresses common failure modes: retrying the same broken approach, passing tasks without proof, burning tokens quietly, making unauditable changes, and failing in ways that are hard to classify. It provides hard budget stops, JSONL run records, inspectable audit trails, failure classification, test-verified completion, and reproducible benchmark runs.
Key features include:
- Hard budget stops — cap spending on agent runs automatically.
- JSONL run records — every step logged in a structured format.
- Inspectable audit trails — any engineer can review the agent's actions.
- Failure classification — categorize why an agent failed (e.g., stuck in loop, wrong approach).
- Test-verified completion — agents must pass defined tests before reporting done.
- Reproducible benchmark runs — standardize evaluation across agents.
The project is positioned as CI/CD for autonomous coding agents. The core is open source on GitHub: https://github.com/Keesan12/Martin-Loop. A demo is available at https://martinloop.com/demo.
Useful for teams using Claude Code, Codex, Cursor, Devin-style agents, or custom agent loops who need governance, budgets, evals, and auditability over their AI coding workflows.
📖 Read the full source: r/ClaudeAI
👀 See Also

Open-source pipeline turns Claude Code workflow into reusable skills
A developer who used Claude Code daily for 9 months has open-sourced a pipeline that structures feature development with checkpoints like functional documentation, technical documentation, complexity estimation, and security checks. The pipeline includes /new-feature and /bug-fix entry points that guide implementation.

Interfaze: New Model Architecture Beats Gemini-3-Flash and GPT-5.4-Mini on Deterministic Tasks
Interfaze, a new model architecture combining DNN/CNNs with transformers, outperforms Gemini-3-Flash, Claude-Sonnet-4.6, GPT-5.4-Mini, and Grok-4.3 across 9 benchmarks including OCR, vision, STT, and structured output.

OCTO-VEC: Open-source virtual software company with 24 AI agents
OCTO-VEC is an open-source TypeScript/SQLite project that simulates a software company with 9 default AI agents and 15 hirable specialists. It includes automated security scanning, per-agent git identities, and supports 22+ LLM providers.

Relay lets Claude Code sessions message each other without alt-tabbing
A plugin called Relay uses Claude Code's channels capability to let parallel sessions communicate directly, removing the need to manually copy-paste context between backend and frontend repos.