Be My Butler: Multi-Agent Pipeline for AI Code Verification

✍️ OpenClawRadar📅 Published: March 14, 2026🔗 Source

What Be My Butler Does

Be My Butler (BMB) is a multi-agent pipeline designed to solve a specific problem in AI-assisted coding: when AI coding agents incorrectly report their own code as working. The creator, a materials/mechanical engineer with no programming background, built this after experiencing Claude Code agents writing code that passed tests but didn't actually work in practice.

Core Concept

The system implements a peer review model for AI-generated code:

One model writes the code
A different model reviews it without knowing who wrote it (blind verification)
A cross-model council (Claude + GPT + Gemini) votes on whether it actually works
An analyst agent tracks patterns in what goes wrong

Performance Metrics

From testing:

Single-agent self-review catches ~40% of real issues
Cross-model blind review catches ~85%
Cost overhead: 15-20% more tokens

v0.2 Features

Analytics dashboard to track token usage and costs
Analyst agent for automated code review patterns
Consultant agent for architecture decisions
Improved tmux-based orchestration

Installation and Usage

Fully open source under MIT license. Installation:

git clone https://github.com/project820/be-my-butler.git
cd be-my-butler && ./install.sh
bmb "build a REST API with auth"

The tool is particularly useful for "vibe coders" — people without traditional coding experience who depend on AI for code quality assessment. When you can't read code to spot issues yourself, having multiple models cross-check each other provides verification that single-agent systems lack.

📖 Read the full source: r/ClaudeAI

👀 See Also

Tools

Stockade: A New Orchestration Tool for Claude Code with Channel Support and Security Layers

Stockade is an orchestration tool built around Anthropic's Agent SDK that provides channel-based session management, RBAC, and fine-grained permissions for AI agents. It addresses limitations in OpenClaw and NanoClaw by offering more control while maintaining security through containerization and credential proxies.

Apr 15, 2026, 03:22 PM UTC

OpenClawRadar

Tools

skill-depot: A Local-First Memory and Skill System for MCP-Compatible AI Agents

skill-depot is a retrieval system that stores agent knowledge as Markdown files and uses vector embeddings to semantically search and selectively load only relevant content. It runs 100% locally with no API keys, works with any MCP-compatible agent, and can be set up with npx skill-depot init.

Mar 27, 2026, 01:45 AM UTC

OpenClawRadar

Tools

Product Manager Shares 70+ Claude Skills for Automating PM Workflows

A product manager with 20 years experience has created over 70 Claude skills that automate common PM tasks, including PRD generation, user interview analysis, competitive profiling, and roadmap building. The skills are available as downloadable .md files for Claude Code.

Mar 12, 2026, 09:45 AM UTC

OpenClawRadar

Tools

civStation: A VLM System for Playing Civilization VI via Natural Language Commands

civStation is a computer-use VLM harness that plays Civilization VI by translating high-level natural language commands into in-game actions. The system uses a 3-layer architecture separating strategy and execution, with support for human-in-the-loop intervention.

Apr 13, 2026, 11:45 AM UTC

OpenClawRadar