Be My Butler: Multi-Agent Pipeline for AI Code Verification

What Be My Butler Does
Be My Butler (BMB) is a multi-agent pipeline designed to solve a specific problem in AI-assisted coding: when AI coding agents incorrectly report their own code as working. The creator, a materials/mechanical engineer with no programming background, built this after experiencing Claude Code agents writing code that passed tests but didn't actually work in practice.
Core Concept
The system implements a peer review model for AI-generated code:
- One model writes the code
- A different model reviews it without knowing who wrote it (blind verification)
- A cross-model council (Claude + GPT + Gemini) votes on whether it actually works
- An analyst agent tracks patterns in what goes wrong
Performance Metrics
From testing:
- Single-agent self-review catches ~40% of real issues
- Cross-model blind review catches ~85%
- Cost overhead: 15-20% more tokens
v0.2 Features
- Analytics dashboard to track token usage and costs
- Analyst agent for automated code review patterns
- Consultant agent for architecture decisions
- Improved tmux-based orchestration
Installation and Usage
Fully open source under MIT license. Installation:
git clone https://github.com/project820/be-my-butler.git
cd be-my-butler && ./install.sh
bmb "build a REST API with auth"The tool is particularly useful for "vibe coders" — people without traditional coding experience who depend on AI for code quality assessment. When you can't read code to spot issues yourself, having multiple models cross-check each other provides verification that single-agent systems lack.
📖 Read the full source: r/ClaudeAI
👀 See Also

Stockade: A New Orchestration Tool for Claude Code with Channel Support and Security Layers
Stockade is an orchestration tool built around Anthropic's Agent SDK that provides channel-based session management, RBAC, and fine-grained permissions for AI agents. It addresses limitations in OpenClaw and NanoClaw by offering more control while maintaining security through containerization and credential proxies.

skill-depot: A Local-First Memory and Skill System for MCP-Compatible AI Agents
skill-depot is a retrieval system that stores agent knowledge as Markdown files and uses vector embeddings to semantically search and selectively load only relevant content. It runs 100% locally with no API keys, works with any MCP-compatible agent, and can be set up with npx skill-depot init.

Product Manager Shares 70+ Claude Skills for Automating PM Workflows
A product manager with 20 years experience has created over 70 Claude skills that automate common PM tasks, including PRD generation, user interview analysis, competitive profiling, and roadmap building. The skills are available as downloadable .md files for Claude Code.

civStation: A VLM System for Playing Civilization VI via Natural Language Commands
civStation is a computer-use VLM harness that plays Civilization VI by translating high-level natural language commands into in-game actions. The system uses a 3-layer architecture separating strategy and execution, with support for human-in-the-loop intervention.