Bernstein: A Kubernetes-like orchestrator for AI coding agents with verification and model policies

✍️ OpenClawRadar📅 Published: April 13, 2026🔗 Source
Bernstein: A Kubernetes-like orchestrator for AI coding agents with verification and model policies
Ad

Bernstein is an orchestrator for AI coding agents that the creator describes as "Kubernetes for coding agents." Unlike simpler tools that spawn agents in parallel worktrees, Bernstein addresses what the developer calls "the other 95%" of the problem.

Key Features

The system includes several critical components:

  • Verification: A "janitor" component independently verifies agent outputs after every task. It runs tests, checks diffs, and lints output because "agents lie" - they may claim tests pass when they don't or say they committed files when they didn't.
  • Model Policy Engine: Provides allow/deny lists per provider, data residency constraints, preferred routing, and cost ceilings. The creator compares this to "K8s network policies but for LLM providers."
  • Deterministic Scheduling: Uses pure Python for scheduling instead of LLMs, creating deterministic control flow with zero LLM tokens spent on coordination. An epsilon-greedy bandit learns routing over time.
  • Agent-Agnostic Design: Includes 13 adapters for Claude Code, Codex, Gemini CLI, Cursor, Qwen, Aider, Amp, Roo Code, Goose, Kilo, Kiro, OpenCode, and generic agents. Claude Code has the deepest integration.
  • Scale Features: At 500K+ lines and ~5000 tests, Bernstein includes circuit breakers, cost anomaly detection, loop detection, deadlock detection, PII scanning, HMAC-chained audit logs, progressive permissions, and quarantine for suspicious output.
  • Self-Development: Can develop itself using bernstein --evolve.
Ad

Technical Details

The creator notes that spawning agents in worktrees is "the hello world of this space" and that most multi-agent frameworks use an LLM to schedule other LLMs, which is "slow, expensive, and non-deterministic." Bernstein's approach uses pure Python for deterministic control flow.

The project has been tested at scale with 500K+ lines of code and approximately 5000 tests. The developer built features like circuit breakers and anomaly detection because "things broke and these were the fixes."

The creator is a solo developer from Israel who mentions "building under rockets (literally)" and that the project has outgrown them, seeking contributors.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also