Feynman: Open Source Research Agent with Paper-Codebase Audit Tool

What Feynman Does
Feynman is an open source research agent CLI that handles research questions through a multi-agent architecture. When you ask a research question, it dispatches four subagents in parallel:
- Researcher: Searches papers and web
- Reviewer: Runs simulated peer review with severity grading
- Writer: Produces structured output
- Verifier: Checks every citation and kills dead links
Key Features from Source
The standout feature mentioned in the source is the audit tool: Feynman audit [arxiv-id] pulls a paper's claims and compares them against the actual public codebase. This addresses the common problem of wondering whether published code actually implements what's claimed in the paper.
Other capabilities include:
- Experiment replication on local or cloud GPUs via modal/runpod
- Literature reviews showing consensus vs disagreements vs open questions
- Deep research mode with multi-agent parallel investigation
- Option to install just the research skills into Claude Code or Codex without the full terminal app
Technical Details
- One command installation
- MIT license
- Built on pi for the agent runtime
- Uses alphaxiv for paper search
- 2.3k stars on GitHub at time of source publication
- Launch tweet received 2,768 bookmarks from an account with 1,400 followers
The architecture specifically addresses hallucination issues common in AI research tools by dedicating an entire agent to catching incorrect citations before they reach the user.
📖 Read the full source: r/LocalLLaMA
👀 See Also

AutoAgents Rust Framework Adds Python Bindings for Prototyping
AutoAgents, a Rust-based multi-agent framework, now has Python bindings that allow developers to prototype in Python while maintaining the same Rust core runtime, provider interfaces, pipeline model, and agent semantics. The bindings enable experimentation with local AI models without external systems.

Multi-Agent Loop Failures Are Org-Design Failures, Not Prompt Failures
Agent loops bouncing between peers aren't prompt bugs—they're org-chart problems. Treat agent networks as hierarchies with clear stop authority.

Aurelius: A React Framework Built with 48 Claude Code Agents and Figma-to-React Pipeline
Aurelius is an open-source React framework that uses 48 Claude Code agents organized hierarchically to autonomously build React applications from Figma designs. The framework enforces TDD, visual QA with pixel-diff comparison, and quality gates before deployment.

Prompt-Mini: Claude Code Plugin Intercepts Vague Prompts to Reduce Credit Waste
Prompt-mini is a Claude Code plugin that intercepts vague prompts before execution, asks clarifying questions, and builds structured prompts with stack detection and specific rules for 40+ frameworks. The tool addresses 35 credit-killing patterns like missing scope, stop conditions, and file paths.