ProofShot: CLI for AI Agents to Verify UI Code with Browser Recording

What ProofShot Does
ProofShot is a CLI tool that gives AI coding agents visual verification capabilities. It allows agents to see what the UI they build actually looks like in the browser, detect layout issues, and capture console errors.
How It Works
The tool operates through three main commands:
proofshot start --run "npm run dev" --port 3000- Launches your dev server, opens headless Chromium, and starts recording video- Your AI agent then executes actions like
proofshot exec navigate "http://localhost:3000"andproofshot exec screenshot "homepage"to navigate, click, fill forms, and take screenshots proofshot stop- Collects errors, stops recording, trims dead time, and generates proof artifacts
Output and Features
ProofShot generates a standalone HTML file containing:
- Video playback of the browser session synced with an action timeline
- Screenshots taken during the session
- Element labels for each action
- Browser console errors captured during the session
- Server logs scanned with pattern matching for JavaScript, Python, Go, Rust, and other languages
- PR-ready artifacts including SUMMARY.md and formatted output for pull requests
- Visual diff comparison against baselines
Technical Details
The tool is:
- Built on agent-browser from Vercel Labs (described as "far better and faster than Playwright MCP")
- Not a testing framework - the agent doesn't decide pass/fail, it just provides evidence
- Agent-agnostic - works with Claude Code, Cursor, Codex, Gemini CLI, Windsurf, and any MCP-compatible agent
- Packaged as a skill so AI agents know exactly how it works
- Open source with MIT license
Installation and Setup
$ npm install -g proofshot
$ proofshot install
The tool automatically trims dead time from recordings, so you see only what the agent actually did, not idle waiting periods.
📖 Read the full source: HN LLM Tools
👀 See Also

Building a Self-Improving Dream Cycle with Cron Jobs and Claude
A developer built an autonomous dream cycle using two cron jobs: one at 10:30 PM for research and reflection, and another at 11:00 PM for review and planning. The system scans arXiv, GitHub trending, and Reddit, identifies weaknesses, and proposes concrete improvements.

StartClaw: A headless browser automation tool built on ZeroClaw with Claude integration
StartClaw is a browser automation tool built on ZeroClaw's Rust base with Composio v3 for integrations, designed to run headless in the cloud without requiring local hardware. It uses Claude exclusively for reliability and includes built-in context compaction that reduces token usage by ~5x.

Hollow AgentOS: Run Claude-like agents locally on RTX 5070 using Qwen 3.5 9B
A self-modifying agent system running Qwen 3.5 9B on local hardware cuts Claude API costs by 50%. Uses iterative testing and self-improvement loop to develop software without human intervention.

Introducing Aionic Anthology: A Framework for Structuring Claude's AI Tasks
The Aionic Anthology framework organizes Claude's AI tasks by separating context into categories and adding a risk evaluation system to improve task execution.