Relvy improves Claude's root cause analysis accuracy by 12 percentage points on OpenRCA benchmark

Relvy is a tool that automates runbooks, and it has shown measurable improvements in AI agent performance on a specific benchmark. According to the source material, Relvy improves Claude's root cause analysis accuracy by 12 percentage points on the OpenRCA benchmark.
Key Details
The information comes from a Hacker News post titled "OpenRCA benchmark – Improving Claude's root cause analysis accuracy by 12 pp." The post received 11 points. The linked article is from Relvy's blog, which describes the tool as "Your runbooks, automated."
Root cause analysis (RCA) is a critical process in software engineering and IT operations for identifying the underlying reasons for incidents or failures. The OpenRCA benchmark appears to be a test suite for evaluating how well AI agents can perform this diagnostic task. A 12 percentage point improvement represents a significant gain in accuracy for this type of reasoning task.
For developers using AI coding agents like Claude, tools that can reliably improve the agent's performance on technical, diagnostic work are directly relevant. Automating runbooks—predefined procedures for handling common operational tasks—is a practical application of AI agents in DevOps and SRE contexts.
📖 Read the full source: HN AI Agents
👀 See Also

Zot Chrome Operator: Let Your Terminal AI Agent Drive the Browser via Side Panel
A Chrome extension + local bridge that lets zot, a terminal-based coding AI, control browser tabs through a `browser_action` tool. Install in two commands, no zot changes required.

Claude Pulse Browser Extension Surfaces Token Counts, Cache Timers, and Rate Limits on Claude.ai
Claude Pulse is a client-side Chrome extension that adds a real-time dashboard to Claude.ai showing per-message token counts, total context usage, prompt cache expiry timer, and rate limit progress bar. Also includes chat export to Markdown.

altRAG: Replace Vector DB RAG with 2KB Pointer Files for AI Coding Agents
altRAG is a Python tool that replaces vector database RAG with lightweight pointer files. It scans Markdown/YAML skill files to create a 2KB skeleton file mapping sections to exact line numbers and byte offsets, allowing AI agents to read only needed sections instead of entire files.

SimplePDF Copilot: Client-Side AI Tool Calling for PDF Form Filling
SimplePDF Copilot uses client-side tool calling to let an LLM fill fields, add fields, delete pages, and more in PDFs — without the PDF leaving the browser.