Agent-Xray: Open-source tool for debugging AI agent failures from trace logs

✍️ OpenClawRadar📅 Published: April 15, 2026🔗 Source

Agent-Xray is an open-source tool for debugging AI agents by analyzing their trace logs. It was created to solve the problem of agents failing tasks without clear errors—situations where code runs fine but the agent makes wrong decisions, like repeatedly calling the wrong tool despite error messages suggesting the correct one.

Key Features

The tool reads trace logs and provides structural grading and root-cause classification for agent failures. It reconstructs what the agent was seeing at each step to help understand why bad decisions were made.

Failure Categories

spin
tool_bug
early_abort

Enforcement Mode

The most significant feature according to the creator is enforcement mode. After fixing an agent bug, this mode runs adversarial challenges against your fixes to verify they're legitimate. It checks for:

Hardcoded returns
Weakened assertions

This addresses the problem where fixes might work on specific test tasks but are actually fragile, or where agents learn to game the test.

Workflow Integration

The tool runs as MCP tools, allowing Claude Code to use it directly. A typical workflow described in the source:

Tell Claude Code to triage agent traces
It finds the worst failure
Replays what the agent saw
Suggests a fix
Enforcement mode verifies the fix is legitimate

The creator describes this as "agents debugging agents."

Technical Details

Installation: pip install agent-xray
Quickstart: agent-xray quickstart (includes sample traces to test without your own data)
License: MIT
Zero dependencies
Runs offline
Works with OpenAI, Anthropic, LangChain, CrewAI, OpenTelemetry traces
Project age: About 9 days old at time of posting

Use Case

This tool is for developers working with AI agents who need to debug failures that don't produce traditional errors or stack traces—situations where agents make incorrect decisions despite having access to correct tools and information.

📖 Read the full source: r/ClaudeAI

👀 See Also

Tools

Using MCP Code Mode for Efficient Claude Keyword Research

A developer built an MCP server that enables Claude to perform autonomous keyword research using a Code Mode pattern, reducing tool definition tokens from thousands to ~1,000 with just two tools: search and execute.

Mar 11, 2026, 01:45 AM UTC

OpenClawRadar

Tools

MCP Server for Local XMind Mind Map Files Released

A developer has published an MCP server that provides 22 tools for reading and writing local XMind mind map files. The server works with MCP-compatible AI clients like Claude Desktop and Cursor.

Apr 19, 2026, 10:45 AM UTC

OpenClawRadar

Tools

Zeude: Self-Hosted Monitoring Dashboard for Claude Code and OpenAI Codex

Zeude is a self-hosted dashboard that tracks Claude Code and OpenAI Codex usage, providing per-prompt token and cost breakdowns, weekly leaderboards, and team skill management. Version 1.0.0 adds Windows support, Codex integration, and per-user skill opt-out.

Apr 16, 2026, 03:45 AM UTC

OpenClawRadar

Tools

Claude Code Skill /council Runs Prompts Across 4 AI Models in Parallel

A Claude Code skill called /council sends any prompt to GPT, Claude, Gemini, and Grok simultaneously in about 7 seconds, then uses Gemini to synthesize the best response by identifying specific improvements from the other models.

Apr 2, 2026, 01:45 AM UTC

OpenClawRadar