AGENTS.md Done Right: A 25% Correctness Boost — or a 30% Drop

✍️ OpenClawRadar📅 Published: April 28, 2026🔗 Source
AGENTS.md Done Right: A 25% Correctness Boost — or a 30% Drop
Ad

Augment Code ran a systematic study on AGENTS.md files across their monorepo. The best files gave their coding agent a quality jump equivalent to upgrading from Haiku to Opus; the worst made output worse than having no AGENTS.md at all. The same file boosted best_practices by 25% on a routine bug fix and dropped completeness by 30% on a complex feature task in the same module. Here's what works.

How They Measured

They used AuggieBench, an internal eval suite. They started with high-quality PRs from a large repo that reflect typical day-to-day agent tasks, set up the environment and prompt, and asked the agent to reproduce the PR. They compared output against the golden PR (the version that landed after review by multiple senior engineers). PRs had to be contained within a single module or app, and scope had to be one where an AGENTS.md might plausibly help. Each task ran twice — with and without the file.

Ad

What Works

1. Progressive Disclosure > Comprehensive Coverage

Cover common cases and workflows at a high level; push details into reference files the agent can load on demand. Keep each reference's scope clear. Files of 100–150 lines with a handful of focused reference documents delivered 10–15% improvements across metrics in mid-size modules (~100 core files). Beyond that length, gains reversed.

2. Procedural Workflows

A numbered, multi-step workflow can move the agent from failing to finishing. Example: a six-step workflow for deploying a new integration. Missing wiring files dropped from 40% to 10%, agent finished faster, correctness went up 25%, completeness up 20%. Keep the main file concise and use reference files for branching cases.

3. Decision Tables

When two or three reasonable ways exist (e.g., React Query vs Zustand for state management), force the choice up front with a table. Example:

Question → React Query → Zustand
Server is the only data source? ✅
Multiple code paths mutate this state? ✅
Need optimistic updates mixed with local state? ✅

PRs in that area scored 25% higher on best_practices.

4. Short Production Examples

3–10 line snippets from actual production code improved reuse and pattern adherence. Example: copy-paste templates for Redux Toolkit primitives (createSlice with typed initial state, createAsyncThunk with error handling, typed useAppSelector). code_reuse went up 20%.

5. Domain-Specific Rules

Still matter — the pattern most people already associate with AGENTS.md.

📖 Read the full source: HN AI Agents

Ad

👀 See Also

OpenClaw 3.22 Upgrade Checklist: Practical Steps from a Developer Who Got Burned
Guides

OpenClaw 3.22 Upgrade Checklist: Practical Steps from a Developer Who Got Burned

A developer shares specific upgrade steps for OpenClaw 3.22, including checking for deprecated environment variables, creating backups, running migration commands, and verifying plugin compatibility.

OpenClawRadar
OpenClaw v2.0 Update: Critical Pre-Update Checklist to Avoid Breaking Changes
Guides

OpenClaw v2.0 Update: Critical Pre-Update Checklist to Avoid Breaking Changes

OpenClaw's latest update introduces 12 breaking changes, a new plugin system, and 30+ security patches. This guide outlines five essential checks to perform before updating, including environment variable renaming, state directory migration, and browser automation reconfiguration.

OpenClawRadar
Three Essential OpenClaw Skills for a Stable Setup: Memory, Security, and Discovery
Guides

Three Essential OpenClaw Skills for a Stable Setup: Memory, Security, and Discovery

A Reddit post recommends installing three specific types of OpenClaw skills first: a memory fix skill to prevent context loss, a local security vetting skill to check for malicious code, and a curated discovery hub to find maintained tools.

OpenClawRadar
OpenClaw setup tips from a user's experience: Gmail MCP, profile flags, and networking issues
Guides

OpenClaw setup tips from a user's experience: Gmail MCP, profile flags, and networking issues

A user running OpenClaw on a Mac via UTM with Ubuntu VM shares specific configuration issues encountered: the Gmail MCP server requires html_body instead of body parameter, the --profile prod flag is needed to avoid a hardcoded dev identity, and API keys must be placed in auth-profiles.json via paste-token command.

OpenClawRadar