Fix Opus 4.7 40% Prompt Degradation via CLAUDE.md & Skills

When Opus 4.7 dropped in April, about 40% of the prompts across 6 mid-market company setups broke overnight. Token burn went up, outputs became weirdly literal — 4.6 had been filling in ambiguous instructions, but 4.7 didn't. The fix wasn't rewriting prompts; it was finally taking CLAUDE.md and Skill files seriously.

What broke and why

Prompts written for 4.6 assumed the model would be charitable about vague instructions. 4.7 interpreted them literally, causing outputs that needed 3-4 turns to correct. Prompts that survived were those baked into Skill files with explicit output formats, length caps, and worked examples.

The rebuild approach

Across the 6 setups, three structural changes were made:

Skills replaced standalone prompts — anything done more than 3 times got a Skill file (50–200 lines) with audience, output format, length, and a 2-3 sentence worked example. Skills are loaded on demand instead of bloating context.
Hierarchical CLAUDE.md — one global file for user identity, business, voice rules; a project-level CLAUDE.md per engagement; session-level instructions for one-offs. Model reads in order and builds a mental model that survives across sessions.
Memory files broken out — kept CLAUDE.md under 400 lines; detailed institutional knowledge lives in separate files that CLAUDE.md points to, loaded on demand.
Verification step in long Skills — model generates output, checks against a 5–7 item checklist, revises. Adds 30s per call but cut downstream cleanup ~70%.

Results after 3 weeks

Average prompt-to-acceptable-output dropped from 3-4 turns to 1-2.
Token usage dropped 22% across workspaces.
"This output is weird, let me try again" rate dropped from once per 4 prompts to once per 15.
Next model release should now be a net positive, not net negative.