Opus 4.7 Broke 40% of Prompts; Fix Was Structuring CLAUDE.md and Skills

When Opus 4.7 dropped in April, about 40% of the prompts across 6 mid-market company setups broke overnight. Token burn went up, outputs became weirdly literal — 4.6 had been filling in ambiguous instructions, but 4.7 didn't. The fix wasn't rewriting prompts; it was finally taking CLAUDE.md and Skill files seriously.
What broke and why
Prompts written for 4.6 assumed the model would be charitable about vague instructions. 4.7 interpreted them literally, causing outputs that needed 3-4 turns to correct. Prompts that survived were those baked into Skill files with explicit output formats, length caps, and worked examples.
The rebuild approach
Across the 6 setups, three structural changes were made:
- Skills replaced standalone prompts — anything done more than 3 times got a Skill file (50–200 lines) with audience, output format, length, and a 2-3 sentence worked example. Skills are loaded on demand instead of bloating context.
- Hierarchical CLAUDE.md — one global file for user identity, business, voice rules; a project-level CLAUDE.md per engagement; session-level instructions for one-offs. Model reads in order and builds a mental model that survives across sessions.
- Memory files broken out — kept CLAUDE.md under 400 lines; detailed institutional knowledge lives in separate files that CLAUDE.md points to, loaded on demand.
- Verification step in long Skills — model generates output, checks against a 5–7 item checklist, revises. Adds 30s per call but cut downstream cleanup ~70%.
Results after 3 weeks
- Average prompt-to-acceptable-output dropped from 3-4 turns to 1-2.
- Token usage dropped 22% across workspaces.
- "This output is weird, let me try again" rate dropped from once per 4 prompts to once per 15.
- Next model release should now be a net positive, not net negative.
Still unsolved: versioning CLAUDE.md
Project-level files are in git, but the global CLAUDE.md lives in chat history, which is fragile. No rollback mechanism yet.
Mental model
The model is the engine. Skills + CLAUDE.md + memory is the car. Build the car once; each new engine makes it faster.
📖 Read the full source: r/ClaudeAI
👀 See Also

Three-layer memory architecture for persistent OpenClaw agent context
A developer built a 3-layer memory system on top of OpenClaw's infrastructure to prevent agents from starting each session without context. The architecture includes L1 workspace files injected every turn, L2 semantic memory search, and L3 reference documents opened on demand.

Practical Guide to Self-Hosting Your First LLM
A Reddit post outlines reasons for self-hosting LLMs including privacy for sensitive data, cost predictability for agent workloads, performance improvements by removing API roundtrips, and customization through fine-tuning methods like LoRA and QLoRA.

Fix for Claude VS Code Extension Error: 'command claude-vscode.editor.openLast not found'
The Claude VS Code extension version 2.1.51 contains a breaking bug that causes the error 'command claude-vscode.editor.openLast not found'. The workaround is to downgrade to version 2.1.49.

OpenClaw setup for human-in-the-loop browser automation with Docker, Chromium, and noVNC
A developer shares their Docker container setup that enables OpenClaw to handle CAPTCHAs and approvals mid-run by using Chromium with noVNC for remote access, requiring ~300MB RAM and 3-second cold starts.