How a /loop Command Burned $6,000 in Claude API Overnight

A Reddit user reported waking up to find their Claude usage limit exhausted after a single /loop 30m check my PRs command ran 46 times over 26 hours unattended on claude-opus-4-7, burning roughly $6,000. The root cause: prompt caching behavior combined with a long-lived session.
Here's the technical breakdown:
- Context window grows on every iteration: Each API call sends the entire conversation history. Turn 1 might be a few hundred tokens; turn 46 sends 800K tokens. You pay for everything sent on each turn.
- Prompt caching expires after ~5 minutes: Anthropic caches conversation history at a 12.5× discount if reused within the cache window. But with
/loop 30m, the 30-minute gap exceeds the 5-minute cache TTL. Each iteration pays the expensive write rate to re-cache the entire growing context from scratch. - Output adds to context: Each loop iteration appends its output to the conversation, making the next re-cache even larger. By hour 20, the session hit ~800K tokens.
- Dashboard lag hides the damage: The Anthropic usage dashboard has a multi-day reporting delay. The only real-time signal was the limit notification email — by then the money was already spent.
The user's key recommendations to avoid this:
- Add a stop condition: Instead of bare
/loop 30m check my PRs, write/loop 30m check my PRs — stop when all are merged or after 3 hours. Claude terminates the loop when the condition is met. - Use Sonnet for unattended tasks: Opus is ~5× more expensive per output token. For polling tasks like PR checks, Sonnet is sufficient. Reserve Opus for sessions where you're present.
- Don't trust the dashboard: It lags by days. Rely on usage limit emails for real-time billing signals.
- Fresh sessions are cheaper: Long-lived sessions compound costs because every call with a gap >5 minutes pays to re-cache the full context. Starting a new session resets the context and avoids this.
max_turnsis not a loop limiter: It caps tool-call chains within a single iteration, not how many times the loop fires. The only built-in expiry on/loopis a 7-day auto-deletion.
The loop runs in the main conversation, so if you keep the same session active, each loop execution reads and writes far more tokens than necessary — amplifying costs exponentially.
If you automate Claude with /loop, always set a stop condition, use a cheaper model, and monitor with external tools. The cache discount only helps when calls are frequent enough to stay within the TTL.
📖 Read the full source: r/ClaudeAI
👀 See Also

Firefox Workaround for Claude.ai Freeze Issue Using Tampermonkey Script
A Reddit user shares a Tampermonkey script workaround for Firefox users experiencing freezes on Claude.ai. The script modifies Date.now() behavior to prevent timing conflicts that cause the interface to hang.

How to Disable Claude Code's 1M Context Window to Reduce Token Usage
Anthropic users can disable the 1M context window in Claude Code by adding environment variables to settings.json, which may reduce unexpected token consumption. The source provides two configuration options: completely disabling 1M context or capping the auto-compact window.

Running OpenClaw Inside Ollama's Docker Container for Simpler Networking
A Reddit user shows how to install OpenClaw inside the official ollama/ollama Docker container so OpenClaw talks to Ollama via localhost, avoiding host.docker.internal and extra networking setup. Trade-off is higher RAM usage.

Claude Prompt Codes Retested: L99 Sharper, OODA Narrower, ARTIFACTS Faded, and 3 New Codes to Use
A 6-month retest of L99, OODA, and ARTIFACTS prompt codes on Claude shows L99 sharper on Sonnet 4.6/Opus 4.7, OODA failing on strategic prompts, ARTIFACTS unnecessary for code, and three new codes (/skeptic, /blindspots, /decompose) earning daily use. Stack no more than 2 codes.