OpenClaw Failure Patterns: 42 Incidents in 28 Days

What This Is

A detailed field guide from a developer who ran OpenClaw daily for 28 days, documenting 42 real incidents where the AI agent system broke. The source organizes failures into eight categories with specific examples and lessons learned.

Key Failure Categories and Examples

1. AI Confidently Reports Things That Didn't Happen

Morning report hallucination: Cron job reported "quiet night" when significant work had actually been done overnight. The AI didn't check anything, just made up plausible-sounding information.
Memory search vs. reality: Asked to enumerate available tools, the AI searched its notes ABOUT tools instead of checking actual tool definitions, reporting capabilities that didn't exist while ignoring real ones.
The "I'll be sharper" non-fix: After making errors, the AI responded with "I'll be sharper" promises with no actual mechanism. Same errors repeated.

Lesson: Any AI system that reports, summarizes, or monitors needs explicit verification steps. "Check the data" is not the same as "run this specific query and report the result." Vague instructions produce confident fiction.

2. Authentication Dies Constantly

Google OAuth 7-day trap: OAuth app left in "testing" mode caused tokens to expire every 7 days. Email and calendar access died repeatedly for 14 days before a 15-minute fix (publishing the app to production).
Google suspended the AI's account: Google account made for the bot was flagged as bot-created and suspended, causing 24 hours of zero email access.
LinkedIn cookies rotate aggressively: li_at cookie expired at least 3 times in the first week, killing all LinkedIn automation until manual browser refresh.
Twitter env var name mismatch: Tool expected AUTH_TOKEN but system stored TWITTER_AUTH_TOKEN, causing silent failure with no error messages.
Kimi fallback model just died: Third-party model API returned 401 without warning, leaving system running with zero fallback for days.

Lesson: Every AI integration that touches external services will break regularly through authentication failures. Budget for it, monitor it, have fallbacks.

3. The Smartest Model Makes the Dumbest Mistakes

Opus adding properties to files: Using Opus 4.6 for simple cron jobs caused it to "creatively" add unwanted metadata to files, creating orphan pages in the knowledge base.
AI content sounds like AI: Full content pipeline (scrape 743 posts, analyze patterns, generate drafts) produced posts that read like AI wrote them. Framework posts got 0 likes while personal posts written by hand got 6 likes and 2 comments in 2 hours.
Long-form rewrites sucked: Two AI-generated drafts of an article came back as generic summaries. The developer had to park the article.

Lesson: More expensive models are not always better. Use the cheapest model that gets the job done. Never let AI be the final voice for anything that needs to sound human.

4. Automation That Saves Time Costs Time

23 iterations for one infographic: HTML/CSS to Chrome headless to PNG consumed an entire day for one visual asset. "AI can generate images, but generate and generate what you actually want are separated by 22 revisions."
4 hours of cleanup per 1 hour "save": The source notes this pattern but doesn't provide the complete example.

Additional Failure Categories Mentioned

The source mentions eight total categories but only details four in the provided text. The remaining categories are referenced but not elaborated.

Who This Is For

Developers building or using AI agent systems who want to understand real-world failure patterns and practical mitigation strategies.

📖 Read the full source: r/openclaw

OpenClaw Failure Patterns: 42 Real Incidents in 28 Days

What This Is

Key Failure Categories and Examples

1. AI Confidently Reports Things That Didn't Happen

2. Authentication Dies Constantly

3. The Smartest Model Makes the Dumbest Mistakes

4. Automation That Saves Time Costs Time

Additional Failure Categories Mentioned

Who This Is For

👀 See Also

Practical Framework for Choosing Between Claude's Haiku, Sonnet, and Opus Models

Optimizing Qwen 3.6 27B/35B on RTX 3090: Flags, Quantization, and Auto-Routing

Claude Code O365 MCP Conditional Access Setup Issues and Solutions

Migrating OpenClaw agents to Claude Code after third-party harness deprecation