OpenClaw LLM Timeout Fix for Cold Model Loading

Issue: Cold Model Timeouts at 60 Seconds
Users reported that cold-loaded local models in OpenClaw would consistently fail after approximately 60 seconds, despite having the general agent timeout set much higher. This issue also occurred with cloud models via Ollama and sometimes OpenAI Codex.
The typical failure pattern:
- Models work if already warm
- Cold models die around ~60 seconds
- Logs mention timeout / embedded failover / status: 408
- Fallback model takes over
Misleading Configurations
The source warns that several obvious configuration options are NOT the real fix and can send developers down the wrong path:
agents.defaults.timeoutSeconds.zshrcexportsLLM_REQUEST_TIMEOUT- Blaming LM Studio / Ollama immediately
Root Cause
The issue stems from OpenClaw having a separate embedded-runner LLM idle timeout for the period before the model emits the first streamed token.
Source trace found in:
src/agents/pi-embedded-runner/run/llm-idle-timeout.ts
Default value:
DEFAULT_LLM_IDLE_TIMEOUT_MS = 60_000
The configuration path resolves from:
cfg?.agents?.defaults?.llm?.idleTimeoutSeconds
So the actual configuration parameter is:
agents.defaults.llm.idleTimeoutSeconds
The Fix
After testing, the working configuration is:
{
"agents": {
"defaults": {
"llm": {
"idleTimeoutSeconds": 180
}
}
}
}
Testing showed that a cold Gemma call that previously failed around 60 seconds survived past that threshold and eventually responded successfully without immediate failover.
Recommended Permanent Configuration
{
"agents": {
"defaults": {
"timeoutSeconds": 300,
"llm": {
"idleTimeoutSeconds": 300
}
}
}
}
The recommendation of 300 seconds accounts for local models being unpredictable, where false failovers are more problematic than waiting longer for genuinely cold models.
📖 Read the full source: r/openclaw
👀 See Also

Worker Agents Shouldn't Write Memory Directly: A Curator-Agent Pattern
A Reddit post details a Memory Curator pattern that prevents worker agents from writing directly to shared memory, routing events through a validation and scoping layer.

The Mother-In-Law Method: Weaponizing Claude's Agreeableness for Brutal Code Reviews
A Reddit user tricks Claude into harsh code reviews by framing the code as written by a hated mother-in-law, resulting in 27 issues found across 4 hostile reviewer agents after 31 minutes of deep analysis.

Using Light-Context Cron Jobs for Daily OpenClaw Tips
A user shares their setup of a daily cron job that posts OpenClaw tips to a Nextcloud Talk channel, highlighting the --light-context flag to reduce bootstrap overhead for isolated tasks.

Fixing AI Agent Dumbness: A Shared Context Tree per Repo
The reason AI employees feel dumb isn't the model—it's lack of shared context. One developer's fix: a context tree repo with hierarchical markdown nodes, auto-maintained by the agent.