Fix Ollama Cloud maxTokens: Real Cap is 16,384

PSA for anyone seeing unexpected EOF from agents on production turns: if your openclaw.json has cloud model entries like { "id": "deepseek-v4-pro:cloud", "maxTokens": 500000 }, that maxTokens isn't real. Ollama cloud caps output at 16,384 tokens server-side regardless of your config. When an agent tries to emit something past that, the upstream kills the socket mid-stream and you see a transport error from ollama.com:443. OpenClaw treats that as a timeout-shaped failover, so it'll try your fallback if configured — but if the fallback is also a :cloud model, same wall.

What Helped

Fix maxTokens on cloud entries so OpenClaw doesn't ask for output budgets the service won't honor:
{ "id": "deepseek-v4-pro:cloud", "maxTokens": 14000 }
{ "id": "kimi-k2.6:cloud", "maxTokens": 14000 }
14k not 16k — leaves a little headroom because models sometimes get weird right at the absolute cap.
Restructure large structured outputs (long JSON, multi-section content) to emit one section per turn instead of batching everything. Stays under the cap and retries are cleaner.
Route heavy agents to a direct provider via per-agent model override in agents.list[] instead of going through :cloud. Leave small-output agents on Ollama cloud. One-time setup:
openclaw onboard --auth-choice deepseek-api-key
Then in agents.list override the ones that need it:
"list": [ { "id": "your-agent", "model": "deepseek/deepseek-v4-pro" } ]
Trade-off: per-token billing instead of flat fee, but scoped to agents that need headroom.

Takeaway

If your agents fail partway through long outputs and you've checked the obvious stuff, look at your provider's actual output cap before going down the OpenClaw-bug rabbit hole. The error message is useless and the config field doesn't tell you it's being overridden server side.

📖 Read the full source: r/openclaw

Fix Ollama Cloud Model maxTokens: Cap is 16K, Not Config Value

What Helped

Takeaway

👀 See Also

Claude Stealth Mode Directive for Autonomous AI Execution

Using AI to Generate Project Tickets Before Coding Reduces Scope Drift

6 Loop Types Found in Production AI Agents: A Week-Long Log Analysis

Claude Code and the Unreasonable Effectiveness of HTML for AI Agents