OpenClaw v2026.3.13: Per-Agent Cache Retention Cuts Token Costs 90%

What changed in v2026.3.13

OpenClaw version 2026.3.13 added proper configuration validation for params.cacheRetention on per-agent entries. This allows you to set cache retention declaratively in your openclaw.json configuration file.

The problem with default cache behavior

OpenAI supports extended prompt cache retention (24 hours) via prompt_cache_retention: "24h" in their API, which keeps your prompt prefix cached for 24 hours instead of the default 5-10 minutes. Cached input tokens are billed at 50% off.

If you're running agents on heartbeat cycles longer than 10 minutes (which the source notes is "basically everyone"), your cache goes completely cold between every single turn. This means you're paying full price for the entire input context on every heartbeat.

The source describes a setup with 15 agents on GPT-5.2 with heartbeats every 60-90 minutes where every heartbeat was a guaranteed cold start. The system prompt, bootstrap context, HEARTBEAT.md, AGENTS.md, SOUL.md, tool definitions — all of it was re-sent at full price every cycle because the cache expired in the gap between heartbeats.

How to configure it

You can now set cache retention in your openclaw.json:

{
  "agents": {
    "list": [
      {
        "agentId": "my-agent",
        "params": {
          "cacheRetention": "long"
        }
      }
    ]
  }
}

The "long" value maps to OpenAI's prompt_cache_retention: "24h" through the pi-ai library.

Important caveat: runtime patch required

OpenClaw's resolveCacheRetention() function has a guard clause that blocks OpenAI providers by default. It only allows Anthropic and Bedrock through. So even with the config set, the value gets filtered out before it reaches the API.

You need the runtime patch from issue #27515 to make it work. The patch adds OpenAI to the allowed provider list in the guard clause. Without both the config AND the patch, nothing happens.

The source author notes they had the patch applied for weeks but never set the config value — meaning the patch was checking extraParams?.cacheRetention !== void 0, getting undefined, and still blocking OpenAI. The patch was doing nothing without the configuration.

Cost savings potential

With 15 agents doing heartbeats, each sending ~128K-170K input tokens per turn:

Without 24h cache: 100% of input tokens at full price, every turn. Cache dies in the ~60-90 min gap between heartbeats.
With 24h cache: The stable prefix (system prompt, agent config, tool definitions — typically 80-90% of input) stays cached across heartbeats. Those tokens are billed at half price.

On a system running 15 agents across a full business day, that's hundreds of heartbeat cycles per day where the bulk of input tokens shift from full price to half price. The input cost reduction compounds fast.

📖 Read the full source: r/openclaw