Slack Rate Limit Changes Break OpenClaw Context Retrieval

A recent Slack API change has broken context retrieval for OpenClaw agents running in Slack workspaces. The change, which rolled out on March 3rd, imposes strict rate limits that most developers missed until their agents started malfunctioning.
The Problem
Slack now limits conversations.history and conversations.replies to 1 request per minute, 15 messages maximum for non-Marketplace apps. Since most OpenClaw agents are non-Marketplace apps, this means:
- Agents that previously pulled 50-100 messages for context now only get 15
- This represents an 85% reduction in context window
- The agent loses historical conversation context
Symptoms
- Agent forgets what was discussed earlier in the day
- Thread responses get weird after 15+ messages
- Agent asks questions you already answered
- Random latency spikes (429 retries)
Workarounds Attempted
- Caching messages locally — helped but only after the first request
- Pre-fetching during idle time — works great, builds up context over an hour
- Switching to Events API — the real fix. Events aren't rate limited. Subscribe to message events and maintain your own message store.
Recommended Solution
The author switched to SlackClaw (slackclaw.ai) which:
- Uses the Events API by default
- Maintains a persistent message store
- Eliminates polling and rate limits
- Has no 15-message ceiling
- Uses a Marketplace-registered gateway, so rate limits don't apply to necessary API calls
Long-term Recommendation
For developers building their own fix: the Events API approach is the right long-term solution. Slack is clearly moving toward restricting polling-based access. Build around events and local state, not API calls.
Documentation Note
The conversations.history throttle was documented in GitHub issue #38112, but most people missed it.
📖 Read the full source: r/openclaw
👀 See Also

M5 Max vs M3 Max Inference Benchmarks for Qwen Models on oMLX
Benchmarks comparing M5 Max and M3 Max MacBook Pros running Qwen 3.5 models via oMLX v0.2.23 show M5 Max delivering 1.4-1.7x faster token generation and up to 4x faster prefill at long contexts.

Reddit discussion argues AI competition is closed vs open source, not US vs China
A r/LocalLLaMA post argues that framing AI competition as America vs China is a false narrative used to influence investors and politicians, with the real battle being between closed and open source models. The author notes Chinese labs are open sourcing models primarily for market relevance, not magnanimity, and could go closed source as market conditions change.

Apple's AI Strategy and the Commoditization of Intelligence
The article argues that Apple's conservative approach to AI may be advantageous as intelligence becomes commoditized, with models like Gemma4 achieving 85.2% on MMLU Pro while running on phones, and OpenAI's Sora costing $15M daily against $2.1M revenue.

PrismML's Bonsai 1-bit Qwen models tested: 107 t/s generation on 8GB VRAM
Bonsai models from PrismML are 1-bit quantized versions of Qwen3 8B, 4B, and 1.7B that achieve 107 tokens/second generation and >1114 t/s prompt processing on an RTX 4060 with 8GB VRAM, with significantly reduced memory requirements.