💡 Tips
Quick tips and tricks to boost productivity

Fix Ollama Cloud Model maxTokens: Cap is 16K, Not Config Value
Ollama cloud caps output at 16,384 tokens regardless of maxTokens config. Set to 14,000 to avoid EOF errors. Restructure long outputs or route to direct provider.

Most People Use Claude at 5% of Its Capacity – Here's How to Fix It
After 60+ hours testing prompts on Claude Opus 4.7, a user shares a 5-step recipe: assign role, load specific context, set constraints, define output format, add forcing function.

OpenClaw Dashboard Disconnecting After 2026.5.27 Update? Fix: Remove Stuck Update Launchd Job
After the 2026.5.27 update, a stuck update launchd job causes dashboard WebSocket disconnects and Telegram failures. Removing the job restores stability.

Claude CLI v2.1.154 Breaks Local vLLM — One-Line Patch Fixes It
Claude CLI ≥2.1.154 adds three new API roles (ctx, msg, system) that break local vLLM compatibility. A one-line patch to vLLM's Anthropic protocol restores it.

Worker Agents Shouldn't Write Memory Directly: A Curator-Agent Pattern
A Reddit post details a Memory Curator pattern that prevents worker agents from writing directly to shared memory, routing events through a validation and scoping layer.

Don't Just Paste the AI — Write Your Own Take
A direct plea to developers: stop copying AI chatbot answers verbatim. Use AI as a drafting partner, then rewrite the reply in your own words.

How to Stop Hitting Claude Limits: Treat Each Session Like a Token Budget
User shares how they fixed daily Claude limits by stopping message bloat — scope the task, load only relevant context, clear after each session. Includes practical workflow & infographic.

100K Lines of Rust with AI: Contracts, Spec-Driven Dev, and Performance
Cheng Huang built a Rust multi-Paxos engine with AI agents, achieving 300K ops/sec. Key techniques: AI-written code contracts, lightweight spec-driven development, and aggressive optimization.

Good AI-Assisted Development Happens at the Systems Level, Not the Task Level
A Reddit user explains how shifting from fixing AI agent output to designing constraints—like a linter rule that forces UI navigation—prevents entire classes of bugs permanently.

3 weeks of OpenClaw: token costs, loops, and compaction — lessons from the trenches
After burning tokens on heartbeat checks with Opus, fighting agent loops, and losing context to compaction, a Reddit user shares the hard-won fixes: use cheaper models for trivial tasks, write anti-loop rules, and save decision logs.

Routing Agent Subtasks to Cheaper Models Dropped Cost from $18 to $4 on Same Refactor
A developer cut agent run costs from $18 to $4 by routing routine subtasks (lint, rename, config edits) to cheap models like DeepSeek V4 Pro and Tencent Hunyuan Hy3, reserving Opus 4.7 for complex reasoning.

Treating Agent Runs as Review Packets: A Practical Pattern for Claude Code & Codex
A developer shares how producing a structured folder per agent run (research, drafts, evals, approval packet, metrics, memory) makes failures visible and iterations faster.