Two $0 OpenClaw setups using free cloud models or local Ollama

An OpenClaw user reports running an agent for free for three weeks, handling about 70% of tasks previously paid for with Claude. The setup offers two paths: free cloud models with rate limits or local models via Ollama with zero ongoing costs.
Path 1: Free cloud models (no hardware needed)
This approach requires only an existing OpenClaw installation and free API tiers:
- OpenRouter free tier: Sign up at openrouter.ai with no credit card. Offers 30+ free models including Llama 3.3 70B, Nemotron Ultra 253B (262K context), MiniMax M2.5, and Devstral. Configuration example:
{
"env": { "OPENROUTER_API_KEY": "sk-or-..." },
"agents": {
"defaults": {
"model": {
"primary": "openrouter/nvidia/nemotron-ultra-253b:free"
}
}
}
}
For automatic model selection: "primary": "openrouter/openrouter/free"
- Gemini free tier: Google provides 15 requests per minute on Gemini Flash for free. Get an API key from ai.google.dev and run
openclaw onboard, selecting Google as the built-in provider. - Groq: Fast with rate-limited free tier suitable for basic agent tasks.
The catch: rate limits. For light to moderate daily use (10-20 interactions), pauses are barely noticeable. For 100+ tasks daily, this won't work.
Path 2: Local models via Ollama (truly $0, forever)
Ollama became an official OpenClaw provider in March 2026. This setup has no API keys, accounts, rate limits, or data leaving your machine.
Setup steps:
- Install Ollama:
curl -fsSL https://ollama.com/install.sh | sh - Pull a model based on your VRAM:
- 20GB+ VRAM (RTX 3090, 4090, M4 Pro/Max):
ollama pull qwen3.5:27b - 16GB VRAM:
ollama pull qwen3.5:35b-a3b - 8GB VRAM (most laptops):
ollama pull qwen3.5:9b
- 20GB+ VRAM (RTX 3090, 4090, M4 Pro/Max):
- Run
openclaw onboardand select Ollama, or use manual setup withexport OLLAMA_API_KEY="ollama-local"
Qwen3.5 27B is noted as the current sweet spot for OpenClaw, handling tool calling well for daily agent tasks. The 35b-a3b mixture-of-experts variant runs at 112 tokens/second on an RTX 3090 by activating only 3B parameters at a time.
Manual configuration example:
{
"models": {
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434",
"apiKey": "ollama-local",
"api": "ollama",
"models": [
{
"id": "qwen3.5:27b",
"name": "Qwen3.5 27B",
"reasoning": false,
"contextWindow": 131072,
"maxTokens": 8192
}
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "ollama/qwen3.5:27b"
}
}
}
}
Important debugging notes:
- Use the native Ollama API URL (
http://localhost:11434), NOT the OpenAI compatible one (http://localhost:11434/v1). The /v1 path breaks tool calling, causing raw JSON output as plain text. - Set
"reasoning": falsein the model configuration.
📖 Read the full source: r/clawdbot
👀 See Also

iOS Shortcut Workaround for Sending iPhone Photos to Cowork via iCloud Sync
A developer created an iOS Shortcut called "PhoPo" that converts iPhone photos to JPEG, resizes them, and saves them to an iCloud-synced folder that Cowork can access, enabling Claude to analyze screenshots and photos from mobile devices.

CLAUDE.md Files Are Often Organized for Developers, Not AI Models – Here's Why That Matters
CLAUDE.md files commonly place Hard Rules at line 47, after background and tech stack. By the time the model reads constraints, it has already constructed conflicting assumptions. A better structure puts hard rules first.

Local Translation Model Recommendations for 32GB VRAM GPUs
A developer shares tested recommendations for local translation models on a 32GB VRAM setup, highlighting Unsloth Gemma3 27b Instruct UD Q6_K_XL for general languages and Bartowski Utter Project EuroLLM 22B Instruct 2512 Q8_0 for European languages plus Korean.
