ClawRelay: macOS-native OpenAI-compatible LLM proxy with automatic failover

✍️ OpenClawRadar📅 Published: April 17, 2026🔗 Source

What ClawRelay does

ClawRelay is a native Swift application for macOS 15+ that runs an OpenAI-compatible HTTP server locally. You configure LLM providers in priority order (OpenAI, Groq, Nvidia NIMs, Ollama, or any service with a /v1/chat/completions endpoint). When a request comes in, it tries the first provider and automatically falls back to the next if there's a failure (rate limit, 5xx error, or timeout).

Setup and configuration

The app runs in the system tray with quick access and a full settings window. Provider API keys are stored in macOS Keychain. No Docker, Node.js, or config files are required.

To connect your tools:

Base URL: http://localhost:11434/v1
API Key: optional for local use, can be generated in-app for LAN or tunnel setups

Works with Cursor, Continue.dev, LM Studio, the Python openai library, and any tool that accepts a custom base URL.

openClaw integration

For openClaw users, one command wires it up:

bash <(curl -fsSL https://www.desertstack.dev/clawrelay/enable-provider.sh ) \
  --provider-id "clawrelay" \
  --base-url "http://localhost:11434/v1" \
  --api-key "claw_relay_key" \
  --api "openai-completions" \
  --model-id "clawrelay" \
  --model-name "ClawRelay"

Generate your key from the Servers tab in ClawRelay. Requires jq and the openclaw CLI.

Deployment options

Beyond localhost, you can bind ClawRelay to your LAN interface to reach it from any device on your network. You can also put Cloudflare Tunnel or ngrok in front to expose it to the internet. The same app and configuration work for all deployment scenarios.

Built-in features

Request logs included
System tray access
Full settings window
macOS Keychain storage for API keys
Native Swift implementation

📖 Read the full source: r/clawdbot

👀 See Also

Tools

OpenClaw Model Performance Review: Codex 5.3 Leads, GLM Models Disappoint

A developer tested multiple AI models with OpenClaw, finding Codex 5.3 performs best with 9/10 rating, while GLM 4.7 and GLM 5 scored 5/10 due to high token usage, slow responses, and inconsistent output.

Apr 17, 2026, 02:45 PM UTC

OpenClawRadar

Tools

Single-page chatbot interface for locally running Gemma 4 26B A4B

A developer built a single HTML page chatbot that connects to Gemma 4 26B A4B running locally with 32K context window at 50-65 tokens/second, sharded between a 7900 XT and 3060 Ti GPU. The interface includes full streaming, Markdown rendering, and parameter controls.

Apr 21, 2026, 10:15 AM UTC

OpenClawRadar

Tools

Skills Creator Tool for OpenClaw Helps Developers Package Workflows

A developer created a skill called skills-creator that guides users through creating quality skills for OpenClaw, addressing common pitfalls like vague descriptions and documentation-like instructions. It's available on ClawHub and provides a design-driven approach with description formulas, checklists, and complexity tiers.

Mar 13, 2026, 10:45 PM UTC

OpenClawRadar

Tools

TestThread: Open Source Testing Framework for AI Agents

TestThread is an open source testing framework for AI agents that runs tests against live endpoints, provides pass/fail results with AI diagnosis, and includes features like semantic matching, PII detection, and CI/CD integration.

Mar 24, 2026, 05:45 AM UTC

OpenClawRadar