RelayPlane Open Source Proxy Shows 73% Cost Reduction with Claude Model Routing

✍️ OpenClawRadar📅 Published: April 7, 2026🔗 Source
RelayPlane Open Source Proxy Shows 73% Cost Reduction with Claude Model Routing
Ad

Open Source Proxy for Claude API Routing

RelayPlane is an open source, npm-native proxy that sits in front of the Anthropic API. The tool was built using Claude Code, which accelerated development. It's free to self-host and designed to handle routing between different Claude models based on prompt complexity.

Benchmark Results and Configuration

The benchmark used a mixed workload with 60% simple tasks and 40% complex tasks. Two scenarios were compared:

  • Direct (all Sonnet): p50 latency 1.55s, cost per 10 requests $0.0323
  • Via RelayPlane with routing: p50 latency 0.78s, cost per 10 requests $0.0086

This represents a 73.4% cost reduction. At 10,000 requests per day, this translates to approximately $712 in monthly savings.

Ad

Routing Configuration

The routing configuration is straightforward:

{
  "routing": {
    "complexity": {
      "enabled": true,
      "simple": "claude-haiku-4-5",
      "moderate": "claude-sonnet-4-6",
      "complex": "claude-opus-4-6"
    }
  }
}

The routing logic uses a complexity classifier that examines token count, code indicators, and analytical keywords. Response headers include x-relayplane-routed-model to verify which model actually processed the request.

Model Pricing and Routing Logic

The routing system directs prompts to appropriate models based on complexity:

  • Simple prompts → Haiku ($0.80 per million tokens)
  • Moderate prompts → Sonnet ($3 per million tokens)
  • Complex prompts → Opus ($15 per million tokens)

The author notes the classifier isn't perfect but is "good enough to capture most of the savings." The full benchmark methodology is available in a Gist linked in the source material.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also