RelayCode VS Code Extension Routes Claude Code Through Sovereign RDUs

✍️ OpenClawRadar📅 Published: March 27, 2026🔗 Source
RelayCode VS Code Extension Routes Claude Code Through Sovereign RDUs
Ad

OpenGPU has released RelayCode, a VS Code extension that acts as a local proxy for AI coding agents. The tool intercepts requests from Claude Code or GitHub Copilot and routes them through the OpenGPU Relay network to open-weight models running on sovereign infrastructure.

Ad

Key Details

The extension provides several specific features and performance characteristics:

  • Infrastructure: Workloads are routed through Infercom's reconfigurable dataflow units (RDUs), described as dedicated sovereign compute with no US jurisdiction and GDPR compliance by design.
  • Performance: Benchmarks show 250+ tokens per second on DeepSeek-R1 (671B) and 400+ tokens per second on MiniMax M2.5. Model switching is near-instant (milliseconds) due to the dataflow architecture.
  • Context Management: The extension automatically manages CLAUDE_AUTOCOMPACT settings to keep agents within model context windows without crashing.
  • Privacy: Code stays on the local machine; only inference requests hit the relay network with no data retention.
  • Current Status: The team reports about 23 installs and is seeking feedback on relay latency from the community.
  • Access: Promo credits are available for testing RDU speeds for free.

The tool is positioned as a way to reduce Anthropic API costs while maintaining Claude CLI workflows, particularly useful for refactoring work.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also

Ninetails Memory Engine V4.5: Int8 Quantization + LRU Cache Cuts Local MCP Memory to 60MB
Tools

Ninetails Memory Engine V4.5: Int8 Quantization + LRU Cache Cuts Local MCP Memory to 60MB

The Ninetails Memory Engine V4.5 uses Int8 scalar quantization and LRU cache eviction to reduce vector storage from 6KB to 1.5KB per embedding, keeping the entire engine at 40-60MB RAM. It combines 70% vector similarity with 30% BM25 search in a fully local SQLite implementation.

OpenClawRadar
Zot: A Lightweight Terminal Coding Agent Now Supports Claude Opus 4.8
Tools

Zot: A Lightweight Terminal Coding Agent Now Supports Claude Opus 4.8

Zot is a minimal terminal coding agent distributed as a single static Go binary with no runtime or Docker dependencies. It now supports Claude Opus 4.8 along with dozens of other models.

OpenClawRadar
Bio-Inspired Memory System for Local LLMs: LTP and Selective Oblivion Implementation
Tools

Bio-Inspired Memory System for Local LLMs: LTP and Selective Oblivion Implementation

A developer built a local MCP server implementing bio-inspired memory mechanics including Long-Term Potentiation reinforcement, selective oblivion decay, and weekly consolidation cycles. The system uses hybrid search with sqlite-vec and text fallbacks, non-blocking architecture with asyncio executors, and maintains state via a persistent 'Soul' file.

OpenClawRadar
SuperHQ: Run AI coding agents in isolated microVM sandboxes
Tools

SuperHQ: Run AI coding agents in isolated microVM sandboxes

SuperHQ is an open source Rust/GPUI app that runs AI coding agents (Claude Code, OpenAI Codex, Pi) in isolated microVM sandboxes. Each agent gets a full Debian VM, mounts project dirs read-only, and never sees host API keys — they're injected via an auth gateway proxy.

OpenClawRadar