RelayCode VS Code Extension Routes Claude Code Through Sovereign RDUs

OpenGPU has released RelayCode, a VS Code extension that acts as a local proxy for AI coding agents. The tool intercepts requests from Claude Code or GitHub Copilot and routes them through the OpenGPU Relay network to open-weight models running on sovereign infrastructure.
Key Details
The extension provides several specific features and performance characteristics:
- Infrastructure: Workloads are routed through Infercom's reconfigurable dataflow units (RDUs), described as dedicated sovereign compute with no US jurisdiction and GDPR compliance by design.
- Performance: Benchmarks show 250+ tokens per second on DeepSeek-R1 (671B) and 400+ tokens per second on MiniMax M2.5. Model switching is near-instant (milliseconds) due to the dataflow architecture.
- Context Management: The extension automatically manages
CLAUDE_AUTOCOMPACTsettings to keep agents within model context windows without crashing. - Privacy: Code stays on the local machine; only inference requests hit the relay network with no data retention.
- Current Status: The team reports about 23 installs and is seeking feedback on relay latency from the community.
- Access: Promo credits are available for testing RDU speeds for free.
The tool is positioned as a way to reduce Anthropic API costs while maintaining Claude CLI workflows, particularly useful for refactoring work.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Ninetails Memory Engine V4.5: Int8 Quantization + LRU Cache Cuts Local MCP Memory to 60MB
The Ninetails Memory Engine V4.5 uses Int8 scalar quantization and LRU cache eviction to reduce vector storage from 6KB to 1.5KB per embedding, keeping the entire engine at 40-60MB RAM. It combines 70% vector similarity with 30% BM25 search in a fully local SQLite implementation.

Zot: A Lightweight Terminal Coding Agent Now Supports Claude Opus 4.8
Zot is a minimal terminal coding agent distributed as a single static Go binary with no runtime or Docker dependencies. It now supports Claude Opus 4.8 along with dozens of other models.

Bio-Inspired Memory System for Local LLMs: LTP and Selective Oblivion Implementation
A developer built a local MCP server implementing bio-inspired memory mechanics including Long-Term Potentiation reinforcement, selective oblivion decay, and weekly consolidation cycles. The system uses hybrid search with sqlite-vec and text fallbacks, non-blocking architecture with asyncio executors, and maintains state via a persistent 'Soul' file.

SuperHQ: Run AI coding agents in isolated microVM sandboxes
SuperHQ is an open source Rust/GPUI app that runs AI coding agents (Claude Code, OpenAI Codex, Pi) in isolated microVM sandboxes. Each agent gets a full Debian VM, mounts project dirs read-only, and never sees host API keys — they're injected via an auth gateway proxy.