Voxray-AI: Production Go Backend for Real-Time Voice Agent Pipelines

✍️ OpenClawRadar📅 Published: March 10, 2026🔗 Source
Voxray-AI: Production Go Backend for Real-Time Voice Agent Pipelines
Ad

Production Voice Agent Pipeline in Go

Voxray-AI provides a complete streaming pipeline in Go that handles client audio through WebSocket or WebRTC, processes it through STT → LLM → TTS, and returns audio output. The system is designed for production-grade servers and high-concurrency voice workloads.

Transport Options

The system supports multiple transport mechanisms:

  • WebSocket at /ws with RTVI serializer (?rtvi=1) and Protobuf (?format=protobuf) support
  • WebRTC at /webrtc/offer with full SDP offer/answer, configurable STUN/TURN, and Opus encoding (requires CGO build)
  • Telephony runner transports: Twilio, Telnyx, Plivo, Exotel, LiveKit, Daily.co

Pluggable Providers

All components are swappable via configuration:

  • STT providers: OpenAI, Groq, Sarvam, Google, AWS
  • LLM providers: OpenAI, Anthropic, Groq, others
  • TTS providers: OpenAI, Google, AWS Polly, Sarvam

Configuration Examples

Minimal configuration example:

{"transport": "both", "stt": { "provider": "groq", "model": "whisper-large-v3" }, "llm": { "provider": "anthropic", "model": "claude-3-5-haiku" }, "tts": { "provider": "google", "voice": "en-US-Neural2-F" }}

Turn-taking and voice activity detection configuration:

{"turn_detection": "silence", "vad_type": "silero", "vad_confidence": 0.7, "vad_start_secs_vad": 0.2, "vad_stop_secs": 0.8, "turn_max_duration_secs": 30, "user_idle_timeout_secs": 60}
Ad

Observability & Storage

  • /metrics endpoint for Prometheus (request counts, latency histograms, active connection gauges)
  • Recording: Full session audio to S3 with configurable worker pool and format
  • Transcripts: Per-message storage to Postgres or MySQL with configurable table
  • /health and /ready endpoints with optional Redis session store check on /ready

Security Features

  • server_api_key gates /ws, /webrtc/offer, /start, /sessions/* via Authorization: Bearer or X-API-Key
  • CORS allowlist configuration
  • TLS cert/key configuration
  • 12-factor style: JSON config + environment variable overrides

This type of backend is useful for developers building real-time voice applications that need to integrate multiple AI services with production-ready infrastructure.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also

Radicle 1.8.0 Released: Decentralized Peer-to-Peer Code Forge Built on Git
Tools

Radicle 1.8.0 Released: Decentralized Peer-to-Peer Code Forge Built on Git

Radicle 1.8.0 ships a sovereign, peer-to-peer code forge on Git with CLI, web UI, and desktop client. Repos replicate across peers using NoiseXK and a custom gossip protocol – no central server.

OpenClawRadar
OpenClaw Alexa Voice Proxy Enables Bidirectional Voice Interaction
Tools

OpenClaw Alexa Voice Proxy Enables Bidirectional Voice Interaction

openclaw-alexa-voice is a Node.js proxy that connects an Alexa Custom Skill to the OpenClaw gateway with a three-tier response system for voice queries. It handles fast responses under 1 second, agent responses under 12 seconds, and deferred complex queries processed asynchronously within 2 minutes.

OpenClawRadar
Chat Saver CG: Browser Extension Built with Claude Exports Conversations Across 12 AI Platforms
Tools

Chat Saver CG: Browser Extension Built with Claude Exports Conversations Across 12 AI Platforms

A developer built Chat Saver CG, a browser extension that exports and transfers conversations between Claude, ChatGPT, Gemini, and 9 other AI platforms, using Claude extensively for development including architecture decisions, debugging DOM parsing issues, and writing adapter logic.

OpenClawRadar
quorum: AI Code Governance Tool Enforces Independent Model Review
Tools

quorum: AI Code Governance Tool Enforces Independent Model Review

quorum is a governance layer for AI-assisted development that enforces a consensus protocol requiring code to be independently reviewed by a different model before committing. It includes three structural gates that block progress: audit, retro, and quality gates.

OpenClawRadar