OpenClaw Alexa Voice Proxy Enables Bidirectional Voice Interaction

✍️ OpenClawRadar📅 Published: March 2, 2026🔗 Source
OpenClaw Alexa Voice Proxy Enables Bidirectional Voice Interaction
Ad

openclaw-alexa-voice is a Node.js proxy that connects an Alexa Custom Skill to the OpenClaw gateway, enabling voice interaction with full access to tools like email, calendar, and finances. The system implements a three-tier response architecture to handle different types of queries efficiently.

Three-Tier Response System

The proxy categorizes responses into three paths based on complexity and processing time:

  • Fast path (<1s) – Handles simple queries like time, date, and custom APIs
  • Agent path (<12s) – Provides quick answers from AI memory
  • Deferred path (<2min) – Processes complex queries asynchronously, then plays back via Home Assistant TTS on any speaker

How It Works

When a query requires tool access (email, web search, market data), Alexa responds with "Let me check" and closes the session. The proxy then sends the query to OpenClaw's main session with full tool access, waits up to 2 minutes, strips markdown formatting, and plays the response on any Echo or Sonos device via Home Assistant's Alexa Media Player integration.

Ad

Key Features

  • Voice PIN authentication with 1-hour sessions
  • Multi-speaker TTS routing to any Echo, Sonos, or speaker group
  • Extensible fast-response system for custom APIs
  • Telegram fallback if TTS fails
  • Alexa request signature validation
  • Rate limiting and audit logging
  • Binds to localhost only for security

Technical Stack

The implementation uses Node.js for the proxy, an Alexa Custom Skill for voice interface, OpenClaw gateway WebSocket for communication, and Home Assistant for TTS playback. This approach allows developers to extend voice capabilities to their OpenClaw instances while maintaining security through local binding and authentication.

The project was inspired by Discussion #11154 and is available as open source for developers who want to add voice interaction to their OpenClaw setups. The three-tier system ensures responsive voice interactions while still allowing complex queries to leverage OpenClaw's full tool capabilities.

📖 Read the full source: r/openclaw

Ad

👀 See Also