Local voice-to-text transcription for OpenClaw using Parakeet TDT 0.6b v3

Local transcription setup for OpenClaw
A community developer has adapted NVIDIA's Parakeet TDT 0.6b v3 model for local voice-to-text transcription within OpenClaw. The model runs via ONNX inference on CPU, eliminating API costs and supporting 25 European languages.
Technical implementation
The solution uses a GitHub repository (groxaxo/parakeet-tdt-0.6b-v3-fastapi-openai) that provides a Docker container for CPU deployment. The container exposes an OpenAI-compatible API endpoint at http://127.0.0.1:5092/v1.
Supported languages include: Bulgarian (bg), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), German (de), Greek (el), Hungarian (hu), Italian (it), Latvian (lv), Lithuanian (lt), Maltese (mt), Polish (pl), Portuguese (pt), Romanian (ro), Slovak (sk), Slovenian (sl), Spanish (es), Swedish (sv), Russian (ru), and Ukrainian (uk).
Integration with OpenClaw
The developer provides a Python script for transcription:
#!/home/openclaw/.local/share/pipx/venvs/openai/bin/python
import sys
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:5092/v1",
api_key="sk-no-key-required"
)
audio_file = open(sys.argv[1], "rb")
transcript = client.audio.transcriptions.create(
model="parakeet-tdt-0.6b-v3",
file=audio_file,
response_format="text"
)
print(transcript)
This script can be configured in OpenClaw's openclaw.json file:
"tools": {
"media": {
"audio": {
"enabled": true,
"models": [
{
"type": "cli",
"command": "/home/openclaw/.local/bin/transcribe",
"args": ["{{MediaPath}}"],
"timeoutSeconds": 60
}
]
}
}
}Alternatively, OpenClaw can be configured to directly use the OpenAI-compatible API endpoint with the model name and dummy API key from the script.
Deployment notes
The developer tested this on an ARM64 Ubuntu Linux VM on a Mac Mini with M4 Pro, noting it should run reasonably fast on any decent Intel-compatible CPU. The Docker container is built following the README instructions in the GitHub repository.
📖 Read the full source: r/openclaw
👀 See Also

Black LLAB: Open-Source Architecture for Dynamic Model Routing and Docker-Sandboxed AI Agents
A developer has open-sourced Black LLAB, a system that uses Mistral 3B to route prompts between local and cloud models and runs AI agents in isolated Docker containers with OpenClaw integration.

Claw Compactor: 14-stage token compression engine for LLM pipelines
Claw Compactor is an open-source LLM token compression engine using a 14-stage Fusion Pipeline to achieve 54% average compression with zero LLM inference cost. It includes specialized compressors for code, JSON, logs, diffs, and search results with reversible compression capabilities.

Persistent AI Advisor with Cross-Platform Memory: Tracks Decision History for 3 Months
A Reddit user built a persistent AI advisor that remembers every product decision across Claude Code, Cursor, and a web interface, catching contradictions and improving over months.

Qwen3.5-35B-A3B-UD-Q6_K_XL Tested in Production Development Workflows
A developer tested the Qwen3.5-35B-A3B-UD-Q6_K_XL model across multiple real client projects, achieving solid performance with benchmarks of 1504pp2048 and 47.71 tg256, and token speeds of 80tps on a single GPU.