Building a Local Voice-to-Text macOS App with Claude Code: Vext Case Study

✍️ OpenClawRadar📅 Published: April 29, 2026🔗 Source
Building a Local Voice-to-Text macOS App with Claude Code: Vext Case Study
Ad

A developer shared their experience building Vext—a native macOS voice-to-text app that runs entirely on-device using Whisper on the Apple Neural Engine. No cloud, no accounts, no subscription. The app uses a Rust core with Swift/SwiftUI UI and Core ML for inference, with Claude Code as the primary coding partner.

Key Features

  • Hold a hotkey anywhere → speak → release → text appears at cursor
  • Transcribes 60 seconds of audio in ~400ms (150x real-time)
  • Smart cleanup: removes filler words, restructures speech for readability
  • Real-time translation to 99+ languages
  • Meeting transcription with speaker diarization + auto-summaries
  • Screen recording during voice recordings (auto-attaches screenshots)
Ad

Claude Code Wins

  • Whisper on Apple Silicon: Helped iterate through quantization strategies, model chunking, and memory layout for Core ML conversion to run efficiently on the Neural Engine.
  • Hotkey system architecture: Suggested using a CGEventTap with proper accessibility permissions, and helped debug race conditions between recording start/stop and clipboard injection.
  • Rust ↔ Swift FFI: Generated FFI bindings and caught several memory safety issues in the C interface layer.

Claude Code Limitations

  • Struggled with macOS-specific API nuances not well-documented online—CGEventTap edge cases required digging into Apple's headers directly.
  • Context window became a bottleneck across the full Rust + Swift codebase; the developer split the project into modules and worked on one at a time.

Pricing

Free to download and try at getvext.app. $49 one-time to keep it (no subscription). Code VEXT50 for 50% off.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also