Pair Programmer Plugin Adds Live Screen & Audio to Claude Code

A developer has released Pair Programmer, a plugin that addresses Claude Code's lack of real-time context by providing live desktop perception. The tool captures three data streams: screen content (with visual indexing generating short scene descriptions), microphone input (transcription plus lightweight intent classification for questions, explanations, or commands), and system audio (indexing meetings, tutorials, or other audio playing on the machine).

Architecture and Implementation

The system uses a multi-agent pipeline rather than a single model approach. It runs specialized agents in parallel:

Screen reader for visual context
Voice processor for microphone transcription and intent classification
Audio classifier for system audio
Orchestrator that correlates all inputs and synthesizes a single response

The plugin is built on VideoDB infrastructure. While indexing currently uses cloud models, the design is model-agnostic—the Index layer can swap in any VLM or LLM. The developer mentions interest in wiring local models for visual description and transcription layers.

Current Status and Installation

The plugin is currently macOS only. Installation requires three commands. The GitHub repository is available at https://github.com/video-db/claude-code/tree/main.

The developer is seeking feedback on architectural approaches, specifically whether developers prefer the multi-agent pipeline with specialized models and orchestration or pushing toward a single model end-to-end solution for desktop perception systems.

📖 Read the full source: r/ClaudeAI

Pair Programmer Plugin Adds Live Screen, Voice, and Audio Context to Claude Code

Architecture and Implementation

Current Status and Installation

👀 See Also

Markdown Manager: A Simple Markdown Editor for macOS

StartClaw: A headless browser automation tool built on ZeroClaw with Claude integration

Jean-Claude: A Satirical LLM Frontend Mocking EU AI Regulation, with 412 Cookie Partners and VAT Invoices Every 5 Messages

Deblank: Tool to Strip Code Formatting for LLM Token Reduction