Local-First Movie Recap Pipeline Using Whisper + CLIP + Ollama

A developer built an automated pipeline that turns any movie into a narrated recap video. The stack is entirely local-first: Whisper for transcription, CLIP for scene matching, Ollama (or OpenAI/Gemini/Anthropic) for script generation, Edge TTS for voiceover, and FFmpeg for rendering.
How it works
- Input: Drop in any movie file via a simple web UI.
- Transcription: Whisper extracts dialogue and timestamps.
- Scene matching: CLIP identifies visual scenes that match the narrative.
- Script generation: Ollama (or any API provider) writes a concise recap script.
- Voiceover + rendering: Edge TTS generates narration, FFmpeg composites everything into a final video.
The entire process runs locally with Ollama, but you can also plug in remote LLM APIs (OpenAI, Gemini, Anthropic). Total runtime is approximately 15 minutes. No manual editing required.
Who it's for
Developers building automated video generation pipelines or anyone who wants to batch-produce movie recaps without cloud dependencies.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Handoffs Pattern in Claude Workflows: Two-File Split vs One-Doc Summary
Long Claude sessions break on context decay. Handoffs compress what matters and start fresh. Two approaches: Matt Pocock's single-doc handoff skill vs a two-file split with persistent narrative and ephemeral prompt.

Pilot Protocol: Open-Source P2P Network Stack for AI Agent Swarms
Pilot Protocol is an open-source Layer 3 and Layer 4 overlay network stack designed specifically for AI agent communication, providing direct encrypted UDP tunnels between agents with permanent 48-bit virtual addresses.
Needle: A 26M Parameter Tool-Calling Model Built Entirely Without FFNs
Needle is a 26M parameter function-calling model with no MLPs, achieving 6000 tok/s prefill and 1200 tok/s decode on consumer devices. It beats FunctionGemma-270M, Qwen-0.6B, Granite-350M, and LFM2.5-350M on single-shot tool calling.

the-knowledge-guy: Turn Your Bookshelf Into a Tutor With Claude Code Skills
A Claude Code skill set that ingests your PDF/EPUB books locally and lets you ask questions, get taught topic-by-topic, or pull cheatsheets — all with citations across your library.