Developer builds local AI research agent that creates podcasts from topics or YouTube links

A developer on r/LocalLLaMA built an autonomous research and podcast agent that runs entirely locally. What started as an attempt to avoid paying for TTS (text-to-speech) services evolved into a full system that can research topics and present information in human-like formats.
What the agent does
The system takes either a topic or a YouTube link as input and produces three outputs:
- A proper deep-dive report
- A conversational podcast-style script
- Generated audio for the podcast
How it works differently from fixed pipelines
The developer focused on making the agent behave less like a fixed pipeline and more like something that decides what to do next dynamically. Instead of step-by-step execution, it:
- Searches and pulls content
- Extracts insights (including from videos)
- Refines summaries in multiple passes
- Converts that into a natural back-and-forth conversation
Key challenges and solutions discovered during development
- Speed issues: Initial performance was rough, but parallelizing tasks made a significant difference
- Shallow summaries: Early summaries felt shallow, but implementing multi-step refinement helped substantially
- Robotic audio: The audio initially sounded robotic, but switching to a 2-speaker format made it much more natural
The developer noted that this project demonstrates how close we're getting to doing powerful AI work entirely on local machines, without relying on cloud services.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Mandala v0.3: Open-Source Async Runtime to Unify Logistics Telemetry as OpenTelemetry Spans for Agent Reasoning
Mandala v0.3 provides an open-source async runtime that ingests telemetry from Samsara, Descartes, Vizion, and FMCSA via webhooks, emits events as OpenTelemetry spans, and exposes data via MCP tools for LLM agents.

Open Source Book Genesis: 20 Claude Code Skills for Autonomous Book Writing
Book Genesis is an open-source system of 20 specialized Claude Code skills that takes a book idea and produces a complete, publish-ready manuscript through a 14-phase autonomous pipeline. It includes a 'Chaos Engine' to break AI predictability patterns and has generated a 68,000-word memoir scoring 9.0/10 on its Genesis Score.

/compress-architecture: An Agent Skill to Prune Over-Engineering
A new agent skill called /compress-architecture audits codebases for speculative layers, pass-through modules, and duplicate concepts while protecting real domain boundaries and public APIs.

Leadership App with 90+ Lessons from 20+ Books Runs in Claude
A developer created a leadership app that runs inside Claude, featuring 90+ lessons extracted from 20+ books on leadership, habits, discipline, influence, team culture, and wealth mindset. The app provides daily lessons with specific actions, streak tracking, journaling, and search capabilities.