Qwen3.5 35B-A3B MoE runs 27-step agentic workflow locally on mid-range hardware

Local agentic workflow demonstration
A developer on r/LocalLLaMA reported successfully running a complex agentic workflow locally using Qwen3.5 35B-A3B MoE. The model executed a 27-step video processing chain autonomously on mid-range hardware.
Workflow details
The task involved processing a video from a single natural language prompt:
- Upload a video
- Transcribe with Whisper
- Edit the subtitles
- Burn subtitles back into video with custom styling
The workflow consisted of 27 sequential tool calls including: extract_audio, transcribe, read_file, edit_file, burn_subtitles, plus verification steps. The model planned, executed, verified each step, and self-corrected when needed.
Technical specifications
Hardware:
- Lenovo ThinkPad P53 mobile workstation
- Intel i7-9850H processor
- Quadro RTX 3000 (6GB VRAM)
- 48GB DDR4 2666MT/s RAM
Software stack:
- Full local implementation with llama.cpp + whisper.cpp
- No cloud APIs used
Model configuration:
- Qwen3.5 35B-A3B MoE at Q4_K_M quantization
- MoE architecture with ~3B active parameters per token
- Fits and runs on 6GB VRAM with layers offloaded
- Full 35B parameter knowledge base
Performance results
The complete workflow ran in approximately 10 minutes, with most time spent on inference. The developer noted zero errors and zero human intervention required during the 27-step chain. The MoE architecture made this feasible on mid-range hardware by keeping active parameter count low while maintaining full model capability.
This demonstrates that local agentic workflows are becoming practical on consumer-grade hardware, particularly with MoE models that balance active parameter count for speed against full parameter count for capability.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Non-Developer Builds SaaS App with Claude as Coding Partner
A Director of Data Operations with no software development background used Claude to build and launch a full SaaS application called The Pit Preacher, an AI-powered BBQ assistant with Next.js 14, Supabase authentication, Stripe payments, and Vercel deployment.

Reddit user shares experience with AI agent building a Next.js project overnight
A developer on r/openclaw gave their AI agent an open-ended task to build a project from scratch overnight, documenting what the agent handled well versus where human intervention was required. The agent successfully scaffolded a Next.js project, wrote content, managed Git operations, deployed to Vercel, and iterated on design with feedback.

Using Claude to Audit Email Systems for Missing User Scenarios
A developer used Claude to analyze their database schema and email triggers, identifying four critical gaps: no follow-up for unverified signups, no acknowledgment for downgrades, no notification for accepted team invitations, and no warnings for approaching plan limits.

Running Claude Code 24/7 as a Background Agent — 2 Weeks of Experience
A developer shares their setup for running Claude Code continuously on a VPS, handling code reviews, refactoring, and deployments while they sleep.