Local Book Translation Pipeline Uses Qwen 32B and Mistral 24B with Contextual RAG

A developer has created a fully local, automated book translation pipeline that converts PDF files to ePub format using eight Python scripts. The system addresses common translation issues like context loss and formatting problems through a multi-step workflow.
Workflow Details
The pipeline consists of eight scripts that handle the entire process:
- PDF Extraction: Uses Marker to extract content from PDFs while preserving formatting elements like bold text, chapters, and images
- Text Segmentation: Splits the extracted text into manageable chunks
- Context Creation: Before translation, sends excerpts from throughout the book to Qwen 32B to generate a "Super Bible" - a global glossary containing characters, tone, and atmosphere
- Translation: Qwen 32B translates each text segment while referencing the Super Bible to maintain consistency
- Style Editing: Mistral 24B acts as an editor, reviewing Qwen's translations and rewriting them for perfect literary style
- Assembly: A final script reassembles all translated segments, reinserts images, and uses Pandoc to output a polished ePub file
Automation Features
The system includes a monitoring script that watches a designated folder. Users simply drop a PDF into this folder, and the pipeline automatically processes it. After several hours, the system outputs both the translated ePub and a receipt showing processing time.
The developer notes the results are surprisingly effective, though not 100% perfect, and mentions having several improvement ideas. The entire system runs locally on a personal computer without requiring external services.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Fixing OpenClaw Browser CAPTCHAs with Camoufox and CLI Wrapper
OpenClaw's built-in Chromium browser triggers bot detection through Chrome DevTools Protocol, JavaScript injection artifacts, and hardware fingerprinting inconsistencies. The solution uses Camoufox (a Firefox fork) modified at the C++ level and wrapped in a CLI that returns accessibility-tree snapshots to reduce token usage.

120 Prompt Patterns Tested: 8 That Actually Work for Claude Code
A 3-month empirical test of 120 prompt patterns for Claude Code yields 8 actionable commands and 5 validation prompts. Key patterns: L99 (cuts hedging), /ghost (removes AI voice), OODA (structured reasoning), ULTRATHINK (deep reasoning), HARDMODE (constraint debugging).

devcontainer-mcp: Give AI Agents Their Own Dev Environment, Not Yours
devcontainer-mcp is an MCP server that exposes 45 tools for AI agents to create, manage, and work inside dev containers backed by Docker, DevPod, or GitHub Codespaces — keeping host machines clean.

MCP Server Directory Lists 1000+ Servers Across 20 Categories
A curated directory provides install commands and config snippets for over 1000 MCP servers across categories including databases, developer tools, browser automation, AI/ML, and cloud/devops. The directory is free to browse and submit to.