Building an AI Receptionist for a Mechanic Shop: RAG Pipeline and Voice Integration

Building the RAG Pipeline
The first step was creating an accurate knowledge base to prevent hallucinations. The developer scraped the mechanic shop's website service pages and pricing into markdown files, creating a structured knowledge base covering 21+ documents including service types, pricing, turnaround times, hours, payment methods, cancellation policies, warranty info, loaner vehicles, and specialized car makes.
Each document was converted into a 1024-dimensional vector using Voyage AI (voyage-3-large) and stored in MongoDB Atlas alongside the raw text, with an Atlas Vector Search index on the embedding field.
When a customer asks a question, the query gets embedded using the same Voyage AI model and runs against the Atlas Vector Search index, returning the top 3 most semantically similar documents. Retrieved documents get passed as context to Anthropic Claude (claude-sonnet-4-6) with a strict system prompt: answer only from the knowledge base, keep responses short and conversational, and if you don't know — say so and offer to take a message.
Example response: "How much is an oil change?" → "$45 for conventional, $75 for synthetic. Includes oil filter, fluid top-off, and tire pressure check. Takes about 30 minutes."
Connecting to a Real Phone Line
The developer used Vapi as the voice platform to handle telephony: purchasing a phone number, speech-to-text (via Deepgram), text-to-speech (via ElevenLabs), and real-time function calling back to the server.
A FastAPI webhook server was built with a /webhook endpoint. When a caller asks a question, Vapi sends a tool-calls request to this endpoint with the caller's query. The server routes that to the RAG pipeline, gets a response from Claude, and sends it back to Vapi, which reads it aloud to the caller.
During development, the server runs locally on port 8000 and is exposed using Ngrok, which creates a tunnel to a public HTTPS URL that gets pasted into the Vapi dashboard as the webhook endpoint.
In the Vapi dashboard, the assistant was configured with a greeting ("Hi, thanks for calling Dane's Motorsport, how can I help you today?") and two tools: answerQuestion for RAG-backed responses and saveCallback for collecting a name and number when a question can't be answered.
Vapi sends the full conversation history with each request, enabling conversation memory.
📖 Read the full source: HN AI Agents
👀 See Also

Independent Researcher Uses Claude AI to Write Quantum Mechanics Paper and 30-50k Lines of Rust Code
An independent researcher used Claude AI as a collaborator to write a research paper titled 'Clifford Geometry as the Foundation of Quantum Mechanics' and develop 30-50k lines of Rust code with zero external dependencies. The code verifies Bell correlations and wave dynamics in a phase lattice.

Multi-Agent Systems Fail Silently with Garbage Output, Requiring Metadata Validation
A developer running a 39-agent system for two weeks found that when one agent produces garbage output, downstream agents process it confidently, creating polished but fabricated results. The solution involves wrapping output in metadata envelopes that declare task completion and source counts.

Direct Mobile Document Ingestion to OpenClaw: iOS to Raspberry Pi Health Workflow
A developer shares an architecture for pushing documents directly from an iOS mobile client to a local OpenClaw instance on a Raspberry Pi, using QR-based pairing and dedicated endpoints for health record processing.

OpenClaw agent replaces multiple SaaS tools for LinkedIn lead generation at 5x lower cost
A developer replaced €250/month in SaaS subscriptions with an OpenClaw agent running on a VPS for under €2/day, using model routing between Haiku and Sonnet for LinkedIn lead generation with 60-70% connection acceptance rates.