How to Build Persistent AI Knowledge with OpenClaw

A developer has built a full knowledge infrastructure system called 'Brain' on top of OpenClaw to address the statelessness problem common in AI setups. The system provides persistent memory across sessions, allowing users to query past decisions and workflow history.

Core Architecture

Brain serves as the central knowledge service where documents are ingested, chunked, and embedded locally using Ollama. Data is stored across multiple databases: Postgres, MongoDB, and Qdrant, with relationships mapped in a Memgraph graph database. This makes every decision, session, and workflow run searchable and connected.

Search and Retrieval

Search in Brain uses hybrid retrieval combining semantic search via Qdrant with BM25 full-text search from Postgres, merged using reciprocal rank fusion. Results are automatically deduplicated and context-budgeted before synthesis.

RAG Agent and Plugin System

On top of Brain sits a RAG Agent that runs a complete pipeline: retrieve → graph expand → fuse → synthesize, all powered by local Ollama models. The agent estimates confidence on every answer and automatically logs 'knowledge gaps' to a pending queue when confidence is low.

The system includes a clean plugin system with 33+ typed tools that agents can call, including: brain_search, brain_ingest, brain_rag_query, brain_graph_slice, and brain_condense_domain. Every operation has a strict, well-typed interface.

Workflows and Observability

Workflows are first-class citizens in this system. Multi-step pipelines—orient, fetch, inspect, synthesize, log—can be run either through agents or via a deterministic runner on a cron schedule with zero LLM involvement. Telemetry and observability remain consistent either way.

Each agent has a strict mandate and communicates through structured handoffs, with all activity tracked back into Brain as searchable history. A Python drift checker compares live agent configs against Brain snapshots, automatically logging structured events when tool allowlists or plugin versions change.

Local Deployment and Future Plans

The entire system runs locally using Ollama for embeddings and synthesis, with Docker for all the stores. There are no OpenAI calls or external APIs for the core intelligence layer.

Next steps include migrating the RAG agent to LlamaIndex Workflows, building out a shared brain-client SDK, and tightening the API surface. RAG endpoints are moving to a /v1/rag/ prefix, realm is becoming a header, and leaky DB facades are getting properly abstracted.

📖 Read the full source: r/openclaw