Hybrid RAG for Local Agent Memory with OpenClaw & Ollama

Problem: Retrieval, Not Storage

The developer had months of daily memory logs stored in markdown files, which worked for saving information but not for finding it again. When the agent needed past context, it would fall back to running ls, opening files one by one, spending tokens, and sometimes missing relevant information. The issue was retrieval by meaning, not storage.

Solution: Hybrid RAG with Local Embeddings

The developer enabled memorySearch in OpenClaw using Ollama as the provider and nomic-embed-text for local embeddings, running in hybrid mode. Hybrid means 70% vector similarity (cosine via nomic-embed-text) combined with 30% BM25 keyword matching. Vector handles semantic proximity while BM25 handles exact names, versions, and IDs. MMR reduces redundant results, and temporal decay gives more weight to recent logs. Everything runs locally without external APIs.

Configuration

"memorySearch": {
  "provider": "ollama",
  "query": {
    "hybrid": {
      "enabled": true,
      "vectorWeight": 0.7,
      "textWeight": 0.3,
      "mmr": {
        "enabled": true,
        "lambda": 0.7
      },
      "temporalDecay": {
        "enabled": true,
        "halfLifeDays": 30
      }
    }
  }
}

Setup Instructions

OpenClaw detects Ollama automatically at localhost:11434
No need to specify baseUrl or model - it picks up nomic-embed-text if pulled
Run ollama pull nomic-embed-text first, then restart the gateway
Avoid setting provider: "openai" and pointing baseUrl to Ollama - use provider: "ollama" directly

Behavioral Change Required

Enabling the tool wasn't enough. Without explicit instructions to use memorySearch before reading files directly, the agent would skip it and take the slower, token-heavy route. The developer wrote a rule into both AGENTS.md and MEMORY.md in the workspace to make memory search part of the agent's normal workflow.

Before vs After Results

Before: Browse folders, open files blindly, hope wording matches, waste tokens, miss context
After: Run memory_search with semantic query, retrieve ranked results with similarity scores, open best match, answer from actual past notes
Similarity scores for relevant results typically range 0.45 to 0.48 for nomic-embed-text on prose logs

Practical Notes

nomic-embed-text has a 2048 token context limit by default, not 8192 - large files may get truncated at indexing
Memory files in Spanish work well - nomic-embed-text handles Spanish without issues
Retrieval quality depends on note quality - vague logs still cause semantic search struggles

Tech Stack

OpenClaw (local, self-hosted)
Ollama + nomic-embed-text:latest
SQLite with sqlite-vec and FTS5 (created automatically by OpenClaw on first use)
Mac mini M4, 16GB unified memory

📖 Read the full source: r/openclaw