Hybrid RAG for Local Agent Memory with OpenClaw, Ollama, and nomic-embed-text

✍️ OpenClawRadar📅 Published: March 10, 2026🔗 Source
Hybrid RAG for Local Agent Memory with OpenClaw, Ollama, and nomic-embed-text
Ad

Problem: Retrieval, Not Storage

The developer had months of daily memory logs stored in markdown files, which worked for saving information but not for finding it again. When the agent needed past context, it would fall back to running ls, opening files one by one, spending tokens, and sometimes missing relevant information. The issue was retrieval by meaning, not storage.

Solution: Hybrid RAG with Local Embeddings

The developer enabled memorySearch in OpenClaw using Ollama as the provider and nomic-embed-text for local embeddings, running in hybrid mode. Hybrid means 70% vector similarity (cosine via nomic-embed-text) combined with 30% BM25 keyword matching. Vector handles semantic proximity while BM25 handles exact names, versions, and IDs. MMR reduces redundant results, and temporal decay gives more weight to recent logs. Everything runs locally without external APIs.

Configuration

"memorySearch": {
  "provider": "ollama",
  "query": {
    "hybrid": {
      "enabled": true,
      "vectorWeight": 0.7,
      "textWeight": 0.3,
      "mmr": {
        "enabled": true,
        "lambda": 0.7
      },
      "temporalDecay": {
        "enabled": true,
        "halfLifeDays": 30
      }
    }
  }
}

Setup Instructions

  • OpenClaw detects Ollama automatically at localhost:11434
  • No need to specify baseUrl or model - it picks up nomic-embed-text if pulled
  • Run ollama pull nomic-embed-text first, then restart the gateway
  • Avoid setting provider: "openai" and pointing baseUrl to Ollama - use provider: "ollama" directly
Ad

Behavioral Change Required

Enabling the tool wasn't enough. Without explicit instructions to use memorySearch before reading files directly, the agent would skip it and take the slower, token-heavy route. The developer wrote a rule into both AGENTS.md and MEMORY.md in the workspace to make memory search part of the agent's normal workflow.

Before vs After Results

  • Before: Browse folders, open files blindly, hope wording matches, waste tokens, miss context
  • After: Run memory_search with semantic query, retrieve ranked results with similarity scores, open best match, answer from actual past notes
  • Similarity scores for relevant results typically range 0.45 to 0.48 for nomic-embed-text on prose logs

Practical Notes

  • nomic-embed-text has a 2048 token context limit by default, not 8192 - large files may get truncated at indexing
  • Memory files in Spanish work well - nomic-embed-text handles Spanish without issues
  • Retrieval quality depends on note quality - vague logs still cause semantic search struggles

Tech Stack

  • OpenClaw (local, self-hosted)
  • Ollama + nomic-embed-text:latest
  • SQLite with sqlite-vec and FTS5 (created automatically by OpenClaw on first use)
  • Mac mini M4, 16GB unified memory

📖 Read the full source: r/openclaw

Ad

👀 See Also