Local Semantic Memory Search: Harrier + OpenClaw Config

A new repo shows how to give an OpenClaw agent local semantic memory search without sending embeddings to an external service. The approach runs a small local embedding server around Microsoft's Harrier model (microsoft/harrier-oss-v1-0.6b), exposes an Ollama-compatible API, and wires it to OpenClaw's memorySearch config.

How it works

The embedding server runs Harrier locally and provides /api/embed and /api/embeddings endpoints that match Ollama's API format. OpenClaw's memorySearch already supports Ollama-style endpoints, so pointing it at http://localhost:8000 gives the agent a local SOTA semantic memory layer.

Why this matters for agent memory

Most agent memory systems have two pain points:

Shoving too much memory into the prompt burns tokens and makes context messy.
Keeping memory files small and manual becomes hard to maintain as history grows.

Semantic memory search offers a middle path. Long-term memory stays in normal markdown files (MEMORY.md, daily logs, notes, project files) that are human-readable and editable. At runtime, the agent retrieves only relevant chunks.

Benefits

Less token waste — not stuffing every durable fact into every prompt.
Cleaner memory files — no need to compress into one giant context blob.
Better recall — finds conceptually related notes even when wording doesn't match exactly.
Easier debugging — source of truth is plain text, not an opaque vector database.
Better privacy — embeddings computed locally, no data shipped to hosted API.