Pali v0.1: Open Source Memory Infrastructure for LLMs with Reproducible Benchmarks

What Pali Is
Pali is open source memory infrastructure for LLMs that's infrastructure-first. It's built in Go as a single binary out of the box with configs for plug and play attachments like qdrant, neo4j, ollama, and openrouter. The project is MIT licensed and fully self-hostable.
Key Features
- Multi-tenant memory APIs with tenant-scoped isolation
- Hybrid retrieval across lexical, dense, fusion, reranking, and optional multi-hop expansion
- MCP server with memory-first tools and tenant-aware resolution
- REST API with respective Python and JavaScript packages live
- Dashboard for operators inspecting tenants, memories, and system state
- Plug-and-play extension points for vector stores, embedders, entity-fact backends, and scoring/routing
Benchmark Approach
The creator addresses common issues with memory stack benchmarks by implementing a reproducible approach:
- Every run stores the exact config files used (profile + rendered)
- Hardware is fully disclosed (CPU, GPU, RAM, model versions)
- Paired comparisons only — same fixture/eval/top_k across all profiles
- Speed lanes and retrieval quality lanes are kept separate
Performance Numbers
Benchmarks from testing on a Ryzen 9 7950X + RTX 5070:
- sqlite + lexical: 208 store ops/s, Top1=0.32, Recall@5=0.54
- qdrant + ollama (all-minilm): 98 store ops/s, Top1=0.34, Recall@5=0.52
- parser+graph (structured memory stress lane): 2.4 store ops/s — slow due to structured extraction cost, but gets ~30 avg on LoCoMo with temporal highs around ~40
Important Clarification
Pali is not LLM memory in the SaaS sense. It returns raw retrieval results you optimize for your own workflow — no black box scoring, no locked provider decisions. You can swap vector backends, embedders, and scorers through config without changing your app contract.
Project Status
Version 0.1 was recently pushed with a proper benchmark suite added. The creator is looking for contributors.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Reddit user shares detailed prompt for exporting personal knowledge from AI assistants
A Reddit user has created a comprehensive prompt for extracting structured personal knowledge from AI assistants like Claude, addressing perceived limitations in Anthropic's ChatGPT import feature. The prompt generates three distinct JSON artifacts covering personal knowledge bases, intellectual frameworks, and knowledge graphs.

Declawed: A Community-Driven OpenClaw Malware Scanner
Declawed is a new OpenClaw SKILL.md malware scanner focused on detecting arbitrary prompt injection, malicious content, and info stealers in ClawHub skills.

Multi-Agent Career Mentor Built with Ollama and MCP for Local AI
A developer built a 5-agent AI system that analyzes resumes and generates career intelligence reports using Ollama with llama3 locally. The system chains agent outputs so each builds on previous context, with MCP handling tool integration.

Claude to PDF Chrome Extension Exports Long Conversations with Formatting Intact
A developer has released a free Chrome extension called Claude to PDF that captures full conversation history from Claude AI chats and preserves code blocks, LaTeX math, and table formatting when exporting to PDF.