OmniRecall Beta: FAISS-Powered Memory Injection for Cloud LLM Chats

What OmniRecall Does
OmniRecall is a local mitmproxy bypass that intercepts traffic to cloud chat interfaces (tested on DeepSeek). It hacks into the proprietary SSE fragment stream and forces a long-term memory layer onto a system that was designed to be stateless.
Technical Mechanism
- Deep-Packet Parsing: Reconstructs the full assistant reply by tracking real-time patches
- Command Control: Detects [ADD], [UPDATE], [REMOVE], [CLEAR] from the AI's output
- Local Brain: Maintains memory.txt + FAISS index (sentence-transformers MiniLM-L6)
- Context Injection: Top recalled facts get force-fed into your next message as [RECALL: ...]
Current Status & Limitations
This is a beta/experimental release. The developer notes: "This is the closest I've gotten to the dream after weeks of debugging hell. It is buggy. It is experimental. [ADD] is mostly stable, but [SEARCH] is temperamental—if you want perfection, fix it yourself. I've hit my energy limit on this build."
Upstream UI changes will break it. The developer states: "If it breaks, that's on you now."
Requirements & Setup
Potato-PC Requirements:
- CPU only (faiss-cpu + all-MiniLM-L6-v2)
- No local LLM needed — augments the cloud models you already use
- Zero cost, zero API keys, 100% local data isolation
How to Deploy:
pip install mitmproxy faiss-cpu sentence-transformers numpyTrust the mitmproxy CA cert on your OS/browser (run mitmproxy once to generate it). Set system proxy to 127.0.0.1:8080. Then run:
mitmdump -s omnirecall.pyGo to chat.deepseek.com and start feeding it memories.
License Terms
The project uses an aggressively restrictive source-available license:
- No commercial use
- No private forks
- Mandatory public ALTERATIONS.md for any logic changes
- If you port to Claude/GPT-4o/whatever, keep it public per the license
The developer explains: "I've watched too many solo-dev projects get strip-mined, privatized, or turned into paid SaaS while the creator gets zero. This license isn't friendly—it's built to protect the work from exactly those people. If the terms scare you off, that's the point."
📖 Read the full source: r/LocalLLaMA
👀 See Also

MOOSE-Star: A 7B Model and 108K-Paper Dataset for Scientific Hypothesis Discovery – ICML 2026
MiroMind releases MOOSE-Star on Hugging Face: a 7B model (DeepSeek-R1-Distill-Qwen-7B fine-tune) for scientific hypothesis discovery, alongside the 108K-paper TOMATO-Star dataset. Benchmark shows MS-7B achieves 54.34% inspiration retrieval accuracy, beating GPT-5.4 and approaching Gemini-3 Pro.

Two Free Claude Code Skills: Tutorial Generator and Prompt Fixer
Two new free Claude Code skills: create-tutorial generates code reading tutorials from your actual project files, and prompter rewrites typo-filled prompts into actionable instructions. Both are MIT licensed and install via GitHub.

Claude Desktop App Cowork Feature Enables AI-to-AI Communication via Shared Google Docs
Users have successfully implemented Claude-to-Claude communication using the new cowork function in the desktop app, with two agents reading and writing to a shared Google Doc. The test involved five rounds of question-and-answer dialogue between the AI agents.

Clawdex: A Directory for Tracking OpenClaw Derivatives and Forks
Clawdex is a directory listing 18 OpenClaw-related projects across three tiers, with data on stars, language, and category tags. The project is PR-driven, requiring contributors to fork the repo, add a YAML file to /src/data/projects/, and open a pull request.