Testing δ-Mem on Apple Silicon: MLX Implementation and Benchmarks

✍️ OpenClawRadar📅 Published: May 16, 2026🔗 Source

A Reddit user implemented the δ-mem research paper (arXiv 2605.12357) for Apple Silicon using mlx and OpenClaw integration. The paper improves model attention direction without context or LoRA, reporting 20% better answers in their tests. The implementation used Qwen3-4B-Instruct via mlx and custom adapters.

Benchmark Results (normalized mlx tests, Qwen3-4B-Instruct on MacMini 64GB):

Synthetic paper-style: Plain 0.5129, δ-mem 0.5129 (1.00x)
LoCoMo-10 mini: Plain 0.0500, δ-mem 0.1833 (3.67x)
OpenClaw replay: Plain 0.5701, δ-mem 0.6667 (1.17x)

Latency costs (vs plain):

Synthetic: 1.013x
LoCoMo-10 mini: 1.33x query / 1.50x total
OpenClaw replay: 1.30x

Key links:

GitHub repo with adapter: delta-mem-mlx-sidecar-w-openclaw
MLX adapter on Hugging Face: delta-mem-qwen3-4b-instruct-mlx-adapter

Takeaways:

Synthetic probes were flat (1.00x), but LoCoMo-mini showed strong relative gains (3.67x).
OpenClaw-style replay showed a practically meaningful improvement (6/8 → 7/8 probes passed, 1.17x).
The user notes Apple Silicon cannot run CUDA efficiently, so results are lower than paper benchmarks. Paper benchmarks (Qwen3-4B-Instruct) showed avg 1.10x vs frozen backbone, MemoryAgentBench 1.31x, LoCoMo 1.20x.
The user is seeking help (or funding ~$6k) to train an adapter for larger models like Qwen3.6:27B.

Who it's for: Developers running local LLM agents on Apple Silicon who want to experiment with δ-mem weight modulation to improve memory/context performance.

📖 Read the full source: r/LocalLLaMA

👀 See Also

Tools

Browser39: A Headless Web Browser for AI Agents

Browser39 is a headless web browser designed specifically for AI agents that converts web pages to token-optimized Markdown locally, runs JavaScript, manages cookies and sessions, queries the DOM, and fills forms. It's a single binary with no external browser needed, no fees, and no external service.

Apr 14, 2026, 11:45 PM UTC

OpenClawRadar

Tools

Memento v1.0: Local Persistent Memory for AI Coding Agents

Memento v1.0 is a fully local memory layer for AI coding agents that runs embeddings, storage, and search on your machine with no cloud dependencies. It uses all-MiniLM-L6-v2 embeddings, HNSW indexing, and supports multiple IDEs with 17 MCP tools.

Mar 24, 2026, 07:45 AM UTC

OpenClawRadar

Tools

Be My Butler: Multi-Agent Pipeline for AI Code Verification

Be My Butler is an open-source multi-agent pipeline where different AI models review each other's code through blind verification. The system addresses the problem of AI agents incorrectly reporting their own code as functional.

Mar 14, 2026, 04:45 AM UTC

OpenClawRadar

Tools

Heartbeat-gateway: Event-driven replacement for cron polling in OpenClaw

Heartbeat-gateway is an open-source Python tool that replaces cron-based polling with webhook-driven events for OpenClaw, reducing API costs from ~$86/month to ~$4.50/month and improving latency from up to 30 minutes to under 2 seconds.

Mar 28, 2026, 11:45 PM UTC

OpenClawRadar