Galadriel: Open-Source Warm-Cache Harness for Persistent Claude Agents

A Reddit user has open-sourced Galadriel, a harness for persistent Claude agents that achieves 87% cost savings and sub-3s latency on 100K token contexts by optimizing prompt caching. The project, released under MIT license, targets the memory and cost issues often called the "Goldfish Problem" in AI coding agents.
Key Features
- 3-Tier Stacked Caching: Separate cache breakpoints for tool definitions, system prompts (
CLAUDE.md), and trailing conversation history. This avoids cache invalidation across different context segments. - Integrated MemPalace: A vector-based persistent memory system that does not break the prompt cache, enabling permanent recall.
- Privacy-first: Designed for private subnets — no middleman, no message caps, just your API key and rules.
- CLAUDE.md Guidelines (Karpathy-style): Built-in rules to prevent agent bloat (unnecessary context expansion).
Benchmarks
According to the author, tested against OpenClaw/Cursor workflows:
- Cost: $10 for every $100 normally spent (87% reduction).
- Latency: 100K token context drops from 11s to <3s (85% improvement).
Who It's For
Developers running persistent Claude agents for tasks like infrastructure management or codebase maintenance who are paying high API costs due to uncached context.
Setup
The harness is currently customized for Discord (the author's personal setup), but the caching logic is generic. Clone the repo and adapt the transport layer for your needs.
GitHub
github.com/avasol/galadriel-public (MIT License)
📖 Read the full source: r/openclaw
👀 See Also

Gemma Gem: On-Device AI Agent for Browser Automation via WebGPU
Gemma Gem is a Chrome extension that runs Google's Gemma 4 model (2B or 4B) entirely on-device using WebGPU, with no API keys or cloud dependencies. It provides tools to read page content, take screenshots, click elements, type text, scroll, and run JavaScript through a chat interface.

Banker Creates Credit Due Diligence Tool with 31 AI Prompts Using Only Claude
A banker with 17 years of MSME credit underwriting experience in India has open-sourced 31 AI prompts that transform a visiting card into a full credit due diligence report, compressing a 3-4 week process to 30 minutes. The tool was built entirely through conversation with Claude, without writing any code.

OpenClaw Agent Memory Plugin: Persistent Context Across Sessions
A developer built a memory layer plugin for OpenClaw that injects relevant context from past conversations before each turn and stores new facts and events after each turn, solving the problem of agents forgetting everything between sessions.

Claude Workflow Library Now Tracks and Rates Reddit- Sourced Workflows Automatically
A searchable, auto-updated index of Claude and Claude Code workflows from major subreddits, with steps, artifacts, and community ratings.