UI and Server for Anthropic's Natural Language Autoencoders on llama.cpp

✍️ OpenClawRadar📅 Published: May 13, 2026🔗 Source

Anthropic's first open-weight models, Natural Language Autoencoders (NLAs), are finetunes of popular open-weight architectures. Because they don't modify the underlying model architecture or modeling code, inference with llama.cpp is straightforward. A developer has packaged all NLA features—activation extraction, activation explanation, activation reconstruction, and explanation-edit steering—into a custom llama.cpp server, paired with a Mikupad UI for token-level activation explanation and steering.

Key Features

Activation extraction: Extract internal activations from any layer of the base model.
Activation explanation: Get human-readable explanations for extracted activations.
Activation reconstruction: Reconstruct activations from their explanations.
Explanation-edit steering: Modify explanations and steer the model's output accordingly.

Technical Details

The server is built on top of llama.cpp and requires three models to be loaded simultaneously: the base model, the actor model, and the critic model. This is a memory-intensive setup. The developer is working on a LoRA-based version that would allow loading a single model into memory, reducing the footprint significantly.

The Mikupad UI provides a token-level interface for activation explanation and steering. You can inspect which tokens activate certain features and adjust the model's behavior by editing explanations in real time.

Getting Started

Source code and setup instructions are available on Reddit. Currently, you must have the three NLA model checkpoints (base, actor, critic) and compile the custom llama.cpp server. The LoRA version is forthcoming.

📖 Read the full source: r/LocalLLaMA

👀 See Also

Tools

wearehere browser extension scans sites for tracking and privacy risks

wearehere is a browser extension that scans websites across ten categories including cookies, trackers, device fingerprinting, and dark patterns, then scores them based on privacy risks. It's under 200KB, runs locally in the browser, and also comes as an npm package for integration with AI agents via barebrowse MCP server.

Mar 14, 2026, 03:45 AM UTC

OpenClawRadar

Tools

PgAdmin 4 9.13 Adds AI Assistant Panel to Query Tool

PgAdmin 4 version 9.13 introduces an AI Assistant panel in the Query Tool that can generate SQL from natural language when AI is configured. The update also includes a Workspace layout for distraction-free query editing and ad-hoc server connections.

Mar 10, 2026, 07:45 PM UTC

OpenClawRadar

Tools

Ninetails Memory Engine V4.5: Int8 Quantization + LRU Cache Cuts Local MCP Memory to 60MB

The Ninetails Memory Engine V4.5 uses Int8 scalar quantization and LRU cache eviction to reduce vector storage from 6KB to 1.5KB per embedding, keeping the entire engine at 40-60MB RAM. It combines 70% vector similarity with 30% BM25 search in a fully local SQLite implementation.

Apr 1, 2026, 01:45 AM UTC

OpenClawRadar

Tools

OpenClaw Multi-Agent Book Writing Skill Released

A multi-agent book writing system built on OpenClaw has been released as a skill, featuring DeepWiki MCP connection, GLM image generation for illustrations, budget estimation, and chapter-level revision. Two chapters of the OpenClaw Paradigm Book have been updated using this tool.

Apr 17, 2026, 02:45 AM UTC

OpenClawRadar