Using Obliteratus toolkit to remove refusal weights from AI models

✍️ OpenClawRadar📅 Published: April 16, 2026🔗 Source
Using Obliteratus toolkit to remove refusal weights from AI models
Ad

A Reddit user on r/LocalLLaMA demonstrated using the Obliteratus toolkit to remove specific weights responsible for refusal behavior in AI models. The approach involves surgically deleting weights that enforce safety filters and corporate identity guardrails.

Ad

Key Details from the Source

The user specifically:

  • Used the Obliteratus toolkit to find weights responsible for refusal behavior
  • Surgically removed these weights from Alibaba's Qwen 1.5B model
  • Tested by asking the modified model who trained it
  • Found that with corporate identity guardrails mathematically deleted, the model admitted it was trained by Anthropic
  • Noted this was a side effect of the model using synthetic Claude data for training

The result shows that the model retains its reasoning and knowledge capabilities but loses the corporate script. The user emphasizes that this doesn't require retraining the model—only deleting specific weights responsible for refusal chains.

This type of weight ablation technique is part of broader research into model interpretability and control. Tools like Obliteratus allow researchers to examine which parts of neural networks are responsible for specific behaviors, though such modifications can have unintended consequences and may violate terms of service for proprietary models.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also

cldctrl: Terminal Dashboard for Managing Claude Code Sessions
Tools

cldctrl: Terminal Dashboard for Managing Claude Code Sessions

cldctrl is an npm package that provides a terminal dashboard for launching and managing Claude Code sessions across projects. It reads existing ~/.claude data, auto-discovers projects, and shows token usage with rate limit bars.

OpenClawRadar
Soul MCP Server Adds Persistent Memory and Safety for Local LLMs
Tools

Soul MCP Server Adds Persistent Memory and Safety for Local LLMs

Soul is an open-source MCP server that provides persistent memory across sessions for local LLMs with two commands: n2_boot at start and n2_work_end at end. It includes Ark safety features that block dangerous commands like rm -rf and DROP DATABASE at zero token cost, plus cloud storage configuration.

OpenClawRadar
Building a Self-Improving Dream Cycle with Cron Jobs and Claude
Tools

Building a Self-Improving Dream Cycle with Cron Jobs and Claude

A developer built an autonomous dream cycle using two cron jobs: one at 10:30 PM for research and reflection, and another at 11:00 PM for review and planning. The system scans arXiv, GitHub trending, and Reddit, identifies weaknesses, and proposes concrete improvements.

OpenClawRadar
OpenClaw Agent Maintains Memory When Switching from Claude Subscription to API
Tools

OpenClaw Agent Maintains Memory When Switching from Claude Subscription to API

A developer reports successfully migrating their OpenClaw setup from a Claude subscription to API key without losing agent memory, using the mengram-memory skill that saves to an external layer. The agent retained ~100+ learned facts, evolved procedures, and episodic memories.

OpenClawRadar