Merlin Research releases Qwen3.5-4B-Safety-Thinking model for structured reasoning

Merlin Research has released Qwen3.5-4B-Safety-Thinking, a 4 billion parameter safety-aligned reasoning model built on Qwen3.5. This model is specifically designed for structured 'thinking' and safety applications in real-world scenarios, with particular focus on agent systems.
Key improvements and features
- Improved ability to accurately follow strict instructions in prompts
- Based on the use of Bloom and Petri methods from Anthropic
- Resistant to hacking attempts
- Increased resistance to 'abnormal' and adversarial prompts
- Up to 1 million token context window
- Uses frameworks from Anthropic - Bloom and Petri
The model is available on Hugging Face at MerlinSafety/Qwen3.5-4B-Safety-Thinking.
For developers working with AI agents, this model represents a specialized tool for safety-critical applications where structured reasoning and resistance to prompt manipulation are priorities. The integration of Anthropic's Bloom and Petri methods suggests a focus on constitutional AI approaches to alignment.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Claude-Code v2.1.80 adds rate limit monitoring, plugin improvements, and memory optimizations
Claude-Code v2.1.80 introduces a rate_limits field for statusline scripts to display Claude.ai usage, adds source: 'settings' plugin marketplace support, and reduces memory usage by ~80 MB in large repositories. The release also fixes parallel tool result restoration, WebSocket failures, and various UI issues.

Exploring the New Chat Layer Built for AI Agents: Community Feedback Wanted!
A new chat layer has been introduced for AI agents, and the creators are inviting feedback from the OpenClaw community. Discover the potential of this innovative tool.

Anthropic Removes Gmail Message Body Access from Claude Connector
Anthropic has removed the gmail_read_message and gmail_search_messages tools from the Gmail connector, replacing them with get_thread and search_threads that no longer return message bodies or attachment content.

Control-UI LAN Access Issues in Docker OpenClaw Bridge Networks
A user reports persistent problems accessing OpenClaw's Control-UI via LAN connections in Docker bridge networks, with version 2026.3.14 briefly supporting token-based access before subsequent versions reverted to requiring pairing and throwing scope errors.