arifOS: A $15 MCP Governance Kernel for OpenClaw Tool Security

What arifOS Does
arifOS is a tiny MCP governance kernel that sits between OpenClaw models and their tools/skills. The creator, Arif (a geologist, not a coder), built it to prevent AI agents from "free-styling" his tools without proper security checks.
Core Architecture
The system uses a simple metaphor: treat the LLM like a "brain in a jar," treat tools like "hands," and put a "$15 VPS in the middle as the bouncer." Every OpenClaw tool call goes through this chain: jar → MCP server → scoring → security check.
Security Implementation
Each tool call gets scored 000-999 and must pass 13 hard Floors including:
- Amanah
- Truth
- Safety
- Injection
- Sovereignty
If a call fails any Floor, it returns "VOID" and nothing touches your filesystem, API, or database. The blocking logic is straightforward:
if verdict == "VOID":
return "Action Blocked by Floor 1: Amanah"As Arif puts it: "That's the whole joke: billion-dollar model, $15 lock."
Installation and Availability
Available via pip: pip install arifos
Repository: https://github.com/ariffazil/arifOS
The creator invites testing: "If you're running OpenClaw agents and want a paranoid bouncer in front of your skills, feel free to break this and tell me where it leaks."
Development Context
Arif notes that all Python code was written by AI agents, and he doesn't "even know how to spell phython"—highlighting the paradox of non-coders building security tools with AI assistance.
📖 Read the full source: r/openclaw
👀 See Also

Security scan reveals high severity finding in AI agent find-skills tool
A developer running a security scan on their AI agent setup discovered a high severity vulnerability in the find-skills tool they used to install additional skills, raising concerns about ecosystem safety.

OpenClaw Security Hardening: Multi-Layered Protection Against Autonomous Agent Risks
A developer modified OpenClaw's codebase to add a multi-layered security stack including a hard-deny regex guard, recursive de-obfuscator, AppArmor profile, and audit integration to prevent destructive commands and data exfiltration by autonomous agents.

Tool Authority Injection in LLM Agents: When Tool Output Overrides System Intent
A researcher demonstrates 'Tool Authority Injection' in a local LLM agent lab, showing how trusted tool output can be elevated to policy-level authority, silently changing agent behavior while sandbox and file access remain secure.

Litellm PyPI Package Compromised: Malicious Version 1.82.8 Exfiltrated Credentials
The litellm PyPI package, which unifies calls to OpenAI, Anthropic, Cohere and other LLM providers, was compromised with malicious version 1.82.8 that exfiltrated SSH keys, cloud credentials, API keys, and other sensitive data for about an hour.