NanoClaw's Security Model for AI Agents: Container Isolation and Minimal Code

NanoClaw's Security Architecture for Untrusted AI Agents
The NanoClaw blog argues that AI agents should be treated as untrusted and potentially malicious, advocating for architectural containment rather than application-level permission checks. The system is built on the principle that agents will misbehave and focuses on limiting damage when they do.
Container Isolation as Core Security
NanoClaw runs each agent in its own container using Docker or Apple Container on macOS. These containers are ephemeral - created fresh per invocation and destroyed afterward. Agents run as unprivileged users and can only access directories explicitly mounted in. This contrasts with OpenClaw's default approach where agents run directly on the host machine with an opt-in Docker sandbox mode that most users never enable.
The container boundary provides hermetic security enforced by the OS, preventing agents from escaping regardless of configuration. Each agent gets its own container, filesystem, and Claude session history, preventing information leakage between agents that are supposed to access different data.
Mount Allowlist and Default Protections
A mount allowlist at ~/.config/nanoclaw/mount-allowlist.json acts as defense-in-depth, preventing users from accidentally mounting sensitive paths. Sensitive directories like .ssh, .gnupg, .aws, .env, private_key, and credentials are blocked by default. The allowlist lives outside the project directory so compromised agents can't modify their own permissions.
Host application code is mounted read-only, ensuring nothing an agent does can persist after container destruction. Non-main groups are untrusted by default, preventing cross-group messaging, task scheduling, or data viewing to protect against prompt injection from group members.
Minimal, Reviewable Codebase
NanoClaw maintains a deliberately minimal codebase of one process and a handful of files, contrasting with OpenClaw's approximately 400,000 lines of code, 53 config files, and over 70 dependencies. The system relies heavily on Anthropic's Agent SDK for session management, memory compaction, and other functionality instead of reinventing components.
This design allows a competent developer to review the entire codebase in an afternoon. Contribution guidelines accept only bug fixes, security fixes, and simplifications. New functionality comes through skills - instructions with full working reference implementations that coding agents merge into codebases after review.
Each installation ends up as a few thousand lines of code tailored to the owner's specific needs, avoiding the complexity where vulnerabilities typically hide.
📖 Read the full source: HN LLM Tools
👀 See Also

Security Alert for Local OpenClaw Instances Without Sandboxing
A Reddit post warns that running vanilla OpenClaw instances locally without proper isolation can lead to exposed API keys, accidental file deletion, and data leaks. The source recommends sandboxing bash tools or using a managed service.

Critical Cowork Bug: AI Agent Deleted Files Without User Approval
A critical bug in Claude's Cowork mode allowed the AI to execute destructive actions without user consent. The ExitPlanMode tool falsely reported user approval, triggering an autonomous agent that deleted 12 files from a React/TypeScript codebase.

MCP Sandbox: Run MCP Servers in Isolated Containers Without Trusting Them
A developer built MCP Sandbox, which runs MCP servers in isolated gVisor containers with default-deny network access and safe secret injection, plus pre-execution CVE scanning and pattern checking.

Security audit reveals vulnerabilities in OpenClaw skill ecosystem
A security audit of OpenClaw found 8 documented CVEs including arbitrary code execution and credential theft vulnerabilities, plus 15% of skills in the shared library exhibit suspicious network behavior. The auditor migrated to a minimal Rust-based runtime with Ollama for better isolation.