KnightClaw: Local Security Extension for OpenClaw Agents

✍️ OpenClawRadar📅 Published: February 23, 2026🔗 Source
KnightClaw: Local Security Extension for OpenClaw Agents
Ad

KnightClaw is a security extension designed to protect OpenClaw AI coding agents from adversarial prompts. The tool addresses a specific threat model where a single malicious message in the context window can cause an agent to follow attacker instructions instead of user commands.

Core Features

KnightClaw operates as a drop-in extension with no configuration required, no API keys, and no cloud dependency. It intercepts every message before it reaches the agent.

Detection System

The guard uses an 8-layer hybrid detection approach:

  • Regex patterns
  • Homoglyph detection
  • Boundary token analysis
  • Perplexity scoring
  • Entropy analysis
  • Heuristics
  • Semantic embeddings (using a local, quantized BGE model)

Blocks occur in microseconds.

Ad

Additional Security Measures

  • Egress redaction: Strips secrets from outbound responses before they leave the agent
  • Hash-chained audit logs: Tamper-proof, append-only logs with full timeline of every block, allow, and config change
  • Velocity circuit breaker: 10 blocks in 60 seconds triggers automatic lockdown with no manual intervention
  • Kill switch: One command stops everything: openclaw knight lockdown on

Technical Details

The extension runs entirely local with zero telemetry and is MIT licensed. The source is available for testing and contribution.

📖 Read the full source: r/openclaw

Ad

👀 See Also