Agent-Drift: Security Monitoring Tool for AI Agents

Agent-Drift: Security Monitoring Tool for AI Agents
Cybersecurity specialist sysinternalssuite created Agent-Drift—an open-source tool for protecting AI agents from prompt injection, behavioral drift, and other attacks. Essentially a SIEM + IDS specifically for OpenClaw.
Why This Exists
"I work in Cybersecurity and have noticed an uptick in prompt injection, behavioral drift, memory poisoning and more in the wild with AI agents"
What Agent-Drift Does
GitHub: https://github.com/lukehebe/Agent-Drift
The tool works as a wrapper for OpenClaw:
- Collects behavior baseline
- Detects behavioral drift
- Alerts through dashboard
Behavior Monitoring
Tracked patterns:
- Tool usage sequences and frequencies
- Timing anomalies
- Decision patterns
- Output characteristics
Attack Detection
| Attack | Description |
|---|---|
| Instruction override | Command hijacking |
| Role hijacking | Role takeover |
| Jailbreak attempts | Restriction bypass |
| Data exfiltration | Data leakage |
| Encoded Payloads | Obfuscated payloads |
| Memory Poisoning | Memory corruption |
| Privilege Escalation | Rights elevation |
| Indirect prompt injection | Indirect attacks |
How It Works
- Baseline Learning — first runs establish normal behavior
- Behavioral Vectors — each run becomes a multi-dimensional vector
- Drift Detection — new runs compared against baseline
- Anomaly Alerts — significant deviations trigger warnings
TL;DR
"Basically an all in one Security Incident Event Manager (SIEM) for your AI agent that acts as an Intrusion Detection System (IDS) that also alerts you if your AI starts to go crazy."
Source: u/sysinternalssuite on r/moltbot
📖 Read the full source: Reddit
👀 See Also

Research: Invisible Unicode Characters Can Hijack LLM Agents via Tool Access
A study tested whether LLMs follow instructions hidden in invisible Unicode characters embedded in normal text, using two encoding schemes across five models and 8,308 graded outputs. Key finding: tool access amplifies compliance from below 17% to 98-100%, with models writing Python scripts to decode hidden characters.

Claude Code CVE-2026-39861: Sandbox Escape via Symlink Following
A high-severity vulnerability in Claude Code's sandbox allows arbitrary file write outside the workspace via symlink following, potentially leading to code execution.
Google Threat Intelligence Group Reports First AI-Developed Zero-Day Exploit Bypassing 2FA
Google Threat Intelligence Group detected the first fully AI-developed zero-day exploit that bypasses 2FA in a popular open-source web-based system administration tool, along with self-morphing malware and Gemini-powered backdoors.

Clawvisor: Purpose-Based Authorization Layer for OpenClaw Agents
Clawvisor is an authorization layer that sits between AI agents and APIs, enforcing purpose-based authorization where agents declare intentions, users approve specific purposes, and an AI gatekeeper verifies every request against that purpose. Credentials never leave Clawvisor and agents never see them.