Running NemoClaw with Local vLLM: Setup Notes and Agent Engineering Observations

Local NemoClaw Setup with vLLM
A developer shared their experience running NVIDIA's NemoClaw, a sandboxed AI agent platform, with a local Nemotron 9B v2 model using vLLM on WSL2. The setup is based on jieunl24's fork of NemoClaw.
Key Technical Details
Inference Routing: NemoClaw's inference routing follows a clean path: inference.local → gateway → vLLM. However, initial onboarding bugs required a 3-layer network hack that has since been fixed via PR #412.
Parser Compatibility: The built-in vLLM parsers (qwen3_coder, nemotron_v3) are incompatible with Nemotron v2 models. You need NVIDIA's official plugin parsers from the NeMo repository instead.
Agent Engineering Gap: OpenClaw as an agent platform provides solid infrastructure but ships with minimal prompt engineering. The gap between "model serves text" and "agent does useful work" is primarily about scaffolding rather than model capability limitations.
Resources
- Blog post covering architecture, vLLM parser setup, and agent engineering observations: https://github.com/soy-tuber/nemoclaw-local-inference-guide/blob/master/BLOG-openclaw-agent-engineering.md
- Setup guide (V2) with inference.local routing and no network hacks: https://github.com/soy-tuber/nemoclaw-local-inference-guide
- Original NemoClaw issue #315: https://github.com/NVIDIA/NemoClaw/issues/315
This setup demonstrates practical local deployment of AI agent platforms, highlighting both the technical implementation details and the ongoing challenges in agent engineering.
📖 Read the full source: r/LocalLLaMA
👀 See Also
MartinLoop: Open-Source Control Plane for AI Coding Agents with Budget Stops and Audit Trails
MartinLoop is an open-source control plane that adds hard budget stops, JSONL audit trails, failure classification, and test-verified completion checks to AI coding agents.

Google Releases Sashiko: AI Code Review Agent for Linux Kernel Patches
Google engineers have open-sourced Sashiko, an agentic AI code review system designed for the Linux kernel. It found 53% of bugs in an unfiltered set of 1,000 recent upstream issues that were missed by human reviewers.

Security scanning skill for AI coding agents checks deployments automatically
A developer created a skill file that enables AI coding agents to automatically scan their own deployments for exposed .env files, open ports, missing security headers, and leaked source code. The scan runs after every deploy and takes about 30 seconds.

Rival-Review: A Cross-Model Review Loop for AI Agent Plans
Rival-review is an MIT-licensed tool that uses a second AI model to audit plans from a primary AI coding agent before execution, catching issues like flawed rollback plans, security holes, and stale-state decisions.