Run Qwen 3.5 9B Agents Locally on RTX 5070: Full Guide

Developer /u/TheOnlyVibemaster has published Hollow AgentOS, a self-modifying agent system that runs locally on an RTX 5070 (or any CPU laptop, albeit slower) and claims to reduce Claude API usage by half. The system runs 24/7, and when idle, reviews its own source files, proposes improvements, and implements them after a 2/3 majority vote among all agents.

How it works

The core insight: using a loop of iterative testing and self-improvement, the author found Qwen 3.5 9B running over sufficient time to be just as useful as Claude Code for many tasks. The agent proposes code, writes it, tests it, checks results, edits, and repeats indefinitely. The author states: "It becomes a time issue, not a model issue."

Key features

Self-modification without human input: When idle, agents review the system's own files, propose improvements, and autonomously implement changes inside a sandboxed environment after a 2/3 majority vote.
True offline development: The author says: "Claude thinks and then I basically just copy/paste Claude's instructions for the agents to work on. Come back in 6 hours and it's done for free on local hardware."
Hardware agnostic: Demonstrated on an RTX 5070 gaming PC, but can run on CPU on any laptop (slower).

Two core problems solved

The author lists two specific problems Hollow AgentOS addresses: A) enabling "truly develop without developing" — offloading tasks that can be figured out over time; B) allowing the system to "truly develop itself over time, learning and adapting without human interaction" unless the user chooses to intervene.

Repo and community

The project is available on GitHub at github.com/ninjahawk/hollow-agentOS and has received 66 stars as of the post date. The author thanks hundreds of testers and encourages feedback, criticism, or success stories.

Who it's for: Developers who want to reduce Claude API costs by running Qwen 3.5 9B agents locally for tasks that can tolerate longer wall-clock time in exchange for free compute.

📖 Read the full source: r/ClaudeAI

Hollow AgentOS: Run Claude-like agents locally on RTX 5070 using Qwen 3.5 9B

How it works

Key features

Two core problems solved

Repo and community

👀 See Also

Vibeyard IDE adds embedded browser for direct web UI editing with AI agents

Claude VS Code Extension Reasoning Effort Slider Sends Inconsistent Values

Pali v0.1: Open Source Memory Infrastructure for LLMs with Reproducible Benchmarks

Collaborate: A Claude Code Skill for Structured, Asynchronous Document Writing with Multi-Agent Handoffs