civStation: Open-Source VLM Harness for Natural Language Control of Civilization VI

✍️ OpenClawRadar📅 Published: April 13, 2026🔗 Source
civStation: Open-Source VLM Harness for Natural Language Control of Civilization VI
Ad

What civStation Does

civStation is an open-source, controllable computer-use stack and VLM harness built specifically for Civilization VI. Instead of treating the game as a low-level UI automation problem, the project focuses on strategy-level control. You can give natural language inputs like "expand to the east", "focus on economy this turn", or "aim for a science victory", and the system translates that intent into actual in-game actions.

Core Architecture and Loop

The system implements a complete loop: screen observation → strategy interpretation → action planning → execution → human override. This shifts the interface upward from direct execution to intent expression and controllable delegation. The goal wasn't just to make an agent play Civ6, but to build a loop where the model can observe the game screen, interpret high-level strategy, plan actions, execute them through mouse and keyboard, and be interrupted or guided live through human-in-the-loop (HitL) or MCP.

Current Features and Capabilities

  • Live desktop observation
  • Real UI interaction on the host machine
  • Runtime control interface
  • Human-in-the-loop control
  • MCP/skill extensibility
  • Natural language or voice-driven control
Ad

Research Questions and Motivation

The creator is exploring several questions: Where should the boundary be between strategy and execution? How controllable can a computer-use agent be before the loop becomes too slow or brittle? Does this approach make sense only for games, or also for broader desktop workflows?

The motivation stems from observing that most computer-use demos focus on "watch the model click," while civStation aims for something closer to a controllable runtime where you can operate at the level of strategy instead of raw UI interaction. Another motivation was testing whether voice and natural language, combined with computer-use, could open a different interaction layer where the player behaves more like a strategist giving directives rather than directly executing actions.

Repository and Availability

The project is available at: https://github.com/NomaDamas/civStation.git

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also