Deploy OpenClaw with llama.cpp on GEEKOM IT15 Mini PC

Deployment Architecture and Key Changes

This guide outlines a deployment where OpenClaw's gateway (port 18789) connects to a manually managed llama-server (port 8080) instead of the default Ollama server (port 11434). The goal is to run a local Qwen3-8B model using Intel Arc GPU acceleration via SYCL.

Debugging and Solutions

The process involved resolving several configuration conflicts:

Issue 1: Unsupported mcpServers Config: OpenClaw does not support the mcpServers configuration key. The solution was to remove this section from openclaw.json and use batch files to manually start llama-server, integrating its startup logic into Python code.
Issue 2: Session Cache Conflict: A cached Feishu channel session was overriding the new global configuration, causing Ollama API errors. This was fixed by deleting the session cache file: del "C:\Users\JiugeAItest\.openclaw\agents\main\sessions\sessions.json".
Issue 3: Insufficient Context Length: The default llama-server context of 4096 tokens caused errors for longer conversations. This was resolved by starting the server with -c 32768 and setting contextWindow: 32768 in the OpenClaw configuration.

Deployment Steps

The setup uses a specific directory structure on the GEEKOM IT15:

E:\Workspace_AI\Buildup_OpenClow
├── llama-b8245-bin-win-sycl-x64\ # llama.cpp SYCL version
│   ├── llama-server.exe
│   └── ... (DLLs)
├── models\Qwen3-8B-GGUF\
│   └── Qwen3-8B-Q4_K_M.gguf # Model file
└── start_openclaw_with_llamacpp.bat # Startup script

Note: The Qwen3-8B-Q4_K_M.gguf model is verified compatible with llama.cpp version b8245. Qwen3.5 models are incompatible with this version due to a rope.dimension_sections length mismatch.

OpenClaw Configuration

The primary configuration change is in C:\Users\<Username>\.openclaw\openclaw.json. The model provider is switched from ollama to llama-cpp:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "llama-cpp/qwen3-8b"
      }
    }
  },
  "models": {
    "providers": {
      "ollama": { ... },
      "llama-cpp": {
        "api": "openai-completions",
        "apiKey": "llama-cpp-local",
        "baseUrl": "http://127.0.0.1:8080/v1",
        "models": [
          {
            "contextWindow": 32768,
            "id": "qwen3-8b",
            "name": "qwen3-8b",
            ...
          }
        ]
      }
    }
  }
}

The guide also includes sections on parameter reference, a pitfall avoidance guide, troubleshooting, and instructions for switching back to Ollama if needed.

📖 Read the full source: r/openclaw