Local LLM Setup Recommendations for OpenClaw

Setup Overview
A user on r/openclaw has shared their current configuration for integrating a local Large Language Model (LLM) with OpenClaw. They are using separate hardware: a GB10 device specifically for running the AI model and a Mac mini for the main OpenClaw installation.
Configuration Details
The setup process is described as mostly standard, with one key deviation: when prompted to choose an LLM, you must select the 'custom LLM' option. The user instructs to "put in ur ip" at this stage. They note that most setups will be using OpenAI-compatible endpoints via tools like vLLM, SGLang, or llama.cpp.
For the model selection, the user provides a specific warning and recommendation:
- Model Selection Advice: "don’t choose the biggest model that fit into your vram u need to find the balance between context token and model size."
- Current Model: They are using
unsloth/MiniMax-M2.5-GGUF:UD_Q2_K_XL + 24000. - Inference Server: They are using llama.cpp to run the model.
Server Endpoint
The local inference server is configured to run at localhost:8080/v1. This provides an OpenAI-compatible API endpoint that OpenClaw can connect to.
The user notes this is a work in progress, stating: "I am still testing openclaw though so I might change to another model if token isn’t enough." This highlights the practical, iterative nature of finding the right model for a specific workflow's context window requirements.
📖 Read the full source: r/openclaw
👀 See Also

Running OmniCoder-9B locally with llama.cpp configuration details
A developer achieved 96.7% average HumanEval score with OmniCoder-9B on mid-range hardware using specific llama.cpp flags including --reasoning-budget 0 to disable chain-of-thought output. The setup used a Q6_K quantized model running on an RTX 3080 with 10GB VRAM.

NemoClaw Windows Setup Issues and Fixes
NemoClaw installations on Windows fail with three specific errors: unsupported environment on Git Bash, port 18789 already in use, and Docker build failing on OpenClaw install. The root cause is that NemoClaw wasn't built with Windows in mind, requiring WSL2 Ubuntu for successful setup.

Practical Guide to Self-Hosting Your First LLM
A Reddit post outlines reasons for self-hosting LLMs including privacy for sensitive data, cost predictability for agent workloads, performance improvements by removing API roundtrips, and customization through fine-tuning methods like LoRA and QLoRA.

How to set up Qwen 3.6 Plus Preview on OpenRouter for free OpenClaw usage
Qwen 3.6 Plus Preview is currently free on OpenRouter with a 1 million token context window, suitable for AI agent work. The setup involves creating an OpenRouter account, adding the provider to OpenClaw, and configuring the model.