free-claude-code adds GLM-5 support via NVIDIA NIM, expands to OpenRouter and Discord

free-claude-code, a lightweight proxy that converts Claude Code's Anthropic API requests into other provider formats, has been updated with GLM-5 support through NVIDIA NIM and several new features. The tool allows developers to use Claude Code's agentic coding interface without an Anthropic subscription by routing requests to alternative backends.
Key updates
NVIDIA added tool calling fixes for z-ai/glm5 to their NIM inventory, and free-claude-code now fully supports it. The NVIDIA NIM free tier provides 40 requests per minute with no credit card required.
- OpenRouter support: Use any model on OpenRouter's platform as your backend, including their free models
- Discord bot integration: Control Claude Code remotely via Discord in addition to the existing Telegram bot support
- LMStudio local provider support: Run models fully locally
- Claude Code VSCode extension support
Technical advantages
- Zero cost options: NVIDIA NIM free tier (40 reqs/min) and Open Router free models require no payment
- Interleaved thinking preservation: Native interleaved thinking tokens are preserved across turns, allowing models like GLM-5 and Kimi-K2.5 to leverage reasoning from previous turns
- 5 built-in optimizations: Fast prefix detection, title generation skip, suggestion mode skip, and other optimizations reduce unnecessary LLM calls
- Remote control: Telegram and Discord bots enable sending coding tasks from mobile devices with session forking and persistence
- Configurable rate limiter: Sliding window rate limiting for concurrent sessions
- Easy model support: New models launching on NVIDIA NIM can be used with no code changes
- Extensibility: Modular code structure makes it easy to add custom providers or messaging platforms
Supported models
Popular models include z-ai/glm5, moonshotai/kimi-k2.5, minimaxai/minimax-m2.5, qwen/qwen3.5-397b-a17b, and stepfun-ai/step-3.5-flash. The full list is available in nvidia_nim_models.json. With OpenRouter and LMStudio, virtually any model can be used as a backend.
The developer is currently working on automatic model selection based on availability and quality. The project is open source with issues and PRs welcome.
📖 Read the full source: r/ClaudeAI
👀 See Also

AI Claw: Serverless Bridge Connects Alexa to Local OpenClaw with Dual Delivery
AI Claw is a Python AWS Lambda pipeline that connects Amazon Echo speakers to local OpenClaw instances, bypassing Amazon's 8-second timeout by using a fire-and-forget architecture with dual delivery to Telegram and native Echo audio output.

Log Reducer MCP Server Cuts Token Usage When Claude Code Reads Logs
Log Reducer is an MCP server that processes log files server-side before sending reduced output to Claude Code, avoiding raw logs in the context window. It applies 19 deterministic transforms that compress logs by 50-90%, with a 2000-line log representing 20,000+ tokens removed from sessions.

Dynamic Status Bar for Claude Code Shows Live Updates
A developer has improved their Claude Code status bar from static text to dynamic display with real-time updates showing what Claude is working on. The configuration is available as a GitHub gist.

Tocket CLI: A Context Engineering Framework for AI Coding Agents
Tocket is a CLI tool that creates a .context/ folder with markdown files for AI agents to maintain project memory across sessions. It auto-detects tech stacks from package.json and generates a pre-configured .cursorrules file.