VoidLLM: Zero-Knowledge Proxy for Ollama and vLLM

VoidLLM is a proxy server that sits between your applications and local LLM servers like Ollama and vLLM. It adds organization and team access control, API key management, usage tracking, and rate limiting without ever seeing your prompts or content.

Key Features

OpenAI-compatible — works with any SDK that supports the OpenAI API format
Provider adapters for Ollama, vLLM, Anthropic, Azure, and OpenAI
<2ms proxy overhead
Rate limiting per organization, team, or API key (distributed via Redis)
Cost tracking and analytics dashboard
Zero content logging — only metadata (who accessed what model and how many tokens were used)

Use Case

If you're running Ollama or vLLM locally and want to share it across a team with proper access control and usage visibility, this proxy provides those capabilities while maintaining privacy through its zero-knowledge architecture.

The tool is available on GitHub at github.com/voidmind-io/voidllm.

📖 Read the full source: r/LocalLLaMA

VoidLLM: Zero-Knowledge Proxy for Ollama and vLLM with Team Access Control

Key Features

Use Case

👀 See Also

Claude Code Lazy-Loads Tool Schemas via ToolSearch to Save Tokens

ModelFitAI: Deploy AI Agents Without VPS Setup, Built with Claude Code

Open-Sourced CLAUDE.md Keeps Claude Code Agents Productive for Hours, Not Looping

MCP Server for TypeScript Projects Replaces Claude Code's Grep Pattern with Indexed Symbol Lookups