Torrix: Self-Hosted LLM Observability Without Postgres or Redis

Torrix is a self-hosted LLM observability tool designed for teams who want to see what their agents are doing in production without the overhead of Postgres, Redis, or complex infrastructure. It runs as a single Docker container backed by SQLite. The full install is:
curl -o docker-compose.yml https://raw.githubusercontent.com/torrix-ai/install/main/docker-compose.community.yml
docker compose upNo external dependencies. All data stays in a local SQLite file on your machine. After startup, open http://localhost:8088 and create an account.
Key Features
- LLM call logging via HTTP proxy or Python/Node.js SDK: tokens, cost, latency, full prompt and response traces, reasoning token capture.
- Provider support: OpenAI, Anthropic, Gemini, Groq, Mistral, Azure OpenAI, and any OpenAI API-compatible endpoint.
- Cost forecasting and hard budget caps
- PII masking
- Model routing rules
- Evals with golden runs and AI judge
- Prompt library with version history
- Run tags for filtering by environment
- MCP server so AI Assistants can query your own logs
- OTLP/HTTP ingestion for apps already using OpenTelemetry
SDK Usage Example (Python)
pip install torrix
import torrix
from openai import OpenAI
torrix.init(api_key="<your-torrix-api-key>", base_url="http://localhost:8088")
client = torrix.wrap(OpenAI(api_key="<your-openai-key>"))
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
torrix_name="my-run",
)
print(response.choices[0].message.content)
The Node.js SDK is also available via npm install.
Licensing and Scaling
Community edition is free for one user with 7-day retention. Pro adds teams, RBAC, 30-day retention, API key management, full text search, and audit logs. SQLite doesn't scale to high write throughput; this is aimed at teams logging hundreds to low thousands of LLM calls per day, not millions.
📖 Read the full source: HN LLM Tools
👀 See Also

Unsloth Studio enables 2x training speed with 70% VRAM reduction for local AI fine-tuning
Unsloth Studio provides tools to train and fine-tune language models on local hardware with 2x faster training and 70% VRAM reduction. It supports exporting models to GGUF format for use with Ollama and enables full local AI coding workflows on 24GB hardware like RTX 4090.

Open-source multi-agent framework extracted from Claude Code leak
A developer extracted the multi-agent orchestration system from Claude Code's leaked source code and rebuilt it as a model-agnostic open-source framework with MIT license. The 8,000-line TypeScript framework includes task scheduling, inter-agent messaging, and built-in tools.

devcontainer-mcp: Give AI Agents Their Own Dev Environment, Not Yours
devcontainer-mcp is an MCP server that exposes 45 tools for AI agents to create, manage, and work inside dev containers backed by Docker, DevPod, or GitHub Codespaces — keeping host machines clean.

free-claude-code adds GLM-5 support via NVIDIA NIM, expands to OpenRouter and Discord
free-claude-code now supports GLM-5 through NVIDIA NIM's free tier (40 requests/min) and adds OpenRouter integration, Discord bot support, and LMStudio local provider compatibility. The tool converts Claude Code's Anthropic API requests to work with alternative model backends.