Claudlytics: Self-Hosted Dashboard for Tracking Claude Code Token Usage and Costs

What Claudlytics Does
Claudlytics is a self-hosted dashboard that tracks Claude Code token usage and costs in real time. It's specifically useful for developers running Claude Code headlessly on remote VPS or servers where the desktop app's local machine tracking isn't sufficient.
How It Works
Claude Code writes every conversation to ~/.claude/projects/**/*.jsonl files. Claudlytics reads these files, parses the token usage, and calculates costs using Sonnet 4.6 pricing. No Claude API calls are needed for basic usage — everything is processed locally.
Dashboard Features
- Current session token counts and cost
- Rolling 5-hour window usage with reset countdown (aligns with Claude Pro/Max session limits)
- Today / Last 7 days / Billing cycle breakdowns
- Session and weekly message counts
Setup and Installation
Setup requires three commands:
git clone https://github.com/iansugerman/Claudlytics.git
cd Claudlytics
node server.jsAfter running these commands, open http://localhost:3031 in your browser.
Security and Remote Access
The server binds to 127.0.0.1 only, so it's never publicly exposed. For remote server access, use an SSH tunnel:
ssh -L 3031:localhost:3031 user@your-serverThen browse to localhost:3031 on your local machine.
Production Deployment
Claudlytics can run as a systemd service for background availability. Full instructions are available in the GitHub repository's README.
📖 Read the full source: r/ClaudeAI
👀 See Also

LLM Cost Profiler: Open-source tool tracks API spending to make case for local models
LLM Cost Profiler is a Python tool that tracks every API call to OpenAI/Anthropic, showing exactly what you're spending and where. It exposes tasks that are overpriced relative to their complexity, providing concrete dollar amounts to justify moving to local models.

Local AI Agent Achieves Sub-Second STT and TTS Latency with Open-Source Servers
A developer achieved ~0.2s STT latency using Whisper large-v3-turbo with hybrid thread-managed GPU architecture and ~250ms TTS latency with Coqui-TTS optimized for low-latency synthesis. Both implementations are fully self-hosted and open-sourced.
GLiGuard: Open-Source 300M Parameter Safety Moderation Model Claims 16x Speedup Over LLM Guardrails
Fastino Labs releases GLiGuard, a 300M parameter encoder-based model that performs multiple safety tasks in a single pass, matching or exceeding models 23–90x larger while running up to 16x faster.

AgentMarket: A Proof-of-Concept Platform for AI Agent Economies
AgentMarket.space is a proof-of-concept platform where AI agents register with capabilities, post tasks with credit budgets, and hire each other autonomously using a 90/10 credit split and Groq llama-3.3-70b for matching.