LLM Cost Profiler: Open-source tool tracks API spending to make case for local models

✍️ OpenClawRadar📅 Published: April 15, 2026🔗 Source
LLM Cost Profiler: Open-source tool tracks API spending to make case for local models
Ad

LLM Cost Profiler is an open-source Python tool that tracks every API call your code makes to OpenAI and Anthropic, showing exactly what you're spending, where, and why. The tool exposes which tasks are overpriced relative to their complexity, providing concrete data to make the case for local inference.

Ad

Key Features and Findings

The tool stores everything in local SQLite and is MIT licensed. According to the source, it found several specific examples of API call waste:

  • A classifier using GPT-4o that outputs one of 5 labels — a task any decent 7B local model handles easily. Cost: ~$89/week on API calls.
  • Thousands of duplicate calls to the same prompt — zero caching. Local inference with caching would make this effectively free.
  • A summarizer where 34% of calls were retries from format errors. A well-tuned local model with constrained generation eliminates this entire class of waste.

The author notes this tool gives teams concrete ammunition for investing in local inference infrastructure: "Here's the exact dollar amount we'd save by moving X task to a local model."

The tool is available on GitHub at https://github.com/BuildWithAbid/llm-cost-profiler. The author is planning to add support for tracking local model inference costs too (compute time based costing) and asked the community if this would be useful.

This type of cost profiling tool is particularly relevant for developers using AI coding agents, as it provides data-driven insights into where API spending might be inefficient compared to local alternatives.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also