DeepSeek V4 Pricing: 178x Cheaper Cached Tokens vs Opus

DeepSeek V4 launched with pricing so low a Reddit user checked the math. Here are the verified numbers:

Pricing breakdown

V4-Pro standard input: $0.145 per million tokens. Opus 4.7 input: ~$5 per million. Ratio: 34x.
With 75% promotional discount (through end of May): V4-Pro input drops to $0.036 per million — 138x cheaper than Opus.
Cache hit pricing: V4-Pro is $0.0036 per million. Opus cached is $0.625 per million. Ratio: 173x.

The catch

As the original post notes, DeepSeek admits V4 is three to six months behind GPT-5.4 and Gemini 3.1 Pro on capability. You're not getting frontier quality at frontier-divided-by-178 — you're getting last summer's frontier quality.

What this means for agentic workflows

For agentic loops with heavy caching (system prompts, tool definitions), the cache hit discount is the real story. Reusable system prompts become essentially free. The key unknown: whether the claimed 1M context window holds up under real workloads or degrades to a usable 200K, as seen with many large-window models.

📖 Read the full source: r/LocalLLaMA

DeepSeek V4 pricing reality check: 178x cheaper cached tokens vs Opus, but capability lag acknowledged

Pricing breakdown

The catch

What this means for agentic workflows

👀 See Also

FairyFuse Achieves 29.6x Kernel Speedup on CPUs via Ternary Weight Multiplication-Free Inference

DeepSeek-V4-Flash Makes LLM Steering Practical for Local Models

Claude Code v2.1.157: Auto-Load Plugins from .claude/skills, Improved Agents & Worktrees

Snowflake lays off documentation staff after training AI replacement