Hy3 LLM Tops OpenRouter Rankings: Cheapest Model or Something Else?

A mysterious LLM called Hy3 preview has surged to the top of OpenRouter's AI Model Rankings, beating established models like Claude Opus 4.7 and DeepSeek V4 Flash by more than 50% in token usage. The model, an open-source release from Chinese megacorp Tencent, is priced at $0.066/1M input tokens on OpenRouter, making it the cheapest major model on the platform — even cheaper than DeepSeek V4 Flash at $0.10/1M input tokens.
However, the model's quality doesn't match its popularity. Tencent's own Hugging Face repo shows oddly honest benchmark results that are not favorable for Hy3 compared to other Chinese open-source models. Testing by the author suggests the model's quality is on par with other Chinese models, but not close to Claude Opus 4.7 or GPT 5.5.
OpenRouter's data reveals several peculiarities:
- Usage spike: Hy3 preview had no usage before May 8, 2026, when it switched from a free SKU to paid. Usage has been steady since, indicating organic adoption.
- App usage minimal: The top 5 apps account for <1% of all activity to Hy3. This rules out a single app switching default model (as happened with Grok Code Fast 1 earlier).
- 98% input tokens, 2% output — an extreme ratio suggesting heavy usage in retrieval or preprocessing tasks, not agentic coding loops.
- Single provider: Hy3 preview is only available via SiliconFlow, a Singapore-based provider, which saw a massive usage spike coinciding with Hy3.
When Hy3 moved from free to paid, usage didn't drop significantly, suggesting users are willing to pay despite the model's lower quality — likely because it remains the cheapest option on OpenRouter. The author asks: is Hy3 preview actually the cheapest LLM backed by a major company on OpenRouter?
Developers using AI coding agents should be aware that cost savings may come at a quality cost. If you're running high-volume inference where output quality is less critical (e.g., data extraction, simple classification), Hy3 could be a viable option. But for complex agentic coding, expect significantly worse results compared to Claude or GPT.
📖 Read the full source: HN AI Agents
👀 See Also

Developer Seeks Architecture Advice for Serving Embed, Rerank, and Zero-Shot Models on 8GB VRAM
A developer building a unified Knowledge Graph/RAG service for a local coding agent is struggling with memory constraints on 8GB VRAM and 16GB system RAM, experiencing OOM errors, latency spikes, and Linux kernel kills when serving three transformer models concurrently.

Gen Z's AI Backlash: Usage Drives Skepticism, Not Acceptance
Polling shows Gen Z adopts AI tools but resents the AI-centric future. Many avoid AI entirely or disable features, citing job fears, environmental concerns, and social impact.

Melbourne Psychiatrist Refuses New Patients Who Don't Consent to AI Note-Taking
A Melbourne psychiatrist now requires new patients to consent to AI transcription for sessions or be referred elsewhere, raising data security and accuracy concerns.

OpenClaw API Costs Hit $275 in 5.5 Hours, Annualizing to Over $200K
A developer testing OpenClaw with OpenAI's GPT-5.4 API spent $275 between 11am and 4:30pm, which annualizes to over $200,000 per year at that usage rate.