WCY format reduces LLM token overhead by 50-71% and adds structural 'I don't know' markers

WCY (Watch → Compute → Yield) is a line-oriented format designed to reduce LLM token overhead and provide structural markers for uncertainty in reasoning. It replaces JSON's brackets, quotes, and commas with one-marker-per-line syntax.
Token reduction benchmarks
From testing across 10-500 rows and MCP exchange types:
- Structured data vs JSON: -50 to -54% token reduction
- Tool-call schemas: -65 to -71% reduction
- Full MCP protocol exchange: -61% reduction
- Multi-agent output tokens: -40% reduction
No fine-tuning is needed—three few-shot examples are enough for models to switch formats. The parse_r metric goes from 0.29 to 1.00 on complex tasks with this approach.
The ? marker for uncertainty
WCY introduces a structural way for LLMs to mark what they don't know during reasoning. The ? (void-B) slot allows models to indicate uncertainty inline:
: ?diagnosis hint=labs+imaging conf_range=0.4..0.8
order CT_scan reason=from=3 . CT_result mass_in_RUL size=2.3cm : diagnosis=adenocarcinoma conf=0.82 from=3,5Testing showed:
- Zero-shot: models use ? markers 0% of the time, even with the spec in the prompt
- With 3 examples: 5.4 markers per trace, 67-97% resolved
- 48 pipeline traces across 8 domains: 95% resolution, 100% quality gate pass
The from= slot tracks which observations support which conclusions inline, which helps catch hallucination chains.
Available resources
- wcy_parser.py — pure Python, no external dependencies
- wcy_eval.py — 3-axis scoring (Structural / Meaning / Provenance)
- 60 reasoning traces with void-B cycles (CC BY 4.0 license, for fine-tuning experiments)
- Pipeline script to generate more traces
So far only tested on Claude Sonnet. The author is curious whether the 0% → 5.4 markers result holds on Qwen, Llama, and Mistral with the same few-shot examples.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Four ClawHub Skills for Real-Time Search Data in AI Agents
Four ClawHub skills provide structured search capabilities for AI agents: Google (web, news, images, maps), Amazon (product search across 12 marketplaces), Walmart (product search with delivery filters), and YouTube (video search with transcripts). Install via clawhub install commands with one API key.

Code-Graph-MCP: Open Source MCP Server Reduces Claude Code Token Usage by 40-60%
code-graph-mcp is an MCP server that indexes codebases into an AST knowledge graph, replacing multiple grep/read calls with single structured queries. The developer reports 40-60% total session token savings and 80% fewer tool calls per navigation task.

Developer shares solution for Claude AI ignoring rules beyond 50-count threshold
A developer reports Claude Code started silently dropping rules once their shared rule set exceeded approximately 50 items, particularly during frontend-heavy tasks. They built a hook that scans prompts and loads only 2-3 relevant rules based on keyword matching.

PeaDB: Redis-Compatible Database Coded with AI Assistants in C++20
A developer created PeaDB, a Redis 7.2.5 drop-in replacement written in C++20 using Codex, Copilot, and Claude, implementing ~147 commands with persistence, replication, and cluster support. Benchmarks show performance close to Redis.