SubQ: A Sub-Quadratic LLM with 12M-Token Context Window

✍️ OpenClawRadar📅 Published: May 6, 2026🔗 Source

SubQ from Subquadratic is a production-ready LLM built on a fully sub-quadratic sparse-attention architecture. It handles up to 12M tokens in a single prompt, runs at 150 tokens per second, and costs roughly 1/5 of leading models like GPT-5 or Opus.

Architecture & Benchmarks

Unlike standard transformers with O(n²) attention, SubQ uses a sub-quadratic sparse-attention mechanism that only processes relevant token relationships. At 12M tokens, this reduces attention compute by nearly 1000×. Benchmarks (third-party validated):

SWE-Bench Verified (real-world coding): 81.8%
RULER @ 128K (long-context accuracy): 95.0%
MRCR v2 (8-needle, 1M): 65.9%

For comparison, SubQ's SWE-Bench score sits between Gemini 3.1 Pro (80.6%) and Opus 4.6 (80.8%). The model also outperforms Opus 4.7 (87.6%? – not reported at time) and GPT-5.5 (n/r) on MRCR v2.

Products & Integration

Two access options:

Full-Context API: 12M-token context, streaming, tool use, OpenAI-compatible endpoints. Process entire repositories in one call at linear cost.
SubQ Code (long-context layer for coding agents): Plug into Claude Code, Codex, or Cursor. ~25% lower bill, 10× faster exploration, auto-redirects expensive model turns. One-line install.

Who It's For

Developers and teams running AI agents that need to reason across full codebases, long PR histories, or persistent state without quality loss.

📖 Read the full source: HN AI Agents

👀 See Also

Tools

context-os: Open-source tool reduces Claude Code token consumption by 27-42%

context-os is a local context optimizer that hooks into Claude Code automatically, compressing tool output before Claude sees it and reducing token consumption by 27-42% depending on content type.

Apr 16, 2026, 04:45 AM UTC

OpenClawRadar

Tools

Open-source Claude Code reimplementation patched for local model compatibility

A developer patched the open-source Claude Code reimplementation to work with Ollama and local models by removing hardcoded Anthropic client dependencies. The CLI now auto-detects providers from model names and environment variables.

Apr 21, 2026, 02:37 AM UTC

OpenClawRadar

Tools

LobsterBoard adds theme system and template gallery

LobsterBoard now includes a theme system with five visual options and a template gallery that allows users to export and import dashboard layouts with automatic sensitive data stripping.

Apr 17, 2026, 01:45 AM UTC

OpenClawRadar

Tools

Comparing Local vs. Cloud AI Agents: OpenClaw and Twin.so

OpenClaw is an open-source local AI agent that runs on your machine with full data control, while Twin.so is a cloud-based platform with 200,000+ community-built agents for 24/7 automation.

Mar 2, 2026, 06:45 AM UTC

OpenClawRadar