How Mendral Cut LLM Costs by Upgrading to Opus: Triager Pattern, SQL Access, and Sub-Agent Architecture

Mendral recently published details on how they upgraded to Opus 4.6 for CI failure analysis while reducing overall LLM costs compared to their previous setup with Sonnet 4.0. The key is an architecture that separates triage from investigation and uses cheap sub-agents for heavy lifting.
Architecture: Cheap triager, expensive planner
Out of ~4,000 CI failures analyzed, 3,187 were duplicates — a known flaky test, infrastructure hiccup, or network blip. Waking up an expensive model for those is wasteful. But deduplication isn't deterministic: the same job can fail for different reasons. Their solution is a triager pattern:
- A Haiku agent handles the narrow job: decide if a failure is already tracked. It uses exact matching and semantic search (pgvector) against known error messages. Two different strings like
operator does not exist bigint character varyingandmigration type mismatch on installation_idare the same root cause — semantic search catches that. - When in doubt, Haiku escalates to Opus 4.6. A false positive costs a little; a false negative misses a real bug.
- 4 out of 5 failures never reach Opus. A triager match costs ~25x less than a full investigation.
Let agents pull context, don't push it
Instead of stuffing 200K+ line logs into prompts, agents get a SQL interface to ClickHouse. There's a raw table (github_logs, one row per log line) and materialized views with pre-aggregated data: failure rates by workflow, job timings, outcome counts. Most investigations start with the views to narrow down, then drill into raw logs. If a query returns too many rows, the system truncates and suggests a more specific view. If logs aren't ingested yet, agents fall back to the GitHub CLI.
Expensive models plan, cheap models execute
Opus forms a hypothesis and spawns Haiku sub-agents capped at one level deep — no unbounded fan-out. Each sub-agent gets a prompt from Opus: exactly what to search and how. Example from a real case:
Three Storybook CI jobs failed on the same commit, crashing at pnpm install. Opus dispatched a sub-agent to fetch error messages from that step. ClickHouse didn't have the logs yet, so the sub-agent used GitHub CLI and returned: gyp ERR! not found: make — [email protected] couldn't compile because make wasn't on the runner. Opus then queried ClickHouse for the failure trend over 14 days, found the inflection point, and escalated. Sub-agent prompts are explicit: "Fetch the CI logs for this run. Return the exact error messages from the pnpm install step, the full error output, especially the last 50-100 lines."
Who this is for
Teams building LLM-powered agents for CI debugging or any task where context size and cost are concerns.
📖 Read the full source: HN LLM Tools
👀 See Also

Cloudflare's vinext: A Next.js-compatible framework built with AI on Vite
Cloudflare engineers rebuilt Next.js API surface on Vite using AI in one week, creating vinext - a drop-in replacement that builds 4x faster and produces 57% smaller bundles. It deploys to Cloudflare Workers with a single command.

OpenClaw 2026.3.23 adds DeepSeek provider, Qwen pay-as-you-go, and Chrome MCP improvements
OpenClaw v2026.3.23 introduces a DeepSeek provider plugin, Qwen pay-as-you-go pricing, OpenRouter auto pricing with Anthropic thinking order, Chrome MCP tab waiting, and fixes for Discord/Slack/Matrix and Web UI.

Spectyra Plugin for OpenClaw: Real-Time AI Cost Optimization by Analyzing Full Request Flow
Spectyra plugin reduces AI API costs by surfacing hidden waste like repeated calls, excessive context, and expensive model misuse in real time.

Nakkas MCP Server Generates Animated SVGs from AI Descriptions
Nakkas is an MCP server where AI constructs complete animated SVG configurations from descriptions, rendering clean animated SVGs with shapes, gradients, animations, and filters. It supports parametric curves, 15 filter presets, CSS @keyframes and SMIL animations, and works anywhere SVG renders.