LamBench: A Lambda Calculus Benchmark Suite for AI Coding Agents

Victor Taelin released LamBench v1, a benchmark framework designed to test AI coding agents on lambda calculus problems. The project is hosted on GitHub at github.com/VictorTaelin/LamBench and includes a live site at victortaelin.github.io/lambench/.
Key Details
- Metrics: The benchmark measures three axes:
:intelligence,:speed, and:elegance. - Components: A set of
:problemsand a:matrixfor scoring results. - Version: v1 (initial release).
LamBench is part of a broader effort by Taelin to create rigorous evaluations for AI systems in symbolic computation. For context, lambda calculus is a formal system in mathematical logic and computing, often used to test reasoning and functional programming capabilities — making this benchmark particularly relevant for AI coding agents that need to handle symbolic manipulation, recursion, and higher-order functions.
Who It's For
AI researchers and developers building or evaluating coding agents, especially those working with functional programming or symbolic reasoning tasks.
📖 Read the full source: HN AI Agents
👀 See Also

Context Routing Layer Reduces Claude Code Token Usage by Tracking Accessed Files
A developer saved approximately $80 per month on Claude Code usage by adding a context routing layer that prevents the AI from re-reading the same repository files on follow-up turns. The tool tracks what files have already been accessed to reduce redundant token consumption.

Qwen 3.5 Chat Template Release with 21 Bug Fixes for Agent Workflows
A developer has released a fixed chat template for Qwen 3.5 models, addressing 21 bugs including tool calling crashes, parallel call separation, and agent loop stability. It's a drop-in replacement tested on llama.cpp, Open WebUI, vLLM, and other platforms.

Mobile Harness: Bringing Browser-Use Skills to Mobile Apps for Claude Agents
Mobile Harness gives Claude/agents reusable mobile app skills (Reddit, Instagram, TikTok) using MobAI as execution layer. Works with real devices, emulators, simulators, free daily quota.

Developer builds local AI research agent that creates podcasts from topics or YouTube links
A developer built a fully local AI agent that takes topics or YouTube links and generates deep-dive reports, conversational podcast scripts, and audio. The system dynamically researches, extracts insights, refines summaries, and creates natural back-and-forth conversations.