Components of a Coding Agent: How Tools, Memory, and Context Extend LLMs

Sebastian Raschka outlines the architecture of coding agents, which are systems that wrap LLMs in application layers to improve performance on coding tasks. He distinguishes between LLMs, reasoning models, and agents, explaining that much of the practical progress in LLM systems comes from the surrounding system components rather than just better models.
Key Components of Coding Agents
The article identifies six main building blocks that make coding agents effective:
- Repo context: Navigation and management of code repository information
- Tool design: Integration of external tools and functions
- Prompt-cache stability: Consistent prompt management across sessions
- Memory: State retention and session continuity
- Long-session continuity: Maintaining context over extended interactions
- Model choice: Selection of appropriate LLM or reasoning model
Architecture Layers
Raschka defines several key concepts in the agent ecosystem:
- LLM: The core next-token model
- Reasoning model: An LLM trained or prompted to spend more inference-time compute on intermediate reasoning, verification, or search over candidate answers
- Agent: A control loop around the model that decides what to inspect next, which tools to call, how to update its state, and when to stop
- Agent harness: The software scaffold around an agent that manages context, tool use, prompts, state, and control flow
- Coding harness: A special case of agent harness specifically for software engineering that manages code context, tools, execution, and iterative feedback
He notes that Claude Code and Codex CLI can be considered coding harnesses. The relationship is described as: the LLM is the engine, a reasoning model is a beefed-up engine, and an agent harness helps us use the model effectively.
Coding work involves more than just next-token generation—it requires repo navigation, search, function lookup, diff application, test execution, error inspection, and context management. Coding harnesses combine three layers: the model family, an agent loop, and runtime supports.
📖 Read the full source: HN AI Agents
👀 See Also

Local Claude Code Setup with Qwen3.5 27B via llama.cpp
A developer shares their configuration for running Claude Code locally using Qwen3.5 27B with llama.cpp, including environment variables, server parameters, and performance benchmarks across seven coding tasks.

OpenClaw Onboarding: How to Train Your AI Agent Right

Running OpenClaw, ClawdBot, and MoltBot on a Budget
Discover how to run OpenClaw, ClawdBot, and MoltBot without breaking the bank. Explore budgeting tips and free alternatives as discussed by enthusiasts on r/clawdbot.

OpenClaw's Gateway and Skills: Moving Beyond Chat to Automated Execution
OpenClaw's Gateway connects channels like Telegram and WhatsApp to skills that execute real-world actions such as running tests, calling APIs, and managing files, with cron jobs enabling scheduled background automation.