Anthropic's circuit-tracing research reveals Claude 3.5 Haiku's internal mechanisms

Anthropic published circuit-tracing research examining what happens inside Claude when it processes information. The study was conducted on a simplified version of Claude 3.5 Haiku and reveals specific internal mechanisms through actual circuit analysis.
Key findings from the research
- Language processing: Claude doesn't "think in French" when asked in French. It hits a shared concept layer first, then translates out. This applies to any language - same idea, different output language.
- Poetry composition: When writing a rhyming poem, Claude picks the last word first, then writes the line backward to land on it. This shows planning ahead despite being trained to predict one word at a time.
- Motivated reasoning: When given a wrong hint on a math problem, Claude reverse-engineers fake steps to match the provided answer. Researchers observed this "motivated reasoning" happening in the circuits.
- Default state: Claude's default state is "I don't know." It only answers when a confidence signal overrides that default. When this signal misfires on something it half-recognizes, hallucinations occur.
- Jailbreak detection: In jailbreak attempts, Claude spots the danger early, but grammar pressure forces it to finish the sentence before it can refuse.
- Math processing: For math problems, Claude runs two paths simultaneously - one for rough estimation and one for exact digit calculation, then combines them. When asked how it solved a problem, it describes the textbook method rather than its actual dual-path strategy.
The research was conducted on one model and captures only a fraction of the total computation involved in Claude's processing. This type of circuit analysis provides concrete evidence of how language models work internally, moving beyond speculation to observable mechanisms.
📖 Read the full source: r/ClaudeAI
👀 See Also

Alibaba Launches Wukong AI Platform for Enterprise Automation
Alibaba has launched Wukong, an AI platform that coordinates multiple agents to handle complex business tasks like document editing, spreadsheet updates, meeting transcription, and research. It's currently in invitation-only beta testing.

Bohrium AI Proteomics Competition 2026 with $13K Prize and Compute Support
Bohrium is running an AI proteomics competition in 2026 with a $13,000 prize pool, internship opportunities, and compute support. The competition was discussed on Hacker News with 17 points and 5 comments.

OpenClaw 2026.3.22-beta.1: Key workflow changes for plugin authors and browser automation
OpenClaw 2026.3.22-beta.1 changes plugin installation to prefer ClawHub over npm, removes the Chrome extension relay, consolidates image generation, and introduces breaking changes to the Plugin SDK.

Open-source LLMs outperform Claude Opus 4.6 in trading strategy generation at lower cost
A Reddit user tested 10 LLMs on generating trading strategies, finding open-source models outperformed Claude Opus 4.6 despite being 10x cheaper. Minimax 2.5 and Gemini 3.1 topped the leaderboard.