Claude Code Source Leak Reveals Anti-Distillation, Undercover Mode, and Frustration Detection

✍️ OpenClawRadar📅 Published: April 1, 2026🔗 Source
Claude Code Source Leak Reveals Anti-Distillation, Undercover Mode, and Frustration Detection
Ad

Anthropic accidentally shipped a .map file with their Claude Code npm package containing the full, readable source code of the CLI tool. The package has since been pulled, but the code was widely mirrored and analyzed on Hacker News. This follows another recent leak of Anthropic's model spec.

Anti-distillation: injecting fake tools to poison copycats

In claude.ts (lines 301-313), there's a flag called ANTI_DISTILLATION_CC. When enabled, Claude Code sends anti_distillation: ['fake_tools'] in its API requests, telling the server to silently inject decoy tool definitions into the system prompt. This is designed to pollute training data if someone is recording API traffic to train competing models.

The activation requires four conditions: the ANTI_DISTILLATION_CC compile-time flag, the CLI entrypoint, a first-party API provider, and the tengu_anti_distill_fake_tool_injection GrowthBook flag returning true. A MITM proxy that strips the anti_distillation field from request bodies would bypass it entirely. Setting the CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS environment variable to a truthy value disables the whole mechanism.

A second anti-distillation mechanism in betas.ts (lines 279-298) implements server-side connector-text summarization. When enabled, the API buffers the assistant's text between tool calls, summarizes it, and returns the summary with a cryptographic signature. This means API traffic recordings would only capture summaries, not full reasoning chains.

Ad

Undercover mode: AI that hides its AI

The file undercover.ts implements a mode that strips all traces of Anthropic internals when Claude Code is used in non-internal repos. It instructs the model to never mention internal codenames like "Capybara" or "Tengu," internal Slack channels, repo names, or the phrase "Claude Code" itself. Line 15 states: "There is NO force-OFF. This guards against model codename leaks."

You can force it ON with CLAUDE_CODE_UNDERCOVER=1, but there's no way to force it off. In external builds, the entire function gets dead-code-eliminated to trivial returns. This means AI-authored commits and PRs from Anthropic employees in open source projects will have no indication that an AI wrote them.

Frustration detection via regex

userPromptKeywords.ts contains a regex pattern that detects user frustration: /\b(wtf|wth|ffs|omfg|shit(ty|tiest)?|dumbass|horribl (incomplete in source). This suggests the system attempts to identify frustrated users through keyword matching.

Other findings

  • Native client attestation below the JS runtime
  • 250,000 wasted API calls per day
  • KAIROS: an unreleased autonomous agent mode

The leak occurred just ten days after Anthropic sent legal threats to OpenCode, forcing them to remove built-in Claude authentication because third-party tools were using Claude Code's internal APIs to access Opus at subscription rates instead of pay-per-token pricing.

📖 Read the full source: HN AI Agents

Ad

👀 See Also