anthropicclaude-codesecurityartificial-intelligencesource-code-leak

Claude Code Source Leak: Anthropic's Copyright Irony

Tincho Fuentes··9 min read
Claude Code Source Leak: Anthropic's Copyright Irony

Anthropic's irony: trained AI on others' code, then had its own leaked to the world

TL;DR:

  • On March 31, 2026, Anthropic accidentally exposed 512,000 lines of Claude Code's source through a .map file mistakenly bundled in their npm package.
  • The company classified the incident as "human error" and issued DMCA notices to remove over 8,100 GitHub repositories.
  • The irony is devastating: Anthropic trained Claude on millions of pirated books and paid a $1.5 billion copyright settlement — the largest in U.S. history. Now it invokes the same IP protections it previously circumvented.

A 59.8 MB Oversight That Shook the AI Industry

At 04:00 UTC on March 31, 2026, Anthropic published version 2.1.88 of Claude Code to npm. It was a routine update. Or so they thought.

Twenty-three minutes later, researcher Chaofan Shou (from Solayer Labs) posted a tweet that lit the fuse: inside the published package, he had found a source map file (main.js.map) weighing 59.8 megabytes — a file that should never have seen the light of day.

That file pointed directly to a ZIP archive hosted on Anthropic's Cloudflare R2 bucket, containing 1,906 TypeScript files with 512,000 lines of Claude Code's complete source code.

The news went viral within minutes. By the end of the day, it had accumulated over 26 million impressions on X alone. Mirror repositories appeared on GitHub faster than any legal team could respond.

What Was Actually Exposed

This wasn't a couple of internal scripts of marginal importance. The leaked code includes the complete operational core of Claude Code:

  • The central LLM call engine
  • Tool handling and permission systems
  • Memory architecture and OAuth flows
  • 44 feature flags for unreleased functionality, including internal projects codenamed KAIROS and Buddy/Tamagotchi (an always-on AI agent designed as a digital pet)

Anthropic quickly clarified that no user data or credentials were exposed. Technically true. But what was exposed is, strategically, far more valuable: the complete blueprint of their programming assistant, plus the internal product roadmap that no press release had hinted at.

The Technical Vector: A Failed .npmignore

The exposure vector wasn't a sophisticated cyberattack. It was a missing line in a configuration file.

In JavaScript and TypeScript, .map files allow compiled code to be traced back to the original source. They're development tools that should never be included in a production npm package — precisely because they reveal unobfuscated code. The standard practice is to exclude them via .npmignore.

The failure combined two factors: an incorrect .npmignore configuration and a bug in the Bun runtime (on which Claude Code is built) that also failed to exclude the map files automatically. The result was a chain of exposure so straightforward it's hard to believe:

# This is how simple it was to access Claude Code's complete source:
npm install @anthropic-ai/claude-code
# → downloads main.js.map (59.8 MB)
# → .map points to: https://storage.anthropic.com/claude/anon/src.zip
# → curl -L -O 'https://storage.anthropic.com/...'
# → unzip: 512,000 lines of TypeScript ready to explore

Any developer with basic npm knowledge could walk through the internal architecture of one of the world's most-used AI tools. And they did.

Anthropic's Response: Calculated, Minimal, and Unapologetic

Anthropic's official reaction was minimalist by design. The company confirmed the incident to Axios, Business Insider, and The Hacker News using a carefully chosen formula:

"It was a packaging problem, human error, not a security breach."

No official statement on their website. No press conference. No public apology. Just controlled quotes to selected media — enough to contain the narrative damage without committing to more detail than strictly necessary.

Meanwhile, Anthropic's legal team was moving in parallel: within hours, the company issued DMCA requests to GitHub to remove all repositories hosting the leaked code. The official notice, recorded in the public github/dmca repository, listed ~8,100 repositories as allegedly infringing, with instructions for complete removal.

One of the most popular mirrors — instructkr/claw-code — had reached over 50,000 stars and tens of thousands of forks in just two hours. A historic record on GitHub. Another, realsigridjin/claw-code, reimplemented the entire codebase in Python using OpenAI's Codex and climbed to 75,000 stars, proclaiming itself a "clean-room" version immune to DMCA.

The Irony Anthropic Would Prefer You Forget

This is where the story gets truly interesting. Or, depending on your perspective, profoundly uncomfortable for a San Francisco AI lab.

Anthropic is the same company that, according to unsealed court documents, deliberately downloaded millions of books from pirate sites like Library Genesis (LibGen) and PiLiMi to train Claude. The case Bartz et al. v. Anthropic PBC, settled in August 2025 with a landmark agreement of $1.5 billion — the largest copyright settlement in U.S. history — established a distinction that should resonate in the company's hallways today:

AI training may qualify as "fair use." Acquiring stolen data to build that training foundation does not.

The company that built its flagship product on a corpus of protected works — without asking permission, without paying royalties, hoping no one would notice — now invokes intellectual property law as a shield. With diligence. With speed. With a battery of 8,100 DMCAs deployed within hours.

The same company that for years argued copyright shouldn't apply to AI training is now urgently invoking copyright to protect its own code the very day it was exposed.

What do you call that? In journalism, it's called a documented contradiction. In everyday language, it has other names.

"The Internet Never Forgets" — A Lesson Learned the Hard Way

The DMCA requests worked partially. GitHub removed most of the notified repositories. But the community responded with creativity:

  • Python reimplementations: developers created "clean-room" versions of the code, rewritten from scratch using the technical findings from the leak. Since they don't literally contain Anthropic's original code, these versions are immune to DMCA.
  • IPFS and decentralized mirrors: the code was uploaded to decentralized networks that have no central operator to notify. "It will never be taken down," proclaimed their authors.
  • Public technical analyses: engineers published detailed articles on Claude Code's internal workings — autonomous memory mechanisms, anti-distillation logic, permission systems — making that knowledge irreversibly public.

The internet — the same internet that Anthropic crawled for training data without permission — turned out to be a permanent archive. An irony that, as a journalist, I'll leave without further comment.

The Feature Flags: A Roadmap Nobody Was Supposed to See

Beyond the operational code, the 44 discovered feature flags offer an unprecedented look at what Anthropic has in development:

  • KAIROS: an unannounced feature, possibly related to temporal planning capabilities or long-running agent behaviors.
  • Buddy/Tamagotchi: an "always-on" AI agent designed as a kind of digital pet. The Tamagotchi reference — the Japanese toy from the 90s that died if you didn't tend to it — doesn't seem accidental from a company obsessed with AI "alignment" and "relationship."

These projects were protected under commercial confidentiality. They're now public knowledge. Anthropic's competitors — OpenAI, Google DeepMind, Meta — have access to the roadmap. The strategic damage is real and, unlike the code itself, cannot be removed by any DMCA notice.

The Pattern Repeats: "Ask Forgiveness Later"

This incident doesn't happen in isolation. The unsealed Bartz documents reveal an internal corporate culture of "ask forgiveness later": Anthropic took data it knew was problematic, built its product, and managed legal consequences once it was too late to backtrack.

"Project Panama" — Anthropic's internal plan to physically purchase millions of books and "destructively scan" them for mass digitization — was described by the publishing community as "reprehensible." The company called it "format transformation" and argued it was fair use. The courts, as usual, added nuance.

The same pattern reappears here: a carelessly published package, a minimal response to contain narrative damage, aggressive legal action. No signs of transparent correction of failed internal review processes. Just the minimum required to move forward.

Anthropic published a 79-page "AI Constitution" in January 2026 speaking of integrity and transparency. It's admirable in its intentions. But there is a considerable gap between stated principles and documented practices.

What's Coming: Regulation, Forced Transparency, and Growing Pressure

The Claude Code incident is not an isolated episode. It occurs in a context of growing regulatory pressure that will demand more, not less:

  • The EU AI Act requires that by August 2026, companies like Anthropic must disclose in granular detail the content used for training their models.
  • Post-Bartz agreements mandate judicial oversight of data practices and destruction of pirated material.
  • The Copyright Clearance Center launched AI-specific training licenses in March 2026, progressively closing the "fair use" defense as a sole argument.

The era of opacity in AI development is coming to an end. The industry knows it. Some just prefer it arrives later rather than sooner.

Conclusion: When Irony Exceeds Fiction

On March 31, 2026, Anthropic accidentally became the protagonist of a perfect parable about tech's double standards.

A company that built its business on the principle that "information wants to be free" — at least when it belongs to someone else — discovered firsthand what it's like to be on the other side of that equation. A company that used the code and writing of millions of people without asking permission now watches its own code replicated across thousands of repositories it cannot control.

The code is out. The reimplementations too. The internet will remember.

And while Anthropic works on a native installer to avoid npm dependencies going forward, somewhere on an IPFS server, 512,000 lines of TypeScript remain available to anyone who searches for them.

Poetic justice, as any journalist who covered the Bartz case from the beginning would say.


Verified sources: Axios · Business Insider · The Verge · GitHub DMCA Anthropic · Decrypt · Engineers Codex · Goodwin Law — Bartz case

Tincho FuentesTech journalist and investigative researcher 🚀