By Ion Anghel · April 2026
On March 26, 2026, two security researchers — Roy Paz from LayerX Security and Alexandre Pauwels from the University of Cambridge — stumbled onto something Anthropic would rather have kept under wraps: nearly 3,000 internal files sitting in a public, indexable data cache. Among them, a draft blog post announcing a new model called Claude Mythos, described by the company itself as "by far the most powerful AI model we have ever developed."
Five days later, on March 31, a researcher named Chaofan Shou discovered that the NPM package for Claude Code (version 2.1.88) contained a 60 MB .map file with the tool's entire source code — roughly 1,900 files and over 500,000 lines of TypeScript.
Two major leaks in a single week. From the same lab that tells us it has built a model "too dangerous to release publicly."
Let's unpack this.
What we actually know
The model exists. Anthropic has officially confirmed that it is working on a general-purpose model with "significant advances in reasoning, coding, and cybersecurity." Internally, the model carries the codename Capybara and is positioned as an entirely new tier above the Opus family — not a simple upgrade. The leaked blog draft actually contained two versions of the same text, differing only in the name: one used "Mythos," the other "Capybara." The subtitle of the Capybara version still contained a reference to "Claude Mythos," suggesting the naming decision had not yet been finalized.
There are no independent public benchmarks. The only thing we have are the claims in Anthropic's internal draft: scores "dramatically higher" than Claude Opus 4.6 on coding, academic reasoning, and cybersecurity. Nobody outside the company has verified those numbers.
On April 7, Anthropic officially launched Project Glasswing: a program in which 12 partner companies — Amazon, Apple, Microsoft, Google, CrowdStrike, Palo Alto Networks, Cisco, Broadcom, the Linux Foundation, NVIDIA, JPMorganChase — receive access to Claude Mythos Preview exclusively for defensive security work. Roughly 40 other organizations that maintain critical software infrastructure also received access. Anthropic allocated up to $100 million in usage credits and donated $4 million to open-source security organizations.
Results claimed by Anthropic: the model reportedly identified "thousands of zero-day vulnerabilities," including a 27-year-old bug in OpenBSD and a 16-year-old vulnerability in FFmpeg — in a line of code that automated testing tools had walked past 5 million times without detecting. Nicholas Carlini, a security researcher at Anthropic, said he found more bugs in a few weeks with Mythos than in his entire previous career.
The price after the free credits run out: $25/$125 per million input/output tokens. That is not a consumer price.
About the second leak
The Claude Code source code leak is a separate but complementary story. A cli.js.map file accidentally shipped inside the NPM package contained the complete, unobfuscated source of the tool. GitHub exploded — forks crossed 41,000 before Anthropic could react with DMCA notices.
What people found in the code is fascinating from a technical standpoint: a 40+ tool system with permission gates, multi-agent orchestration, an anti-distillation mechanism that injected fake tools into prompts to poison competitors' training data, a "Dream System" for background memory consolidation, and — the juiciest part — a feature flag called KAIROS referenced over 150 times, which promises an always-on, autonomous daemon mode. Plus a Tamagotchi system with virtual pets. Seriously.
But the genuinely worrying part: within hours of the leak, it turned out that the Axios package (a Claude Code dependency) had been compromised with a Remote Access Trojan. Users who installed or updated Claude Code via NPM on March 31 inside a three-hour window risked pulling a RAT onto their machine. Typosquatting attacks and fake GitHub repos distributing malware followed.
The ultimate irony: Anthropic had a system in the code called "Undercover Mode" — built specifically so the AI would not accidentally reveal internal information. Then they shipped the whole source code in a .map file.
The "danger" argument
Anthropic's official narrative is sober: Mythos can discover and exploit vulnerabilities at speeds that exceed human defensive capacity. The model can "chain" vulnerabilities — combining three, four, or even five individual bugs that would not be critical on their own, but in sequence grant full access to a system. The company claims it detected a coordinated campaign by a Chinese state-sponsored threat group using Claude Code to infiltrate roughly 30 organizations.
The market reaction was immediate: a sell-off in software and cybersecurity stocks, Bitcoin sliding to around $66,000. Japanese media treated the news as a national security issue.
Simon Willison, a widely respected developer, commented: "I think the security risks are real. The extra time for trusted teams to gain a defensive advantage is a reasonable trade-off."
The "hype" argument
And now the uncomfortable part.
Anthropic just wrote the most effective PR piece in the tech industry. Think about it: you have a model you cannot sell to the public yet (probably for cost and inference-efficiency reasons), but you want to position it above every competitor. What do you do?
You call it "too dangerous to release."
Project Glasswing puts Anthropic at the same table as Apple, Google, Microsoft, Amazon, and NVIDIA — not as a vendor, but as a strategic partner in national security. That is a perception jump no traditional marketing campaign could have bought. One hundred million dollars in usage credits sounds impressive, but it is marketing: you hand out free access to a product you cannot sell on the open market yet, and in exchange you collect testimonials from the biggest companies in the world.
The fact that there are no independent benchmarks is convenient. "Dramatically higher" than Opus 4.6 can mean anything — from a genuine generational leap to a few percentage points on specific tests.
I am not ignoring the context either: Anthropic was recently labeled a "supply chain risk" by the US Pentagon for refusing to allow Claude to be used in autonomous weapons targeting or mass surveillance. Project Glasswing is also a political message: "We are responsible, we are a trusted partner, we are not opposed to security — we define it."
What bothers me
What bothers me is the narrative inconsistency. The same company that accidentally dropped 3,000 internal files into a public cache, that published the complete source code of its flagship tool on NPM, and whose dependency (Axios) was compromised with malware — that same company is asking us to take their word that they have built the most capable cybersecurity model in the world.
I'm not saying it isn't true. I'm saying that Anthropic's recent operational track record does not inspire the level of trust their narrative demands.
And I will add one more thing: in an industry where every lab calls every model "the most powerful yet" roughly every two months, the language of superlatives has lost its weight.
What actually matters
If the Mythos cybersecurity results are real — the 27-year-old OpenBSD vulnerability, the 16-year-old FFmpeg bug — then we have concrete proof that AI can do things even the best human teams have failed to do for decades. And that is remarkable, regardless of the marketing wrapper.
But it matters enormously who decides what gets done with these capabilities. Today, the decision belongs to Anthropic and a select group of corporate partners. Tomorrow, models with similar capabilities will be more widely available. The question is not if, but when.
Project Glasswing offers an answer — imperfect, but pragmatic. You give defenders a time advantage. That is better than nothing. But it is not a solution, it is a band-aid on a problem that is only just beginning.
Disclaimer: I believe AI is the future, and that today's problems — security, trust, governance — will be solved over time. The tone of this article is not anti-AI, it is anti-uncontested-narrative. An engineer's job is not to applaud, but to verify.