The Safety Company Keeps Leaking
I use Claude Code every day. Anthropic builds genuinely good products, and their safety research is serious work. That’s what makes this pattern worth paying attention to.
In 2026 alone: Claude Code’s source code leaked for the second time, and their next model tier — apparently codenamed Mythos or Capybara, sitting above Opus — leaked through their own CMS. Meanwhile, Anthropic is signing AI safety partnerships with governments, including a formal arrangement with Australia, and positioning itself as the careful, responsible actor in a reckless industry.
Security incidents happen to everyone — sophisticated attackers, careless vendors, and unlucky timing don’t discriminate by company values. But there’s something instructive in the gap between Anthropic’s public positioning and their operational track record: if the company most publicly committed to careful AI deployment keeps accidentally exposing its own source code and product roadmap, it’s worth asking what that tells us about the industry’s readiness for the agent-everywhere future.
The hard part of AI safety isn’t writing the principles. It’s the boring, unglamorous work of securing systems against human error, insider risk, and infrastructure gaps — the kind of work that doesn’t make for compelling government briefings.
Every AI company is selling a future where autonomous agents operate inside your infrastructure. The safety-first company is still figuring out how to secure a CMS.