They filter the Claude Code code and reveal KAIROS: the persistent agent that could operate in the background

Published 6 min de lectura 140 reading

The past incident in Anthropic - where part of the source code of its programming assistant Claude Code became publicly available - recalls that even the companies most focused on artificial intelligence are not exempt from human errors with powerful consequences. According to the company itself, the release was the result of a failure in the packaging and not of an external attack, and they ensure that no credentials or sensitive customer data were displayed. However, the scope of the filtered material and the speed with which it was viruled made it clear that the technical and reputational risks are real and of great magnitude. For those who want to review media coverage and official statements, you can start with sources recognized as CNBC and Fortune.

What was mistakenly published was a npm package that included a source map able to reveal most of the source code: thousands of TypeScript files and hundreds of thousands of code lines. The technical community soon shared and cloned that material, and within hours a tide of forks, stars and public analysis was observed. Those who want to curiosate the repositories and searches in GitHub can explore related results in GitHub, and social media conversation can be followed from X (formerly Twitter).

They filter the Claude Code code and reveal KAIROS: the persistent agent that could operate in the background
Image generated with IA.

Beyond the morbo, the interest is not casual: the filtered code offers a detailed map of Claude Code's internal design. Developers and competitors can study how Anthropic manages the memory of the session to mitigate context limitations, how orchestra calls for models and auxiliary tools, and even how the company thinks its persistent agents that perform tasks in the background. Some fragments - already publicly described by people who examined the files - mention modules for the execution of bash commands, bidirectional communication with IDE extensions, and mechanisms to launch "sub- agents" that collaborate in complex tasks.

Among the most striking pieces is a functionality that, by name in the files, was described as KAIROS: a persistent agent capable of proactively acting, automatically correcting errors and notifying users without direct intervention. In addition to that, references were found to a "dream" mode that would allow the system to continue to generate ideas in the background. It also drew attention to a way labeled as Undercover Mode, designed for the assistant to make discrete contributions in public repositories without revealing the company's internal information: the objective, according to the recovered text, was that the commit messages and the PR did not report their origin.

While some of these capabilities are ingenious from the point of view of automation, the public exposure of design and the data chains with which they work opens up attack vectors much more sophisticated than the usual jailbreak attempts by test and error. Security experts have warned that, with the code in sight, bad actors can study exactly how data are compacted and persist in context management and, from there, design malicious loads capable of surviving these processes and maintaining an unauthorized presence throughout extensive sessions. A technical discussion of these risks and examples of theoretical exploitation can be found in linked analyses and in blogs of security companies.

The problem was not simply access to code: the event was complicated because, according to public reports, there was a temporary window in which the person who installed or updated the package from npm could have taken units committed by a supply chain attack related to a crowded version of a popular HTTP bookstore. In view of this, the immediate security recommendations were clear: back to versions known as safe, rotate secrets and review systems by commitment indicators. For general guidance on how to manage supply chain incidents and protect units, resources such as the CISA supply chain security guide and GitHub and npm security documentation are good starting points: CISA - Supply Chain, GitHub - Code Security and npm Docs.

Another immediate derivative was the malicious noise use work: actors who publish packages with names almost identical to the interns (typosquatting) and who wait for developers desperate to compile the filtered code to install those wrong dependencies. In fact, packages published by users who copied internal names and were, in the first instance, empty stubs were identified; the classic tactic is to win downloads and then publish a malicious update that infects projects like dependent confusion. To understand the risk of this technique and the recommended defenses, see the technical documentation and safety analysis on typosquatting and confusion of dependencies, for example in specialized blogs such as Snyk's: Snyk - Typosquating and Dependency Confusion.

Beyond immediate technical damage, the episode raises ethical and strategic questions for industry: what level of secret is acceptable when developing tools that can write and alter code in public repositories? How can innovation be balanced with technical controls that prevent abuse, such as the insertion of functions that seek to poison third-party training data if they detect output scraping? The filtered files found signs of defenses designed to complicate the distillation of models by competitors - such as the injection of false tool definitions into responses - a controversial strategy that further complicates the debate on transparency and active defence.

They filter the Claude Code code and reveal KAIROS: the persistent agent that could operate in the background
Image generated with IA.

For development teams using third-party programming assistants and packages, the lesson is double: on the one hand, to maintain hygiene practices in dependencies and secrets (safety blocks, package scanning, fixed version policies and registry change monitoring); on the other hand, to increase monitoring over the performance time of automated tools that run commands or interact with the file system. The safety best practice guides of platforms such as npm and GitHub offer concrete steps to harden workflows and mitigate attacks derived from package ecosystems.

Anthropic has already said that it will implement internal measures to prevent an incorrect packaging from happening again. But the reality is that, in the ecosystem of modern software, human failures and process errors are translated very quickly into operating vectors. The community, infrastructure providers and regulators have to work in parallel: improve internal audits, provide early reporting channels and, above all, educate users to react quickly to the possibility of commitments in units. For sources that discuss incident management and good practice in packages, official npm documentation and specialized safety analysis are recommended readings: npm Blog and resources to respond to security signature incidents.

In short, Anthropic's error is a useful reminder that the software, however powerful, is not infallible and that operational security matters as much as the capacity of the model. The world of artificial intelligence and collaborative software is moving fast; the question is whether security and responsibility can move at the same speed. Meanwhile, for those who use tools such as Claude Code, the immediate priority is to assume that the incident can have side effects and take containment measures: review dependencies, rotate secrets and monitor suspicious activity in repositories and systems.

Coverage

Related

More news on the same subject.