RAMPART and Clarity redefine the safety of IA agents with reproducible testing and governance from the start

Microsoft has presented two open source tools, RAMPART and Clarity, aimed at changing the way the safety of IA agents is tested: one that automates and standardizes technical tests and the other that requires documentation and discussion of design decisions from the earliest stage. In the field of agents, which make decisions and use external tools, each access to data, APIs or documents extends the attack surface; therefore, it is not enough to audit at the end of the project: it is necessary to test adverse scenarios while the system can still be redesigned.

RAMPART is presented as an integrated test framework with Pytest, which facilitates its adoption in continuous integration pipelines. Its central idea is to allow engineers and security equipment to write cases that simulate from covert injections of instructions (cross- prompt injection) to data leaks and behavioral returns, and that these cases are enforceable and repeatable within the development cycle. To get real value out of RAMPART it is key to build adapters that connect the specific agent to the battery of tests, and to convert the findings into verifiable mitigations that are part of the code and the automatic tests.

RAMPART and Clarity redefine the safety of IA agents with reproducible testing and governance from the start — Image generated with IA.

Clarity works at another level: it does not run attacks, but acts as a structured thinking partner which requires you to record assumptions, explore alternatives and draw decisions before writing a single line. This conceptual "pressure-testing" exercise reduces the risk of introducing dangerous architectures - for example, providing unrestricted access to external tools or sources without controls - and produces live devices that equipment can review and update throughout the product's life cycle.

The combination of both tools reflects a higher trend in safety of IA: move security to the left, convert team findings into reproducible tests and keep track of why certain decisions were made. In the long term this facilitates traceability and accountability, but also challenges: effectiveness depends on the quality of the tests, the coverage of modeled threats and the discipline to keep these devices up to date with new vectors and models.

For teams that want to transform these ideas into practice I recommend to start by integrating early tests into the pipelines: use frames compatible with your CI / CD infrastructure, add adapters to your agents and convert safety tests into automated acceptance conditions. In parallel, use requirements and assumptions clarification exercises at the start of each project and preserve these decisions as live documentation accompanying the code. These practices should be complemented by governance on data access, comprehensive records of agent activities and real-time monitoring of unusual activity.

However, these tools are not a silver bullet. There are risks of false comfort if teams rely only on automated testing suites without qualified human validation, or if test cases do not reflect real threats or emerging model capabilities. It is essential to combine technical evidence with external adversarial reviews, threat analysis and organizational controls that include privileges minimization and incident response policies. The NIST AI Risk Management Framework ( https: / / www.nist.gov / artificially -intelligence / ai-risk-management-framework) and emerging safety standards for models and agents, for example the efforts of the OWASP community on LLM ( https: / / owasp.org / www-project-top-ten-for-large-language-models /).

If you decide to evaluate these capabilities, also consider the underlying test infrastructure: relying on standard testing tools such as Pytest facilitates integration ( https: / / docs.pytest.org / en / stable /), while adhering to AI and continuous audit practices helps to convert point findings into systematic mitigation ( https: / / www.microsoft.com / en-us / ai / responsible-ai). In short, RAMPART and Clarity are useful steps towards a more responsible development of IA agents, but their impact will depend on the quality of threat modelling, the discipline to incorporate them into daily development and their combination with governance and independent reviews.

Coverage

More news on the same subject.

18-year-old Ukrainian youth leads a network of infostealers that violated 28,000 accounts and left $250,000 in losses

May 20, 2026 4 min de lectura 13

Explore RadarBytes

RAMPART and Clarity redefine the safety of IA agents with reproducible testing and governance from the start

Disable your ad blocker

RAMPART and Clarity redefine the safety of IA agents with reproducible testing and governance from the start

Related

18-year-old Ukrainian youth leads a network of infostealers that violated 28,000 accounts and left $250,000 in losses

A single GitHub workflow token opened the door to the software supply chain

WebWorm 2025: the malware that is hidden in Discord and Microsoft Graphh to evade detection

Identity is no longer enough: continuous verification of the device for real-time security

The dark matter of identity is changing the rules of corporate security

PinTheft the public explosion that could give you root on Arch Linux

YellowKey The BitLocker failure that could allow an attacker to unlock your unit with only physical access

Manage your cookies