Bleeding Flame: the critical vulnerability of Olama (CVE-2026-7482) that exposes memory and secrets

Cybersecurity researchers have revealed critical vulnerability in Olama - the framework that allows you to run large language models (LLM) locally - that can expose the full memory of the process and therefore sensitive secrets. Cataloged as CVE-2026-7482 and nicknamed "Bleeding Flame," the failure is a out-of-bounds read in the GGUF model charger and has received a high CVSS score (9.1), indicating a real and exploitable risk in environments with exposed instances.

In technical terms, the problem arises when the server accepts a malformed GGUF file by the endpoint of model creation; in processing it, a function that uses the dangerous route of the package unsafe In Go read beyond the assigned buffer, which allows to filter arbitrary content from the memory of the process. In practice this can be translated into the disclosure of environment variables, API keys, system messages (system prompts) and concurrent user conversations. The attacker can also transform that reading into real exfiltration by increasing the resulting artifact to a record controlled by it by the server's endpoint of uploading.

Bleeding Flame: the critical vulnerability of Olama (CVE-2026-7482) that exposes memory and secrets — Image generated with IA.

The size and importance of Olama as a local alternative to cloud makes this failure particularly worrying: the project has a wide footprint on developers and organizations and, according to reports, vulnerability could impact hundreds of thousands of servers. The official project repository can be reviewed to confirm versions and updates published by developers: https: / / github.com / ollama / ollama. For registration and formal details of the CVE, see the notice in the national vulnerability database: https: / / nvd.nist.gov / vuln / detail / CVE-2026-7482.

The case is complicated because, in parallel, researchers have found two failures in the Olama application update mechanism for Windows that, combined, allow persistent code execution at the start of the session. These vulnerabilities include the lack of signature verification of the update binary and a directory path (path traversal) that can write executables in the Windows boot folder if the update process is controlled by an attacker. The result can be silent persistence and execution with the privileges of the user running Olama.

What should administrators and users do now? First of all, apply patches and official versions as soon as they are available and published by the project maintainers; if there is no immediate update, consider disconnecting the Olla instances from public networks and audit all exposed endpoint REST. Protect the instances with an authentication proxy or a gateway API in front of the service, as the Olama REST API does not incorporate default authentication. Limit network access to IPs and trust subnetworks and place the machines behind a firewall. In Windows environments, while assessing or applying patch, disable automatic customer updates and remove any direct access to the user's start folder to prevent silent execution when login.

Do not overlook impact mitigation: key rotes and potentially stored credentials in the affected machines, review records and uploaded artifacts (including models stored in records) and search for unusual files in the Startup folder on Windows. Consider running Olla in containers or environments with minimum privileges, and limit connections to other automated tools (e.g., tool chain integrators) that can send process sensitive data and thus expand the attack surface.

Finally, this incident highlights two broader trends: on the one hand, running local LLM reduces cloud dependence but increases responsibility for host security; on the other hand, the use of unsafe routes within "safe by design" languages such as Go (e.g. the unsafe package) can introduce critical vulnerabilities if strict control is not applied. Organizations that depend on local model deployments should incorporate specific security reviews for useful model loads (GGUF or others) and actively monitor service exposures. Be informed through the official project notices and CVE sources, and prioritize containment and audit if it has accessible network instances.

Coverage

More news on the same subject.

18-year-old Ukrainian youth leads a network of infostealers that violated 28,000 accounts and left $250,000 in losses

May 20, 2026 4 min de lectura 11

Explore RadarBytes

Bleeding Flame: the critical vulnerability of Olama (CVE-2026-7482) that exposes memory and secrets

Disable your ad blocker

Bleeding Flame: the critical vulnerability of Olama (CVE-2026-7482) that exposes memory and secrets

Related

18-year-old Ukrainian youth leads a network of infostealers that violated 28,000 accounts and left $250,000 in losses

The digital signature is in check: Microsoft dismands a service that turned malware into apparently legitimate software

A single GitHub workflow token opened the door to the software supply chain

WebWorm 2025: the malware that is hidden in Discord and Microsoft Graphh to evade detection

Identity is no longer enough: continuous verification of the device for real-time security

The dark matter of identity is changing the rules of corporate security

PinTheft the public explosion that could give you root on Arch Linux

Manage your cookies