Cybersecurity researchers have uncovered a critical security vulnerability in Olama that, if successfully exploited, could allow a remote, unauthenticated attacker to leak its entire process memory.
The out-of-bounds read flaw, which potentially affected more than 300,000 servers globally, is tracked as CVE-2026-7482 (CVSS Score: 9.1). It is codenamed bleeding llama By Saira.
OLAMA is a popular open-source framework that allows large language models (LLMs) to run locally rather than in the cloud. On GitHub, the project has over 171,000 stars and has been forked over 16,100 times.
According to the description of the flaw at CVE.org, “Olama before 0.17.1 contained a stack out-of-bounds read vulnerability in the GGUF model loader.” “The /api/create endpoint accepts an attacker-supplied GGUF file in which the declared tensor offset and size exceed the actual length of the file; during quantization in fs/ggml/gguf.go and server/quantization.go (WriteTo()), the server reads the allocated heap buffer.”
GGUF, short for GPT-Generated Unified Format, is a file format used to store large language models so that they can be easily loaded and executed locally.
The problem, at its core, arises from the use of insecure packages by Olama when creating a model from a GGUF file, in particular in the function called “WriteTo()”, which makes it possible to perform operations that bypass the memory safety guarantees of the programming language.
In a hypothetical attack scenario, a bad actor could send a specially crafted GGUF file to an exposed Olama server by setting the size of the tensor to a very large number to trigger an out-of-bounds heap read during model creation using the /api/create endpoint. Successful exploitation of the vulnerability could leak sensitive data from Olama process memory.
This may include environment variables, API keys, system signals, and conversation data from concurrent users. This data can be extracted by uploading the resulting model artifact via the /api/push endpoint to an attacker-controlled registry.
The chain of exploitation opens in three stages –
- Upload a crafted GGUF file with the inflated tensor shape to a network-accessible Olama server using an HTTP POST request.
- Use the /api/create endpoint to activate model creation, activating the out-of-bounds read vulnerability.
- Use the /api/push endpoint to send data from heap memory to an external server.
“An attacker can learn basically anything about your organization from your AI guess – API keys, proprietary code, customer contracts, and more,” said Cyra security researcher Dor Attias.
“On top of that, engineers often connect Olama to tools like cloud code. In those cases, the impact is even greater – all tool output flows to Olama servers, is saved in stacks, and potentially ends up in the hands of an attacker.”
Users are advised to apply the latest fixes, limit network access, audit running instances for Internet exposure, and isolate and secure them behind a firewall. It is also recommended to deploy an authentication proxy or API gateway in front of all Olama instances, as the REST API does not provide authentication out of the box.
Two unintended flaws in Olama cause persistent code execution
The development comes after Striga researchers detailed two vulnerabilities in Olama’s Windows Update mechanism that could be linked to persistent code execution. Deficiencies persist after disclosure on January 27, 2026 and are published after the 90-day disclosure period has passed.
According to Striga co-founder, Bartolomej “Bartek” Dmytruk, the Windows desktop client auto-starts at login from the Windows Startup folder, listening on 127.0.0[.]1:11434, and periodically polls for updates in the background via the /api/update endpoint to run any pending updates on the next app start.
The identified vulnerabilities relate to a path traversal and a missing signature check, which, when combined with on-login routines, could allow an attacker with the ability to influence update responses to execute arbitrary code upon each login. The flaws are listed below –
- CVE-2026-42248 (CVSS score: 7.7) – A missing signature verification vulnerability that does not verify the update binary before installation, unlike its macOS version.
- CVE-2026-42249 (CVSS Score: 7.7) – A path traversal vulnerability that arises from the fact that Windows Updater creates a local path to the installer’s staging directory directly from the HTTP response header without cleaning it.
To exploit the flaw, the attacker needs to have control of an update server that is accessible by the victim’s Olama client. In such a situation, it could lead to a scenario where an arbitrary executable is supplied as part of the update process and is written to the Windows Startup folder without any signature checking issue.
To be able to control the update response, one approach involves overriding OLLAMA_UPDATE_URL to point the client to the local server over plain HTTP. The attack chain also assumes that AutoUpdateEnabled is on, which is the default setting.
Furthermore, missing integrity checks can lead to code execution on its own without exploiting the path traversal vulnerability. In this case, the installer is dropped into the expected staging directory. During the next launch from the Startup folder, the update process is initiated without re-verifying the signature, causing the attacker’s code to be executed instead.
That being said, remote code execution is not permanent, as the next valid update overwrites the staging file. By adding path traversal to the mix, a bad actor can redirect the executable to be written outside the normal path and achieve continued code execution.
According to CERT Polska, which handled the coordinated disclosure process, Olama for Windows versions 0.12.10 to 0.17.5 are vulnerable to two flaws. In the interim, users are advised to delete any existing Olama shortcuts from the Startup folder (“%APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup”) to turn off automatic updates and disable the silent on-login execution path.
“Any Olama for Windows installation running versions 0.12.10 to 0.22.0 is vulnerable,” Dmitruk said. “Path traversal writes executables chosen by the attacker to the Windows Startup folder. Missing signature verification keeps them there: post-write cleanup that would delete unsigned files on a working updater is a no-op on Windows. On the next login, Windows runs whatever was left behind.”
“The chain produces persistent, silent code execution at the privilege level of the user running Olama. Realistic payloads include reverse shells, information that steals browser secrets and SSH keys, or droppers that lead to additional persistence mechanisms. Anything that runs as the current user. Deleting the dropped binary from the startup folder eliminates persistence, but the underlying flaws remain.”