A brand new vulnerability dubbed ‘LeftoverLocals’ affecting graphics processing items from AMD, Apple, Qualcomm, and Imagination Technologies permits retrieving knowledge from the native reminiscence area.
Tracked as CVE-2023-4969, the safety subject allows knowledge restoration from susceptible GPUs, particularly within the context of enormous language fashions (LLMs) and machine studying (ML) processes.
LeftoverLocals was found by Trail of Bits researchers Tyler Sorensen and Heidy Khlaaf, who reported it privately to the distributors earlier than publishing a technical overview.
LeftoverLocals particulars
The safety flaw stems from the truth that some GPU frameworks don’t isolate reminiscence utterly and one kernel working on the machine might learn values in native reminiscence written by one other kernel.
Trail of Bits researchers Tyler Sorensen and Heidy Khlaaf, who found and reported the vulnerability, clarify that an adversary solely must run a GPU compute software (eg OpenCL, Vulkan, Metal) to learn knowledge a person left within the GPU native reminiscence.
“Using these, the attacker can learn knowledge that the sufferer has left within the GPU native reminiscence just by writing a GPU kernel that dumps uninitialized native reminiscence” – Trail of Bits
LeftoverLocals lets attackers launch a ‘listener’ – a GPU kernel that reads from uninitialized native reminiscence and may dump the information in a persistent location, equivalent to the worldwide reminiscence.
If the native reminiscence will not be cleared, the attacker can use the listener to learn values left behind by the ‘author’ – a program that shops values to native reminiscence.
The animation beneath reveals how the author and listener packages work together and the way the latter can retrieve knowledge from the previous on affected GPUs.

The recovered knowledge can reveal delicate details about the sufferer’s computations, together with mannequin inputs, outputs, weights, and intermediate computations.
In a multi-tenant GPU context that runs LLMs, LeftoverLocals can be utilized to pay attention to different customers’ interactive classes and get well from the GPU’s native reminiscence the information from the sufferer’s “author” course of.
The Trail of Bits researchers have created a proof of idea (PoC) to show LeftoverLocals and confirmed that an adversary can get well 5.5MB of knowledge per GPU invocation, relying on the GPU framework.
On an AMD Radeon RX 7900 XT powering the open-source LLM llama.cpp, an attacker can get as a lot as 181MB per question, which is adequate to reconstruct the LLM’s responses with excessive accuracy.
Impact and remediation
Trail of Bits researchers found CVE-2023-4969 in September 2023 and knowledgeable CERT/CC to assist coordinate the disclosure and patching efforts.
Mitigation efforts are underway as some distributors have already mounted it whereas others are nonetheless engaged on a solution to develop and implement a protection mechanism.
In the case of Apple, the most recent iPhone 15 is unaffected and fixes grew to become accessible for A17 and M3 processors, however the subject persists on M2-powered computer systems.
AMD knowledgeable that the next GPU fashions stay susceptible as its engineers examine efficient mitigation methods.
Qualcomm has launched a patch by way of firmware v2.0.7 that fixes LeftoverLocals in some chips however others stay susceptible.
Imagination launched a repair in DDK v23.3 in December 2023. However, Google warned in January 2024 that a few of the vendor’s GPUs are nonetheless impacted.
Intel, NVIDIA, and ARM GPUs have reported that the information leak drawback doesn’t have an effect on their units.
Trail of Bits means that GPU distributors implement an computerized native reminiscence clearing mechanism between kernel calls, making certain isolation of delicate knowledge written by one course of.
While this strategy would possibly introduce some efficiency overhead, the researchers counsel that the trade-off is justified given the severity of the safety implications.
Other potential mitigations embody avoiding multi-tenant GPU environments in security-critical eventualities and implementing user-level mitigations.
