Here We Go Again: Intel “Downfall” Vulnerability Affects Billions of Processors Made Since 2014

johnwalker · 11 August 2023 16:07

This Patch Tuesday brings a new and potentially painful processor speculative execution vulnerability… Downfall, or as Intel prefers to call it is GDS: Gather Data Sampling. GDS/Downfall affects the gather instruction with AVX2 and AVX-512 enabled processors. At least the latest-generation Intel CPUs are not affected but Tigerlake / Ice Lake back to Skylake is confirmed to be impacted. There is microcode mitigation available but it will be costly for AVX2/AVX-512 workloads with GATHER instructions in hot code-paths and thus widespread software exposure particularly for HPC and other compute-intensive workloads that have relied on AVX2/AVX-512 for better performance.

Downfall is characterized as a vulnerability due to a memory optimization feature that unintentionally reveals internal hardware registers to software. With Downfall, untrusted software can access data stored by other programs that typically should be off-limits: the AVX GATHER instruction can leak the contents of the internal vector register file during speculative execution. Downfall was discovered by security researcher Daniel Moghimi of Google. Moghimi has written demo code for Downfall to show 128-bit and 256-bit AES keys being stolen from other users on the local system as well as the ability to steal arbitrary data from the Linux kernel.

Skylake processors are confirmed to be affected through Tiger Lake on the client side or Xeon Scalable Ice Lake on the server side. At least the latest Intel Alder Lake / Raptor Lake and Intel Xeon Scalable Sapphire Rapids are not vulnerable to Downfall. But for all the affected generations, CPU microcode is being released today to address this issue.

Intel acknowledges that their microcode mitigation for Downfall will have the potential for impacting performance where gather instructions are in an applications’ hot-path. In particular given the AVX2/AVX-512 impact with vectorization-heavy workloads, HPC workloads in particular are likely to be most impacted but we’ve also seen a lot of AVX use by video encoding/transcoding, AI, and other areas. Intel has not relayed any estimated performance impact claims from this mitigation. Well, to the press. To other partners Intel has reportedly communicated a performance impact up to 50%. That is for workloads with heavy gather instruction use as part of AVX2/AVX-512. Intel is being quite pro-active in letting customers know they can disable the microcode change if they feel they are not to be impacted by Downfall. Intel also believes pulling off a Downfall attack in the real-world would be a very difficult undertaking. However, those matters are subject to debate.

The discoverer of the vulnerability, Daniel Moghimi, who reported the problem to Intel in August 2022, has put up a Web site devoted to the flaw, downfall.page.

He notes, in response to Intel’s poo-poohing the severity of the risk,

GDS [Gather Data Sampling] is highly practical. It took me 2 weeks to develop an end-to-end attack stealing encryption keys from OpenSSL. It only requires the attacker and victim to share the same physical processor core, which frequently happens on modern-day computers, implementing preemptive multitasking and simultaneous multithreading.

As to the cost of Intel’s microcode patch to workloads that use the vector gather facility heavily:

This depends on whether Gather is in the critical execution path of a program. According to Intel, some workloads may experience up to 50% overhead.

Here is the research paper describing the exploit, “Downfall: Exploiting Speculative Data Gathering” [PDF]. This is the abstract.

We introduce Downfall attacks, new transient execution attacks that undermine the security of computers running everywhere across the internet. We exploit the gather instruction on high-performance x86 CPUs to leak data across boundaries of user-kernel, processes, virtual machines, and trusted execution environments. We also develop practical and end-to-end attacks to steal cryptographic keys, program’s runtime data, and even data at rest (arbitrary data). Our findings, exploitation techniques, and demonstrated attacks defeat all previous defenses, calling for critical hardware fixes and security updates for widely-used client and server computers.

Source code of the proof of concept of the exploit is posted on GitHub.

This is a silent video demonstrating the exploit in action.

gms · 11 August 2023 20:20

This reads like a massive vulnerability in terms of potential impact. Regarding the timing of the fix, it clearly aimed to come right after Intel’s Q2 earnings announcement just at the end of July.

Sobering points for me in John’s summary

Can steal SSL keys
The fix comes with a potential 50% overhead increase in certain use case scenario

This is not over yet in terms of implications - expect the class action lawsuits will start, if they have not already.

johnwalker · 11 August 2023 21:34

The severity of the problem and consequences of the fix depend upon the environment in which the CPU is used. For some users it may not be a substantial risk or one worth mitigating in ways other than the microcode fix.

First of all, the flaw allows leakage of information between two processes running on the same core of the same physical CPU. In environments where there is no possibility of a hostile program running simultaneously on the CPU (for example in a personal computer which is only used by one person), an attacker does not have an opportunity to run an exploit program simultaneously with the targeted user. The user may be concerned about software running on the computer having been corrupted, but if this is the case and the corrupted software runs under the user’s permissions, there are many more easier ways for it to steal information than via this vulnerability.

The problem mainly affects CPUs whose workload runs programs for multiple users simultaneously on the same processor, as is the case for a cloud services provider whose customers do not receive a dedicated CPU but rather share a single processor. In this case, the microcode fix is required to prevent a hostile program running on one user’s account from stealing data from another. But since the vulnerability only affects the Gather instruction of the vector processing instruction set of the x86 CPU, the overhead imposed by the fix will be negligible unless the workload relies heavily on this feature, which is usually the case only for specialised applications such as machine learning, high performance graphics rendering (CGI), and some scientific calculations. Customers with such workloads may prefer to pay for a dedicated CPU and run them without the microcode fix, but they represent a small minority of cloud customers.

The affected vector instructions are heavily used in many scientific supercomputer calculations. But supercomputers are often dedicated to one task at a time, and hence there is no threat of leakage to a hostile program running concurrently.

It would be far better, of course, if users and system managers did not have to worry about such matters resulting from a flawed design by Intel, but in practice, in many cases, it isn’t as bad as some of the media coverage implies.