Microsoft’s compiler-level Spectre fix shows how hard this problem will be to solve

The Meltdown and Spectre attacks that use processor speculative execution to leak sensitive information have resulted in a wide range of software changes to try to limit the scope for harm. Many of these are operating system-level fixes, some of which depend on processor microcode updates.

But Spectre isn't a simple attack to solve; operating system changes help a great deal, but application-level changes are also needed. Apple has talked about some of the updates it has made to the WebKit rendering engine, used in its Safari browser, but this is only a single application.

Microsoft is offering a compiler-level change for Spectre. The "Spectre" label actually covers two different attacks. The one that Microsoft's compiler is addressing, known as "variant 1," concerns checking the size of an array: before accessing the Nth element of an array, code should check that the array has at least N elements in it. Programmers using languages like C and C++ often have to write these checks explicitly. Other languages, like JavaScript and Java, perform them automatically. Either way, the test has to be done; attempts to access array members that don't exist are a whole class of bugs all on their own.

Speculative information leaks

The Spectre problem is that the processor doesn't always wait to see if the Nth element exists before it tries to access it. It can speculatively try to access the Nth element while waiting for the check to finish. This access is still "safe," insofar as it doesn't introduce any programming bugs. But as security researchers found, it can leak information. The processor will try to load the Nth element (regardless of what it is, or if it even exists), and this can change the data stored in the processor's cache. The change can be detected and can be used to leak secret information.

This kind of speculative execution is important in modern processors. Current Intel chips can run just shy of 200 instructions speculatively. They'll essentially guess how comparisons will be evaluated and what the path through the code will be, rolling back everything that they guessed if those guesses turn out to be wrong. This is all meant to be transparent to running programs; the security issue is because it isn't, thanks to those cache changes.

The generally agreed-on fix for this problem is to make the processor wait: to tell the processor not to access the array until the test to see if the element exists has completed. The difficulty with this—and the reason that Microsoft is investigating a compiler-level change—is that identifying exactly which accesses are risky and ensuring that they're fixed requires careful line-by-line inspection of program source code. The goal of Microsoft's change is to avoid that and insert instructions to make the processor stop speculating in the right places automatically.

As a confounding issue, there aren't really any truly good ways of making the processor stop speculating and wait. Because the speculative execution is designed to be transparent, a hidden implementation detail of the way the processor works, processors don't give a lot of control over how it works. There's no explicit instruction—for now—that tells the processor "don't speculate beyond this instruction" without having any other effects.

What we do have, however, are instructions that just happen to act as a speculation block. The oldest and most widely used of these is an instruction called cpuid. cpuid actually has nothing to do with speculation. The processor contains various tables of information describing, for example, which extensions it supports (things like SSE, AVX, 64-bit, and so on) or what its cache topology looks like, and cpuid is used to read those tables.

cpuid is a very slow instruction—it takes hundreds of cycles to run—so it has always had an unusual extra property apart from reading the processor's data tables: it's documented as a "serializing instruction" that acts as a block to speculative execution. Any instruction before a cpuid must be completely executed before the cpuid starts running, and no instruction following a cpuid can start to run until the cpuid is finished.

In response to Spectre, ARM is introducing an instruction called CSDB, the sole purpose of which is to be a speculative execution barrier. But on x86, at least for the time being, no such instructions are planned; we have instead only instructions like cpuid, where the speculation blocking is a side effect.

Here’s what it’s meant to do

Microsoft's new compiler feature will insert an instruction to block speculation in code that the compiler detects as being vulnerable to Spectre. Specifically, whenever code like this is detected:

if(untrusted_index < array1_length) {
    // speculative access to an array
    unsigned char value = array1[untrusted_index];
    // use that speculatively-accessed data
    // in such a way as to disturb the cache
    unsigned char value2 = array2[value * 64];
}

It's transformed into something closer to this:

if(untrusted_index < array1_length) {
    // make sure that the processor knows the result
    // of the comparison
    speculation_barrier();
    // this access is no longer speculative
    unsigned char value = array1[untrusted_index];
    unsigned char value2 = array2[value * 64];
}

Microsoft's chosen instruction to block speculation is called lfence, which means "load fence." lfence is another instruction that doesn't really have anything to do with speculative execution. In principle, it is used to ensure that the processor has finished all outstanding attempts to load data from memory before it starts any new load attempts. The exact value of this instruction in x86 isn't entirely clear (x86 already has strict rules about the order in which it attempts to perform loads from memory), but with the discovery of Spectre, lfence has taken on a new role: it, too, is a speculation block.

lfence is more convenient than cpuid because it doesn't alter any registers, but lfence's use as a speculation block is slightly awkward. For its processors, Intel has always documented lfence as having semi-serializing behavior. In principle, instructions performing stores could be reordered and executed speculatively, but those depending on loads could not. As with cpuid, this was largely a side effect and not the explicit purpose of the instruction. For AMD, however, lfence hasn't always been serializing. It is on some AMD architectures; it isn't on others. That variation was permissible because of the way speculative execution behavior has always been treated as implementation details, not as a documented part of the architecture. The behavior in terms of the processor architecture is the same whether lfence serializes or not.

As part of their responses to Spectre, Intel and AMD have both changed lfence. For Intel, the change appears to be documentation: lfence is now a full serializing instruction, but that appears not to have required any hardware changes, so it seems that the instruction always had this behavior in Intel's implementations. AMD has adopted Intel's convention; going forward, lfence will always be a serializing instruction that blocks speculative execution. For existing processors, AMD says that an MSR (a "model specific register," a special vendor and model-specific processor register that can be used to apply low-level configuration) can be used to change non-serializing lfence into serializing lfence. Only operating systems (and virtualization hypervisors) are able to change MSRs, so operating system updates will be needed to ensure that this is enabled.

Update: A previous version of this article said that some AMD processors would need a microcode update to enable this serializing lfence behavior. That turns out to not be the case; while some AMD processors do not include the MSR, AMD says that those processors already have the serializing behavior anyway.

In the future, lfence will likely be x86's closest equivalent to ARM's CSDB.

The Microsoft compiler change injects the lfence instructions at the correct spot to prevent Spectre attacks on this kind of code. Microsoft's compiler does work, and the code change is effective. But this is where things get tricky.

A complex problem to solve

Speculative execution is important. We want our processors to execute speculatively almost all of the time, because their performance depends on it. As such, we don't want an lfence inserted every single time an array is accessed. As an example, lots of programs do something like this:

for(int i = 0; i < array.size(); ++i) {
    unsigned char value = array[i];
}

This kind of code, which accesses every element of the array in order, is always going to be safe; the program simply has no way of generating a value of i that's larger than the array's size. It doesn't need lfence instructions. Accordingly, Microsoft's compiler doesn't just blindly insert lfence instructions every single time. Most of the time, in fact, it doesn't add them. Instead, it uses some kinds of unspecified heuristics to determine where they should be inserted.

This approach preserves performance, but unfortunately, Microsoft's heuristics are tightly constrained. They detect some Spectre-vulnerable code patterns, but not all of them. Even small changes to a vulnerable piece of code can defeat Microsoft's heuristics—the code will be vulnerable to Spectre, but the compiler won't add lfence instructions to protect it.

Paul Kocher, one of the researchers who wrote the Spectre paper, has taken a closer look at what the compiler is doing. He has discovered that Microsoft's Spectre mitigation is much narrower than one might expect from reading the company's description of it. Code has to follow the vulnerable structure very closely if it's to get the lfence inserted. If it deviates a little (for example, if the test of the array index is in one function, but the actual array access is in another function), then the compiler assumes the code to be not vulnerable. So while Microsoft's change does indeed protect code from the exact Spectre attack outlined in the original paper, its protection is narrow.

This is a problem because it may well leave developers thinking that their code is safe—they built their code with Microsoft's Spectre protection turned on—when it's just as vulnerable as it always was. As Kocher writes, "Speculation barriers are only an effective defense if they are applied to all vulnerable code patterns in a process, so compiler-level mitigations need to instrument all potentially vulnerable code patterns." Microsoft's compiler change isn't doing that.

“No guarantee”

In fairness, Microsoft does warn that "there is no guarantee that all possible instances of variant 1 will be instrumented," but as Kocher's examination shows, it's not simply that some Spectre-vulnerable code will escape the compiler's fixes. Much—and perhaps even most—Spectre-vulnerable code will escape. And even if it were only a few instances, bad guys would be able to locate the unprotected routines and focus their attacks accordingly.

Fundamentally, the only code that needs lfence instructions is that where an attacker can control the array index being used. Without that control, an attacker can't influence which information is leaked by speculative execution. But detecting exactly which array accesses are derived from user input and which are not is far too complex for the compiler. In a language like C or C++, the only way to reliably make that determination is to run the program.

Kocher suggests that Microsoft should offer a more pessimistic mode that protects every conditional access. But this will come with a heavy cost: in sample code he wrote to compute SHA-256 hashes, the version with lfence instructions after every branch had only 40 percent of the performance of the unmodified version. This poses a security-performance trade-off that's decidedly uncomfortable; even if the compiler offered such an option, few people are likely to be willing to accept that kind of performance penalty in general. But for smaller pieces of code that are known to be at risk, such an option may be useful.

Microsoft's much more restricted protection does have the virtue of having much lower impact; the company says that it has built Windows with the Spectre protection and found no real performance regression.

The work done on the compiler and the limitations faced underscore what a complex problem Spectre poses for the computing industry. The processors are working as they're supposed to. We can't do without speculative execution of this kind—we need the performance it offers—but equally, we have no good way of systematically addressing the security concerns it creates. Compiler changes of the kind Microsoft has made are well-meaning, but as Kocher's investigation has shown, they're a long way short of offering a complete solution.

Spooky spooky Spectre —

Microsoft’s compiler-level Spectre fix shows how hard this problem will be to solve

Investigation of Microsoft's compiler changes show that much of the time, they won't fix Spectre.

Speculative information leaks

Here’s what it’s meant to do

A complex problem to solve

“No guarantee”

Channel Ars Technica

Speculative information leaks

Here’s what it’s meant to do

A complex problem to solve

“No guarantee”

reader comments

Channel Ars Technica