This has been extremely worrying. What's more worrying are the number of 'security researchers' regurgitating Intel's bullshit verbatim. We have yet to fully see the fallout from this.
He's also dead right in that Intel has been mixing up the two issues, Meltdown and Spectre, deliberately, so they could tell everyone that it wasn't just Intel that was affected, and they also gave the impression that Spectre had been fixed when it was Meltdown that had been mitigated - with a patch that creates unacceptable performance problems, to a lesser or a greater extent.
Yes, all processor manufacturers are affected by Spectre, but it is Intel that is mostly affected because they implemented speculative loads badly without much attempt at segregation. They've also attempted to pass this off as 'historical architectural decisions we can do nothing about, but it is working as designed'.
He's also dead right in that Intel has been mixing up the two issues, Meltdown and Spectre, deliberately, so they could tell everyone that it wasn't just Intel that was affected, and they also gave the impression that Spectre had been fixed when it was Meltdown that had been mitigated - with a patch that creates unacceptable performance problems, to a lesser or a greater extent.
This, in spades. While Theo De Raadt is not my favorite IT personality, the mixing together of the issues (actually 3 of them!) has made it exceedingly hard for someone who isn't familiar with the inner working of modern CPU architectures to get the story straight, and Mr. De Raadt gets kudos for calling them out on it.
The following is what I could infer from what I found online. I'm almost certain a good portion of it is WRONG, and I hope the more knowledgeable part of the/. crowd will help me out by correcting it. (No, I'm not being lazy - just stretched to the limit of my understanding of the primary sources [blogspot.be], yet desperate to gain some working understanding beyond the "it's hard to explain but you should apply patches" advice found everywhere on the internet.)
There are three separate but somewhat related issues:
Variant 3 is a true bug by any definition. It was named "meltdown" and is an Intel exclusive - AMD and ARM are not affected. If an attacker succeeds to run a malicious binary on an affected system, they can read kernel memory, including juicy secrets like passwords and decription keys. To put this into perspective, this is very nearly as bad as a local privilege escalation. And to put that into perspective, local privilege escalations are so common that there's a mantra in security: if a sufficiently skilled adversary gains "arbitrary code execution", it's virtually "game over" and you can go scrub your HDD. Nevertheless, the aforementioned "sufficiently skilled" bar lies quite high and may not be met by a lot of common threats (especially the automated ones). So, from a defense-in-depth perspective, the only sane advise is "patch your system now". The big news is that patching will come with a performance impact that is proportional with how frequently a process calls the kernel. A process that simply allocates a big chunk of memory, loads data into it, and starts chewing on that (think stuff like compression, crypto mining, scientific computation,...) will not feel much impact, while databases generally will.
Variant 1, IF I understand correctly, allows an attacker to feed a non-buggy process carefully crafted input that tricks it into leaking data into memory space that is owned by the process in question, but not in use by it. The bad news here is that all CPUs (including AMD and ARM) are vulnerable and there's no way to patch it system-wide. One could argue that this is not a huge deal in and by itself because if the process and the system have no other bugs, the data could never be retrieved. However, it is apparently possible on certain browsers to make JavaScript read data from the "not-in-use" memory locations (which would be a feature for a "system" language like C, but I would classify it as a bug for a high-level interpreted language such as JavaScript). Given that a browser handles sensitive data (passwords), this is potentially devastating. Fortunately, it is easily mitigated by the fact that the leaked data doesn't live long by virtue of it physically only residing in the CPU cache and not the actual memory. The attack therefore relies on precise timing, and by decreasing the precision of the timing mechanisms that are available in JavaScript, browser manufacturers can put a stopgap into th
Thank you for noting that you're not 100% sure it's right, and for the excellent summary. There's a ton of misinformation going around, especially with 0100010001010011 dude on Slashdot repeatedly posting that Meltdown is INTEL ONLY, which is false, as some ARM products are affected. What is true is that Meltdown does not affect AMD and affects only a few of ARM's processors.
Here are two corrections to make: 1) Meltdown:
One of your bold statements "AMD and ARM are not affected" is untrue. See here, from ARM directly: https://developer.arm.com/supp... [arm.com]
ARM has confirmed that A75 is vulnerable to Meltdown. In addition, A15, A57, and A72 are vulnerable to a variant of Meltdown (Variant 3a) which ARM has added. ARM has stated that they believe this variant is NOT exploitable, however, there is already userspace code out there that can do some limited exploits: https://github.com/lgeek/spec_... [github.com]
2) Variant 1: While other vendors may require application changes to address this issue, AMD appears to be able to address this with an OS update, based on their post: https://www.amd.com/en/corpora... [amd.com]
Summary: Variant 1: Some manufacturers (ARM) appear to not be able to fix it and are recommending compiler changes, but AMD will fix this in OS updates. Unclear how Intel is addressing this vulnerability. Variant 2: Correct, from what I can tell. Variant 3 (Meltdown): Affects nearly all Intel (within the last 10 years) and ARM A75 chips. AMD not affected. Variant 3a (Modified Meltdown): Affects a larger set of high performance ARM chips
Finally, Intel has done a terrible job (intentionally?) at conflating the two issues, which is unfair. These are 3 separate security issues, with their own priorities and impacts. If you read Intel's official press release for this issue, there's no differentiation between variants 1-3, like there is for AMD and ARM: https://www.intel.com/content/... [intel.com]
Thanks a lot for the reviewing and corrections! I hope not only I but the/. community at large will get some benefit out of this exercise.
I guess I did get a bit lazy with verifying statements about ARM. Also, I'm surprised on a somewhat more fundamental level that it's so badly affected, so some confirmation bias might have been at play. If a lot of the same issues turn up in a completely different architecture, then Intel's "hard-to-foresee consequences of logical design decisions" attitude might have so
That's good to note. I wasn't aware that no A75 hardware has yet shipped.
That said, the design was finalized and released in May of last year... so something will have to be done by anyone implementing it if that hardware is running untrusted code.
Snapdragon 845 is scheduled to release this year. It will have the Meltdown flaw and it won't be Intel. I know you have a emotional desire to repeat that meltdown is Intel only, but it's not. Grow up.
I skimmed both papers, and that seems to about sum it up. Though I would add that all three attacks cause speculative execution of a construction like; "x = array[ *pointer ];", to push memory from an array into or out of cache based on the data loaded from the victim pointer. So combining the announcement does make some sense, as the details of any of those variants might point people to rediscovering the others.
I was impressed with the work put into variant 2. Tricking the CPU branch predictor into running ROP-like gadgets within a higher privileged process, then using cache access timing to work out what happened. It almost sounds like bad sci-fi dialog, yet they actually did it. And yes, the attack complexity sounds comparable to similar ROP stack smashing exploits.
Variant 2 is being patched in compilers. Both gcc and clang [llvm.org] are working on patches (that might already be released?) that avoid any speculative execution of indirect branching. Using a trick documented by google to patch the stack with the destination address, and then return. So now we just have to recompile *everything* that has access to privileged / sensitive memory contents to hopefully prevent attackers doing anything useful with branch poisoning. Of course there will be a performance hit, as no indirect branches can be correctly predicted.
Personally I would say that the problem with variant 2 is sharing the branch predictor between domains. Branches taken in one process, influence how branches in other processes are predicted. I can understand that in a modern OS, multiple processes end up running the same library code, so this may have been a deliberate decision. But, if these tables were stored per-thread and context switched, this problem would probably have never been exploitable.
The Spectre paper did suggest that they had found some evidence of something like variant 2 on an AMD CPU. But I believe that the inner workings of AMD's branch predictor are not as easily deduced as Intel's. So the researchers took the easiest route and attacked 3 different Intel cores instead. That doesn't mean that nobody will ever work out how to pull off an attack though.
Variant 2 is being patched in compilers. Both gcc and clang [llvm.org] are working on patches (that might already be released?) that avoid any speculative execution of indirect branching. Using a trick documented by google to patch the stack with the destination address, and then return. So now we just have to recompile *everything* that has access to privileged / sensitive memory contents to hopefully prevent attackers doing anything useful with branch poisoning. Of course there will be a performance hit, as no indirect branches can be correctly predicted.
Browsing llvm/clang's mailing list, I also found that variant 1 is getting a compiler work around [llvm.org]. Though this requires opt-in from each vulnerable call site.
Note that AMD is far from innocent in this respect - just remember how they skilfully downplayed their infamous TLB errata.
I might suspect that their Phenom TLB problem had a connection with not speculating invalid loads but apparently their earlier processors did not suffer from this problem either.
"I have just one word for you, my boy...plastics."
- from "The Graduate"
He and Linus are Spot On (Score:5, Insightful)
He's also dead right in that Intel has been mixing up the two issues, Meltdown and Spectre, deliberately, so they could tell everyone that it wasn't just Intel that was affected, and they also gave the impression that Spectre had been fixed when it was Meltdown that had been mitigated - with a patch that creates unacceptable performance problems, to a lesser or a greater extent.
Yes, all processor manufacturers are affected by Spectre, but it is Intel that is mostly affected because they implemented speculative loads badly without much attempt at segregation. They've also attempted to pass this off as 'historical architectural decisions we can do nothing about, but it is working as designed'.
Re:He and Linus are Spot On (Score:5, Informative)
He's also dead right in that Intel has been mixing up the two issues, Meltdown and Spectre, deliberately, so they could tell everyone that it wasn't just Intel that was affected, and they also gave the impression that Spectre had been fixed when it was Meltdown that had been mitigated - with a patch that creates unacceptable performance problems, to a lesser or a greater extent.
This, in spades. While Theo De Raadt is not my favorite IT personality, the mixing together of the issues (actually 3 of them!) has made it exceedingly hard for someone who isn't familiar with the inner working of modern CPU architectures to get the story straight, and Mr. De Raadt gets kudos for calling them out on it.
The following is what I could infer from what I found online. I'm almost certain a good portion of it is WRONG, and I hope the more knowledgeable part of the /. crowd will help me out by correcting it. (No, I'm not being lazy - just stretched to the limit of my understanding of the primary sources [blogspot.be], yet desperate to gain some working understanding beyond the "it's hard to explain but you should apply patches" advice found everywhere on the internet.)
Re:He and Linus are Spot On (Score:5, Informative)
As you state, it's important to rely on the original sources. Here is each CPU vendor's response to the security issues:
https://www.amd.com/en/corpora... [amd.com]
https://www.intel.com/content/... [intel.com]
https://developer.arm.com/supp... [arm.com]
Here are two corrections to make:
1) Meltdown:
One of your bold statements "AMD and ARM are not affected" is untrue. See here, from ARM directly:
https://developer.arm.com/supp... [arm.com]
ARM has confirmed that A75 is vulnerable to Meltdown. In addition, A15, A57, and A72 are vulnerable to a variant of Meltdown (Variant 3a) which ARM has added. ARM has stated that they believe this variant is NOT exploitable, however, there is already userspace code out there that can do some limited exploits:
https://github.com/lgeek/spec_... [github.com]
AMD is not affected by Meltdown, in any form. From AMD's press release:
https://www.amd.com/en/corpora... [amd.com]
2) Variant 1: While other vendors may require application changes to address this issue, AMD appears to be able to address this with an OS update, based on their post:
https://www.amd.com/en/corpora... [amd.com]
Summary:
Variant 1: Some manufacturers (ARM) appear to not be able to fix it and are recommending compiler changes, but AMD will fix this in OS updates. Unclear how Intel is addressing this vulnerability.
Variant 2: Correct, from what I can tell.
Variant 3 (Meltdown): Affects nearly all Intel (within the last 10 years) and ARM A75 chips. AMD not affected.
Variant 3a (Modified Meltdown): Affects a larger set of high performance ARM chips
Finally, Intel has done a terrible job (intentionally?) at conflating the two issues, which is unfair. These are 3 separate security issues, with their own priorities and impacts. If you read Intel's official press release for this issue, there's no differentiation between variants 1-3, like there is for AMD and ARM:
https://www.intel.com/content/... [intel.com]
Re: (Score:2)
Thanks a lot for the reviewing and corrections! I hope not only I but the /. community at large will get some benefit out of this exercise.
I guess I did get a bit lazy with verifying statements about ARM. Also, I'm surprised on a somewhat more fundamental level that it's so badly affected, so some confirmation bias might have been at play. If a lot of the same issues turn up in a completely different architecture, then Intel's "hard-to-foresee consequences of logical design decisions" attitude might have so
Re: (Score:2)
That said, the design was finalized and released in May of last year... so something will have to be done by anyone implementing it if that hardware is running untrusted code.
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re:He and Linus are Spot On (Score:4, Interesting)
I skimmed both papers, and that seems to about sum it up. Though I would add that all three attacks cause speculative execution of a construction like; "x = array[ *pointer ];", to push memory from an array into or out of cache based on the data loaded from the victim pointer. So combining the announcement does make some sense, as the details of any of those variants might point people to rediscovering the others.
I was impressed with the work put into variant 2. Tricking the CPU branch predictor into running ROP-like gadgets within a higher privileged process, then using cache access timing to work out what happened. It almost sounds like bad sci-fi dialog, yet they actually did it. And yes, the attack complexity sounds comparable to similar ROP stack smashing exploits.
Variant 2 is being patched in compilers. Both gcc and clang [llvm.org] are working on patches (that might already be released?) that avoid any speculative execution of indirect branching. Using a trick documented by google to patch the stack with the destination address, and then return. So now we just have to recompile *everything* that has access to privileged / sensitive memory contents to hopefully prevent attackers doing anything useful with branch poisoning. Of course there will be a performance hit, as no indirect branches can be correctly predicted.
Personally I would say that the problem with variant 2 is sharing the branch predictor between domains. Branches taken in one process, influence how branches in other processes are predicted. I can understand that in a modern OS, multiple processes end up running the same library code, so this may have been a deliberate decision. But, if these tables were stored per-thread and context switched, this problem would probably have never been exploitable.
The Spectre paper did suggest that they had found some evidence of something like variant 2 on an AMD CPU. But I believe that the inner workings of AMD's branch predictor are not as easily deduced as Intel's. So the researchers took the easiest route and attacked 3 different Intel cores instead. That doesn't mean that nobody will ever work out how to pull off an attack though.
Re: (Score:2)
Variant 2 is being patched in compilers. Both gcc and clang [llvm.org] are working on patches (that might already be released?) that avoid any speculative execution of indirect branching. Using a trick documented by google to patch the stack with the destination address, and then return. So now we just have to recompile *everything* that has access to privileged / sensitive memory contents to hopefully prevent attackers doing anything useful with branch poisoning. Of course there will be a performance hit, as no indirect branches can be correctly predicted.
Interesting! Thank you for the additional info.
Re: (Score:2)
Re: (Score:2)
Note that AMD is far from innocent in this respect - just remember how they skilfully downplayed their infamous TLB errata.
I might suspect that their Phenom TLB problem had a connection with not speculating invalid loads but apparently their earlier processors did not suffer from this problem either.