Discussion:
[Cerowrt-devel] KASLR: Do we have to worry about other arches than x86?
Dave Taht
2018-01-01 23:08:48 UTC
Permalink
or is this primarily a virtualization bug?

http://hn.premii.com/#/article/16046636

"Bad news: the software mitigation is expensive

The primary reason for the old Linux behaviour of mapping kernel
memory in the same page tables as user memory is so that when the
user’s code triggers a system call, fault, or an interrupt fires, it
is not necessary to change the virtual memory layout of the running
process.

Since it is unnecessary to change the virtual memory layout, it is
further unnecessary to flush highly performance-sensitive CPU caches
that are dependant on that layout, primarily the Translation Lookaside
Buffer.

With the page table splitting patches merged, it becomes necessary for
the kernel to flush these caches every time the kernel begins
executing, and every time user code resumes executing. For some
workloads, the effective total loss of the TLB lead around every
system call leads to highly visible slowdowns: @grsecurity measured a
simple case where Linux “du -s” suffered a 50% slowdown on a recent
AMD CPU."
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Jonathan Morton
2018-01-01 23:27:45 UTC
Permalink
It looks nasty, but if it's a hardware bug then it's likely applicable only
to specific CPUs.

- Jonathan Morton
Jonathan Morton
2018-01-02 19:06:11 UTC
Permalink
As I thought:

https://lkml.org/lkml/2017/12/27/2

"AMD processors are not subject to the types of attacks that the kernel
page table isolation feature protects against. The AMD microarchitecture
does not allow memory references, including speculative references, that
access higher privileged data when running in a lesser privileged mode
when that access would result in a page fault."

So it only affects *Intel* CPUs, though it's not yet clear to me how widespread the bug is in Intel-land. Therefore ARM, PPC, etc are unaffected, and AMD might just get even more of a leg up in the server biz than previously anticipated.

Reading between the lines, I get the definite impression that this is a hardware exploit which uses *speculative* memory accesses to perform Rowhammer attacks in privileged memory areas. So we probably shouldn't worry about it too much on consumer PCs or routers, even if they do use Intel x86 CPUs, except for the performance impact we might see where the mitigation is in place. The performance impact would primarily affect system calls and context switches, I think, with much less impact on general computation.

- Jonathan Morton
Jonathan Morton
2018-01-04 12:09:30 UTC
Permalink
Okay, it's a little bit more nuanced than I thought. In fact there are *three* different CPU hardware vulnerabilities just disclosed. I've summarised the impact in this Reddit post:

https://www.reddit.com/r/Amd/comments/7o2i91/technical_analysis_of_spectre_meltdown/

The TL;DR version is:

- Spectre v1 affects pretty much any modern out-of-order CPU, but is relatively low impact. It could potentially be exploited using JIT compilation of untrusted eBPF or Javascript, but can only exfiltrate data from the local process.

- Spectre v2 affects most recent Intel CPUs and some recent, high-performance ARM CPU cores, but not AMD to any significant degree. On vulnerable CPUs, it allows a local attacker to exfiltrate data from privileged address space.

- Meltdown is the nasty one which Linux kernel devs have been scrambling to mitigate. So far, it is known to affect only Intel x86 CPUs, due to their unusually aggressive speculative behaviour regarding L1 cache hits. On vulnerable CPUs, it allows a local attacker to exfiltrate data from privileged address space.

I don't think we need to worry about it too much in a router context. Virtual server folks, OTOH...

- Jonathan Morton
Dave Taht
2018-01-04 13:38:27 UTC
Permalink
Post by Jonathan Morton
https://www.reddit.com/r/Amd/comments/7o2i91/technical_analysis_of_spectre_meltdown/
On top of that, potential attacks are cpu-intensive as hell, at least
in their early stages.

I can't help but reflect on my favorite (sadly, still slidewire)
alternate cpu's characteristics, the mill.

It's a single address space in the first place, protection of memory
is to the byte, not the page (and done in a separate unit than the
TLB).
There are no syscalls, per se', instead an explicit capability gaining
(or dropping) portal call almost exactly like a subroutine.
the stack is protected from ROP. Stack and Registers have no rubble
(there are no registers, as we know them, either) that can be peered
at on call or return. Speculative execution is an intrinsic, well
documented part of the exposed processor pipeline: an explicit value
(NAR = not a result) is dropped anywhere there is the equivalent of
speculative execution. There isn't a conventional BTB, either (branch
exits are predicted via an undefined mechanism). In short, I think the
mill, as the closest thing to a pure capabilities arch that exists
today, would have (at least on paper) been invulnerable to this string
of attacks.

But the only way we'll ever find out is to build it. I keep calling
for more open processor designs (the risc-v is gaining some traction).
We really need more diversity in the infrastructure. I started
reminiscing fondly of the days when I used to use an old DEC alpha as
a firewall merely because I had more confidence it would be hard to
exploit than anything else.

I lay awake last night trying to figure out what the impact of these
bugs would be on the market. What I think will happen is everybody's
stock is going to go up as there is a mad rush to replace now 10-30%
slower hardware just to meet existing loads. Maybe power8 will regain
some traction. Cloud costs will jump, also, to compensate. (and it's
not just the cloud, anybody using a JIT on a desktop looks to be in
trouble - that includes java). And then, of course, would be a flood
of exploits over the next several years, attacking everything that
hasn't been patched.

And I'm hating the benchmarks thus far like that, because, latencies
are going to jump once again on servicing interrupts, and
that breaks a lot of assumptions in (for example) the virtualized
networking space, and context switches were already orders of
magnitude too slow for my taste and favorite applications (like
ardour.org).
Post by Jonathan Morton
- Spectre v1 affects pretty much any modern out-of-order CPU, but is relatively low impact. It could potentially be exploited using JIT compilation of untrusted eBPF or Javascript, but can only exfiltrate data from the local process.
- Spectre v2 affects most recent Intel CPUs and some recent, high-performance ARM CPU cores, but not AMD to any significant degree. On vulnerable CPUs, it allows a local attacker to exfiltrate data from privileged address space.
- Meltdown is the nasty one which Linux kernel devs have been scrambling to mitigate. So far, it is known to affect only Intel x86 CPUs, due to their unusually aggressive speculative behaviour regarding L1 cache hits. On vulnerable CPUs, it allows a local attacker to exfiltrate data from privileged address space.
I don't think we need to worry about it too much in a router context. Virtual server folks, OTOH...
- Jonathan Morton
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Jonathan Morton
2018-01-04 13:48:06 UTC
Permalink
Post by Dave Taht
I lay awake last night trying to figure out what the impact of these
bugs would be on the market. What I think will happen is everybody's
stock is going to go up as there is a mad rush to replace now 10-30%
slower hardware just to meet existing loads.
I think *AMD's* stock is going to go up a lot more than Intel's, because they have pretty good server/workstation hardware now, and no vulnerability to the most serious variants of this attack (which means no performance impact from mitigation). Apparently there are also mitigations for Spectre v1 and v2 which have minimal performance impact; Meltdown is the one which has a big performance cost to deal with.

Probably some ARM and PowerPC server vendors could get a boost too, but only after their exposure to these attacks has been properly assessed.

- Jonathan Morton
Dave Taht
2018-01-04 13:59:26 UTC
Permalink
Alan cox has been doing a good job of finding the good stuff. Power
and the IBM z-series are also affected.

https://plus.google.com/u/0/+AlanCoxLinux
Post by Jonathan Morton
Post by Dave Taht
I lay awake last night trying to figure out what the impact of these
bugs would be on the market. What I think will happen is everybody's
stock is going to go up as there is a mad rush to replace now 10-30%
slower hardware just to meet existing loads.
I think *AMD's* stock is going to go up a lot more than Intel's, because they have pretty good server/workstation hardware now, and no vulnerability to the most serious variants of this attack (which means no performance impact from mitigation). Apparently there are also mitigations for Spectre v1 and v2 which have minimal performance impact; Meltdown is the one which has a big performance cost to deal with.
Probably some ARM and PowerPC server vendors could get a boost too, but only after their exposure to these attacks has been properly assessed.
- Jonathan Morton
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Jonathan Morton
2018-01-04 14:49:21 UTC
Permalink
Post by Dave Taht
Alan cox has been doing a good job of finding the good stuff. Power
and the IBM z-series are also affected.
Conversely, the ARM-1176, Cortex-A7 and Cortex-A53 cores used by various iterations of the Raspberry Pi are not affected. These are all in-order execution CPUs with short pipelines, and I think they're representative of what you'd want in CPE.

- Jonathan Morton
Dave Taht
2018-01-04 14:53:50 UTC
Permalink
Post by Jonathan Morton
Post by Dave Taht
Alan cox has been doing a good job of finding the good stuff. Power
and the IBM z-series are also affected.
Conversely, the ARM-1176, Cortex-A7 and Cortex-A53 cores used by various iterations of the Raspberry Pi are not affected. These are all in-order execution CPUs with short pipelines, and I think they're representative of what you'd want in CPE.
Well, I'd hope that this string of bugs stalls deployment of more
advanced arches in this space until the speculative execution bugs are
fully resolved.

(and I *vastly* prefer short pipelines)
Post by Jonathan Morton
- Jonathan Morton
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Jonathan Morton
2018-01-04 21:20:45 UTC
Permalink
The really core issue with Meltdown at the highest level is that the kernel is addressable from userspace, except for the "privilege level" in the page table entries. That's a couple of bits between userspace and data that userspace isn't supposed to ever see. And those bits are ignored during specutlative execution's memory accesses.
...on Intel CPUs since Nehalem and Silvermont, and on a very small number of ARM's highest-performance cores (which you're unlikely to find in CPE).

But not on most ARM cores, nor on AMD CPUs. These all do their security checks more promptly, so the rogue data never reaches either a shadow register nor an execution unit, even under speculative execution.

The conceptually simplest mitigation turns out to be switching off branch prediction.

- Jonathan Morton
Dave Taht
2018-01-04 21:40:28 UTC
Permalink
Depending on how you set up your "home router", you might allow "infected"
or "trojan" programs to run in userspace there. I wouldn't do that, because
hardware is cheap. But some people like to throw all kinds of server code
into their router setups - even stuff like node.js servers.
I do not know if lua-jit is used in lede or openwrt these days, but
since so far as I recall the web server runs as root anyway, once you
have any control of that you are nearly home free in the first place.
The really core issue with Meltdown at the highest level is that the kernel
is addressable from userspace, except for the "privilege level" in the page
table entries. That's a couple of bits between userspace and data that
userspace isn't supposed to ever see. And those bits are ignored during
specutlative execution's memory accesses.
It is really bad news for cloudy multi-tenant devices, but to a huge
extent that market can more rapidly adapt than anywhere else.

A fear is that millions of formerly high end and insecure chips are in
the pipeline and that they will get dumped into any market that will
take them, which certainly includes IoT. It's hard to imagine
shipments of any of 'em actually stopping for any reason, or being
dumped in the ocean on entrance to the country, like some form of
TwEAk party.

And despite the patches ongoing, it's not clear to me if the door can
ever be completely shut on this past generation of hardware still
deployed, I'm still looking over the interrupt related portions and
scratching my head. Significantly limit, yes, close, no.

I guess I'm hoping for simple patches to the microcode to arrive next
week, even simply stuff to disable the branch predictor or speculative
execution, something simple, slow, and sane.
-----Original Message-----
Sent: Thursday, January 4, 2018 9:53am
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches
than x86?
Post by Jonathan Morton
Post by Dave Taht
Alan cox has been doing a good job of finding the good stuff. Power
and the IBM z-series are also affected.
Conversely, the ARM-1176, Cortex-A7 and Cortex-A53 cores used by various
iterations of the Raspberry Pi are not affected. These are all in-order
execution CPUs with short pipelines, and I think they're representative of
what you'd want in CPE.
Well, I'd hope that this string of bugs stalls deployment of more
advanced arches in this space until the speculative execution bugs are
fully resolved.
(and I *vastly* prefer short pipelines)
Post by Jonathan Morton
- Jonathan Morton
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
v***@vt.edu
2018-01-04 21:51:08 UTC
Permalink
Post by Dave Taht
I guess I'm hoping for simple patches to the microcode to arrive next
week, even simply stuff to disable the branch predictor or speculative
execution, something simple, slow, and sane.
In my inbox this morning. I have *no* idea why Intel is allegedly shipping a
microcode fix for something believed to not be fixable via microcode. It
may be this is a fix for only this one variant of the attack, and the other
two require kernel hacks.

Summary:

An update for microcode_ctl is now available for Red Hat Enterprise Linux 7.

Red Hat Product Security has rated this update as having a security impact of
Important. A Common Vulnerability Scoring System (CVSS) base score, which gives
a detailed severity rating, is available for each vulnerability from the CVE
link(s) in the References section.

The microcode_ctl packages provide microcode updates for Intel and AMD processors.

Security Fix(es):

* An industry-wide issue was found in the way many modern microprocessor
designs have implemented speculative execution of instructions (a commonly used
performance optimization). There are three primary variants of the issue which
differ in the way the speculative execution can be exploited. Variant
CVE-2017-5715 triggers the speculative execution by utilizing branch target It
relies on the presence of a precisely-defined instruction sequence in the
privileged code as well as the fact that memory accesses may cause allocation
into the microprocessor's data cache even for speculatively executed
instructions that never actually commit (retire). As a result, an unprivileged
attacker could use this flaw to cross the syscall and guest/host boundaries and
read privileged memory by conducting targeted cache side-channel attacks.
(CVE-2017-5715)

Note: This is the microcode counterpart of the CVE-2017-5715 kernel mitigation.
injection.
Joel Wirāmu Pauling
2018-01-04 21:44:04 UTC
Permalink
Post by Jonathan Morton
I don't think we need to worry about it too much in a router context.
Virtual server folks, OTOH...
- Jonathan Morton
​Disagree - The Router is pretty much synonymous with NFV​
​; I run my lede instances at home on hypervisors - and this is definitely
the norm in Datacentres now. We need to work through this quite carefully. ​
Dave Taht
2018-01-04 21:47:51 UTC
Permalink
Post by Jonathan Morton
Post by Jonathan Morton
I don't think we need to worry about it too much in a router context.
Virtual server folks, OTOH...
- Jonathan Morton
Disagree - The Router is pretty much synonymous with NFV
; I run my lede instances at home on hypervisors - and this is definitely
the norm in Datacentres now. We need to work through this quite carefully.
Yes, the NFV case is serious and what I concluded we had most to worry
about - before starting to worry about the lower end router chips
themselves. But I wasn't aware that people were actually trying to run
lede in that, I'd kind of expected
a more server-like distro to be used there. Why lede in a NFV? Ease of
configuration? Reduced attack surface? (hah)

The only x86 chip I use (aside from simulations) is the AMD one in the
apu2, which I don't know enough about as per speculation...
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Joel Wirāmu Pauling
2018-01-04 21:52:21 UTC
Permalink
Well as I've argued before Lede ideally should be using to Kernel
Namespaces (poor mans containers) for at a minimum the firewall and
per-interface routing instances.

The stuff I am running at home is mostly on cheap Atom board, so it's a
matter of squeezing out unneeded cruft on the platform. Also I don't want
to be admining centos/rhel servers at home.
Post by Joel Wirāmu Pauling
Post by Jonathan Morton
Post by Jonathan Morton
I don't think we need to worry about it too much in a router context.
Virtual server folks, OTOH...
- Jonathan Morton
Disagree - The Router is pretty much synonymous with NFV
; I run my lede instances at home on hypervisors - and this is definitely
the norm in Datacentres now. We need to work through this quite
carefully.
Yes, the NFV case is serious and what I concluded we had most to worry
about - before starting to worry about the lower end router chips
themselves. But I wasn't aware that people were actually trying to run
lede in that, I'd kind of expected
a more server-like distro to be used there. Why lede in a NFV? Ease of
configuration? Reduced attack surface? (hah)
The only x86 chip I use (aside from simulations) is the AMD one in the
apu2, which I don't know enough about as per speculation...
--
Dave TÀht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Dave Taht
2018-01-04 21:54:39 UTC
Permalink
Well as I've argued before Lede ideally should be using to Kernel Namespaces
(poor mans containers) for at a minimum the firewall and per-interface
routing instances.
Enough stuff landed in the last kernel for me to finally consider that feasible.
The stuff I am running at home is mostly on cheap Atom board, so it's a
matter of squeezing out unneeded cruft on the platform. Also I don't want to
be admining centos/rhel servers at home.
OK, so currently shipped gear is a big unknown then.
Post by Dave Taht
Post by Jonathan Morton
Post by Jonathan Morton
I don't think we need to worry about it too much in a router context.
Virtual server folks, OTOH...
- Jonathan Morton
Disagree - The Router is pretty much synonymous with NFV
; I run my lede instances at home on hypervisors - and this is definitely
the norm in Datacentres now. We need to work through this quite carefully.
Yes, the NFV case is serious and what I concluded we had most to worry
about - before starting to worry about the lower end router chips
themselves. But I wasn't aware that people were actually trying to run
lede in that, I'd kind of expected
a more server-like distro to be used there. Why lede in a NFV? Ease of
configuration? Reduced attack surface? (hah)
The only x86 chip I use (aside from simulations) is the AMD one in the
apu2, which I don't know enough about as per speculation...
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Joel Wirāmu Pauling
2018-01-04 21:57:40 UTC
Permalink
Yup - and I know of more than one SDN ISP that is using Lede as their CPE
VNF - straight off the x86 build servers.

Whilst it's more a Hyper-visor mitigation there are certainly things guest
can do to improve situation.

But yes we should look at both cases in detail.
Post by Joel Wirāmu Pauling
Post by Joel Wirāmu Pauling
Well as I've argued before Lede ideally should be using to Kernel
Namespaces
Post by Joel Wirāmu Pauling
(poor mans containers) for at a minimum the firewall and per-interface
routing instances.
Enough stuff landed in the last kernel for me to finally consider that feasible.
Post by Joel Wirāmu Pauling
The stuff I am running at home is mostly on cheap Atom board, so it's a
matter of squeezing out unneeded cruft on the platform. Also I don't
want to
Post by Joel Wirāmu Pauling
be admining centos/rhel servers at home.
OK, so currently shipped gear is a big unknown then.
Post by Joel Wirāmu Pauling
Post by Dave Taht
Post by Jonathan Morton
Post by Jonathan Morton
I don't think we need to worry about it too much in a router context.
Virtual server folks, OTOH...
- Jonathan Morton
Disagree - The Router is pretty much synonymous with NFV
; I run my lede instances at home on hypervisors - and this is definitely
the norm in Datacentres now. We need to work through this quite carefully.
Yes, the NFV case is serious and what I concluded we had most to worry
about - before starting to worry about the lower end router chips
themselves. But I wasn't aware that people were actually trying to run
lede in that, I'd kind of expected
a more server-like distro to be used there. Why lede in a NFV? Ease of
configuration? Reduced attack surface? (hah)
The only x86 chip I use (aside from simulations) is the AMD one in the
apu2, which I don't know enough about as per speculation...
--
Dave TÀht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
--
Dave TÀht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Joel Wirāmu Pauling
2018-01-04 22:02:06 UTC
Permalink
If you are using name-spaces to provide a level of context separation
between your processes ... it's a problem.
Containers and kernel namespaces, and so forth are MEANINGLESS against the
Meltdown and Sceptre problems. It's a hardware bug that lets any userspace
process access anything the kernel can address.
-----Original Message-----
Sent: Thursday, January 4, 2018 4:52pm
bufferbloat.net
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches
than x86?
Well as I've argued before Lede ideally should be using to Kernel
Namespaces (poor mans containers) for at a minimum the firewall and
per-interface routing instances.
The stuff I am running at home is mostly on cheap Atom board, so it's a
matter of squeezing out unneeded cruft on the platform. Also I don't want
to be admining centos/rhel servers at home.
Post by Joel Wirāmu Pauling
Post by Jonathan Morton
Post by Jonathan Morton
I don't think we need to worry about it too much in a router context.
Virtual server folks, OTOH...
- Jonathan Morton
Disagree - The Router is pretty much synonymous with NFV
; I run my lede instances at home on hypervisors - and this is
definitely
Post by Jonathan Morton
the norm in Datacentres now. We need to work through this quite
carefully.
Yes, the NFV case is serious and what I concluded we had most to worry
about - before starting to worry about the lower end router chips
themselves. But I wasn't aware that people were actually trying to run
lede in that, I'd kind of expected
a more server-like distro to be used there. Why lede in a NFV? Ease of
configuration? Reduced attack surface? (hah)
The only x86 chip I use (aside from simulations) is the AMD one in the
apu2, which I don't know enough about as per speculation...
--
Dave TÀht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Joel Wirāmu Pauling
2018-01-04 22:00:36 UTC
Permalink
SRIOV ports and Vendor NIC optimizations wrt Latencies.

Whilst these heavy hitting NVF appliances tend to be large and span
multiple compute hosts (and therefore are the only tenannts on those
computes) - this isn't always the case.

It's a problem in that if you can get onto the hypervisor even as an
unprivileged user you can read out guest stores. .... Big Problem.
Hmm... protection datacentres tend to require lower latencies than can be
achieved running on hypervisors.
Which doesn't mean that some datacenters don't do that.
As far as NFV is concerned, Meltdown only breaks security if a userspace
application is running on a machine where another user has data running
through kernel address space. NFV environments don't tend to run NFV in
userspace under an OS that has kernel data in the page tables that are
reachable from CR3.
The key issue in Meltdown is that CR3 is not changed between userspace and
kernelspace. Which means that the memory access pipeline in userspace can
use a kernelspace address (what Intel calls a "linear" address) without a
check that the pagetables enable userspace access. The check happens after
the speculative execution of the memory access.
I repeat this, because many pseudo-experts who love to be quoted in the
press as saying "be afraid, be very afraid" are saying a lot of nonsense
about Meltdown and Sceptre. It seems to be an echo chamber effect - the
papers were released yesterday afternoon, but in a rush to get "quoted",
all the wannabe-quoted people are saying things that are just plain NOT
TRUE.
-----Original Message-----
Sent: Thursday, January 4, 2018 4:44pm
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches
than x86?
Post by Jonathan Morton
I don't think we need to worry about it too much in a router context.
Virtual server folks, OTOH...
- Jonathan Morton
​Disagree - The Router is pretty much synonymous with NFV​
​; I run my lede instances at home on hypervisors - and this is definitely
the norm in Datacentres now. We need to work through this quite carefully. ​
d***@deepplum.com
2018-01-04 22:09:19 UTC
Permalink
I don't disagree that anyone who can run code in the hypervisor itself can attack the guest instances.

But that has nothing to do with KALSR or Meltdown or Sceptre. That's just bad security design - the rule is "the principle of least privilege", which comes from the 1970's work on secure operating systems.

I should point out here that I was one of the researchers that helped develop the original multi-level security systems then. Those "colored books" come from us.

-----Original Message-----
From: "Joel Wirāmu Pauling" <***@aenertia.net>
Sent: Thursday, January 4, 2018 5:00pm
To: "***@deepplum.com" <***@deepplum.com>
Cc: "Jonathan Morton" <***@gmail.com>, cerowrt-***@lists.bufferbloat.net
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches than x86?




SRIOV ports and Vendor NIC optimizations wrt Latencies.


Whilst these heavy hitting NVF appliances tend to be large and span multiple compute hosts (and therefore are the only tenannts on those computes) - this isn't always the case.


It's a problem in that if you can get onto the hypervisor even as an unprivileged user you can read out guest stores. .... Big Problem.


On 5 January 2018 at 10:57, [ ***@deepplum.com ]( mailto:***@deepplum.com ) <[ ***@deepplum.com ]( mailto:***@deepplum.com )> wrote:

Hmm... protection datacentres tend to require lower latencies than can be achieved running on hypervisors.

Which doesn't mean that some datacenters don't do that.

As far as NFV is concerned, Meltdown only breaks security if a userspace application is running on a machine where another user has data running through kernel address space. NFV environments don't tend to run NFV in userspace under an OS that has kernel data in the page tables that are reachable from CR3.

The key issue in Meltdown is that CR3 is not changed between userspace and kernelspace. Which means that the memory access pipeline in userspace can use a kernelspace address (what Intel calls a "linear" address) without a check that the pagetables enable userspace access. The check happens after the speculative execution of the memory access.

I repeat this, because many pseudo-experts who love to be quoted in the press as saying "be afraid, be very afraid" are saying a lot of nonsense about Meltdown and Sceptre. It seems to be an echo chamber effect - the papers were released yesterday afternoon, but in a rush to get "quoted", all the wannabe-quoted people are saying things that are just plain NOT TRUE.


-----Original Message-----
From: "Joel Wirāmu Pauling" <[ ***@aenertia.net ]( mailto:***@aenertia.net )>
Sent: Thursday, January 4, 2018 4:44pm
To: "Jonathan Morton" <[ ***@gmail.com ]( mailto:***@gmail.com )>
Cc: [ cerowrt-***@lists.bufferbloat.net ]( mailto:cerowrt-***@lists.bufferbloat.net )
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches than x86?








On 5 January 2018 at 01:09, Jonathan Morton <[ ***@gmail.com ]( mailto:***@gmail.com )> wrote:


I don't think we need to worry about it too much in a router context. Virtual server folks, OTOH...



- Jonathan Morton



​Disagree - The Router is pretty much synonymous with NFV​
​; I run my lede instances at home on hypervisors - and this is definitely the norm in Datacentres now. We need to work through this quite carefully. ​
Joel Wirāmu Pauling
2018-01-04 22:13:32 UTC
Permalink
Talking cross purposes here ; I am merely pointing out WHY it's a problem
in the routing world.

I also have coloured books from my past, they mostly involve bad 80's
Children's TV series tie ins and 'between the lines' style instructions.

-Joel
Post by d***@deepplum.com
I don't disagree that anyone who can run code in the hypervisor itself can
attack the guest instances.
But that has nothing to do with KALSR or Meltdown or Sceptre. That's just
bad security design - the rule is "the principle of least privilege", which
comes from the 1970's work on secure operating systems.
I should point out here that I was one of the researchers that helped
develop the original multi-level security systems then. Those "colored
books" come from us.
-----Original Message-----
Sent: Thursday, January 4, 2018 5:00pm
bufferbloat.net
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches than x86?
SRIOV ports and Vendor NIC optimizations wrt Latencies.
Whilst these heavy hitting NVF appliances tend to be large and span
multiple compute hosts (and therefore are the only tenannts on those
computes) - this isn't always the case.
It's a problem in that if you can get onto the hypervisor even as an
unprivileged user you can read out guest stores. .... Big Problem.
Hmm... protection datacentres tend to require lower latencies than can be
achieved running on hypervisors.
Which doesn't mean that some datacenters don't do that.
As far as NFV is concerned, Meltdown only breaks security if a userspace
application is running on a machine where another user has data running
through kernel address space. NFV environments don't tend to run NFV in
userspace under an OS that has kernel data in the page tables that are
reachable from CR3.
The key issue in Meltdown is that CR3 is not changed between userspace
and kernelspace. Which means that the memory access pipeline in userspace
can use a kernelspace address (what Intel calls a "linear" address) without
a check that the pagetables enable userspace access. The check happens
after the speculative execution of the memory access.
I repeat this, because many pseudo-experts who love to be quoted in the
press as saying "be afraid, be very afraid" are saying a lot of nonsense
about Meltdown and Sceptre. It seems to be an echo chamber effect - the
papers were released yesterday afternoon, but in a rush to get "quoted",
all the wannabe-quoted people are saying things that are just plain NOT
TRUE.
-----Original Message-----
Sent: Thursday, January 4, 2018 4:44pm
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches than x86?
Post by Jonathan Morton
I don't think we need to worry about it too much in a router context.
Virtual server folks, OTOH...
- Jonathan Morton
​Disagree - The Router is pretty much synonymous with NFV​
​; I run my lede instances at home on hypervisors - and this is
definitely the norm in Datacentres now. We need to work through this quite
carefully. ​
Dave Taht
2018-01-04 22:15:21 UTC
Permalink
Post by d***@deepplum.com
I don't disagree that anyone who can run code in the hypervisor itself can
attack the guest instances.
But that has nothing to do with KALSR or Meltdown or Sceptre. That's just
bad security design - the rule is "the principle of least privilege", which
comes from the 1970's work on secure operating systems.
I should point out here that I was one of the researchers that helped
develop the original multi-level security systems then. Those "colored
books" come from us.
You are one of the few remaining that have written those. Back when I
read those (in 1990 or so,
SCO was trying for at least a c1 rating), I felt they were impossible
to implement without hardware support,
which led to my early interest in capabilities based architectures.
Sadly the need for speed trumped all security concerns in the decades
since.

There are undoubtably sordid tales we both could tell here.

https://en.wikipedia.org/wiki/Trusted_Computer_System_Evaluation_Criteria
Post by d***@deepplum.com
-----Original Message-----
Sent: Thursday, January 4, 2018 5:00pm
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches than x86?
SRIOV ports and Vendor NIC optimizations wrt Latencies.
Whilst these heavy hitting NVF appliances tend to be large and span multiple
compute hosts (and therefore are the only tenannts on those computes) - this
isn't always the case.
It's a problem in that if you can get onto the hypervisor even as an
unprivileged user you can read out guest stores. .... Big Problem.
Hmm... protection datacentres tend to require lower latencies than can be
achieved running on hypervisors.
Which doesn't mean that some datacenters don't do that.
As far as NFV is concerned, Meltdown only breaks security if a userspace
application is running on a machine where another user has data running
through kernel address space. NFV environments don't tend to run NFV in
userspace under an OS that has kernel data in the page tables that are
reachable from CR3.
The key issue in Meltdown is that CR3 is not changed between userspace and
kernelspace. Which means that the memory access pipeline in userspace can
use a kernelspace address (what Intel calls a "linear" address) without a
check that the pagetables enable userspace access. The check happens after
the speculative execution of the memory access.
I repeat this, because many pseudo-experts who love to be quoted in the
press as saying "be afraid, be very afraid" are saying a lot of nonsense
about Meltdown and Sceptre. It seems to be an echo chamber effect - the
papers were released yesterday afternoon, but in a rush to get "quoted", all
the wannabe-quoted people are saying things that are just plain NOT TRUE.
-----Original Message-----
Sent: Thursday, January 4, 2018 4:44pm
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches than x86?
Post by Jonathan Morton
I don't think we need to worry about it too much in a router context.
Virtual server folks, OTOH...
- Jonathan Morton
Disagree - The Router is pretty much synonymous with NFV
; I run my lede instances at home on hypervisors - and this is definitely
the norm in Datacentres now. We need to work through this quite carefully.
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Jonathan Morton
2018-01-04 22:26:21 UTC
Permalink
Post by d***@deepplum.com
I should point out here that I was one of the researchers that helped develop the original multi-level security systems then. Those "colored books" come from us.
Obligatory:


- Jonathan Morton
Joel Wirāmu Pauling
2018-01-04 22:35:37 UTC
Permalink
I was too busy skateboarding holding on to the bumper of my Limo to take
notice obviously.
Post by d***@deepplum.com
Post by d***@deepplum.com
I should point out here that I was one of the researchers that helped
develop the original multi-level security systems then. Those "colored
books" come from us.
Obligatory: http://youtu.be/4U9MI0u2VIE
- Jonathan Morton
d***@deepplum.com
2018-01-04 22:58:48 UTC
Permalink
As I continue to study the Spectre bug, I read the Project Zero post about POC's they developed for Spectre.

[ https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html ]( https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html )

(the Meltdown and Spectre papers, linked from that page are far better at explaining the mechanics of the issue. Read Meltdown first, it's simpler.).

It was very enlightening to see that one exploit used the "in-kernel eBPF JIT interpreter". This one also looks very practical to exploit, and it is "network related" so of some interest here. The Project Zero post doesn't describe the exploit itself, but reading the Spectre paper gives one context on how Spectre works.

Unlike Meltdown, Spectre really depends on the attacker being able to force certain instruction sequences to get executed in some address space, such that the effect can be observed in another address space where the attacker's observer resides.

What this means is that to read data from the kernel, Spectre needs to force a specific code sequence to be executed, with a branch mispredicted, *in the kernel*.

That's where the eBPF JIT comes into play, apparently. Because the eBPF JIT allows *kernel code* to be constructed from the attackers userspace code. Hmmm... sounds like the user can change the kernel binary code and get it executed!

So this is a relatively practical thing to do, and it gives full access to anything in the kernel address space, from a userspace program.

Now it's easy to disable the JIT feature. Just a packet processing performance hit.

But I bet designing the JIT so it won't generate Spectre-exploitable code would be tricky indeed.

Especially since the Spectre-exploitable code is highly processor architecture specific, unlike Meltdown, which appears to be Intel-only.
Dave Taht
2018-01-05 04:53:55 UTC
Permalink
It took me a long while to digest that one. The branch predictor
analysis of haswell was easiest to understand (and AMD claims to have
an AI based one), and perhaps scrambling that at random intervals
would help? (this stuff is now way above my pay grade)
Jonathan Morton
2018-01-05 14:07:03 UTC
Permalink
Post by Dave Taht
It took me a long while to digest that one. The branch predictor
analysis of haswell was easiest to understand (and AMD claims to have
an AI based one), and perhaps scrambling that at random intervals
would help? (this stuff is now way above my pay grade)
Software mitigations for all three attacks have been developed during the "responsible disclosure" period.

Spectre v1: adding an LFENCE instruction (memory load fence) to JIT code performing a bounds-checked array read. This is basically a userspace fix for a userspace attack. Firefox just got this, Chrome undoubtedly will too, if it hasn't already.

Spectre v2: three different mitigations are appropriate for different families of CPU:

https://lkml.org/lkml/2018/1/4/742

On AMD CPUs, the small risk actually existing (because AMD's BTB is much less prone to poisoning than Intel's) is erased by adding LFENCE to privileged indirect branches. This has only a very small cost.

On Intel CPUs until Broadwell inclusive (and Silvermont onwards), a "retpoline" structure is necessary and sufficient. This has a bigger cost than LFENCE and is pretty ugly to look at, but it's still relatively minor.

On Skylake, Kaby Lake and Coffee Lake, something more exotic is required - I think it involves temporarily disabling the BTB during privileged indirect branches. That's *really* ugly, and involves tweaking poorly-documented MSRs.

Something similar in nature to the above should also work for affected ARM cores.

Meltdown: nothing is required for AMD CPUs. Unmapping the privileged addresses when returning to userspace is sufficient for Intel, but incurs a big performance overhead for syscalls. The same is likely true for any other affected CPUs.

- Jonathan Morton
d***@deepplum.com
2018-01-05 15:35:45 UTC
Permalink
One of the most troubling "overreactions" is due to the fact that the POC by Google Project Zero describes an attack on the hypervisor host memory under KVM.
In fine print, and not very explicitly in the Project Zero description, is that the version of KVM that was hacked was dependent on the hypervisor being mapped into the linear address space of the guest kernel.
In a hypervisor that uses VMX extensions, the EPT during guest execution doesn't even provide addressability to the hypervisor code and data. (I haven't inspected KVM's accelerated mode, but I can't see why it would have the EPT map non-guest memory. I know VMWare does not.)

This is validated by a posting from QEMU re KVM, [ https://www.qemu.org/2018/01/04/spectre/ ]( https://www.qemu.org/2018/01/04/spectre/ ) , again a little hard to understand if you don't know how VMX and EPT's work.

What this means is that older cloud VMs based on techniques used in paravirtualization (Xen, ancient QEMU, older VMware) may be susceptible to accessing hypervisor state via Spectre v1.

But newer so-called hardware-accelerated VMs based on VMX extensions and using the EPT are isolated to a much larger extent, making Spectre v1 pretty useless.

Thus, the "overreaction" is that ALL VM's are problematic. This is very far from true. Hardware-accelerated VM's hypervisors are not vulnerable to Meltdown, Spectre v2, and probably not Spectre v1.

Of course, *within* a particular VM, the guest kernel and other processes are vulnerable. But there is no inter-VM path that has been demonstrated, nor do any of the discussions explain any means for using speculative execution and branch misprediction between VMs running under different EPT's.

So for the cloud, and also for NVF's that are run on accelerated HVM's, the problem is either non-existent or yet to be discovered.

Of course the "press" wants everyone to be superafraid, so if they can say "KVM is affected" that causes the mob to start running for the exits!

Summary: hardware virtualization appears to be a pragmatic form of isolation that works. And thus many cloud providers are fine.



-----Original Message-----
From: "Jonathan Morton" <***@gmail.com>
Sent: Friday, January 5, 2018 9:07am
To: "Dave Taht" <***@gmail.com>
Cc: "***@deepplum.com" <***@deepplum.com>, "Joel Wirāmu Pauling" <***@aenertia.net>, cerowrt-***@lists.bufferbloat.net
Subject: Re: [Cerowrt-devel] Spectre and EBPF JIT
Post by Dave Taht
It took me a long while to digest that one. The branch predictor
analysis of haswell was easiest to understand (and AMD claims to have
an AI based one), and perhaps scrambling that at random intervals
would help? (this stuff is now way above my pay grade)
Software mitigations for all three attacks have been developed during the "responsible disclosure" period.

Spectre v1: adding an LFENCE instruction (memory load fence) to JIT code performing a bounds-checked array read. This is basically a userspace fix for a userspace attack. Firefox just got this, Chrome undoubtedly will too, if it hasn't already.

Spectre v2: three different mitigations are appropriate for different families of CPU:

https://lkml.org/lkml/2018/1/4/742

On AMD CPUs, the small risk actually existing (because AMD's BTB is much less prone to poisoning than Intel's) is erased by adding LFENCE to privileged indirect branches. This has only a very small cost.

On Intel CPUs until Broadwell inclusive (and Silvermont onwards), a "retpoline" structure is necessary and sufficient. This has a bigger cost than LFENCE and is pretty ugly to look at, but it's still relatively minor.

On Skylake, Kaby Lake and Coffee Lake, something more exotic is required - I think it involves temporarily disabling the BTB during privileged indirect branches. That's *really* ugly, and involves tweaking poorly-documented MSRs.

Something similar in nature to the above should also work for affected ARM cores.

Meltdown: nothing is required for AMD CPUs. Unmapping the privileged addresses when returning to userspace is sufficient for Intel, but incurs a big performance overhead for syscalls. The same is likely true for any other affected CPUs.

- Jonathan Morton
Jonathan Morton
2018-01-05 19:18:33 UTC
Permalink
Post by d***@deepplum.com
Of course the "press" wants everyone to be superafraid, so if they can say "KVM is affected" that causes the mob to start running for the exits!
Meanwhile, in XKCD land...

https://xkcd.com/1938/

- Jonathan Morton
David Lang
2018-01-05 20:15:58 UTC
Permalink
He does a good job of explaining these high provile vulnerabilities.
Post by Jonathan Morton
Post by d***@deepplum.com
Of course the "press" wants everyone to be superafraid, so if they can say "KVM is affected" that causes the mob to start running for the exits!
Meanwhile, in XKCD land...
https://xkcd.com/1938/
d***@deepplum.com
2018-01-04 22:02:46 UTC
Permalink
Hmm... protection datacentres tend to require lower latencies than can be achieved running on hypervisors.

Which doesn't mean that some datacenters don't do that.

As far as NFV is concerned, Meltdown only breaks security if a userspace application is running on a machine where another user has data running through kernel address space. NFV environments don't tend to run NFV in userspace under an OS that has kernel data in the page tables that are reachable from CR3.

The key issue in Meltdown is that CR3 is not changed between userspace and kernelspace. Which means that the memory access pipeline in userspace can use a kernelspace address (what Intel calls a "linear" address) without a check that the pagetables enable userspace access. The check happens after the speculative execution of the memory access.

I repeat this, because many pseudo-experts who love to be quoted in the press as saying "be afraid, be very afraid" are saying a lot of nonsense about Meltdown and Sceptre. It seems to be an echo chamber effect - the papers were released yesterday afternoon, but in a rush to get "quoted", all the wannabe-quoted people are saying things that are just plain NOT TRUE.


-----Original Message-----
From: "Joel Wirāmu Pauling" <***@aenertia.net>
Sent: Thursday, January 4, 2018 4:44pm
To: "Jonathan Morton" <***@gmail.com>
Cc: cerowrt-***@lists.bufferbloat.net
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches than x86?






On 5 January 2018 at 01:09, Jonathan Morton <[ ***@gmail.com ]( mailto:***@gmail.com )> wrote:


I don't think we need to worry about it too much in a router context. Virtual server folks, OTOH...



- Jonathan Morton



​Disagree - The Router is pretty much synonymous with NFV​
​; I run my lede instances at home on hypervisors - and this is definitely the norm in Datacentres now. We need to work through this quite carefully. ​
d***@deepplum.com
2018-01-04 22:02:56 UTC
Permalink
Containers and kernel namespaces, and so forth are MEANINGLESS against the Meltdown and Sceptre problems. It's a hardware bug that lets any userspace process access anything the kernel can address.

-----Original Message-----
From: "Joel Wirāmu Pauling" <***@aenertia.net>
Sent: Thursday, January 4, 2018 4:52pm
To: "Dave Taht" <***@gmail.com>
Cc: "Jonathan Morton" <***@gmail.com>, cerowrt-***@lists.bufferbloat.net
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches than x86?




Well as I've argued before Lede ideally should be using to Kernel Namespaces (poor mans containers) for at a minimum the firewall and per-interface routing instances.


The stuff I am running at home is mostly on cheap Atom board, so it's a matter of squeezing out unneeded cruft on the platform. Also I don't want to be admining centos/rhel servers at home.
Post by Jonathan Morton
Post by Jonathan Morton
I don't think we need to worry about it too much in a router context.
Virtual server folks, OTOH...
- Jonathan Morton
Disagree - The Router is pretty much synonymous with NFV
; I run my lede instances at home on hypervisors - and this is definitely
the norm in Datacentres now. We need to work through this quite carefully.
Yes, the NFV case is serious and what I concluded we had most to worry
about - before starting to worry about the lower end router chips
themselves. But I wasn't aware that people were actually trying to run
lede in that, I'd kind of expected
a more server-like distro to be used there. Why lede in a NFV? Ease of
configuration? Reduced attack surface? (hah)

The only x86 chip I use (aside from simulations) is the AMD one in the
apu2, which I don't know enough about as per speculation...



--

Dave TÀht
CEO, TekLibre, LLC
[ http://www.teklibre.com ]( http://www.teklibre.com )
Tel: 1-669-226-2619
Dave Taht
2018-01-04 22:04:46 UTC
Permalink
Containers and kernel namespaces, and so forth are MEANINGLESS against the
Meltdown and Sceptre problems. It's a hardware bug that lets any userspace
process access anything the kernel can address.
Just to be clear, I was merely agreeing with joel that containers had
matured enough to be potentially usuable for some level of process
isolation and firewalling, not that it applied to coping with MeltRe.
-----Original Message-----
Sent: Thursday, January 4, 2018 4:52pm
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches than x86?
Well as I've argued before Lede ideally should be using to Kernel Namespaces
(poor mans containers) for at a minimum the firewall and per-interface
routing instances.
The stuff I am running at home is mostly on cheap Atom board, so it's a
matter of squeezing out unneeded cruft on the platform. Also I don't want to
be admining centos/rhel servers at home.
Post by Dave Taht
Post by Jonathan Morton
Post by Jonathan Morton
I don't think we need to worry about it too much in a router context.
Virtual server folks, OTOH...
- Jonathan Morton
Disagree - The Router is pretty much synonymous with NFV
; I run my lede instances at home on hypervisors - and this is definitely
the norm in Datacentres now. We need to work through this quite carefully.
Yes, the NFV case is serious and what I concluded we had most to worry
about - before starting to worry about the lower end router chips
themselves. But I wasn't aware that people were actually trying to run
lede in that, I'd kind of expected
a more server-like distro to be used there. Why lede in a NFV? Ease of
configuration? Reduced attack surface? (hah)
The only x86 chip I use (aside from simulations) is the AMD one in the
apu2, which I don't know enough about as per speculation...
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
--
Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
d***@deepplum.com
2018-01-04 22:12:38 UTC
Permalink
I don't disagree about using containers being useful as one of many security mechanisms. They are useful against certain attack vectors, but depend on two things: 1) kernel correctness, and 2) putting all functionality in separate userspace processes to satisfy the "principle of least privilege".

-----Original Message-----
From: "Dave Taht" <***@gmail.com>
Sent: Thursday, January 4, 2018 5:04pm
To: "***@deepplum.com" <***@deepplum.com>
Cc: "Joel Wirāmu Pauling" <***@aenertia.net>, "Jonathan Morton" <***@gmail.com>, cerowrt-***@lists.bufferbloat.net
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches than x86?
Containers and kernel namespaces, and so forth are MEANINGLESS against the
Meltdown and Sceptre problems. It's a hardware bug that lets any userspace
process access anything the kernel can address.
Just to be clear, I was merely agreeing with joel that containers had
matured enough to be potentially usuable for some level of process
isolation and firewalling, not that it applied to coping with MeltRe.
-----Original Message-----
Sent: Thursday, January 4, 2018 4:52pm
Subject: Re: [Cerowrt-devel] KASLR: Do we have to worry about other arches than x86?
Well as I've argued before Lede ideally should be using to Kernel Namespaces
(poor mans containers) for at a minimum the firewall and per-interface
routing instances.
The stuff I am running at home is mostly on cheap Atom board, so it's a
matter of squeezing out unneeded cruft on the platform. Also I don't want to
be admining centos/rhel servers at home.
Post by Dave Taht
Post by Jonathan Morton
Post by Jonathan Morton
I don't think we need to worry about it too much in a router context.
Virtual server folks, OTOH...
- Jonathan Morton
Disagree - The Router is pretty much synonymous with NFV
; I run my lede instances at home on hypervisors - and this is definitely
the norm in Datacentres now. We need to work through this quite carefully.
Yes, the NFV case is serious and what I concluded we had most to worry
about - before starting to worry about the lower end router chips
themselves. But I wasn't aware that people were actually trying to run
lede in that, I'd kind of expected
a more server-like distro to be used there. Why lede in a NFV? Ease of
configuration? Reduced attack surface? (hah)
The only x86 chip I use (aside from simulations) is the AMD one in the
apu2, which I don't know enough about as per speculation...
--
Dave TÀht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
--
Dave TÀht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Loading...