On Mon, Apr 29, 2019 at 12:02 PM Linus Torvalds torvalds@linux-foundation.org wrote:
If nmi were to break it, it would be a cpu bug. I'm pretty sure I've seen the "shadow stops even nmi" documented for some uarch, but as mentioned it's not necessarily the only way to guarantee the shadow.
In fact, the documentation is simply the official Intel instruction docs for "STI":
The IF flag and the STI and CLI instructions do not prohibit the generation of exceptions and NMI interrupts. NMI interrupts (and SMIs) may be blocked for one macroinstruction following an STI.
note the "may be blocked". As mentioned, that's just one option for not having NMI break the STI shadow guarantee, but it's clearly one that Intel has done at times, and clearly even documents as having done so.
There is absolutely no question that the sti shadow is real, and that people have depended on it for _decades_. It would be a horrible errata if the shadow can just be made to go away by randomly getting an NMI or SMI.
Linus
On Mon, Apr 29, 2019 at 01:16:10PM -0700, Linus Torvalds wrote:
On Mon, Apr 29, 2019 at 12:02 PM Linus Torvalds torvalds@linux-foundation.org wrote:
If nmi were to break it, it would be a cpu bug. I'm pretty sure I've seen the "shadow stops even nmi" documented for some uarch, but as mentioned it's not necessarily the only way to guarantee the shadow.
In fact, the documentation is simply the official Intel instruction docs for "STI":
The IF flag and the STI and CLI instructions do not prohibit the generation of exceptions and NMI interrupts. NMI interrupts (and SMIs) may be blocked for one macroinstruction following an STI.
note the "may be blocked". As mentioned, that's just one option for not having NMI break the STI shadow guarantee, but it's clearly one that Intel has done at times, and clearly even documents as having done so.
There is absolutely no question that the sti shadow is real, and that people have depended on it for _decades_. It would be a horrible errata if the shadow can just be made to go away by randomly getting an NMI or SMI.
FWIW, Lakemont (Quark) doesn't block NMI/SMI in the STI shadow, but I'm not sure that counters the "horrible errata" statement ;-). SMI+RSM saves and restores STI blocking in that case, but AFAICT NMI has no such protection and will effectively break the shadow on its IRET.
All other (modern) Intel uArchs block NMI in the shadow and also enforce STI_BLOCKING==0 when injecting an NMI via VM-Enter, i.e. prevent a VMM from breaking the shadow so long as the VMM preserves the shadow info.
KVM is generally ok with respect to STI blocking, but ancient versions didn't migrate STI blocking and there's currently a hole where single-stepping a guest (from host userspace) could drop STI_BLOCKING if a different VM-Exit occurs between the single-step #DB VM-Exit and the instruction in the shadow. Though "don't do that" may be a reasonable answer in that case.
On Mon, Apr 29, 2019 at 3:08 PM Sean Christopherson sean.j.christopherson@intel.com wrote:
FWIW, Lakemont (Quark) doesn't block NMI/SMI in the STI shadow, but I'm not sure that counters the "horrible errata" statement ;-). SMI+RSM saves and restores STI blocking in that case, but AFAICT NMI has no such protection and will effectively break the shadow on its IRET.
Ugh. I can't say I care deeply about Quark (ie never seemed to go anywhere), but it's odd. I thought it was based on a Pentium core (or i486+?). Are you saying those didn't do it either?
I have this dim memory about talking about this with some (AMD?) engineer, and having an alternative approach for the sti shadow wrt NMI - basically not checking interrupts in the instruction you return to with 'iret'. I don't think it was even conditional on the "iret from NMI", I think it was basically any iret also did the sti shadow thing.
But I can find no actual paper to back that up, so this may be me just making sh*t up.
KVM is generally ok with respect to STI blocking, but ancient versions didn't migrate STI blocking and there's currently a hole where single-stepping a guest (from host userspace) could drop STI_BLOCKING if a different VM-Exit occurs between the single-step #DB VM-Exit and the instruction in the shadow. Though "don't do that" may be a reasonable answer in that case.
I thought the sti shadow blocked the single-step exception too? I know "mov->ss" does block debug interrupts too.
Or are you saying that it's some "single step by emulation" that just miss setting the STI_BLOCKING flag?
Linus
On Mon, Apr 29, 2019 at 03:22:09PM -0700, Linus Torvalds wrote:
On Mon, Apr 29, 2019 at 3:08 PM Sean Christopherson sean.j.christopherson@intel.com wrote:
FWIW, Lakemont (Quark) doesn't block NMI/SMI in the STI shadow, but I'm not sure that counters the "horrible errata" statement ;-). SMI+RSM saves and restores STI blocking in that case, but AFAICT NMI has no such protection and will effectively break the shadow on its IRET.
Ugh. I can't say I care deeply about Quark (ie never seemed to go anywhere), but it's odd. I thought it was based on a Pentium core (or i486+?). Are you saying those didn't do it either?
It's 486 based, but either way I suspect the answer is "yes". IIRC, Knights Corner, a.k.a. Larrabee, also had funkiness around SMM and that was based on P54C, though I'm struggling to recall exactly what the Larrabee weirdness was.
I have this dim memory about talking about this with some (AMD?) engineer, and having an alternative approach for the sti shadow wrt NMI - basically not checking interrupts in the instruction you return to with 'iret'. I don't think it was even conditional on the "iret from NMI", I think it was basically any iret also did the sti shadow thing.
But I can find no actual paper to back that up, so this may be me just making sh*t up.
If Intel CPUs ever did anything like that on IRET it's long gone.
KVM is generally ok with respect to STI blocking, but ancient versions didn't migrate STI blocking and there's currently a hole where single-stepping a guest (from host userspace) could drop STI_BLOCKING if a different VM-Exit occurs between the single-step #DB VM-Exit and the instruction in the shadow. Though "don't do that" may be a reasonable answer in that case.
I thought the sti shadow blocked the single-step exception too? I know "mov->ss" does block debug interrupts too.
{MOV,POP}SS blocks #DBs, STI does not.
Or are you saying that it's some "single step by emulation" that just miss setting the STI_BLOCKING flag?
This is the case I was talking about for KVM. KVM supports single-stepping the guest from userpace and uses EFLAGS.TF to do so (since it works on both Intel and AMD). VMX has a consistency check that fails VM-Entry if STI_BLOCKING=1, EFLAGS.TF==1, IA32_DEBUGCTL.BTF=0 and there isn't a pending single-step #DB, and so KVM clears STI_BLOCKING immediately before entering the guest when single-stepping the guest. If a VM-Exit occurs immediately after VM-Entry, e.g. due to hardware interrupt, then KVM will see STI_BLOCKING=0 when processing guest events in its run loop and will inject any pending interrupts.
I *think* the KVM behavior can be fixed, e.g. I'm not entirely sure why KVM takes this approach instead of setting PENDING_DBG.BS, but that's probably a moot point.
On Mon, Apr 29, 2019 at 05:08:46PM -0700, Sean Christopherson wrote:
On Mon, Apr 29, 2019 at 03:22:09PM -0700, Linus Torvalds wrote:
On Mon, Apr 29, 2019 at 3:08 PM Sean Christopherson sean.j.christopherson@intel.com wrote:
FWIW, Lakemont (Quark) doesn't block NMI/SMI in the STI shadow, but I'm not sure that counters the "horrible errata" statement ;-). SMI+RSM saves and restores STI blocking in that case, but AFAICT NMI has no such protection and will effectively break the shadow on its IRET.
Ugh. I can't say I care deeply about Quark (ie never seemed to go anywhere), but it's odd. I thought it was based on a Pentium core (or i486+?). Are you saying those didn't do it either?
It's 486 based, but either way I suspect the answer is "yes". IIRC, Knights Corner, a.k.a. Larrabee, also had funkiness around SMM and that was based on P54C, though I'm struggling to recall exactly what the Larrabee weirdness was.
Aha! Found an ancient comment that explicitly states P5 does not block NMI/SMI in the STI shadow, while P6 does block NMI/SMI.
On Mon, Apr 29, 2019 at 5:45 PM Sean Christopherson sean.j.christopherson@intel.com wrote:
On Mon, Apr 29, 2019 at 05:08:46PM -0700, Sean Christopherson wrote:
It's 486 based, but either way I suspect the answer is "yes". IIRC, Knights Corner, a.k.a. Larrabee, also had funkiness around SMM and that was based on P54C, though I'm struggling to recall exactly what the Larrabee weirdness was.
Aha! Found an ancient comment that explicitly states P5 does not block NMI/SMI in the STI shadow, while P6 does block NMI/SMI.
Ok, so the STI shadow really wouldn't be reliable on those machines. Scary.
Of course, the good news is that hopefully nobody has them any more, and if they do, they presumably don't use fancy NMI profiling etc, so any actual NMI's are probably relegated purely to largely rare and effectively fatal errors anyway (ie memory parity errors).
Linus
On Mon, Apr 29, 2019 at 07:26:02PM -0700, Linus Torvalds wrote:
On Mon, Apr 29, 2019 at 5:45 PM Sean Christopherson sean.j.christopherson@intel.com wrote:
On Mon, Apr 29, 2019 at 05:08:46PM -0700, Sean Christopherson wrote:
It's 486 based, but either way I suspect the answer is "yes". IIRC, Knights Corner, a.k.a. Larrabee, also had funkiness around SMM and that was based on P54C, though I'm struggling to recall exactly what the Larrabee weirdness was.
Aha! Found an ancient comment that explicitly states P5 does not block NMI/SMI in the STI shadow, while P6 does block NMI/SMI.
Ok, so the STI shadow really wouldn't be reliable on those machines. Scary.
Of course, the good news is that hopefully nobody has them any more, and if they do, they presumably don't use fancy NMI profiling etc, so any actual NMI's are probably relegated purely to largely rare and effectively fatal errors anyway (ie memory parity errors).
We do have KNC perf support, if that chip has 'issues'...
Outside of that, we only do perf from P6 onwards. With P4 support being in dubious shape, because it's massively weird and 'nobody' still has those machines.
On Mon, 29 Apr 2019, Linus Torvalds wrote:
It's 486 based, but either way I suspect the answer is "yes". IIRC, Knights Corner, a.k.a. Larrabee, also had funkiness around SMM and that was based on P54C, though I'm struggling to recall exactly what the Larrabee weirdness was.
Aha! Found an ancient comment that explicitly states P5 does not block NMI/SMI in the STI shadow, while P6 does block NMI/SMI.
Ok, so the STI shadow really wouldn't be reliable on those machines. Scary.
Of course, the good news is that hopefully nobody has them any more, and if they do, they presumably don't use fancy NMI profiling etc, so any actual NMI's are probably relegated purely to largely rare and effectively fatal errors anyway (ie memory parity errors).
FWIW, if that thing has local apic (I have no idea, I've never seen Quark myself), then NMIs are used to trigger all-cpu backtrace as well. Which still can be done in situations where the kernel is then expected to continue running undisrupted (softlockup, sysrq, hung task detector, ...).
Nothing to really worry about in the particular case of this HW perhaps, sure.
linux-kselftest-mirror@lists.linaro.org