On Mon, Mar 9, 2020 at 2:09 AM Thomas Gleixner tglx@linutronix.de wrote:
Paolo Bonzini pbonzini@redhat.com writes:
On 09/03/20 07:57, Thomas Gleixner wrote:
Thomas Gleixner tglx@linutronix.de writes:
guest side:
nmi()/mce() ...
stash_crs();
stash_and_clear_apf_reason(); ....
restore_apf_reason();
restore_cr2();
Too obvious, isn't it?
Yes, this works but Andy was not happy about adding more save-and-restore to NMIs. If you do not want to do that, I'm okay with disabling async page fault support for now.
I'm fine with doing that save/restore dance, but I have no strong opinion either.
Storing the page fault reason in memory was not a good idea. Better options would be to co-opt the page fault error code (e.g. store the reason in bits 31:16, mark bits 15:0 with the invalid error code RSVD=1/P=0), or to use the virtualization exception area.
Memory store is not the problem. The real problem is hijacking #PF.
If you'd have just used a separate VECTOR_ASYNC_PF then none of these problems would exist at all.
I'm okay with the save/restore dance, I guess. It's just yet more entry crud to deal with architecture nastiness, except that this nastiness is 100% software and isn't Intel/AMD's fault.
At least until we get an async page fault due to apf_reason being paged out...