On Thu, Dec 12, 2024 at 11:42:37AM -0800, Sean Christopherson wrote:
Gah, I got my enums mixed up. I conflated RET_PF_WRITE_PROTECTED with EMULTYPE_WRITE_PF_TO_SP. Ignore the above.
FWIW, KVM _can't_ unprotect and retry in the EMULTYPE_WRITE_PF_TO_SP case. From kvm_unprotect_and_retry_on_failure():
Ah alright, I guess I get it now :) So, EMULTYPE_ALLOW_RETRY_PF is set whenever there is a PF when accessing write-protected page, and EMULTYPE_WRITE_PF_TO_SP is set when this access touches SPTE used to translate the write itself. If both are set, we can't unprotect and retry, and the latter can be set only if the former is set.
Unprotect and re-execute is fine, what I'm worried about is *successfully* emulating the instruction. E.g.
- CPU executes instruction X and hits a #GP.
- While vectoring the #GP, a shadow #PF is taken.
- On VM-Exit, KVM re-injects the #GP (see __vmx_complete_interrupts()).
- KVM emulates because of the write-protected page.
- KVM "successfully" emulates and also detects the #GP
- KVM synthesizes a #GP, and because the vCPU already has injected #GP, incorrectly escalates to a #DF.
The above is a bit contrived, but I think it could happen if the guest reused a page that _was_ a page table, for a vCPU's kernel stack.
Does it work like that only for contributory exceptions / page faults? In case if it's not #GP but (for instance) #UD, (as far as I understand) KVM will queue only one of them without causing #DF so it's gonna be valid?
However, I'm not sure what happens if vectoring is caused by external interrupt: if we unprotect the page and re-execute the instruction, will IRQ be delivered nonetheless, or it will be lost as irq is already in ISR? Do we need to re-inject it in such a case?
In all cases, the event that was being vectored is re-injected. Restarting from scratch would be a bug. E.g. if the cause of initial exception was "fixed", say because the initial exception was #BP, and the guest finished patching out the INT3, then restarting would execute the _new_ instruction, and the INT3 would be lost.
Cool, that is what I was concerned about, glad that it is already implemented :)
As far as unprotect+retry being viable, I think we're on the same page. What I'm getting at is that I think KVM should never allow emulating on #PF when the #PF occurred while vectoring. E.g. this:
static inline bool kvm_can_emulate_event_vectoring(int emul_type) { return !(emul_type & EMULTYPE_PF); }
Yeah, I agree. I'll post a V3 with suggested fixes (after running all of the selftests to be sure that it doesn't break anything).
and then I believe this? Where this diff can be a separate prep patch (though I'm pretty sure it's technically pointless without the vectoring angle, because shadow #PF can't coincide with any of the failure paths for kvm_check_emulate_insn()).
Looks good. If you don't mind, I could add this patch to the series with `Suggested-by` tag since it's neccessary to allow unprotect+retry in case of shadow #PF during vectoring.
-- Kind regards, Ivan Orlov
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 07c6f1d5323d..63361b2da450 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9107,6 +9107,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, if (r == X86EMUL_RETRY_INSTR || r == X86EMUL_PROPAGATE_FAULT) return 1;
if (kvm_unprotect_and_retry_on_failure(vcpu, cr2_or_gpa,
emulation_type))
return 1;
if (r == X86EMUL_UNHANDLEABLE_VECTORING_IO) { kvm_prepare_event_vectoring_exit(vcpu, cr2_or_gpa); return 0;