On Thu, Feb 27, 2025, Manali Shukla wrote:
On 2/26/2025 3:37 AM, Sean Christopherson wrote:
@@ -5225,6 +5230,8 @@ static __init void svm_set_cpu_caps(void) if (vnmi) kvm_cpu_cap_set(X86_FEATURE_VNMI);
kvm_cpu_cap_check_and_set(X86_FEATURE_IDLE_HLT);
I am 99% certain this is wrong. Or at the very least, severly lacking an explanation of why it's correct. If L1 enables Idle HLT but not HLT interception, then it is KVM's responsibility to NOT exit to L1 if there is a pending V_IRQ or V_NMI.
Yeah, it's buggy. But, it's buggy in part because *existing* KVM support is buggy. If L1 disables HLT exiting, but it's enabled in KVM, then KVM will run L2 with HLT exiting and so it becomes KVM's responsibility to check for valid L2 wake events prior to scheduling out the vCPU if L2 executes HLT. E.g. nVMX handles this by reading vmcs02.GUEST_INTERRUPT_STATUS.RVI as part of vmx_has_nested_events(). I don't see the equivalent in nSVM.
Amusingly, that means Idle HLT is actually a bug fix to some extent. E.g. if there is a pending V_IRQ/V_NMI in vmcb02, then running with Idle HLT will naturally do the right thing, i.e. not hang the vCPU.
Anyways, for now, I think the easiest and best option is to simply skip full nested support for the moment.
Got it, I see the issue you're talking about. I'll need to look into it a bit more to fully understand it. So yeah, we can hold off on full nested support for idle HLT intercept for now.
Since we are planning to disable Idle HLT support on nested guests, should we do something like this ?
@@ -167,10 +167,15 @@ void recalc_intercepts(struct vcpu_svm *svm) if (!nested_svm_l2_tlb_flush_enabled(&svm->vcpu)) vmcb_clr_intercept(c, INTERCEPT_VMMCALL);
if (!guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_IDLE_HLT))
vmcb_clr_intercept(c, INTERCEPT_IDLE_HLT);
When recalc_intercepts copies the intercept values from vmc01 to vmcb02, it also copies the IDLE HLT intercept bit, which is set to 1 in vmcb01. Normally, this isn't a problem because the HLT intercept takes priority when it's on. But if the HLT intercept gets turned off for some reason, the IDLE HLT intercept will stay on, which is not what we want.
Why don't we want that?