On 8/28/25 12:08 AM, Paolo Bonzini wrote:
On Wed, Aug 27, 2025 at 6:01 PM Sean Christopherson seanjc@google.com wrote:
On Wed, Aug 27, 2025, Fei Li wrote:
Commit ff90afa75573 ("KVM: x86: Evaluate latched_init in KVM_SET_VCPU_EVENTS when vCPU not in SMM") changes KVM_SET_VCPU_EVENTS handler to set pending LAPIC INIT event regardless of if vCPU is in SMM mode or not.
However, latch INIT without checking CPU state exists race condition, which causes the loss of INIT event. This is fatal during the VM startup process because it will cause some AP to never switch to non-root mode. Just as commit f4ef19108608 ("KVM: X86: Fix loss of pending INIT due to race") said: BSP AP kvm_vcpu_ioctl_x86_get_vcpu_events events->smi.latched_init = 0
kvm_vcpu_block kvm_vcpu_check_block schedule
send INIT to AP kvm_vcpu_ioctl_x86_set_vcpu_events (e.g. `info registers -a` when VM starts/reboots) if (events->smi.latched_init == 0) clear INIT in pending_events
This is a QEMU bug, no?
I think I agree.
Actually this is a bug triggered by one monitor tool in our production environment. This monitor executes 'info registers -a' hmp at a fixed frequency, even during VM startup process, which makes some AP stay in KVM_MP_STATE_UNINITIALIZED forever. But thisrace only occurs with extremely low probability, about 1~2 VM hangs per week.
Considering other emulators, like cloud-hypervisor and firecracker maybe also have similar potential race issues, I think KVM had better do some handling. But anyway, I will check Qemu code to avoid such race. Thanks for both of your comments. 🙂
Have a nice day, thanks Fei
IIUC, it's invoking kvm_vcpu_ioctl_x86_set_vcpu_events() with stale data.
More precisely, it's not expecting other vCPUs to change the pending events asynchronously.
Yes, will sort out a more complete calling process later.
I'm also a bit confused as to how QEMU is even gaining control of the vCPU to emit KVM_SET_VCPU_EVENTS if the vCPU is in kvm_vcpu_block().
With a signal. :)
Paolo