On Thu, Aug 28, 2025 at 5:13 PM Fei Li lifei.shirley@bytedance.com wrote:
Actually this is a bug triggered by one monitor tool in our production environment. This monitor executes 'info registers -a' hmp at a fixed frequency, even during VM startup process, which makes some AP stay in KVM_MP_STATE_UNINITIALIZED forever. But this race only occurs with extremely low probability, about 1~2 VM hangs per week.
Considering other emulators, like cloud-hypervisor and firecracker maybe also have similar potential race issues, I think KVM had better do some handling. But anyway, I will check Qemu code to avoid such race. Thanks for both of your comments. 🙂
If you can check whether other emulators invoke KVM_SET_VCPU_EVENTS in similar cases, that of course would help understanding the situation better.
In QEMU, it is possible to delay KVM_GET_VCPU_EVENTS until after all vCPUs have halted.
Paolo