On Thu, 02 May 2024 14:39:36 -0700, Zide Chen wrote:
Currently, the migration worker delays 1-10 us, assuming that one KVM_RUN iteration only takes a few microseconds. But if the CPU low power wakeup latency is large enough, for example, hundreds or even thousands of microseconds deep C-state exit latencies on x86 server CPUs, it may happen that it's not able to wakeup the target CPU before the migration worker starts to migrate the vCPU thread to the next CPU.
[...]
Applied to kvm-x86 selftests, thanks! I tweaked the changelog slightly to call out the new comment and assert message. I also added an extra newline so that the "help" part of the assert message is isolated from the primary explanation of why the assert fired. E.g. the output looks like:
==== Test Assertion Failure ==== rseq_test.c:290: skip_sanity_check || i > (NR_TASK_MIGRATIONS * 2002) pid=20283 tid=20283 errno=4 - Interrupted system call 1 0x000000000040210a: main at rseq_test.c:286 2 0x00007f07fa821c86: ?? ??:0 3 0x0000000000402209: _start at ??:? Only performed 11162 KVM_RUNs, task stalled too much?
Try disabling deep sleep states to reduce CPU wakeup latency, e.g. via cpuidle.off=1 or setting /dev/cpu_dma_latency to '0', or run with -u to disable this sanity check.
[1/1] KVM: selftests: Add a new option to rseq_test https://github.com/kvm-x86/linux/commit/20ecf595b513