Re: [PATCH v2 4/5] KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs

27 Aug 2021

      On Fri, Aug 27, 2021, Mathieu Desnoyers wrote:
...
...
So there are effectively three reasons we want a delay:

To allow sched_setaffinity() to coincide with ioctl(KVM_RUN) before KVM can
enter the guest so that the guest doesn't need an arch-specific VM-Exit source.

To let ioctl(KVM_RUN) make its way back to the test before the next round
of migration.

To ensure the read-side can make forward progress, e.g. if sched_getcpu()
involves a syscall.

After looking at KVM for arm64 and s390, #1 is a bit tenuous because x86 is the
only arch that currently uses xfer_to_guest_mode_work(), i.e. the test could be
tweaked to be overtly x86-specific.  But since a delay is needed for #2 and #3,
I'd prefer to rely on it for #1 as well in the hopes that this test provides
coverage for arm64 and/or s390 if they're ever converted to use the common
xfer_to_guest_mode_work().
Now that we have this understanding of why we need the delay, it would be good to
write this down in a comment within the test.
Ya, I'll get a new version out next week.
...
Does it reproduce if we randomize the delay to have it picked randomly from 0us
to 100us (with 1us step) ? It would remove a lot of the needs for arch-specific
magic delay value.
My less-than-scientific testing shows that it can reproduce at delays up to ~500us,
but above ~10us the reproducibility starts to drop.  The bug still reproduces
reliably, it just takes more iterations, and obviously the test runs a bit slower.
Any objection to using a 1-10us delay, e.g. a simple usleep((i % 10) + 1)?

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v2 4/5] KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs