In rseq_test, there are two threads, which are thread group leader and migration worker. The migration worker relies on sched_setaffinity() to force migration on the thread group leader. Unfortunately, we have wrong parameter (0) passed to sched_getaffinity(). It's actually forcing migration on the migration worker instead of the thread group leader. It also means migration can happen on the thread group leader at any time, which eventually leads to failure as the following logs show.
host# uname -r 5.19.0-rc6-gavin+ host# # cat /proc/cpuinfo | grep processor | tail -n 1 processor : 223 host# pwd /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm host# for i in `seq 1 100`; \ do echo "--------> $i"; ./rseq_test; done --------> 1 --------> 2 --------> 3 --------> 4 --------> 5 --------> 6 ==== Test Assertion Failure ==== rseq_test.c:265: rseq_cpu == cpu pid=3925 tid=3925 errno=4 - Interrupted system call 1 0x0000000000401963: main at rseq_test.c:265 (discriminator 2) 2 0x0000ffffb044affb: ?? ??:0 3 0x0000ffffb044b0c7: ?? ??:0 4 0x0000000000401a6f: _start at ??:? rseq CPU = 4, sched CPU = 27
This fixes the issue by passing correct parameter, tid of the group thread leader, to sched_setaffinity().
Fixes: 61e52f1630f5 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs") Signed-off-by: Gavin Shan gshan@redhat.com --- tools/testing/selftests/kvm/rseq_test.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/rseq_test.c b/tools/testing/selftests/kvm/rseq_test.c index 4158da0da2bb..c83ac7b467f8 100644 --- a/tools/testing/selftests/kvm/rseq_test.c +++ b/tools/testing/selftests/kvm/rseq_test.c @@ -38,6 +38,7 @@ static __thread volatile struct rseq __rseq = { */ #define NR_TASK_MIGRATIONS 100000
+static pid_t rseq_tid; static pthread_t migration_thread; static cpu_set_t possible_mask; static int min_cpu, max_cpu; @@ -106,7 +107,8 @@ static void *migration_worker(void *ign) * stable, i.e. while changing affinity is in-progress. */ smp_wmb(); - r = sched_setaffinity(0, sizeof(allowed_mask), &allowed_mask); + r = sched_setaffinity(rseq_tid, sizeof(allowed_mask), + &allowed_mask); TEST_ASSERT(!r, "sched_setaffinity failed, errno = %d (%s)", errno, strerror(errno)); smp_wmb(); @@ -231,6 +233,7 @@ int main(int argc, char *argv[]) vm = vm_create_default(VCPU_ID, 0, guest_code); ucall_init(vm, NULL);
+ rseq_tid = gettid(); pthread_create(&migration_thread, NULL, migration_worker, 0);
for (i = 0; !done; i++) {