On 8/9/22 10:57 AM, Mathieu Desnoyers wrote:
----- Gavin Shan gshan@redhat.com wrote:
Hi Florian,
On 8/9/22 2:01 AM, Florian Weimer wrote:
It has come to my attention that the KVM rseq test apparently needs to be ported to glibc 2.35. The background is that on aarch64, rseq is the only way to get a practically useful sched_getcpu. (There's no hidden per-task CPU state the vDSO could reveal as the CPU ID.)
Yes, kvm/selftests/rseq needs to support glibc 2.35. The question is about glibc 2.34 or 2.35 because kvm/selftest/rseq fails on glibc 2.34
I would guess upstream-glibc-2.35 feature is enabled on downstream glibc-2.34?
# ./rseq_test ==== Test Assertion Failure ==== rseq_test.c:60: !r pid=112043 tid=112043 errno=22 - Invalid argument 1 0x0000000000401973: main at rseq_test.c:226 2 0x0000ffff84b6c79b: ?? ??:0 3 0x0000ffff84b6c86b: ?? ??:0 4 0x0000000000401b6f: _start at ??:? rseq failed, errno = 22 (Invalid argument) # rpm -aq | grep glibc-2 glibc-2.34-39.el9.aarch64
The main rseq tests have already been adjusted via:
commit 233e667e1ae3e348686bd9dd0172e62a09d852e1 Author: Mathieu Desnoyers mathieu.desnoyers@efficios.com Date: Mon Jan 24 12:12:45 2022 -0500
selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35 glibc-2.35 (upcoming release date 2022-02-01) exposes the rseq per-thread data in the TCB, accessible at an offset from the thread pointer, rather than through an actual Thread-Local Storage (TLS) variable, as the Linux kernel selftests initially expected. The __rseq_abi TLS and glibc-2.35's ABI for per-thread data cannot actively coexist in a process, because the kernel supports only a single rseq registration per thread. Here is the scheme introduced to ensure selftests can work both with an older glibc and with glibc-2.35+: - librseq exposes its own "rseq_offset, rseq_size, rseq_flags" ABI. - librseq queries for glibc rseq ABI (__rseq_offset, __rseq_size, __rseq_flags) using dlsym() in a librseq library constructor. If those are found, copy their values into rseq_offset, rseq_size, and rseq_flags. - Else, if those glibc symbols are not found, handle rseq registration from librseq and use its own IE-model TLS to implement the rseq ABI per-thread storage. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220124171253.22072-8-mathieu.desnoyers@efficios.com
But I don't see a similar adjustment for tools/testing/selftests/kvm/rseq_test.c. As an additional wrinkle, you'd have to start calling getcpu (glibc function or system call) because comparing rseq.cpu_id against sched_getcpu won't test anything anymore once glibc implements sched_getcpu using rseq.
We noticed this because our downstream glibc version, while based on 2.34, enables rseq registration by default. To facilitate coordination with rseq application usage, we also backported the __rseq_* ABI symbols, so the selftests could use that even in our downstream version. (We enable the glibc tunables downstream, but they are an optional glibc feature, so it's probably better in the long run to fix the kernel selftests rather than using the tunables as a workaround.)
Thanks for the pointer. It makes sense. So it means rseq registration has been done by glibc TLS? In this case, kvm/selftests/rseq is unable to register again.
The registration is done by glibc initialization and thread startup code.
I will come up something similiar for kvm/selftest/rseq.
Make sure to chech the rseq selftests fixes recently pulled in the current merge window as well. One is relevant:
https://github.com/torvalds/linux/commit/d1a997ba4c1bf65497d956aea90de42a639...
We may want to find a way to remove this duplicated rseq.c code eventually.
Thanks, Mathieu. The check for 'rseq-size' will be included either. I almost have something working. I will post the fixes after some tests.
Thanks, Gavin