Hi Christophe and Dimitry,
Le 27/03/2021 à 18:43, Dmitry Safonov a écrit :
Hi Christophe,
On 3/27/21 5:19 PM, Christophe Leroy wrote: [..]
I opportunistically Cc stable on it: I understand that usually such stuff isn't a stable material, but that will allow us in CRIU have one workaround less that is needed just for one release (v5.11) on one platform (ppc64), which we otherwise have to maintain.
Why is that a workaround, and why for one release only ? I think the solution proposed by Laurentto use the aux vector AT_SYSINFO_EHDR should work with any past and future release.
Yeah, I guess. Previously, (before v5.11/power) all kernels had ELF start at "[vdso]" VMA start, now we'll have to carry the offset in the VMA. Probably, not the worst thing, but as it will be only for v5.11 release it can break, so needs separate testing. Kinda life was a bit easier without this additional code.
The assumption that ELF header is at the start of "[vdso]" is perhaps not a good one, but using a "[vvar]" section looks more conventional and allows to clearly identify the data part. I'd argue for this option.
I wouldn't go as far as to say that the commit 511157ab641e is ABI regression as no other userspace got broken, but I'd really appreciate if it gets backported to v5.11 after v5.12 is released, so as not to complicate already non-simple CRIU-vdso code. Thanks!
Cc: Andrei Vagin avagin@gmail.com Cc: Andy Lutomirski luto@kernel.org Cc: Benjamin Herrenschmidt benh@kernel.crashing.org Cc: Christophe Leroy christophe.leroy@csgroup.eu Cc: Laurent Dufour ldufour@linux.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Paul Mackerras paulus@samba.org Cc: linuxppc-dev@lists.ozlabs.org Cc: stable@vger.kernel.org # v5.11 [1]: https://github.com/checkpoint-restore/criu/issues/1417 Signed-off-by: Dmitry Safonov dima@arista.com Tested-by: Christophe Leroy christophe.leroy@csgroup.eu
I tested it with sifreturn_vdso selftest and it worked, because that selftest doesn't involve VDSO data.
Thanks again on helping with testing it, I appreciate it!
But if I do a mremap() on the VDSO text vma without remapping VVAR to keep the same distance between the two vmas, gettimeofday() crashes. The reason is that the code obtains the address of the data by calculating a fix difference from its own address with the below macro, the delta being resolved at link time:
.macro get_datapage ptr bcl 20, 31, .+4 999: mflr \ptr #if CONFIG_PPC_PAGE_SHIFT > 14 addis \ptr, \ptr, (_vdso_datapage - 999b)@ha #endif addi \ptr, \ptr, (_vdso_datapage - 999b)@l .endm
So the datapage needs to remain at the same distance from the code at all time.
Wondering how the other architectures do to have two independent VMAs and be able to move one independently of the other.
It's alright as far as I know. If userspace remaps vdso/vvar it should be aware of this (CRIU keeps this in mind, also old vdso image is dumped to compare on restore with the one that the host has).
I do agree, playing with the VDSO mapping needs the application to be aware of the mapping details, and prior to 83d3f0e90c6c "powerpc/mm: tracking vDSO remap", remapping the VDSO was not working on PowerPC and nobody complained...
Laurent.