On Wed, Apr 21, 2021 at 06:27PM +0200, Marco Elver wrote:
On Wed, Apr 21, 2021 at 05:11PM +0200, Marco Elver wrote:
+Cc linux-arm-kernel
[...]
I've managed to reproduce this issue with a public Raspberry Pi OS Lite rootfs image, even without deploying kernel modules:
https://downloads.raspberrypi.org/raspios_lite_armhf/images/raspios_lite_arm...
# qemu-system-arm -M virt -smp 2 -m 512 -kernel zImage -append "earlycon console=ttyAMA0 root=/dev/vda2 rw rootwait" -serial stdio -display none -monitor null -device virtio-blk-device,drive=virtio-blk -drive file=/tmp/2021-03-04-raspios-buster-armhf-lite.img,id=virtio-blk,if=none,format=raw -netdev user,id=user -device virtio-net-device,netdev=user
The above one doesn't boot if zImage z compiled from commit fb6cc127e0b6 and boots if compiled from 2e498d0a74e5. In both cases I've used default arm/multi_v7_defconfig and gcc-linaro-6.4.1-2017.11-x86_64_arm-linux-gnueabi toolchain.
Yup, I've narrowed it down to the addition of "__u64 _perf" to siginfo_t. My guess is the __u64 causes a different alignment for a bunch of adjacent fields. It seems that x86 and m68k are the only ones that have compile-time tests for the offsets. Arm should probably add those -- I have added a bucket of static_assert() in arch/arm/kernel/signal.c and see that something's off.
I'll hopefully have a fix in a day or so.
Arm and compiler folks: are there some special alignment requirement for __u64 on arm 32-bit? (And if there is for arm64, please shout as well.)
With the static-asserts below, the only thing that I can do to fix it is to completely remove the __u64. Padding it before or after with __u32 just does not work. It seems that the use of __u64 shifts everything in __sifields by 4 bytes.
diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h index d0bb9125c853..b02a4ac55938 100644 --- a/include/uapi/asm-generic/siginfo.h +++ b/include/uapi/asm-generic/siginfo.h @@ -92,7 +92,10 @@ union __sifields { __u32 _pkey; } _addr_pkey; /* used when si_code=TRAP_PERF */
__u64 _perf;
struct {
__u32 _perf1;
__u32 _perf2;
}; } _sigfault;} _perf;
^^ works, but I'd hate to have to split this into 2 __u32 because it makes the whole design worse.
What alignment trick do we have to do here to fix it for __u64?
So I think we just have to settle on 'unsigned long' here. On many architectures, like 32-bit Arm, the alignment of a structure is that of its largest member. This means that there is no portable way to add 64-bit integers to siginfo_t on 32-bit architectures.
In the case of the si_perf field, word size is sufficient since the data it contains is user-defined. On 32-bit architectures, any excess bits of perf_event_attr::sig_data will therefore be truncated when copying into si_perf.
Feel free to test the below if you have time, but the below lets me boot 32-bit arm which previously timed out. It also passes all the static_asserts() I added (will send those as separate patches).
Once I'm convinced this passes all others tests too, I'll send a patch.
Thanks, -- Marco
diff --git a/include/linux/compat.h b/include/linux/compat.h index c8821d966812..f0d2dd35d408 100644 --- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -237,7 +237,7 @@ typedef struct compat_siginfo { u32 _pkey; } _addr_pkey; /* used when si_code=TRAP_PERF */ - compat_u64 _perf; + compat_ulong_t _perf; }; } _sigfault;
diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h index d0bb9125c853..03d6f6d2c1fe 100644 --- a/include/uapi/asm-generic/siginfo.h +++ b/include/uapi/asm-generic/siginfo.h @@ -92,7 +92,7 @@ union __sifields { __u32 _pkey; } _addr_pkey; /* used when si_code=TRAP_PERF */ - __u64 _perf; + unsigned long _perf; }; } _sigfault;