Thank you for the replies,
Kees Cook wrote on Wed, Nov 20, 2024 at 10:08:04AM -0800:
It seems like the correct first step is to revert the brk change. It's still not clear to me why the change is causing a problem -- I assume it is colliding with some other program area.
Is the problem strictly with qemu-user-static? (i.e. it was GCC running in qemu-user-static so the crash is qemu, not GCC) That should help me narrow down the issue. There must be some built-in assumption. And it's aarch64 within x86_64 where it happens (is the program within qemu also aarch64 or is it arm32)?
As far as I'm aware I've only seen qemu-user-static fail for aarch64 (e.g. the arm32 variant works fine, didn't try other arches) in this particular configuration (the debian bookworm package has been built with --static-pie which is known to cause problems)
It's also fixed in newer versions of qemu, even with --static-pie, because they reworked the way they detect program mapppins in qemu commit dd55885516 ("linux-user: Rewrite non-fixed probe_guest_base") and that also fixed the issue (in qemu 8.1.0)
So, in short there are many fixes available; it's a qemu bug that assumed something about the memory layout and broke with this kaslr patch (and for some reason only happened on non-pie static build)
mjt will at the very least rebuild the package with pie enabled, because it's known to cause other issues with aarch64 and that was an oversight in the first place, so this issue will go away for debian without any further work.
This is the background behind me saying that this probably should be reverted in stable branches (to avoid other surprises with old userspace), but master can probably keep this commit if it brings tangible security benefits (and I think it does)
See the debian qemu-side of the bug for details: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1087822
Is there an simple reproducer?
Unfortunately I couldn't get it to reproduce easily, but I think it's just a matter of finding the right binary that does problematic mappings (e.g. running true in a loop didn't work but running gcc in a loop did)
The most reliable reproducer I've been using is building a reasonably large program, we were working on modemmanager at the time so that's what I used; if usually fails between 100-300/500 of the build: ---- $ docker run -ti --rm --platform linux/arm64/v8 docker.io/arm64v8/alpine:3.20 sh / # apk add bash-completion-dev dbus-dev elogind-dev gobject-introspection-dev gtk-doc libgudev-dev libmbim-dev libqmi-dev linux-headers meson vala clang abuild alpine-sdk curl / # curl -O https://gitlab.freedesktop.org/mobile-broadband/ModemManager/-/archive/1.22.... / # tar xf ModemManager-1.22.0.tar.gz / # cd ModemManager-1.22.0/ / # abuild-meson \ -Db_lto=true \ -Dsystemdsystemunitdir=no \ -Ddbus_policy_dir=/usr/share/dbus-1/system.d \ -Dgtk_doc=true \ -Dsystemd_journal=false \ -Dsystemd_suspend_resume=true \ -Dvapi=true \ -Dpolkit=no \ . output / # meson compile -C output ----
If you take any of the command that failed from this build and run it in a loop, it'll also eventually fail after a couple hundred of invocations, but if your loop doesn't involve any parallelism that'll be slower to reproduce.
Note it requires qemu to be broken as well, so you'll have best chances with a debian bookworm (VM is fine; a chroot or podman instead of docker is also fine; in the chroot case ninja requires at least /dev (and possibly /proc) mounted)
Thanks,