Re: Re: [PATCH 6.1 175/321] x86: Increase brk randomness entropy for 64-bit systems

20 Nov 2024


      Thank you for the replies,
Kees Cook wrote on Wed, Nov 20, 2024 at 10:08:04AM -0800:
...
It seems like the correct first step is to revert the brk change. It's
still not clear to me why the change is causing a problem -- I assume it
is colliding with some other program area.
Is the problem strictly with qemu-user-static? (i.e. it was GCC running
in qemu-user-static so the crash is qemu, not GCC) That should help me
narrow down the issue. There must be some built-in assumption. And it's
aarch64 within x86_64 where it happens (is the program within qemu also
aarch64 or is it arm32)?
As far as I'm aware I've only seen qemu-user-static fail for aarch64
(e.g. the arm32 variant works fine, didn't try other arches) in this
particular configuration (the debian bookworm package has been built
with --static-pie which is known to cause problems)
It's also fixed in newer versions of qemu, even with --static-pie,
because they reworked the way they detect program mapppins in qemu
commit dd55885516 ("linux-user: Rewrite non-fixed probe_guest_base")
and that also fixed the issue (in qemu 8.1.0)
So, in short there are many fixes available; it's a qemu bug that
assumed something about the memory layout and broke with this kaslr
patch (and for some reason only happened on non-pie static build)
mjt will at the very least rebuild the package with pie enabled, because
it's known to cause other issues with aarch64 and that was an oversight
in the first place, so this issue will go away for debian without any
further work.
This is the background behind me saying that this probably should be
reverted in stable branches (to avoid other surprises with old
userspace), but master can probably keep this commit if it brings
tangible security benefits (and I think it does)
See the debian qemu-side of the bug for details:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1087822
...
Is there an simple reproducer?
Unfortunately I couldn't get it to reproduce easily, but I think it's
just a matter of finding the right binary that does problematic mappings
(e.g. running true in a loop didn't work but running gcc in a loop did)
The most reliable reproducer I've been using is building a reasonably
large program, we were working on modemmanager at the time so that's
what I used; if usually fails between 100-300/500 of the build:
----
$ docker run -ti --rm --platform linux/arm64/v8 docker.io/arm64v8/alpine:3.20 sh
/ # apk add bash-completion-dev dbus-dev elogind-dev gobject-introspection-dev gtk-doc libgudev-dev libmbim-dev libqmi-dev linux-headers meson vala clang abuild alpine-sdk curl
/ # curl -O https://gitlab.freedesktop.org/mobile-broadband/ModemManager/-/archive/1.22....
/ # tar xf ModemManager-1.22.0.tar.gz
/ # cd ModemManager-1.22.0/
/ # abuild-meson \
        -Db_lto=true \
        -Dsystemdsystemunitdir=no \
        -Ddbus_policy_dir=/usr/share/dbus-1/system.d \
        -Dgtk_doc=true \
        -Dsystemd_journal=false \
        -Dsystemd_suspend_resume=true \
        -Dvapi=true \
        -Dpolkit=no \
        . output
/ # meson compile -C output
----
If you take any of the command that failed from this build and run it in
a loop, it'll also eventually fail after a couple hundred of
invocations, but if your loop doesn't involve any parallelism that'll be
slower to reproduce.
Note it requires qemu to be broken as well, so you'll have best chances
with a debian bookworm (VM is fine; a chroot or podman instead of docker
is also fine; in the chroot case ninja requires at least /dev (and
possibly /proc) mounted)
Thanks,
-- 
Dominique Martinet | Asmadeus

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: Re: [PATCH 6.1 175/321] x86: Increase brk randomness entropy for 64-bit systems