On Tue, 2019-04-30 at 19:37 -0300, Guilherme G. Piccoli wrote:
Commit 37f9579f4c31 ("blk-mq: Avoid that submitting a bio concurrently with device removal triggers a crash") introduced a NULL pointer dereference in generic_make_request(). The patch sets q to NULL and enter_succeeded to false; right after, there's an 'if (enter_succeeded)' which is not taken, and then the 'else' will dereference q in blk_queue_dying(q).
This patch just moves the 'q = NULL' to a point in which it won't trigger the oops, although the semantics of this NULLification remains untouched.
A simple test case/reproducer is as follows: a) Build kernel v5.1-rc7 with CONFIG_BLK_CGROUP=n.
b) Create a raid0 md array with 2 NVMe devices as members, and mount it with an ext4 filesystem.
c) Run the following oneliner (supposing the raid0 is mounted in /mnt): (dd of=/mnt/tmp if=/dev/zero bs=1M count=999 &); sleep 0.3; echo 1 > /sys/block/nvme0n1/device/device/remove (whereas nvme0n1 is the 2nd array member)
Reviewed-by: Bart Van Assche bvanassche@acm.org