6.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yang Erkun yangerkun@huawei.com
[ Upstream commit 13017b427118f4311471ee47df74872372ca8482 ]
Our testcase trigger panic:
BUG: kernel NULL pointer dereference, address: 00000000000000e0 ... Oops: Oops: 0000 [#1] SMP NOPTI CPU: 2 UID: 0 PID: 85 Comm: kworker/2:1 Not tainted 6.16.0+ #94 PREEMPT(none) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 Workqueue: md_misc md_start_sync RIP: 0010:rdev_addable+0x4d/0xf0 ... Call Trace: <TASK> md_start_sync+0x329/0x480 process_one_work+0x226/0x6d0 worker_thread+0x19e/0x340 kthread+0x10f/0x250 ret_from_fork+0x14d/0x180 ret_from_fork_asm+0x1a/0x30 </TASK> Modules linked in: raid10 CR2: 00000000000000e0 ---[ end trace 0000000000000000 ]--- RIP: 0010:rdev_addable+0x4d/0xf0
md_spares_need_change in md_start_sync will call rdev_addable which protected by rcu_read_lock/rcu_read_unlock. This rcu context will help protect rdev won't be released, but rdev->mddev will be set to NULL before we call synchronize_rcu in md_kick_rdev_from_array. Fix this by using READ_ONCE and check does rdev->mddev still alive.
Fixes: bc08041b32ab ("md: suspend array in md_start_sync() if array need reconfiguration") Fixes: 570b9147deb6 ("md: use RCU lock to protect traversal in md_spares_need_change()") Signed-off-by: Yang Erkun yangerkun@huawei.com Link: https://lore.kernel.org/linux-raid/20250731114530.776670-1-yangerkun@huawei.... Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/md.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index e78c42c381db..10670c62b09e 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9418,6 +9418,12 @@ static bool rdev_is_spare(struct md_rdev *rdev)
static bool rdev_addable(struct md_rdev *rdev) { + struct mddev *mddev; + + mddev = READ_ONCE(rdev->mddev); + if (!mddev) + return false; + /* rdev is already used, don't add it again. */ if (test_bit(Candidate, &rdev->flags) || rdev->raid_disk >= 0 || test_bit(Faulty, &rdev->flags)) @@ -9428,7 +9434,7 @@ static bool rdev_addable(struct md_rdev *rdev) return true;
/* Allow to add if array is read-write. */ - if (md_is_rdwr(rdev->mddev)) + if (md_is_rdwr(mddev)) return true;
/*