6.16-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zheng Qixing zhengqixing@huawei.com
[ Upstream commit b7ee30f0efd12f42735ae233071015389407966c ]
During raid resync, if a disk becomes faulty, the operation is briefly interrupted. The MD_RECOVERY_RECOVER flag triggered by the disk failure causes sync_action to incorrectly show "recover" instead of "resync". The same issue affects reshape operations.
Reproduction steps: mdadm -Cv /dev/md1 -l1 -n4 -e1.2 /dev/sd{a..d} // -> resync happened mdadm -f /dev/md1 /dev/sda // -> resync interrupted cat sync_action -> recover
Add progress checks in md_sync_action() for resync/recover/reshape to ensure the interface correctly reports the actual operation type.
Fixes: 4b10a3bc67c1 ("md: ensure resync is prioritized over recovery") Signed-off-by: Zheng Qixing zhengqixing@huawei.com Link: https://lore.kernel.org/linux-raid/20250816002534.1754356-3-zhengqixing@huaw... Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/md.c | 37 +++++++++++++++++++++++++++++++++++-- 1 file changed, 35 insertions(+), 2 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 0348b5f3adc5..8746b22060a7 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -4831,9 +4831,33 @@ static bool rdev_needs_recovery(struct md_rdev *rdev, sector_t sectors) rdev->recovery_offset < sectors; }
+static enum sync_action md_get_active_sync_action(struct mddev *mddev) +{ + struct md_rdev *rdev; + bool is_recover = false; + + if (mddev->resync_offset < MaxSector) + return ACTION_RESYNC; + + if (mddev->reshape_position != MaxSector) + return ACTION_RESHAPE; + + rcu_read_lock(); + rdev_for_each_rcu(rdev, mddev) { + if (rdev_needs_recovery(rdev, MaxSector)) { + is_recover = true; + break; + } + } + rcu_read_unlock(); + + return is_recover ? ACTION_RECOVER : ACTION_IDLE; +} + enum sync_action md_sync_action(struct mddev *mddev) { unsigned long recovery = mddev->recovery; + enum sync_action active_action;
/* * frozen has the highest priority, means running sync_thread will be @@ -4857,8 +4881,17 @@ enum sync_action md_sync_action(struct mddev *mddev) !test_bit(MD_RECOVERY_NEEDED, &recovery)) return ACTION_IDLE;
- if (test_bit(MD_RECOVERY_RESHAPE, &recovery) || - mddev->reshape_position != MaxSector) + /* + * Check if any sync operation (resync/recover/reshape) is + * currently active. This ensures that only one sync operation + * can run at a time. Returns the type of active operation, or + * ACTION_IDLE if none are active. + */ + active_action = md_get_active_sync_action(mddev); + if (active_action != ACTION_IDLE) + return active_action; + + if (test_bit(MD_RECOVERY_RESHAPE, &recovery)) return ACTION_RESHAPE;
if (test_bit(MD_RECOVERY_RECOVER, &recovery))