On Tue, 30 Jan 2024 20:55:39 -0800 Song Liu song@kernel.org wrote:
On Tue, Jan 30, 2024 at 6:41 PM Yu Kuai yukuai1@huaweicloud.com
Can you test the following patch?
diff --git a/drivers/md/md.c b/drivers/md/md.c index e3a56a958b47..a8db84c200fe 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -578,8 +578,12 @@ static void submit_flushes(struct work_struct *ws) rcu_read_lock(); } rcu_read_unlock();
if (atomic_dec_and_test(&mddev->flush_pending))
if (atomic_dec_and_test(&mddev->flush_pending)) {
/* The pair is percpu_ref_get() from
md_flush_request() */
percpu_ref_put(&mddev->active_io);
queue_work(md_wq, &mddev->flush_work);
}
}
static void md_submit_flush_data(struct work_struct *ws)
This fixes the issue in my tests. Please submit the official patch. Also, we should add a test in mdadm/tests to cover this case.
Thanks, Song
Hi Kuai,
On my hardware issue also stopped reproducing with this fix.
I applied the fix on current HEAD of master branch in kernel/git/torvalds/linux.git repo.
Thansk, Blazej