On 8/25/22 10:47 AM, Song Liu wrote:
On Tue, Aug 23, 2022 at 10:13 AM Song Liu song@kernel.org wrote:
On Mon, Aug 22, 2022 at 8:15 PM Thomas Deutschmann whissi@whissi.de wrote:
On 2022-08-23 03:37, Song Liu wrote:
Thomas, have you tried to bisect with the fio repro?
Yes, just finished:
d32d3d0b47f7e34560ae3c55ddfcf68694813501 is the first bad commit commit d32d3d0b47f7e34560ae3c55ddfcf68694813501 Author: Christoph Hellwig Date: Mon Jun 14 13:17:34 2021 +0200
nvme-multipath: set QUEUE_FLAG_NOWAIT The nvme multipathing code just dispatches bios to one of the blk-mq based paths and never blocks on its own, so set QUEUE_FLAG_NOWAIT to support REQ_NOWAIT bios.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
So another NOWAIT issue -- similar to the bad commit which is causing the mdraid issue I already found (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...).
Reverting the commit, i.e. deleting
blk_queue_flag_set(QUEUE_FLAG_NOWAIT, head->disk->queue);
fixes the problem for me. Well, sort of. Looks like this will disable io_uring. fio reproducer fails with
My system doesn't have multipath enabled. I guess bisect will point to something else here.
I am afraid we won't get more information from bisect.
OK, I am able to pinpoint the issue, and Jens found the proper fix for it (see below, also available in [1]). It survived 100 runs of the repro fio job.
Thomas, please give it a try.
Thanks, Song
diff --git c/fs/io_uring.c w/fs/io_uring.c index 3f8a79a4affa..72a39f5ec5a5 100644 --- c/fs/io_uring.c +++ w/fs/io_uring.c @@ -4551,7 +4551,12 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags) copy_iov: iov_iter_restore(&s->iter, &s->iter_state); ret = io_setup_async_rw(req, iovec, s, false);
return ret ?: -EAGAIN;
if (!ret) {
if (kiocb->ki_flags & IOCB_WRITE)
kiocb_end_write(req);
return -EAGAIN;
}
return 0;
This should be 'return ret;' for that last line. I had to double check the ones I did, but they did get it right. But I did a double take when I saw this one :-)
It'll work fine for testing as we won't hit errors here unless we run out of memory, so...