On Wed, Feb 03, 2021 at 04:37:29AM -0800, Andres Freund wrote:
Hi,
On 2020-06-01 19:54:21 +0200, Greg Kroah-Hartman wrote:
From: Jens Axboe axboe@kernel.dk
[ Upstream commit b0beb28097fa04177b3769f4bb7a0d0d9c4ae76e ]
This reverts commit c58c1f83436b501d45d4050fd1296d71a9760bcb.
io_uring does do the right thing for this case, and we're still returning -EAGAIN to userspace for the cases we don't support. Revert this change to avoid doing endless spins of resubmits.
Cc: stable@vger.kernel.org # v5.6 Reported-by: Bijan Mottahedeh bijan.mottahedeh@oracle.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org
block/blk-core.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
This broke io_uring direct-io on ext4 over md.
fallocate -l $((1024*1024*1024)) /srv/part1 fallocate -l $((1024*1024*1024)) /srv/part2 losetup -f /srv/part1 losetup -f /srv/part2 losetup -a # assuming these were loop0/1 mdadm --create -n2 -l stripe -N fast-striped /dev/md/fast-striped /dev/loop0 /dev/loop1 mkfs.ext4 /dev/md/fast-striped mount /dev/md/fast-striped /mnt/t2 fio --directory=/mnt/t2 --ioengine io_uring --rw write --filesize 1MB --overwrite=1 --name=test --direct=1 --bs=4k
On v5.4.43-101-gbba91cdba612 this fails with fio: io_u error on file /mnt/t2/test.0.0: Input/output error: write offset=0, buflen=4096 fio: pid=734, err=5/file:io_u.c:1834, func=io_u error, error=Input/output error
whereas previously it worked. libaio still works...
I haven't checked which major kernel version fixed this again, but I did verify that it's still broken in 5.4.94 and that 5.10.9 works.
I would suspect it's
commit 4503b7676a2e0abe69c2f2c0d8b03aec53f2f048 Author: Jens Axboe axboe@kernel.dk Date: 2020-06-01 10:00:27 -0600
io_uring: catch -EIO from buffered issue request failure -EIO bubbles up like -EAGAIN if we fail to allocate a request at the lower level. Play it safe and treat it like -EAGAIN in terms of sync retry, to avoid passing back an errant -EIO. Catch some of these early for block based file, as non-mq devices generally do not support NOWAIT. That saves us some overhead by not first trying, then retrying from async context. We can go straight to async punt instead. Signed-off-by: Jens Axboe <axboe@kernel.dk>
which isn't in stable/linux-5.4.y
Can you test that if the above commit is added, all works well again?
thanks,
greg k-h