On 9/19/22 2:17 PM, Avadhut Naik wrote:
From: Pavel Begunkov asml.silence@gmail.com
We have a couple of problems, first reports of unexpected link breakage for reads when cqe->res indicates that the IO was done in full. The reason here is partial IO with retries.
TL;DR; we compare the result in __io_complete_rw_common() against req->cqe.res, but req->cqe.res doesn't store the full length but rather the length left to be done. So, when we pass the full corrected result via kiocb_done() -> __io_complete_rw_common(), it fails.
The second problem is that we don't try to correct res in io_complete_rw(), which, for instance, might be a problem for O_DIRECT but when a prefix of data was cached in the page cache. We also definitely don't want to pass a corrected result into io_rw_done().
The fix here is to leave __io_complete_rw_common() alone, always pass not corrected result into it and fix it up as the last step just before actually finishing the I/O.
I'm confused by this email, why is it being sent? And what are the 2-3/3 patches?
And while this one should certainly go to stable, also note that:
commit 62bb0647b14646fa6c9aa25ecdf67ad18f13523c Author: Pavel Begunkov asml.silence@gmail.com Date: Tue Sep 13 13:21:23 2022 +0100
io_uring/rw: fix error'ed retry return values
exists in Linus's tree and should go in alongside the parent as it fixes the parameter type.