On 2/12/24 11:28, Bart Van Assche wrote:
On 2/9/24 10:12, Jens Axboe wrote:
Greg, can you elaborate on how useful cancel is for gadgets? Is it one of those things that was wired up "just because", or does it have actually useful cases?
I found two use cases in the Android Open Source Project and have submitted CLs that request to remove the io_cancel() calls from that code. Although I think I understand why these calls were added, the race conditions that these io_cancel() calls try to address cannot be addressed completely by calling io_cancel().
(replying to my own e-mail) The adb daemon (adbd) maintainers asked me to preserve the I/O cancellation code in adbd because it was introduced recently in that code to fix an important bug. Does everyone agree with the approach of the untested patches below?
Thanks,
Bart.
----------------------------------------------------------------------------- [PATCH 1/2] fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted by libaio
If kiocb_set_cancel_fn() is called for I/O submitted by io_uring, the following kernel warning appears:
WARNING: CPU: 3 PID: 368 at fs/aio.c:598 kiocb_set_cancel_fn+0x9c/0xa8 Call trace: kiocb_set_cancel_fn+0x9c/0xa8 ffs_epfile_read_iter+0x144/0x1d0 io_read+0x19c/0x498 io_issue_sqe+0x118/0x27c io_submit_sqes+0x25c/0x5fc __arm64_sys_io_uring_enter+0x104/0xab0 invoke_syscall+0x58/0x11c el0_svc_common+0xb4/0xf4 do_el0_svc+0x2c/0xb0 el0_svc+0x2c/0xa4 el0t_64_sync_handler+0x68/0xb4 el0t_64_sync+0x1a4/0x1a8
Fix this by setting the IOCB_AIO_RW flag for read and write I/O that is submitted by libaio.
Suggested-by: Jens Axboe axboe@kernel.dk --- fs/aio.c | 9 ++++++++- include/linux/fs.h | 2 ++ 2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/aio.c b/fs/aio.c index bb2ff48991f3..da18dbcfcb22 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -593,6 +593,13 @@ void kiocb_set_cancel_fn(struct kiocb *iocb, kiocb_cancel_fn *cancel) struct kioctx *ctx = req->ki_ctx; unsigned long flags;
+ /* + * kiocb didn't come from aio or is neither a read nor a write, hence + * ignore it. + */ + if (!(iocb->ki_flags & IOCB_AIO_RW)) + return; + if (WARN_ON_ONCE(!list_empty(&req->ki_list))) return;
@@ -1509,7 +1516,7 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb) req->ki_complete = aio_complete_rw; req->private = NULL; req->ki_pos = iocb->aio_offset; - req->ki_flags = req->ki_filp->f_iocb_flags; + req->ki_flags = req->ki_filp->f_iocb_flags | IOCB_AIO_RW; if (iocb->aio_flags & IOCB_FLAG_RESFD) req->ki_flags |= IOCB_EVENTFD; if (iocb->aio_flags & IOCB_FLAG_IOPRIO) { diff --git a/include/linux/fs.h b/include/linux/fs.h index ed5966a70495..c2dcc98cb4c8 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -352,6 +352,8 @@ enum rw_hint { * unrelated IO (like cache flushing, new IO generation, etc). */ #define IOCB_DIO_CALLER_COMP (1 << 22) +/* kiocb is a read or write operation submitted by fs/aio.c. */ +#define IOCB_AIO_RW (1 << 23)
/* for use in trace events */ #define TRACE_IOCB_STRINGS \
-----------------------------------------------------------------------------
[PATCH 2/2] fs/aio: Make io_cancel() generate completions again
The following patch accidentally removed the code for delivering completions for cancelled reads and writes to user space: "[PATCH 04/33] aio: remove retry-based AIO" (https://lore.kernel.org/all/1363883754-27966-5-git-send-email-koverstreet@go...) From that patch:
- if (kiocbIsCancelled(iocb)) { - ret = -EINTR; - aio_complete(iocb, ret, 0); - /* must not access the iocb after this */ - goto out; - }
This leads to a leak in user space of a struct iocb. Hence this patch that restores the code that reports to user space that a read or write has been cancelled successfully. --- fs/aio.c | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c index da18dbcfcb22..28223f511931 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -2165,14 +2165,11 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id, #endif
/* sys_io_cancel: - * Attempts to cancel an iocb previously passed to io_submit. If - * the operation is successfully cancelled, the resulting event is - * copied into the memory pointed to by result without being placed - * into the completion queue and 0 is returned. May fail with - * -EFAULT if any of the data structures pointed to are invalid. - * May fail with -EINVAL if aio_context specified by ctx_id is - * invalid. May fail with -EAGAIN if the iocb specified was not - * cancelled. Will fail with -ENOSYS if not implemented. + * Attempts to cancel an iocb previously passed to io_submit(). If the + * operation is successfully cancelled 0 is returned. May fail with + * -EFAULT if any of the data structures pointed to are invalid. May + * fail with -EINVAL if aio_context specified by ctx_id is invalid. Will + * fail with -ENOSYS if not implemented. */ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb, struct io_event __user *, result) @@ -2203,14 +2200,12 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb, } spin_unlock_irq(&ctx->ctx_lock);
- if (!ret) { - /* - * The result argument is no longer used - the io_event is - * always delivered via the ring buffer. -EINPROGRESS indicates - * cancellation is progress: - */ - ret = -EINPROGRESS; - } + /* + * The result argument is no longer used - the io_event is always + * delivered via the ring buffer. + */ + if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW) + aio_complete_rw(&kiocb->rw, -EINTR);
percpu_ref_put(&ctx->users);
-----------------------------------------------------------------------------