What 84965ff8a84f0 ("io_uring: if we see flush on exit, cancel related tasks")
really wants is to cancel all relevant REQ_F_INFLIGHT requests reliably.
That can be achieved by io_uring_cancel_files(), but we'll miss it
calling io_uring_cancel_task_requests(files=NULL) from io_uring_flush(),
because it will go through __io_uring_cancel_task_requests().
Just always call io_uring_cancel_files() during cancel, it's good enough
for now.
Cc: stable(a)vger.kernel.org # 5.9+
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
---
p.s. fold in, maybe?
fs/io_uring.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 12bf7180c0f1..38c6cbe1ab38 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8976,10 +8976,9 @@ static void io_uring_cancel_task_requests(struct io_ring_ctx *ctx,
io_cancel_defer_files(ctx, task, files);
io_cqring_overflow_flush(ctx, true, task, files);
+ io_uring_cancel_files(ctx, task, files);
if (!files)
__io_uring_cancel_task_requests(ctx, task);
- else
- io_uring_cancel_files(ctx, task, files);
if ((ctx->flags & IORING_SETUP_SQPOLL) && ctx->sq_data) {
atomic_dec(&task->io_uring->in_idle);
--
2.24.0
On Thu, Jan 28, 2021 at 10:48:34AM -0800, Paul E. McKenney wrote:
> On Thu, Jan 28, 2021 at 06:12:07PM +0100, Frederic Weisbecker wrote:
> > The "nocb_bypass_timer" ends up calling wake_nocb_gp() which deletes
> > the pending "nocb_timer" (note they are not the same timers) for the
> > given rdp without resetting the matching state stored in nocb_defer
> > wakeup.
> >
> > As a result, a future call_rcu() on that rdp may be fooled and think the
> > timer is armed when it's not, missing a deferred nocb_gp wakeup.
> >
> > Fix this with resetting rdp->nocb_defer_wakeup when we disarm the timer.
> >
> > Fixes: d1b222c6be1f (rcu/nocb: Add bypass callback queueing)
> > Cc: Stable <stable(a)vger.kernel.org>
> > Cc: Josh Triplett <josh(a)joshtriplett.org>
> > Cc: Lai Jiangshan <jiangshanlai(a)gmail.com>
> > Cc: Joel Fernandes <joel(a)joelfernandes.org>
> > Cc: Neeraj Upadhyay <neeraju(a)codeaurora.org>
> > Cc: Boqun Feng <boqun.feng(a)gmail.com>
> > Signed-off-by: Frederic Weisbecker <frederic(a)kernel.org>
> > ---
> > kernel/rcu/tree_plugin.h | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index 7e33dae0e6ee..a44f80d7661b 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -1705,6 +1705,8 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
> > rcu_nocb_unlock_irqrestore(rdp, flags);
> > return false;
> > }
> > +
> > + rdp->nocb_defer_wakeup = RCU_NOCB_WAKE_NOT;
>
> Given this change, does it make sense to remove the
> setting of ->nocb_defer_wakeup to RCU_NOCB_WAKE_NOT from the
> do_nocb_deferred_wakeup_common() function?
I do it later in "[PATCH 09/16] rcu/nocb: Merge nocb_timer to the rdp leader"
> Does the above assignment need
> to be WRITE_ONCE(), in other words, are all reads of ->nocb_defer_wakeup
> done with either ->nocb_lock or ->nocb_gp_lock held? (I do not believe
> that this is the case.)
Ah indeed it should probably be done with WRITE_ONCE() because it's read
locklessly on many places.
Thanks.
>
> Thanx, Paul
>
> > del_timer(&rdp->nocb_timer);
> > rcu_nocb_unlock_irqrestore(rdp, flags);
> > raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
> > --
> > 2.25.1
> >