Re: FAILED: patch "[PATCH] rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive" failed to apply to 6.10-stable tree

30 Jul 2024

      On Tue, Jul 30, 2024 at 12:54:52PM +0200, Ilya Dryomov wrote:
...
On Tue, Jul 30, 2024 at 12:25 PM gregkh@linuxfoundation.org wrote:
...
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 2237ceb71f89837ac47c5dce2aaa2c2b3a337a3c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to 'stable@vger.kernel.org' --in-reply-to '2024073021-strut-specimen-8aad@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
2237ceb71f89 ("rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2237ceb71f89837ac47c5dce2aaa2c2b3a337a3c Mon Sep 17 00:00:00 2001
From: Ilya Dryomov idryomov@gmail.com
Date: Tue, 23 Jul 2024 18:07:59 +0200
Subject: [PATCH] rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive
 mappings
Every time a watch is reestablished after getting lost, we need to
update the cookie which involves quiescing exclusive lock.  For this,
we transition from RBD_LOCK_STATE_LOCKED to RBD_LOCK_STATE_QUIESCING
roughly for the duration of rbd_reacquire_lock() call.  If the mapping
is exclusive and I/O happens to arrive in this time window, it's failed
with EROFS (later translated to EIO) based on the wrong assumption in
rbd_img_exclusive_lock() -- "lock got released?" check there stopped
making sense with commit a2b1da09793d ("rbd: lock should be quiesced on
reacquire").
To make it worse, any such I/O is added to the acquiring list before
EROFS is returned and this sets up for violating rbd_lock_del_request()
precondition that the request is either on the running list or not on
any list at all -- see commit ded080c86b3f ("rbd: don't move requests
to the running list on errors").  rbd_lock_del_request() ends up
processing these requests as if they were on the running list which
screws up quiescing_wait completion counter and ultimately leads to
rbd_assert(!completion_done(&rbd_dev->quiescing_wait));

being triggered on the next watch error.
Cc: stable@vger.kernel.org # 06ef84c4e9c4: rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait
Hi Greg,
Please grab commit f5c466a0fdb2 ("rbd: rename RBD_LOCK_STATE_RELEASING
and releasing_wait") as a prerequisite for this one.  I forgot to adjust
the SHA in the tag that specifies it after a rebase, sorry.
This applies to all stable kernels.
Now done, thanks.  I was wondering about that invalid sha1, odd that the
linux-next scripts didn't catch it :(
greg k-h

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: FAILED: patch "[PATCH] rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive" failed to apply to 6.10-stable tree