On Wed, May 17, 2023 at 3:33 PM Xiubo Li xiubli@redhat.com wrote:
On 5/17/23 21:11, Ilya Dryomov wrote:
On Wed, May 17, 2023 at 2:46 PM Xiubo Li xiubli@redhat.com wrote:
On 5/17/23 19:29, Ilya Dryomov wrote:
On Wed, May 17, 2023 at 1:04 PM Xiubo Li xiubli@redhat.com wrote:
On 5/17/23 18:31, Ilya Dryomov wrote:
On Wed, May 17, 2023 at 7:24 AM xiubli@redhat.com wrote: > From: Xiubo Li xiubli@redhat.com > > When the MClientSnap reqeust's op is not CEPH_SNAP_OP_SPLIT the > request may still contain a list of 'split_realms', and we need > to skip it anyway. Or it will be parsed as a corrupt snaptrace. > > Cc: stable@vger.kernel.org > Cc: Frank Schilder frans@dtu.dk > Reported-by: Frank Schilder frans@dtu.dk > URL: https://tracker.ceph.com/issues/61200 > Signed-off-by: Xiubo Li xiubli@redhat.com > --- > fs/ceph/snap.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c > index 0e59e95a96d9..d95dfe16b624 100644 > --- a/fs/ceph/snap.c > +++ b/fs/ceph/snap.c > @@ -1114,6 +1114,9 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, > continue; > adjust_snap_realm_parent(mdsc, child, realm->ino); > } > + } else { > + p += sizeof(u64) * num_split_inos; > + p += sizeof(u64) * num_split_realms; > } > > /* > -- > 2.40.1 > Hi Xiubo,
This code appears to be very old -- it goes back to the initial commit 963b61eb041e ("ceph: snapshot management") in 2009. Do you have an explanation for why this popped up only now?
As I remembered we hit this before in one cu BZ last year, but I couldn't remember exactly which one. But I am not sure whether @Jeff saw this before I joint ceph team.
Has MDS always been including split_inos and split_realms arrays in !CEPH_SNAP_OP_SPLIT case or is this a recent change? If it's a recent change, I'd argue that this needs to be addressed on the MDS side.
While in MDS side for the _UPDATE op it won't send the 'split_realm' list just before the commit in 2017:
commit 93e7267757508520dfc22cff1ab20558bd4a44d4 Author: Yan, Zheng zyan@redhat.com Date: Fri Jul 21 21:40:46 2017 +0800
mds: send snap related messages centrally during mds recovery sending CEPH_SNAP_OP_SPLIT and CEPH_SNAP_OP_UPDATE messages to clients centrally in MDCache::open_snaprealms() Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Before this commit it will only send the 'split_realm' list for the _SPLIT op.
It sounds like we have the culprit. This should be treated as a regression and fixed on the MDS side. I don't see a justification for putting useless data on the wire.
But we still need this patch to make it to work with the old ceph releases.
This is debatable:
given that no one noticed this for so long, the likelihood of MDS sending a CEPH_SNAP_OP_UPDATE message with bogus split_inos and split_realms arrays is very low
MDS side fix would be backported to supported Ceph releases
people who are running unsupported Ceph releases (i.e. aren't updating) are unlikely to be diligently updating their kernel clients
Just searched the ceph tracker and I found another 3 trackers have the same issue:
https://tracker.ceph.com/issues/57817 https://tracker.ceph.com/issues/57703 https://tracker.ceph.com/issues/57686
So plusing this time and the previous CU case:
https://www.spinics.net/lists/ceph-users/msg77106.html
I have seen at least 5 times.
All this are reproduced when doing MDS failover, and this is the root cause in MDS side.
OK, given that the fixup in the kernel client is small, it seems justified. But, please, add a comment in the new else branch saying that it's there only to work around a bug in the MDS.
Thanks,
Ilya