From: Xiubo Li xiubli@redhat.com
When decoding the snaps fails it maybe leaving the 'first_realm' and 'realm' pointing to the same snaprealm memory. And then it'll put it twice and could cause random use-after-free, BUG_ON, etc issues.
Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/57686 Signed-off-by: Xiubo Li xiubli@redhat.com --- fs/ceph/snap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index 9bceed2ebda3..baf17df05107 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -849,10 +849,12 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc, if (realm_to_rebuild && p >= e) rebuild_snap_realms(realm_to_rebuild, &dirty_realms);
- if (!first_realm) + if (!first_realm) { first_realm = realm; - else + realm = NULL; + } else { ceph_put_snap_realm(mdsc, realm); + }
if (p < e) goto more;
On Mon, Nov 07, 2022 at 03:17:59PM +0800, xiubli@redhat.com wrote:
From: Xiubo Li xiubli@redhat.com
When decoding the snaps fails it maybe leaving the 'first_realm' and 'realm' pointing to the same snaprealm memory. And then it'll put it twice and could cause random use-after-free, BUG_ON, etc issues.
Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/57686 Signed-off-by: Xiubo Li xiubli@redhat.com
fs/ceph/snap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index 9bceed2ebda3..baf17df05107 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -849,10 +849,12 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc, if (realm_to_rebuild && p >= e) rebuild_snap_realms(realm_to_rebuild, &dirty_realms);
- if (!first_realm)
- if (!first_realm) { first_realm = realm;
- else
realm = NULL;
- } else { ceph_put_snap_realm(mdsc, realm);
- }
if (p < e) goto more; -- 2.31.1
This patch looks correct to me. But I wonder if there's a deeper problem there (probably not on the kernel client). Because the other question is: why are we failing to decode the snaps? But I guess this fix is worth it anyway.
Reviewed-by: Luís Henriques lhenriques@suse.de
Cheers, -- Luís
On 07/11/2022 18:39, Luís Henriques wrote:
On Mon, Nov 07, 2022 at 03:17:59PM +0800, xiubli@redhat.com wrote:
From: Xiubo Li xiubli@redhat.com
When decoding the snaps fails it maybe leaving the 'first_realm' and 'realm' pointing to the same snaprealm memory. And then it'll put it twice and could cause random use-after-free, BUG_ON, etc issues.
Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/57686 Signed-off-by: Xiubo Li xiubli@redhat.com
fs/ceph/snap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index 9bceed2ebda3..baf17df05107 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -849,10 +849,12 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc, if (realm_to_rebuild && p >= e) rebuild_snap_realms(realm_to_rebuild, &dirty_realms);
- if (!first_realm)
- if (!first_realm) { first_realm = realm;
- else
realm = NULL;
- } else { ceph_put_snap_realm(mdsc, realm);
- }
if (p < e) goto more; -- 2.31.1
This patch looks correct to me. But I wonder if there's a deeper problem there (probably not on the kernel client). Because the other question is: why are we failing to decode the snaps? But I guess this fix is worth it anyway.
Yeah, good question.
At the same time the MDS also crashed [1][2] just before the kernel crash was triggered seconds later. And the metadata in cephfs was corrupted due to some reasons.
[1] https://tracker.ceph.com/issues/56140
[2] https://tracker.ceph.com/issues/54546
Thanks!
- Xiubo
Reviewed-by: Luís Henriques lhenriques@suse.de
Cheers,
Luís
On Mon, Nov 7, 2022 at 8:18 AM xiubli@redhat.com wrote:
From: Xiubo Li xiubli@redhat.com
When decoding the snaps fails it maybe leaving the 'first_realm' and 'realm' pointing to the same snaprealm memory. And then it'll put it twice and could cause random use-after-free, BUG_ON, etc issues.
Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/57686 Signed-off-by: Xiubo Li xiubli@redhat.com
fs/ceph/snap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index 9bceed2ebda3..baf17df05107 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -849,10 +849,12 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc, if (realm_to_rebuild && p >= e) rebuild_snap_realms(realm_to_rebuild, &dirty_realms);
if (!first_realm)
if (!first_realm) { first_realm = realm;
else
realm = NULL;
Hi Xiubo,
I wonder why realm is cleared only in !first_realm branch? Can't the same issue occur with realm?
first_realm is already set, ceph_put_snap_realm(realm) p < e, goto more decoding fails, goto bad realm is still set and not IS_ERR, ceph_put_snap_realm(realm) <realm is put twice>
Thanks,
Ilya
On 07/11/2022 23:21, Ilya Dryomov wrote:
On Mon, Nov 7, 2022 at 8:18 AM xiubli@redhat.com wrote:
From: Xiubo Li xiubli@redhat.com
When decoding the snaps fails it maybe leaving the 'first_realm' and 'realm' pointing to the same snaprealm memory. And then it'll put it twice and could cause random use-after-free, BUG_ON, etc issues.
Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/57686 Signed-off-by: Xiubo Li xiubli@redhat.com
fs/ceph/snap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index 9bceed2ebda3..baf17df05107 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -849,10 +849,12 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc, if (realm_to_rebuild && p >= e) rebuild_snap_realms(realm_to_rebuild, &dirty_realms);
if (!first_realm)
if (!first_realm) { first_realm = realm;
else
realm = NULL;
Hi Xiubo,
I wonder why realm is cleared only in !first_realm branch? Can't the same issue occur with realm?
first_realm is already set, ceph_put_snap_realm(realm) p < e, goto more decoding fails, goto bad realm is still set and not IS_ERR, ceph_put_snap_realm(realm) <realm is put twice>
Yeah, makes sense.
I will fix this.
Thanks Ilya!
Thanks,
Ilya
linux-stable-mirror@lists.linaro.org