 
            On Tue, 08 Oct 2024, Olga Kornievskaia wrote:
On Sat, Oct 5, 2024 at 5:09 PM NeilBrown neilb@suse.de wrote:
On Sun, 06 Oct 2024, Chuck Lever wrote:
On Sat, Oct 05, 2024 at 12:20:48PM -0400, Jeff Layton wrote:
On Sat, 2024-10-05 at 10:53 -0400, Chuck Lever wrote:
On Fri, Oct 04, 2024 at 06:04:03PM -0400, Olga Kornievskaia wrote:
When multiple FREE_STATEIDs are sent for the same delegation stateid, it can lead to a possible either use-after-tree or counter refcount underflow errors.
In nfsd4_free_stateid() under the client lock we find a delegation stateid, however the code drops the lock before calling nfs4_put_stid(), that allows another FREE_STATE to find the stateid again. The first one will proceed to then free the stateid which leads to either use-after-free or decrementing already zerod counter.
CC: stable@vger.kernel.org
I assume that the broken commit is pretty old, but this fix does not apply before v6.9 (where sc_status is introduced). I can add "# v6.9+" to the Cc: stable tag.
I don't know. It looks like nfsd4_free_stateid always returned NFS4ERR_LOCKS_HELD on a delegation stateid until 3f29cc82a84c.
But what do folks think about a Fixes: tag?
Could be e1ca12dfb1be ("NFSD: added FREE_STATEID operation"), but that doesn't have the switch statement, which was added by 2da1cec713bc ("nfsd4: simplify free_stateid").
Maybe this one?
3f29cc82a84c nfsd: split sc_status out of sc_typeThat particular bit of the code (and the SC_STATUS_CLOSED flag) was added in that patch, and I don't think you'd want to apply this patch to anything that didn't have it.
OK, if we believe that 3f29cc82 is where the misbehavior started, then I can replace the "Cc: stable@" with "Fixes: 3f29cc82a84c".
I believe the misbehaviour started with Commit: b0fc29d6fcd0 ("nfsd: Ensure stateids remain unique until they are freed") in v3.18.
The bug in the current code is that it assumes that
list_del_init(&dp->dl_recall_lru);actually removes from the the dl_recall_lru, and so a reference must be dropped. But if it wasn't on the list, then that is wrong.
I've actually been chasing a different problem (a UAF) and Ben noticed a problem with doing a double free (by double free_stateid) which this patch addresses. But, this particular line list_del_init() in nfsd4_free_stateid() has been bothering me as I thought this access should be guarded by the "state_lock"? Though I have to admit I've tried that and it doesn't seem to help my UAF problem. Anyway where I'm going with it perhaps the guard "if (!list_empty(&dp->dl_recall_lru))" is still needed (not for double free_stateid by from other possibilities)?
dl_recall_lru is a bit confusing.
Sometimes it is on nn->del_recall_lru in which case it is protected by state_lock. Sometimes it is on clp->cl_revoked in which case it is protected by clp->cl_lock. And sometimes it is on reaplist which doesn't need protection.
So it is important to update the state of the delegation when it is moved between lists or removed from a list. I think we now do thanks to your patch, but it wouldn't hurt to audit all accesses again...
NeilBrown
I was wondering if the nfsd4_free_stateid() somehow could steal the entries from the list while the laundromat is going thru the revocation process. The problem I believe is that the laundromat thread marks the delegation "revoked" but somehow never ends up calling destroy_delegation() (destoy_delegation is the place that frees the lease -- but instead we are left with a lease on the file which causes a new open to call into break_lease() which ends up doing a UAF on a freed delegation stateid -- which was freed by the free_stateid).
So a "if (!list_empty(&dp->dl_recall_lru))" guard might also fix the bug (though adding SC_STATUS_CLOSED is a better fix I think).
Prior to the above 3.17 commit, the relevant code was
static void destroy_revoked_delegation(struct nfs4_delegation *dp) { list_del_init(&dp->dl_recall_lru); remove_stid(&dp->dl_stid); nfs4_put_delegation(dp); }
so the revoked delegation would be removed (remove_stid) from the idr and a subsequent FREE_STATEID request would not find it. The commit removed the remove_stid() call but didn't do anything to prevent the free_stateid being repeated. In that kernel it might have been appropriate to set dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID; was done to unhash_delegation() in that patch.
So I think we should declare Fixes: b0fc29d6fcd0 ("nfsd: Ensure stateids remain unique until they are freed")
and be prepared to provide alternate patches for older kernels.
NeilBrown
Signed-off-by: Olga Kornievskaia okorniev@redhat.com
fs/nfsd/nfs4state.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index ac1859c7cc9d..56b261608af4 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -7154,6 +7154,7 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, switch (s->sc_type) { case SC_TYPE_DELEG: if (s->sc_status & SC_STATUS_REVOKED) {
s->sc_status |= SC_STATUS_CLOSED; spin_unlock(&s->sc_lock); dp = delegstateid(s); list_del_init(&dp->dl_recall_lru);-- 2.43.5
-- Jeff Layton jlayton@kernel.org
-- Chuck Lever