 
            From: Allison Henderson allison.henderson@oracle.com
[ Upstream commit f103df763563ad6849307ed5985d1513acc586dd ]
With parent pointers enabled, a rename operation can update up to 5 inodes: src_dp, target_dp, src_ip, target_ip and wip. This causes their dquots to a be attached to the transaction chain, so we need to increase XFS_QM_TRANS_MAXDQS. This patch also add a helper function xfs_dqlockn to lock an arbitrary number of dquots.
Signed-off-by: Allison Henderson allison.henderson@oracle.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de
[amir: backport to kernels prior to parent pointers to fix an old bug]
A rename operation of a directory (i.e. mv A/C/ B/) may end up changing three different dquot accounts under the following conditions: 1. user (or group) quotas are enabled 2. A/ B/ and C/ have different owner uids (or gids) 3. A/ blocks shrinks after remove of entry C/ 4. B/ blocks grows before adding of entry C/ 5. A/ ino <= XFS_DIR2_MAX_SHORT_INUM 6. B/ ino > XFS_DIR2_MAX_SHORT_INUM 7. C/ is converted from sf to block format, because its parent entry needs to be stored as 8 bytes (see xfs_dir2_sf_replace_needblock)
When all conditions are met (observed in the wild) we get this assertion:
XFS: Assertion failed: qtrx, file: fs/xfs/xfs_trans_dquot.c, line: 207
The upstream commit fixed this bug as a side effect, so decided to apply it as is rather than changing XFS_QM_TRANS_MAXDQS to 3 in stable kernels.
The Fixes commit below is NOT the commit that introduced the bug, but for some reason, which is not explained in the commit message, it fixes the comment to state that highest number of dquots of one type is 3 and not 2 (which leads to the assertion), without actually fixing it.
The change of wording from "usr, grp OR prj" to "usr, grp and prj" suggests that there may have been a confusion between "the number of dquote of one type" and "the number of dquot types" (which is also 3), so the comment change was only accidentally correct.
Fixes: 10f73d27c8e9 ("xfs: fix the comment explaining xfs_trans_dqlockedjoin") Cc: stable@vger.kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com ---
Christoph,
This is a cognitive challenge. can you say what you where thinking in 2013 when making the comment change in the Fixes commit? Is my speculation above correct?
Catherine and Leah,
I decided that cherry-pick this upstream commit as is with a commit message addendum was the best stable tree strategy. The commit applies cleanly to 5.15.y, so I assume it does for 6.6 and 6.1 as well. I ran my tests on 5.15.y and nothing fell out, but did not try to reproduce these complex assertion in a test.
Could you take this candidate backport patch to a spin on your test branch?
What do you all think about this?
Thanks, Amir.
fs/xfs/xfs_dquot.c | 41 ++++++++++++++++++++++++++++++++++++++++ fs/xfs/xfs_dquot.h | 1 + fs/xfs/xfs_qm.h | 2 +- fs/xfs/xfs_trans_dquot.c | 15 ++++++++++----- 4 files changed, 53 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index c15d61d47a06..6b05d47aa19b 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -1360,6 +1360,47 @@ xfs_dqlock2( } }
+static int +xfs_dqtrx_cmp( + const void *a, + const void *b) +{ + const struct xfs_dqtrx *qa = a; + const struct xfs_dqtrx *qb = b; + + if (qa->qt_dquot->q_id > qb->qt_dquot->q_id) + return 1; + if (qa->qt_dquot->q_id < qb->qt_dquot->q_id) + return -1; + return 0; +} + +void +xfs_dqlockn( + struct xfs_dqtrx *q) +{ + unsigned int i; + + BUILD_BUG_ON(XFS_QM_TRANS_MAXDQS > MAX_LOCKDEP_SUBCLASSES); + + /* Sort in order of dquot id, do not allow duplicates */ + for (i = 0; i < XFS_QM_TRANS_MAXDQS && q[i].qt_dquot != NULL; i++) { + unsigned int j; + + for (j = 0; j < i; j++) + ASSERT(q[i].qt_dquot != q[j].qt_dquot); + } + if (i == 0) + return; + + sort(q, i, sizeof(struct xfs_dqtrx), xfs_dqtrx_cmp, NULL); + + mutex_lock(&q[0].qt_dquot->q_qlock); + for (i = 1; i < XFS_QM_TRANS_MAXDQS && q[i].qt_dquot != NULL; i++) + mutex_lock_nested(&q[i].qt_dquot->q_qlock, + XFS_QLOCK_NESTED + i - 1); +} + int __init xfs_qm_init(void) { diff --git a/fs/xfs/xfs_dquot.h b/fs/xfs/xfs_dquot.h index 6b5e3cf40c8b..0e954f88811f 100644 --- a/fs/xfs/xfs_dquot.h +++ b/fs/xfs/xfs_dquot.h @@ -231,6 +231,7 @@ int xfs_qm_dqget_uncached(struct xfs_mount *mp, void xfs_qm_dqput(struct xfs_dquot *dqp);
void xfs_dqlock2(struct xfs_dquot *, struct xfs_dquot *); +void xfs_dqlockn(struct xfs_dqtrx *q);
void xfs_dquot_set_prealloc_limits(struct xfs_dquot *);
diff --git a/fs/xfs/xfs_qm.h b/fs/xfs/xfs_qm.h index 442a0f97a9d4..f75c12c4c6a0 100644 --- a/fs/xfs/xfs_qm.h +++ b/fs/xfs/xfs_qm.h @@ -121,7 +121,7 @@ enum { XFS_QM_TRANS_PRJ, XFS_QM_TRANS_DQTYPES }; -#define XFS_QM_TRANS_MAXDQS 2 +#define XFS_QM_TRANS_MAXDQS 5 struct xfs_dquot_acct { struct xfs_dqtrx dqs[XFS_QM_TRANS_DQTYPES][XFS_QM_TRANS_MAXDQS]; }; diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c index 955c457e585a..99a03acd4488 100644 --- a/fs/xfs/xfs_trans_dquot.c +++ b/fs/xfs/xfs_trans_dquot.c @@ -268,24 +268,29 @@ xfs_trans_mod_dquot(
/* * Given an array of dqtrx structures, lock all the dquots associated and join - * them to the transaction, provided they have been modified. We know that the - * highest number of dquots of one type - usr, grp and prj - involved in a - * transaction is 3 so we don't need to make this very generic. + * them to the transaction, provided they have been modified. */ STATIC void xfs_trans_dqlockedjoin( struct xfs_trans *tp, struct xfs_dqtrx *q) { + unsigned int i; ASSERT(q[0].qt_dquot != NULL); if (q[1].qt_dquot == NULL) { xfs_dqlock(q[0].qt_dquot); xfs_trans_dqjoin(tp, q[0].qt_dquot); - } else { - ASSERT(XFS_QM_TRANS_MAXDQS == 2); + } else if (q[2].qt_dquot == NULL) { xfs_dqlock2(q[0].qt_dquot, q[1].qt_dquot); xfs_trans_dqjoin(tp, q[0].qt_dquot); xfs_trans_dqjoin(tp, q[1].qt_dquot); + } else { + xfs_dqlockn(q); + for (i = 0; i < XFS_QM_TRANS_MAXDQS; i++) { + if (q[i].qt_dquot == NULL) + break; + xfs_trans_dqjoin(tp, q[i].qt_dquot); + } } }