When a directory is moved to a different directory, some filesystems
(udf, ext4, ocfs2, f2fs, and likely gfs2, reiserfs, and others) need to
update their pointer to the parent and this must not race with other
operations on the directory. Lock the directories when they are moved.
Although not all filesystems need this locking, we perform it in
vfs_rename() because getting the lock ordering right is really difficult
and we don't want to expose these locking details to filesystems.
CC: stable(a)vger.kernel.org
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
.../filesystems/directory-locking.rst | 26 ++++++++++---------
fs/namei.c | 22 ++++++++++------
2 files changed, 28 insertions(+), 20 deletions(-)
diff --git a/Documentation/filesystems/directory-locking.rst b/Documentation/filesystems/directory-locking.rst
index 504ba940c36c..dccd61c7c5c3 100644
--- a/Documentation/filesystems/directory-locking.rst
+++ b/Documentation/filesystems/directory-locking.rst
@@ -22,12 +22,11 @@ exclusive.
3) object removal. Locking rules: caller locks parent, finds victim,
locks victim and calls the method. Locks are exclusive.
-4) rename() that is _not_ cross-directory. Locking rules: caller locks
-the parent and finds source and target. In case of exchange (with
-RENAME_EXCHANGE in flags argument) lock both. In any case,
-if the target already exists, lock it. If the source is a non-directory,
-lock it. If we need to lock both, lock them in inode pointer order.
-Then call the method. All locks are exclusive.
+4) rename() that is _not_ cross-directory. Locking rules: caller locks the
+parent and finds source and target. We lock both (provided they exist). If we
+need to lock two inodes of different type (dir vs non-dir), we lock directory
+first. If we need to lock two inodes of the same type, lock them in inode
+pointer order. Then call the method. All locks are exclusive.
NB: we might get away with locking the source (and target in exchange
case) shared.
@@ -44,15 +43,17 @@ All locks are exclusive.
rules:
* lock the filesystem
- * lock parents in "ancestors first" order.
+ * lock parents in "ancestors first" order. If one is not ancestor of
+ the other, lock them in inode pointer order.
* find source and target.
* if old parent is equal to or is a descendent of target
fail with -ENOTEMPTY
* if new parent is equal to or is a descendent of source
fail with -ELOOP
- * If it's an exchange, lock both the source and the target.
- * If the target exists, lock it. If the source is a non-directory,
- lock it. If we need to lock both, do so in inode pointer order.
+ * Lock both the source and the target provided they exist. If we
+ need to lock two inodes of different type (dir vs non-dir), we lock
+ the directory first. If we need to lock two inodes of the same type,
+ lock them in inode pointer order.
* call the method.
All ->i_rwsem are taken exclusive. Again, we might get away with locking
@@ -66,8 +67,9 @@ If no directory is its own ancestor, the scheme above is deadlock-free.
Proof:
- First of all, at any moment we have a partial ordering of the
- objects - A < B iff A is an ancestor of B.
+ First of all, at any moment we have a linear ordering of the
+ objects - A < B iff (A is an ancestor of B) or (B is not an ancestor
+ of A and ptr(A) < ptr(B)).
That ordering can change. However, the following is true:
diff --git a/fs/namei.c b/fs/namei.c
index 148570aabe74..6a5e26a529e1 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4731,7 +4731,7 @@ SYSCALL_DEFINE2(link, const char __user *, oldname, const char __user *, newname
* sb->s_vfs_rename_mutex. We might be more accurate, but that's another
* story.
* c) we have to lock _four_ objects - parents and victim (if it exists),
- * and source (if it is not a directory).
+ * and source.
* And that - after we got ->i_mutex on parents (until then we don't know
* whether the target exists). Solution: try to be smart with locking
* order for inodes. We rely on the fact that tree topology may change
@@ -4815,10 +4815,16 @@ int vfs_rename(struct renamedata *rd)
take_dentry_name_snapshot(&old_name, old_dentry);
dget(new_dentry);
- if (!is_dir || (flags & RENAME_EXCHANGE))
- lock_two_nondirectories(source, target);
- else if (target)
- inode_lock(target);
+ /*
+ * Lock all moved children. Moved directories may need to change parent
+ * pointer so they need the lock to prevent against concurrent
+ * directory changes moving parent pointer. For regular files we've
+ * historically always done this. The lockdep locking subclasses are
+ * somewhat arbitrary but RENAME_EXCHANGE in particular can swap
+ * regular files and directories so it's difficult to tell which
+ * subclasses to use.
+ */
+ lock_two_inodes(source, target, I_MUTEX_NORMAL, I_MUTEX_NONDIR2);
error = -EPERM;
if (IS_SWAPFILE(source) || (target && IS_SWAPFILE(target)))
@@ -4866,9 +4872,9 @@ int vfs_rename(struct renamedata *rd)
d_exchange(old_dentry, new_dentry);
}
out:
- if (!is_dir || (flags & RENAME_EXCHANGE))
- unlock_two_nondirectories(source, target);
- else if (target)
+ if (source)
+ inode_unlock(source);
+ if (target)
inode_unlock(target);
dput(new_dentry);
if (!error) {
--
2.35.3
Please cherry-pick 9b7c68b3911aef84afa4cbfc31bce20f10570d51
("netfilter: ctnetlink: Support offloaded conntrack entry deletion") to
all supported stable trees except for 4.14. The lack of it makes the
flowtables feature much more difficult (if not impossible) to use in
environments where connection tracking entries must be removed to
terminate flows. The diffstat is -8,+0 and the commit only removes
code that was not necessary to begin with.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab
From: Pierre Gondois <pierre.gondois(a)arm.com>
commit 3522340199cc060b70f0094e3039bdb43c3f6ee1 upstream
fetch_cache_info() tries to get the number of cache leaves/levels
for each CPU in order to pre-allocate memory for cacheinfo struct.
Allocating this memory later triggers a:
'BUG: sleeping function called from invalid context'
in PREEMPT_RT kernels.
If there is no cache related information available in DT or ACPI,
fetch_cache_info() fails and an error message is printed:
'Early cacheinfo failed, ret = ...'
Not having cache information should be a valid configuration.
Remove the error message if fetch_cache_info() fails with -ENOENT.
Suggested-by: Conor Dooley <conor.dooley(a)microchip.com>
Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
Signed-off-by: Pierre Gondois <pierre.gondois(a)arm.com>
Reviewed-by: Conor Dooley <conor.dooley(a)microchip.com>
Link: https://lore.kernel.org/r/20230414081453.244787-4-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla(a)arm.com>
Signed-off-by: Florian Fainelli <florian.fainelli(a)broadcom.com>
---
Changes in v2:
- Added missing upstream commit reference
- Added missing S-o-b
drivers/base/arch_topology.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 147fb7d4af96..b741b5ba82bd 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -843,10 +843,11 @@ void __init init_cpu_topology(void)
for_each_possible_cpu(cpu) {
ret = fetch_cache_info(cpu);
- if (ret) {
+ if (!ret)
+ continue;
+ else if (ret != -ENOENT)
pr_err("Early cacheinfo failed, ret = %d\n", ret);
- break;
- }
+ return;
}
}
--
2.25.1
The patch below does not apply to the 6.3-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.3.y
git checkout FETCH_HEAD
git cherry-pick -x 4badf2eb1e986bdbf34dd2f5d4c979553a86fe54
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052858-danger-kilowatt-29cc@gregkh' --subject-prefix 'PATCH 6.3.y' HEAD^..
Possible dependencies:
4badf2eb1e98 ("cpufreq: amd-pstate: Add ->fast_switch() callback")
2dd6d0ebf740 ("cpufreq: amd-pstate: Add guided autonomous mode")
3e6e07805764 ("Documentation: cpufreq: amd-pstate: Move amd_pstate param to alphabetical order")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 4badf2eb1e986bdbf34dd2f5d4c979553a86fe54 Mon Sep 17 00:00:00 2001
From: "Gautham R. Shenoy" <gautham.shenoy(a)amd.com>
Date: Wed, 17 May 2023 16:28:15 +0000
Subject: [PATCH] cpufreq: amd-pstate: Add ->fast_switch() callback
Schedutil normally calls the adjust_perf callback for drivers with
adjust_perf callback available and fast_switch_possible flag set.
However, when frequency invariance is disabled and schedutil tries to
invoke fast_switch. So, there is a chance of kernel crash if this
function pointer is not set. To protect against this scenario add
fast_switch callback to amd_pstate driver.
Fixes: 1d215f0319c2 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy(a)amd.com>
Signed-off-by: Wyes Karny <wyes.karny(a)amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 5a3d4aa0f45a..45711fc0a856 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -444,9 +444,8 @@ static int amd_pstate_verify(struct cpufreq_policy_data *policy)
return 0;
}
-static int amd_pstate_target(struct cpufreq_policy *policy,
- unsigned int target_freq,
- unsigned int relation)
+static int amd_pstate_update_freq(struct cpufreq_policy *policy,
+ unsigned int target_freq, bool fast_switch)
{
struct cpufreq_freqs freqs;
struct amd_cpudata *cpudata = policy->driver_data;
@@ -465,14 +464,37 @@ static int amd_pstate_target(struct cpufreq_policy *policy,
des_perf = DIV_ROUND_CLOSEST(target_freq * cap_perf,
cpudata->max_freq);
- cpufreq_freq_transition_begin(policy, &freqs);
+ WARN_ON(fast_switch && !policy->fast_switch_enabled);
+ /*
+ * If fast_switch is desired, then there aren't any registered
+ * transition notifiers. See comment for
+ * cpufreq_enable_fast_switch().
+ */
+ if (!fast_switch)
+ cpufreq_freq_transition_begin(policy, &freqs);
+
amd_pstate_update(cpudata, min_perf, des_perf,
- max_perf, false, policy->governor->flags);
- cpufreq_freq_transition_end(policy, &freqs, false);
+ max_perf, fast_switch, policy->governor->flags);
+
+ if (!fast_switch)
+ cpufreq_freq_transition_end(policy, &freqs, false);
return 0;
}
+static int amd_pstate_target(struct cpufreq_policy *policy,
+ unsigned int target_freq,
+ unsigned int relation)
+{
+ return amd_pstate_update_freq(policy, target_freq, false);
+}
+
+static unsigned int amd_pstate_fast_switch(struct cpufreq_policy *policy,
+ unsigned int target_freq)
+{
+ return amd_pstate_update_freq(policy, target_freq, true);
+}
+
static void amd_pstate_adjust_perf(unsigned int cpu,
unsigned long _min_perf,
unsigned long target_perf,
@@ -715,6 +737,7 @@ static int amd_pstate_cpu_exit(struct cpufreq_policy *policy)
freq_qos_remove_request(&cpudata->req[1]);
freq_qos_remove_request(&cpudata->req[0]);
+ policy->fast_switch_possible = false;
kfree(cpudata);
return 0;
@@ -1309,6 +1332,7 @@ static struct cpufreq_driver amd_pstate_driver = {
.flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
.verify = amd_pstate_verify,
.target = amd_pstate_target,
+ .fast_switch = amd_pstate_fast_switch,
.init = amd_pstate_cpu_init,
.exit = amd_pstate_cpu_exit,
.suspend = amd_pstate_cpu_suspend,
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 46930b7cc7727271c9c27aac1fdc97a8645e2d00
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023052844-splatter-emphasize-8de2@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
46930b7cc772 ("block: fix bio-cache for passthru IO")
7e2e355dd9c9 ("block: extend bio-cache for non-polled requests")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 46930b7cc7727271c9c27aac1fdc97a8645e2d00 Mon Sep 17 00:00:00 2001
From: Anuj Gupta <anuj20.g(a)samsung.com>
Date: Tue, 23 May 2023 16:47:09 +0530
Subject: [PATCH] block: fix bio-cache for passthru IO
commit <8af870aa5b847> ("block: enable bio caching use for passthru IO")
introduced bio-cache for passthru IO. In case when nr_vecs are greater
than BIO_INLINE_VECS, bio and bvecs are allocated from mempool (instead
of percpu cache) and REQ_ALLOC_CACHE is cleared. This causes the side
effect of not freeing bio/bvecs into mempool on completion.
This patch lets the passthru IO fallback to allocation using bio_kmalloc
when nr_vecs are greater than BIO_INLINE_VECS. The corresponding bio
is freed during call to blk_mq_map_bio_put during completion.
Cc: stable(a)vger.kernel.org # 6.1
fixes <8af870aa5b847> ("block: enable bio caching use for passthru IO")
Signed-off-by: Anuj Gupta <anuj20.g(a)samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k(a)samsung.com>
Link: https://lore.kernel.org/r/20230523111709.145676-1-anuj20.g@samsung.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/block/blk-map.c b/block/blk-map.c
index 04c55f1c492e..46eed2e627c3 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -248,7 +248,7 @@ static struct bio *blk_rq_map_bio_alloc(struct request *rq,
{
struct bio *bio;
- if (rq->cmd_flags & REQ_ALLOC_CACHE) {
+ if (rq->cmd_flags & REQ_ALLOC_CACHE && (nr_vecs <= BIO_INLINE_VECS)) {
bio = bio_alloc_bioset(NULL, nr_vecs, rq->cmd_flags, gfp_mask,
&fs_bio_set);
if (!bio)
On Tue, May 30, 2023 at 07:58:01AM +0000, Christian Loehle wrote:
> commit 003fb0a51162d940f25fc35e70b0996a12c9e08a upstream.
<snip>
You sent this, and the other commit, in html format, which made it
impossible to apply (and the mailing lists rejected your change as
well.)
Please fix up your email client and resend in non-html format and I will
be glad to queue this up.
thanks,
greg k-h