There is a race condition generic_shutdown_super() and
__btrfs_run_defrag_inode().
Consider the following scenario:
umount thread: btrfs-cleaner thread:
btrfs_run_delayed_iputs()
->run_delayed_iput_locked()
->iput(inode)
// Here the inode (ie ino 261) will be cleared and freed
btrfs_kill_super()
->generic_shutdown_super()
btrfs_run_defrag_inodes()
->__btrfs_run_defrag_inode()
->btrfs_iget(ino)
// The inode 261 was recreated with i_count=1
// and added to the sb list
->evict_inodes(sb) // After some work
// inode 261 was added ->iput(inode)
// to the dispose list ->iput_funal()
->evict(inode) ->evict(inode)
Now, we have two threads simultaneously evicting
the same inode, which led to a bug.
The above behavior can be confirmed by the log I added for debugging
and the log printed when BUG was triggered.
Due to space limitations, I cannot paste the full diff and here is a
brief describtion.
First, within __btrfs_run_defrag_inode(), set inode->i_state |= (1<<19)
just before calling iput().
Within the dispose_list(), check the flag, if the flag was set, then
pr_info("bug! double evict! crash will happen! state is 0x%lx\n", inode->i_state);
Here is the printed log when the BUG was triggered:
[ 190.686726][ T2336] bug! double evict! crash will happen! state is 0x80020
[ 190.687647][ T2336] ------------[ cut here ]------------
[ 190.688294][ T2336] kernel BUG at fs/inode.c:626!
[ 190.688939][ T2336] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 190.689792][ T2336] CPU: 1 PID: 2336 Comm: a.out Not tainted 6.10.0-rc2-00223-g0c529ab65ef8-dirty #109
[ 190.690894][ T2336] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 190.692111][ T2336] RIP: 0010:clear_inode+0x15b/0x190
// some logs...
[ 190.704501][ T2336] btrfs_evict_inode+0x529/0xe80
[ 190.706966][ T2336] evict+0x2ed/0x6c0
[ 190.707209][ T2336] dispose_list+0x62/0x260
[ 190.707490][ T2336] evict_inodes+0x34e/0x450
To prevent this behavior, we need to set BTRFS_FS_CLOSING_START
before kill_anon_super() to ensure that
btrfs_run_defrag_inodes() doesn't continue working after unmount.
Reported-and-tested-by: syzbot+67ba3c42bcbb4665d3ad(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=67ba3c42bcbb4665d3ad
CC: stable(a)vger.kernel.org
Fixes: c146afad2c7f ("Btrfs: mount ro and remount support")
Signed-off-by: Julian Sun <sunjunchao2870(a)gmail.com>
---
fs/btrfs/super.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index f05cce7c8b8d..f7e87fe583ab 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2093,6 +2093,7 @@ static int btrfs_get_tree(struct fs_context *fc)
static void btrfs_kill_super(struct super_block *sb)
{
struct btrfs_fs_info *fs_info = btrfs_sb(sb);
+ set_bit(BTRFS_FS_CLOSING_START, &fs_info->flags);
kill_anon_super(sb);
btrfs_free_fs_info(fs_info);
}
--
2.39.2
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x c0a1ef9c5be72ff28a5413deb1b3e1a066593c13
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024082600-trodden-majesty-7be0@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
c0a1ef9c5be7 ("thermal: of: Fix OF node leak in of_thermal_zone_find() error paths")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c0a1ef9c5be72ff28a5413deb1b3e1a066593c13 Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Date: Wed, 14 Aug 2024 21:58:23 +0200
Subject: [PATCH] thermal: of: Fix OF node leak in of_thermal_zone_find() error
paths
Terminating for_each_available_child_of_node() loop requires dropping OF
node reference, so bailing out on errors misses this. Solve the OF node
reference leak with scoped for_each_available_child_of_node_scoped().
Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Reviewed-by: Chen-Yu Tsai <wenst(a)chromium.org>
Reviewed-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
Link: https://patch.msgid.link/20240814195823.437597-3-krzysztof.kozlowski@linaro…
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index b08a9b64718d..1f252692815a 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -184,14 +184,14 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
* Search for each thermal zone, a defined sensor
* corresponding to the one passed as parameter
*/
- for_each_available_child_of_node(np, tz) {
+ for_each_available_child_of_node_scoped(np, child) {
int count, i;
- count = of_count_phandle_with_args(tz, "thermal-sensors",
+ count = of_count_phandle_with_args(child, "thermal-sensors",
"#thermal-sensor-cells");
if (count <= 0) {
- pr_err("%pOFn: missing thermal sensor\n", tz);
+ pr_err("%pOFn: missing thermal sensor\n", child);
tz = ERR_PTR(-EINVAL);
goto out;
}
@@ -200,18 +200,19 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
int ret;
- ret = of_parse_phandle_with_args(tz, "thermal-sensors",
+ ret = of_parse_phandle_with_args(child, "thermal-sensors",
"#thermal-sensor-cells",
i, &sensor_specs);
if (ret < 0) {
- pr_err("%pOFn: Failed to read thermal-sensors cells: %d\n", tz, ret);
+ pr_err("%pOFn: Failed to read thermal-sensors cells: %d\n", child, ret);
tz = ERR_PTR(ret);
goto out;
}
if ((sensor == sensor_specs.np) && id == (sensor_specs.args_count ?
sensor_specs.args[0] : 0)) {
- pr_debug("sensor %pOFn id=%d belongs to %pOFn\n", sensor, id, tz);
+ pr_debug("sensor %pOFn id=%d belongs to %pOFn\n", sensor, id, child);
+ tz = no_free_ptr(child);
goto out;
}
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x c0a1ef9c5be72ff28a5413deb1b3e1a066593c13
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024082659-veto-ladies-d1b3@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
c0a1ef9c5be7 ("thermal: of: Fix OF node leak in of_thermal_zone_find() error paths")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c0a1ef9c5be72ff28a5413deb1b3e1a066593c13 Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Date: Wed, 14 Aug 2024 21:58:23 +0200
Subject: [PATCH] thermal: of: Fix OF node leak in of_thermal_zone_find() error
paths
Terminating for_each_available_child_of_node() loop requires dropping OF
node reference, so bailing out on errors misses this. Solve the OF node
reference leak with scoped for_each_available_child_of_node_scoped().
Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Reviewed-by: Chen-Yu Tsai <wenst(a)chromium.org>
Reviewed-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
Link: https://patch.msgid.link/20240814195823.437597-3-krzysztof.kozlowski@linaro…
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index b08a9b64718d..1f252692815a 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -184,14 +184,14 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
* Search for each thermal zone, a defined sensor
* corresponding to the one passed as parameter
*/
- for_each_available_child_of_node(np, tz) {
+ for_each_available_child_of_node_scoped(np, child) {
int count, i;
- count = of_count_phandle_with_args(tz, "thermal-sensors",
+ count = of_count_phandle_with_args(child, "thermal-sensors",
"#thermal-sensor-cells");
if (count <= 0) {
- pr_err("%pOFn: missing thermal sensor\n", tz);
+ pr_err("%pOFn: missing thermal sensor\n", child);
tz = ERR_PTR(-EINVAL);
goto out;
}
@@ -200,18 +200,19 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
int ret;
- ret = of_parse_phandle_with_args(tz, "thermal-sensors",
+ ret = of_parse_phandle_with_args(child, "thermal-sensors",
"#thermal-sensor-cells",
i, &sensor_specs);
if (ret < 0) {
- pr_err("%pOFn: Failed to read thermal-sensors cells: %d\n", tz, ret);
+ pr_err("%pOFn: Failed to read thermal-sensors cells: %d\n", child, ret);
tz = ERR_PTR(ret);
goto out;
}
if ((sensor == sensor_specs.np) && id == (sensor_specs.args_count ?
sensor_specs.args[0] : 0)) {
- pr_debug("sensor %pOFn id=%d belongs to %pOFn\n", sensor, id, tz);
+ pr_debug("sensor %pOFn id=%d belongs to %pOFn\n", sensor, id, child);
+ tz = no_free_ptr(child);
goto out;
}
}
Most firmware names are hardcoded strings, or are constructed from fairly
constrained format strings where the dynamic parts are just some hex
numbers or such.
However, there are a couple codepaths in the kernel where firmware file
names contain string components that are passed through from a device or
semi-privileged userspace; the ones I could find (not counting interfaces
that require root privileges) are:
- lpfc_sli4_request_firmware_update() seems to construct the firmware
filename from "ModelName", a string that was previously parsed out of
some descriptor ("Vital Product Data") in lpfc_fill_vpd()
- nfp_net_fw_find() seems to construct a firmware filename from a model
name coming from nfp_hwinfo_lookup(pf->hwinfo, "nffw.partno"), which I
think parses some descriptor that was read from the device.
(But this case likely isn't exploitable because the format string looks
like "netronome/nic_%s", and there shouldn't be any *folders* starting
with "netronome/nic_". The previous case was different because there,
the "%s" is *at the start* of the format string.)
- module_flash_fw_schedule() is reachable from the
ETHTOOL_MSG_MODULE_FW_FLASH_ACT netlink command, which is marked as
GENL_UNS_ADMIN_PERM (meaning CAP_NET_ADMIN inside a user namespace is
enough to pass the privilege check), and takes a userspace-provided
firmware name.
(But I think to reach this case, you need to have CAP_NET_ADMIN over a
network namespace that a special kind of ethernet device is mapped into,
so I think this is not a viable attack path in practice.)
Fix it by rejecting any firmware names containing ".." path components.
For what it's worth, I went looking and haven't found any USB device
drivers that use the firmware loader dangerously.
Cc: stable(a)vger.kernel.org
Fixes: abb139e75c2c ("firmware: teach the kernel to load firmware files directly from the filesystem")
Signed-off-by: Jann Horn <jannh(a)google.com>
---
Changes in v2:
- describe fix in commit message (dakr)
- write check more clearly and with comment in separate helper (dakr)
- document new restriction in comment above request_firmware() (dakr)
- warn when new restriction is triggered
- Link to v1: https://lore.kernel.org/r/20240820-firmware-traversal-v1-1-8699ffaa9276@goo…
---
drivers/base/firmware_loader/main.c | 41 +++++++++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)
diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
index a03ee4b11134..dd47ce9a761f 100644
--- a/drivers/base/firmware_loader/main.c
+++ b/drivers/base/firmware_loader/main.c
@@ -849,6 +849,37 @@ static void fw_log_firmware_info(const struct firmware *fw, const char *name,
{}
#endif
+/*
+ * Reject firmware file names with ".." path components.
+ * There are drivers that construct firmware file names from device-supplied
+ * strings, and we don't want some device to be able to tell us "I would like to
+ * be sent my firmware from ../../../etc/shadow, please".
+ *
+ * Search for ".." surrounded by either '/' or start/end of string.
+ *
+ * This intentionally only looks at the firmware name, not at the firmware base
+ * directory or at symlink contents.
+ */
+static bool name_contains_dotdot(const char *name)
+{
+ size_t name_len = strlen(name);
+ size_t i;
+
+ if (name_len < 2)
+ return false;
+ for (i = 0; i < name_len - 1; i++) {
+ /* do we see a ".." sequence? */
+ if (name[i] != '.' || name[i+1] != '.')
+ continue;
+
+ /* is it a path component? */
+ if ((i == 0 || name[i-1] == '/') &&
+ (i == name_len - 2 || name[i+2] == '/'))
+ return true;
+ }
+ return false;
+}
+
/* called from request_firmware() and request_firmware_work_func() */
static int
_request_firmware(const struct firmware **firmware_p, const char *name,
@@ -869,6 +900,14 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
goto out;
}
+ if (name_contains_dotdot(name)) {
+ dev_warn(device,
+ "Firmware load for '%s' refused, path contains '..' component",
+ name);
+ ret = -EINVAL;
+ goto out;
+ }
+
ret = _request_firmware_prepare(&fw, name, device, buf, size,
offset, opt_flags);
if (ret <= 0) /* error or already assigned */
@@ -946,6 +985,8 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
* @name will be used as $FIRMWARE in the uevent environment and
* should be distinctive enough not to be confused with any other
* firmware image for this or any other device.
+ * It must not contain any ".." path components - "foo/bar..bin" is
+ * allowed, but "foo/../bar.bin" is not.
*
* Caller must hold the reference count of @device.
*
---
base-commit: b0da640826ba3b6506b4996a6b23a429235e6923
change-id: 20240820-firmware-traversal-6df8501b0fe4
--
Jann Horn <jannh(a)google.com>
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x afc954fd223ded70b1fa000767e2531db55cce58
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024082655-virtuous-reggae-54f4@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
afc954fd223d ("thermal: of: Fix OF node leak in thermal_of_trips_init() error path")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From afc954fd223ded70b1fa000767e2531db55cce58 Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Date: Wed, 14 Aug 2024 21:58:21 +0200
Subject: [PATCH] thermal: of: Fix OF node leak in thermal_of_trips_init()
error path
Terminating for_each_child_of_node() loop requires dropping OF node
reference, so bailing out after thermal_of_populate_trip() error misses
this. Solve the OF node reference leak with scoped
for_each_child_of_node_scoped().
Fixes: d0c75fa2c17f ("thermal/of: Initialize trip points separately")
Cc: All applicable <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Reviewed-by: Chen-Yu Tsai <wenst(a)chromium.org>
Reviewed-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
Link: https://patch.msgid.link/20240814195823.437597-1-krzysztof.kozlowski@linaro…
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index aa34b6e82e26..30f8d6e70484 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -125,7 +125,7 @@ static int thermal_of_populate_trip(struct device_node *np,
static struct thermal_trip *thermal_of_trips_init(struct device_node *np, int *ntrips)
{
struct thermal_trip *tt;
- struct device_node *trips, *trip;
+ struct device_node *trips;
int ret, count;
trips = of_get_child_by_name(np, "trips");
@@ -150,7 +150,7 @@ static struct thermal_trip *thermal_of_trips_init(struct device_node *np, int *n
*ntrips = count;
count = 0;
- for_each_child_of_node(trips, trip) {
+ for_each_child_of_node_scoped(trips, trip) {
ret = thermal_of_populate_trip(trip, &tt[count++]);
if (ret)
goto out_kfree;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x afc954fd223ded70b1fa000767e2531db55cce58
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024082654-puppy-crying-cf89@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
afc954fd223d ("thermal: of: Fix OF node leak in thermal_of_trips_init() error path")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From afc954fd223ded70b1fa000767e2531db55cce58 Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Date: Wed, 14 Aug 2024 21:58:21 +0200
Subject: [PATCH] thermal: of: Fix OF node leak in thermal_of_trips_init()
error path
Terminating for_each_child_of_node() loop requires dropping OF node
reference, so bailing out after thermal_of_populate_trip() error misses
this. Solve the OF node reference leak with scoped
for_each_child_of_node_scoped().
Fixes: d0c75fa2c17f ("thermal/of: Initialize trip points separately")
Cc: All applicable <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Reviewed-by: Chen-Yu Tsai <wenst(a)chromium.org>
Reviewed-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
Link: https://patch.msgid.link/20240814195823.437597-1-krzysztof.kozlowski@linaro…
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index aa34b6e82e26..30f8d6e70484 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -125,7 +125,7 @@ static int thermal_of_populate_trip(struct device_node *np,
static struct thermal_trip *thermal_of_trips_init(struct device_node *np, int *ntrips)
{
struct thermal_trip *tt;
- struct device_node *trips, *trip;
+ struct device_node *trips;
int ret, count;
trips = of_get_child_by_name(np, "trips");
@@ -150,7 +150,7 @@ static struct thermal_trip *thermal_of_trips_init(struct device_node *np, int *n
*ntrips = count;
count = 0;
- for_each_child_of_node(trips, trip) {
+ for_each_child_of_node_scoped(trips, trip) {
ret = thermal_of_populate_trip(trip, &tt[count++]);
if (ret)
goto out_kfree;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x e3e4bf58bad1576ac732a1429f53e3d4bfb82b4b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024082650-decompose-customer-0b61@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
e3e4bf58bad1 ("drm/amdgpu/sdma5.2: limit wptr workaround to sdma 5.2.1")
a03ebf116303 ("drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell")
94b1e028e15c ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e3e4bf58bad1576ac732a1429f53e3d4bfb82b4b Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher(a)amd.com>
Date: Wed, 14 Aug 2024 10:28:24 -0400
Subject: [PATCH] drm/amdgpu/sdma5.2: limit wptr workaround to sdma 5.2.1
The workaround seems to cause stability issues on other
SDMA 5.2.x IPs.
Fixes: a03ebf116303 ("drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3556
Acked-by: Ruijing Dong <ruijing.dong(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
(cherry picked from commit 2dc3851ef7d9c5439ea8e9623fc36878f3b40649)
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index af1e90159ce3..2e72d445415f 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -176,14 +176,16 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring *ring)
DRM_DEBUG("calling WDOORBELL64(0x%08x, 0x%016llx)\n",
ring->doorbell_index, ring->wptr << 2);
WDOORBELL64(ring->doorbell_index, ring->wptr << 2);
- /* SDMA seems to miss doorbells sometimes when powergating kicks in.
- * Updating the wptr directly will wake it. This is only safe because
- * we disallow gfxoff in begin_use() and then allow it again in end_use().
- */
- WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR),
- lower_32_bits(ring->wptr << 2));
- WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR_HI),
- upper_32_bits(ring->wptr << 2));
+ if (amdgpu_ip_version(adev, SDMA0_HWIP, 0) == IP_VERSION(5, 2, 1)) {
+ /* SDMA seems to miss doorbells sometimes when powergating kicks in.
+ * Updating the wptr directly will wake it. This is only safe because
+ * we disallow gfxoff in begin_use() and then allow it again in end_use().
+ */
+ WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR),
+ lower_32_bits(ring->wptr << 2));
+ WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR_HI),
+ upper_32_bits(ring->wptr << 2));
+ }
} else {
DRM_DEBUG("Not using doorbell -- "
"mmSDMA%i_GFX_RB_WPTR == 0x%08x "
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x e3e4bf58bad1576ac732a1429f53e3d4bfb82b4b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024082649-shading-anyhow-2d3e@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
e3e4bf58bad1 ("drm/amdgpu/sdma5.2: limit wptr workaround to sdma 5.2.1")
a03ebf116303 ("drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell")
94b1e028e15c ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e3e4bf58bad1576ac732a1429f53e3d4bfb82b4b Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher(a)amd.com>
Date: Wed, 14 Aug 2024 10:28:24 -0400
Subject: [PATCH] drm/amdgpu/sdma5.2: limit wptr workaround to sdma 5.2.1
The workaround seems to cause stability issues on other
SDMA 5.2.x IPs.
Fixes: a03ebf116303 ("drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3556
Acked-by: Ruijing Dong <ruijing.dong(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
(cherry picked from commit 2dc3851ef7d9c5439ea8e9623fc36878f3b40649)
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index af1e90159ce3..2e72d445415f 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -176,14 +176,16 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring *ring)
DRM_DEBUG("calling WDOORBELL64(0x%08x, 0x%016llx)\n",
ring->doorbell_index, ring->wptr << 2);
WDOORBELL64(ring->doorbell_index, ring->wptr << 2);
- /* SDMA seems to miss doorbells sometimes when powergating kicks in.
- * Updating the wptr directly will wake it. This is only safe because
- * we disallow gfxoff in begin_use() and then allow it again in end_use().
- */
- WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR),
- lower_32_bits(ring->wptr << 2));
- WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR_HI),
- upper_32_bits(ring->wptr << 2));
+ if (amdgpu_ip_version(adev, SDMA0_HWIP, 0) == IP_VERSION(5, 2, 1)) {
+ /* SDMA seems to miss doorbells sometimes when powergating kicks in.
+ * Updating the wptr directly will wake it. This is only safe because
+ * we disallow gfxoff in begin_use() and then allow it again in end_use().
+ */
+ WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR),
+ lower_32_bits(ring->wptr << 2));
+ WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR_HI),
+ upper_32_bits(ring->wptr << 2));
+ }
} else {
DRM_DEBUG("Not using doorbell -- "
"mmSDMA%i_GFX_RB_WPTR == 0x%08x "
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x e3e4bf58bad1576ac732a1429f53e3d4bfb82b4b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024082648-curry-rack-be6b@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
e3e4bf58bad1 ("drm/amdgpu/sdma5.2: limit wptr workaround to sdma 5.2.1")
a03ebf116303 ("drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell")
94b1e028e15c ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e3e4bf58bad1576ac732a1429f53e3d4bfb82b4b Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher(a)amd.com>
Date: Wed, 14 Aug 2024 10:28:24 -0400
Subject: [PATCH] drm/amdgpu/sdma5.2: limit wptr workaround to sdma 5.2.1
The workaround seems to cause stability issues on other
SDMA 5.2.x IPs.
Fixes: a03ebf116303 ("drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3556
Acked-by: Ruijing Dong <ruijing.dong(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
(cherry picked from commit 2dc3851ef7d9c5439ea8e9623fc36878f3b40649)
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index af1e90159ce3..2e72d445415f 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -176,14 +176,16 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring *ring)
DRM_DEBUG("calling WDOORBELL64(0x%08x, 0x%016llx)\n",
ring->doorbell_index, ring->wptr << 2);
WDOORBELL64(ring->doorbell_index, ring->wptr << 2);
- /* SDMA seems to miss doorbells sometimes when powergating kicks in.
- * Updating the wptr directly will wake it. This is only safe because
- * we disallow gfxoff in begin_use() and then allow it again in end_use().
- */
- WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR),
- lower_32_bits(ring->wptr << 2));
- WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR_HI),
- upper_32_bits(ring->wptr << 2));
+ if (amdgpu_ip_version(adev, SDMA0_HWIP, 0) == IP_VERSION(5, 2, 1)) {
+ /* SDMA seems to miss doorbells sometimes when powergating kicks in.
+ * Updating the wptr directly will wake it. This is only safe because
+ * we disallow gfxoff in begin_use() and then allow it again in end_use().
+ */
+ WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR),
+ lower_32_bits(ring->wptr << 2));
+ WREG32(sdma_v5_2_get_reg_offset(adev, ring->me, mmSDMA0_GFX_RB_WPTR_HI),
+ upper_32_bits(ring->wptr << 2));
+ }
} else {
DRM_DEBUG("Not using doorbell -- "
"mmSDMA%i_GFX_RB_WPTR == 0x%08x "