Hi Bjorn et al.
this series addresses a few issues that have come up with the helper
function that enables Atomic Op Requests to be initiated by PCI
enpoints:
A. Most in-tree users of this helper use it incorrectly [0].
B. On s390, Atomic Op Requests are enabled, although the helper
cannot know whether the root port is really supporting them.
C. Loop control in the helper function does not guarantee that a root
port's capabilities are ever checked against those requested by the
caller.
Address these issue with the following patches:
Patch 1: Make it harder to mis-use the enablement function,
Patch 2: Addresses issues B. and C.
I did test that issue B is fixed with these patches. Also, I verified
that Atomic Ops enablement on a Mellanox/Nvidia ConnectX-6 adapter
plugged straight into the root port of a x86 system still gets AtomicOp
Requests enabled. However, I did not test this with any PCIe switches
between root port and endpoint.
Ideally, both patches would be incorporated immediately, so we could
start correcting the mis-uses in the device drivers. I don't know of any
complaints when using Atomic Ops on devices where the driver is
mis-using the helper. Patch 2 however, is fixing an obseved issue.
[0]: https://lore.kernel.org/all/fbe34de16f5c0bf25a16f9819a57fdd81e5bb08c.camel@…
[1]: https://lore.kernel.org/all/20251105-mlxatomics-v1-0-10c71649e08d@linux.ibm…
Signed-off-by: Gerd Bayer <gbayer(a)linux.ibm.com>
---
Changes in v2:
- rebase to 6.19-rc1
- otherwise unchanged to v1
- Link to v1: https://lore.kernel.org/r/20251110-fix_pciatops-v1-0-edc58a57b62e@linux.ibm…
---
Gerd Bayer (2):
PCI: AtomicOps: Define valid root port capabilities
PCI: AtomicOps: Fix logic in enable function
drivers/pci/pci.c | 43 +++++++++++++++++++++----------------------
include/uapi/linux/pci_regs.h | 8 ++++++++
2 files changed, 29 insertions(+), 22 deletions(-)
---
base-commit: 40fbbd64bba6c6e7a72885d2f59b6a3be9991eeb
change-id: 20251106-fix_pciatops-7e8608eccb03
Best regards,
--
Gerd Bayer <gbayer(a)linux.ibm.com>
When fsl_edma_alloc_chan_resources() fails after clk_prepare_enable(),
the error paths only free IRQs and destroy the TCD pool, but forget to
call clk_disable_unprepare(). This causes the channel clock to remain
enabled, leaking power and resources.
Fix it by disabling the channel clock in the error unwind path.
Fixes: d8d4355861d8 ("dmaengine: fsl-edma: add i.MX8ULP edma support")
Cc: stable(a)vger.kernel.org
Suggested-by: Frank Li <Frank.Li(a)nxp.com>
Signed-off-by: Zhen Ni <zhen.ni(a)easystack.cn>
---
Changes in v2:
- Remove FSL_EDMA_DRV_HAS_CHCLK check
Changes in v3:
- Remove cleanup
Changes in v4:
- Re-send as a new thread
---
drivers/dma/fsl-edma-common.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
index 4976d7dde080..11655dcc4d6c 100644
--- a/drivers/dma/fsl-edma-common.c
+++ b/drivers/dma/fsl-edma-common.c
@@ -852,6 +852,7 @@ int fsl_edma_alloc_chan_resources(struct dma_chan *chan)
free_irq(fsl_chan->txirq, fsl_chan);
err_txirq:
dma_pool_destroy(fsl_chan->tcd_pool);
+ clk_disable_unprepare(fsl_chan->clk);
return ret;
}
--
2.20.1
After an innocuous optimization change in clang-22, allmodconfig (which
enables CONFIG_KASAN and CONFIG_WERROR) breaks with:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1724:6: error: stack frame size (3144) exceeds limit (3072) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
1724 | void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
| ^
With clang-21, this function was already pretty close to the existing
limit of 3072 bytes.
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1724:6: error: stack frame size (2904) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
1724 | void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
| ^
A similar situation occurred in dml2, which was resolved by
commit e4479aecf658 ("drm/amd/display: Increase sanitizer frame larger
than limit when compile testing with clang") by increasing the limit for
clang when compile testing with certain sanitizer enabled, so that
allmodconfig (an easy testing target) continues to work.
Apply that same change to the dml folder to clear up the warning for
allmodconfig, unbreaking the build.
Cc: stable(a)vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2135
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
drivers/gpu/drm/amd/display/dc/dml/Makefile | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index b357683b4255..268b5fbdb48b 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -30,7 +30,11 @@ dml_rcflags := $(CC_FLAGS_NO_FPU)
ifneq ($(CONFIG_FRAME_WARN),0)
ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y)
- frame_warn_limit := 3072
+ ifeq ($(CONFIG_CC_IS_CLANG)$(CONFIG_COMPILE_TEST),yy)
+ frame_warn_limit := 4096
+ else
+ frame_warn_limit := 3072
+ endif
else
frame_warn_limit := 2048
endif
---
base-commit: f24e96d69f5b9eb0f3b9c49e53c385c50729edfd
change-id: 20251213-dml-bump-frame-warn-clang-sanitizers-0a34fc916aec
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
The recent refactoring of where runtime PM is enabled done in commit
f1eb4e792bb1 ("spi: spi-cadence-quadspi: Enable pm runtime earlier to
avoid imbalance") made the fact that when we do a pm_runtime_disable()
in the error paths of probe() we can trigger a runtime disable which in
turn results in duplicate clock disables. This is particularly likely
to happen when there is missing or broken DT description for the flashes
attached to the controller.
Early on in the probe function we do a pm_runtime_get_noresume() since
the probe function leaves the device in a powered up state but in the
error path we can't assume that PM is enabled so we also manually
disable everything, including clocks. This means that when runtime PM is
active both it and the probe function release the same reference to the
main clock for the IP, triggering warnings from the clock subsystem:
[ 8.693719] clk:75:7 already disabled
[ 8.693791] WARNING: CPU: 1 PID: 185 at /usr/src/kernel/drivers/clk/clk.c:1188 clk_core_disable+0xa0/0xb
...
[ 8.694261] clk_core_disable+0xa0/0xb4 (P)
[ 8.694272] clk_disable+0x38/0x60
[ 8.694283] cqspi_probe+0x7c8/0xc5c [spi_cadence_quadspi]
[ 8.694309] platform_probe+0x5c/0xa4
Dealing with this issue properly is complicated by the fact that we
don't know if runtime PM is active so can't tell if it will disable the
clocks or not. We can, however, sidestep the issue for the flash
descriptions by moving their parsing to when we parse the controller
properties which also save us doing a bunch of setup which can never be
used so let's do that.
Reported-by: Francesco Dolcini <francesco(a)dolcini.it>
Closes: https://lore.kernel.org/r/20251201072844.GA6785@francesco-nb
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
Changes in v2:
- Switch to moving the DT parsing earlier so we avoid triggering the
clock referencing problems.
- Link to v1: https://patch.msgid.link/20251202-spi-cadence-qspi-runtime-pm-imbalance-v1-…
---
drivers/spi/spi-cadence-quadspi.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index af6d050da1c8..bdbeef05cd72 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1845,6 +1845,12 @@ static int cqspi_probe(struct platform_device *pdev)
return -ENODEV;
}
+ ret = cqspi_setup_flash(cqspi);
+ if (ret) {
+ dev_err(dev, "failed to setup flash parameters %d\n", ret);
+ return ret;
+ }
+
/* Obtain QSPI clock. */
cqspi->clk = devm_clk_get(dev, NULL);
if (IS_ERR(cqspi->clk)) {
@@ -1988,12 +1994,6 @@ static int cqspi_probe(struct platform_device *pdev)
pm_runtime_get_noresume(dev);
}
- ret = cqspi_setup_flash(cqspi);
- if (ret) {
- dev_err(dev, "failed to setup flash parameters %d\n", ret);
- goto probe_setup_failed;
- }
-
host->num_chipselect = cqspi->num_chipselect;
if (ddata && (ddata->quirks & CQSPI_SUPPORT_DEVICE_RESET))
---
base-commit: cebdea5fc60642a39a76c237257a7e6662336006
change-id: 20251202-spi-cadence-qspi-runtime-pm-imbalance-657740cf7eae
Best regards,
--
Mark Brown <broonie(a)kernel.org>
This reverts commit 25decf0469d4c91d90aa2e28d996aed276bfc622.
This software node change doesn't actually fix any current issues
with the kernel, it is an improvement to the lookup process rather
than fixing a live bug. It also causes a couple of regressions with
shipping laptops, which relied on the label based lookup.
There is a fix for the regressions in mainline, the first 5 patches
of [1]. However, those patches are fairly substantial changes and
given the patch causing the regression doesn't actually fix a bug
it seems better to just revert it in stable.
CC: stable(a)vger.kernel.org # 6.18
Link: https://lore.kernel.org/linux-sound/20251120-reset-gpios-swnodes-v7-0-a1004… [1]
Link: https://lore.kernel.org/stable/20251125102924.3612459-1-ckeepax@opensource.… [2]
Closes: https://github.com/thesofproject/linux/issues/5599
Closes: https://github.com/thesofproject/linux/issues/5603
Acked-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
Signed-off-by: Charles Keepax <ckeepax(a)opensource.cirrus.com>
---
This fix for the software node lookups is also required on 6.18 stable,
see the discussion for 6.12/6.17 in [2] for why we are doing a revert
rather than backporting the other fixes. The "full" fixes are merged in
6.19 so this should be the last kernel we need to push this revert onto.
Thanks,
Charles
drivers/gpio/gpiolib-swnode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpio/gpiolib-swnode.c b/drivers/gpio/gpiolib-swnode.c
index e3806db1c0e07..f21dbc28cf2c8 100644
--- a/drivers/gpio/gpiolib-swnode.c
+++ b/drivers/gpio/gpiolib-swnode.c
@@ -41,7 +41,7 @@ static struct gpio_device *swnode_get_gpio_device(struct fwnode_handle *fwnode)
!strcmp(gdev_node->name, GPIOLIB_SWNODE_UNDEFINED_NAME))
return ERR_PTR(-ENOENT);
- gdev = gpio_device_find_by_fwnode(fwnode);
+ gdev = gpio_device_find_by_label(gdev_node->name);
return gdev ?: ERR_PTR(-EPROBE_DEFER);
}
--
2.47.3
The driver_override_show() function reads the driver_override string
without holding the device_lock. However, driver_override_store() uses
driver_set_override(), which modifies and frees the string while holding
the device_lock.
This can result in a concurrent use-after-free if the string is freed
by the store function while being read by the show function.
Fix this by holding the device_lock around the read operation.
Fixes: 1f86a00c1159 ("bus/fsl-mc: add support for 'driver_override' in the mc-bus")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gui-Dong Han <hanguidong02(a)gmail.com>
---
I verified this with a stress test that continuously writes/reads the
attribute. It triggered KASAN and leaked bytes like a0 f4 81 9f a3 ff ff
(likely kernel pointers). Since driver_override is world-readable (0644),
this allows unprivileged users to leak kernel pointers and bypass KASLR.
Similar races were fixed in other buses (e.g., commits 9561475db680 and
91d44c1afc61). Currently, 9 of 11 buses handle this correctly; this patch
fixes one of the remaining two.
---
drivers/bus/fsl-mc/fsl-mc-bus.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 25845c04e562..a97baf2cbcdd 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -202,8 +202,12 @@ static ssize_t driver_override_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
+ ssize_t len;
- return sysfs_emit(buf, "%s\n", mc_dev->driver_override);
+ device_lock(dev);
+ len = sysfs_emit(buf, "%s\n", mc_dev->driver_override);
+ device_unlock(dev);
+ return len;
}
static DEVICE_ATTR_RW(driver_override);
--
2.43.0
In scmi_devm_notifier_unregister(), the notifier-block argument was ignored
and never passed to devres_release(). As a result, the function always
returned -ENOENT and failed to unregister the notifier.
Drivers that depend on this helper for teardown could therefore hit
unexpected failures, including kernel panics.
Commit 264a2c520628 ("firmware: arm_scmi: Simplify scmi_devm_notifier_unregister")
removed the faulty code path during refactoring and hence this fix is not
required upstream.
Cc: <stable(a)vger.kernel.org> # 5.15.x, 6.1.x, and 6.6.x
Fixes: 5ad3d1cf7d34 ("firmware: arm_scmi: Introduce new devres notification ops")
Reviewed-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Reviewed-by: Cristian Marussi <cristian.marussi(a)arm.com>
Signed-off-by: Amitai Gottlieb <amitaig(a)hailo.ai>
---
v2:
* changed the wording of commit-message after suggestions made
by Sudeep Holla <sudeep.holla(a)arm.com>
drivers/firmware/arm_scmi/notify.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/firmware/arm_scmi/notify.c b/drivers/firmware/arm_scmi/notify.c
index 0efd20cd9d69..4782b115e6ec 100644
--- a/drivers/firmware/arm_scmi/notify.c
+++ b/drivers/firmware/arm_scmi/notify.c
@@ -1539,6 +1539,7 @@ static int scmi_devm_notifier_unregister(struct scmi_device *sdev,
dres.handle = sdev->handle;
dres.proto_id = proto_id;
dres.evt_id = evt_id;
+ dres.nb = nb;
if (src_id) {
dres.__src_id = *src_id;
dres.src_id = &dres.__src_id;
--
2.34.1
The sysfs buffer passed to alarms_store() is allocated with 'size + 1'
bytes and a NUL terminator is appended. However, the 'size' argument
does not account for this extra byte. The original code then allocated
'size' bytes and used strcpy() to copy 'buf', which always writes one
byte past the allocated buffer since strcpy() copies until the NUL
terminator at index 'size'.
Fix this by parsing the 'buf' parameter directly using simple_strtoll()
without allocating any intermediate memory or string copying. This
removes the overflow while simplifying the code.
Cc: stable(a)vger.kernel.org
Fixes: e2c94d6f5720 ("w1_therm: adding alarm sysfs entry")
Signed-off-by: Thorsten Blum <thorsten.blum(a)linux.dev>
---
Compile-tested only.
Changes in v4:
- Use simple_strtoll because kstrtoint also parses long long internally
- Return -ERANGE in addition to -EINVAL to match kstrtoint's behavior
- Remove any changes unrelated to fixing the buffer overflow (Krzysztof)
while maintaining the same behavior and return values as before
- Link to v3: https://lore.kernel.org/lkml/20251030155614.447905-1-thorsten.blum@linux.de…
Changes in v3:
- Add integer range check for 'temp' to match kstrtoint() behavior
- Explicitly cast 'temp' to int when calling int_to_short()
- Link to v2: https://lore.kernel.org/lkml/20251029130045.70127-2-thorsten.blum@linux.dev/
Changes in v2:
- Fix buffer overflow instead of truncating the copy using strscpy()
- Parse buffer directly using simple_strtol() as suggested by David
- Update patch subject and description
- Link to v1: https://lore.kernel.org/lkml/20251017170047.114224-2-thorsten.blum@linux.de…
---
drivers/w1/slaves/w1_therm.c | 64 ++++++++++++------------------------
1 file changed, 21 insertions(+), 43 deletions(-)
diff --git a/drivers/w1/slaves/w1_therm.c b/drivers/w1/slaves/w1_therm.c
index 9ccedb3264fb..5707fa34e804 100644
--- a/drivers/w1/slaves/w1_therm.c
+++ b/drivers/w1/slaves/w1_therm.c
@@ -1836,55 +1836,36 @@ static ssize_t alarms_store(struct device *device,
struct w1_slave *sl = dev_to_w1_slave(device);
struct therm_info info;
u8 new_config_register[3]; /* array of data to be written */
- int temp, ret;
- char *token = NULL;
+ long long temp;
+ int ret = 0;
s8 tl, th; /* 1 byte per value + temp ring order */
- char *p_args, *orig;
-
- p_args = orig = kmalloc(size, GFP_KERNEL);
- /* Safe string copys as buf is const */
- if (!p_args) {
- dev_warn(device,
- "%s: error unable to allocate memory %d\n",
- __func__, -ENOMEM);
- return size;
- }
- strcpy(p_args, buf);
-
- /* Split string using space char */
- token = strsep(&p_args, " ");
-
- if (!token) {
- dev_info(device,
- "%s: error parsing args %d\n", __func__, -EINVAL);
- goto free_m;
- }
-
- /* Convert 1st entry to int */
- ret = kstrtoint (token, 10, &temp);
+ const char *p = buf;
+ char *endp;
+
+ temp = simple_strtoll(p, &endp, 10);
+ if (p == endp || *endp != ' ')
+ ret = -EINVAL;
+ else if (temp < INT_MIN || temp > INT_MAX)
+ ret = -ERANGE;
if (ret) {
dev_info(device,
"%s: error parsing args %d\n", __func__, ret);
- goto free_m;
+ goto err;
}
tl = int_to_short(temp);
- /* Split string using space char */
- token = strsep(&p_args, " ");
- if (!token) {
- dev_info(device,
- "%s: error parsing args %d\n", __func__, -EINVAL);
- goto free_m;
- }
- /* Convert 2nd entry to int */
- ret = kstrtoint (token, 10, &temp);
+ p = endp + 1;
+ temp = simple_strtoll(p, &endp, 10);
+ if (p == endp)
+ ret = -EINVAL;
+ else if (temp < INT_MIN || temp > INT_MAX)
+ ret = -ERANGE;
if (ret) {
dev_info(device,
"%s: error parsing args %d\n", __func__, ret);
- goto free_m;
+ goto err;
}
-
/* Prepare to cast to short by eliminating out of range values */
th = int_to_short(temp);
@@ -1905,7 +1886,7 @@ static ssize_t alarms_store(struct device *device,
dev_info(device,
"%s: error reading from the slave device %d\n",
__func__, ret);
- goto free_m;
+ goto err;
}
/* Write data in the device RAM */
@@ -1913,7 +1894,7 @@ static ssize_t alarms_store(struct device *device,
dev_info(device,
"%s: Device not supported by the driver %d\n",
__func__, -ENODEV);
- goto free_m;
+ goto err;
}
ret = SLAVE_SPECIFIC_FUNC(sl)->write_data(sl, new_config_register);
@@ -1922,10 +1903,7 @@ static ssize_t alarms_store(struct device *device,
"%s: error writing to the slave device %d\n",
__func__, ret);
-free_m:
- /* free allocated memory */
- kfree(orig);
-
+err:
return size;
}
--
2.51.1