The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x b04f0d89e880bc2cca6a5c73cf287082c91878da
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025052402-pantomime-relative-1605@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b04f0d89e880bc2cca6a5c73cf287082c91878da Mon Sep 17 00:00:00 2001
From: Gabor Juhos <j4g8y7(a)gmail.com>
Date: Fri, 9 May 2025 15:48:52 +0200
Subject: [PATCH] arm64: dts: marvell: uDPU: define pinctrl state for alarm
LEDs
The two alarm LEDs of on the uDPU board are stopped working since
commit 78efa53e715e ("leds: Init leds class earlier").
The LEDs are driven by the GPIO{15,16} pins of the North Bridge
GPIO controller. These pins are part of the 'spi_quad' pin group
for which the 'spi' function is selected via the default pinctrl
state of the 'spi' node. This is wrong however, since in order to
allow controlling the LEDs, the pins should use the 'gpio' function.
Before the commit mentined above, the 'spi' function is selected
first by the pinctrl core before probing the spi driver, but then
it gets overridden to 'gpio' implicitly via the
devm_gpiod_get_index_optional() call from the 'leds-gpio' driver.
After the commit, the LED subsystem gets initialized before the
SPI subsystem, so the function of the pin group remains 'spi'
which in turn prevents controlling of the LEDs.
Despite the change of the initialization order, the root cause is
that the pinctrl state definition is wrong since its initial commit
0d45062cfc89 ("arm64: dts: marvell: Add device tree for uDPU board"),
To fix the problem, override the function in the 'spi_quad_pins'
node to 'gpio' and move the pinctrl state definition from the
'spi' node into the 'leds' node.
Cc: stable(a)vger.kernel.org # needs adjustment for < 6.1
Fixes: 0d45062cfc89 ("arm64: dts: marvell: Add device tree for uDPU board")
Signed-off-by: Gabor Juhos <j4g8y7(a)gmail.com>
Signed-off-by: Imre Kaloz <kaloz(a)openwrt.org>
Signed-off-by: Gregory CLEMENT <gregory.clement(a)bootlin.com>
diff --git a/arch/arm64/boot/dts/marvell/armada-3720-uDPU.dtsi b/arch/arm64/boot/dts/marvell/armada-3720-uDPU.dtsi
index 3a9b6907185d..242820845707 100644
--- a/arch/arm64/boot/dts/marvell/armada-3720-uDPU.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-3720-uDPU.dtsi
@@ -26,6 +26,8 @@ memory@0 {
leds {
compatible = "gpio-leds";
+ pinctrl-names = "default";
+ pinctrl-0 = <&spi_quad_pins>;
led-power1 {
label = "udpu:green:power";
@@ -82,8 +84,6 @@ &sdhci0 {
&spi0 {
status = "okay";
- pinctrl-names = "default";
- pinctrl-0 = <&spi_quad_pins>;
flash@0 {
compatible = "jedec,spi-nor";
@@ -108,6 +108,10 @@ partition@180000 {
};
};
+&spi_quad_pins {
+ function = "gpio";
+};
+
&pinctrl_nb {
i2c2_recovery_pins: i2c2-recovery-pins {
groups = "i2c2";
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x b04f0d89e880bc2cca6a5c73cf287082c91878da
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025052401-unbiased-designate-2089@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b04f0d89e880bc2cca6a5c73cf287082c91878da Mon Sep 17 00:00:00 2001
From: Gabor Juhos <j4g8y7(a)gmail.com>
Date: Fri, 9 May 2025 15:48:52 +0200
Subject: [PATCH] arm64: dts: marvell: uDPU: define pinctrl state for alarm
LEDs
The two alarm LEDs of on the uDPU board are stopped working since
commit 78efa53e715e ("leds: Init leds class earlier").
The LEDs are driven by the GPIO{15,16} pins of the North Bridge
GPIO controller. These pins are part of the 'spi_quad' pin group
for which the 'spi' function is selected via the default pinctrl
state of the 'spi' node. This is wrong however, since in order to
allow controlling the LEDs, the pins should use the 'gpio' function.
Before the commit mentined above, the 'spi' function is selected
first by the pinctrl core before probing the spi driver, but then
it gets overridden to 'gpio' implicitly via the
devm_gpiod_get_index_optional() call from the 'leds-gpio' driver.
After the commit, the LED subsystem gets initialized before the
SPI subsystem, so the function of the pin group remains 'spi'
which in turn prevents controlling of the LEDs.
Despite the change of the initialization order, the root cause is
that the pinctrl state definition is wrong since its initial commit
0d45062cfc89 ("arm64: dts: marvell: Add device tree for uDPU board"),
To fix the problem, override the function in the 'spi_quad_pins'
node to 'gpio' and move the pinctrl state definition from the
'spi' node into the 'leds' node.
Cc: stable(a)vger.kernel.org # needs adjustment for < 6.1
Fixes: 0d45062cfc89 ("arm64: dts: marvell: Add device tree for uDPU board")
Signed-off-by: Gabor Juhos <j4g8y7(a)gmail.com>
Signed-off-by: Imre Kaloz <kaloz(a)openwrt.org>
Signed-off-by: Gregory CLEMENT <gregory.clement(a)bootlin.com>
diff --git a/arch/arm64/boot/dts/marvell/armada-3720-uDPU.dtsi b/arch/arm64/boot/dts/marvell/armada-3720-uDPU.dtsi
index 3a9b6907185d..242820845707 100644
--- a/arch/arm64/boot/dts/marvell/armada-3720-uDPU.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-3720-uDPU.dtsi
@@ -26,6 +26,8 @@ memory@0 {
leds {
compatible = "gpio-leds";
+ pinctrl-names = "default";
+ pinctrl-0 = <&spi_quad_pins>;
led-power1 {
label = "udpu:green:power";
@@ -82,8 +84,6 @@ &sdhci0 {
&spi0 {
status = "okay";
- pinctrl-names = "default";
- pinctrl-0 = <&spi_quad_pins>;
flash@0 {
compatible = "jedec,spi-nor";
@@ -108,6 +108,10 @@ partition@180000 {
};
};
+&spi_quad_pins {
+ function = "gpio";
+};
+
&pinctrl_nb {
i2c2_recovery_pins: i2c2-recovery-pins {
groups = "i2c2";
If krealloc_array() fails in iort_rmr_alloc_sids(), the function returns
NULL but does not free the original 'sids' allocation. This results in a
memory leak since the caller overwrites the original pointer with the
NULL return value.
Fixes: 491cf4a6735a ("ACPI/IORT: Add support to retrieve IORT RMR reserved regions")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
This follows the same pattern as the fix in commit 06615967d488
("bpf, verifier: Fix memory leak in array reallocation for stack state").
---
drivers/acpi/arm64/iort.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 98759d6199d3..65f0f56ad753 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -937,8 +937,10 @@ static u32 *iort_rmr_alloc_sids(u32 *sids, u32 count, u32 id_start,
new_sids = krealloc_array(sids, count + new_count,
sizeof(*new_sids), GFP_KERNEL);
- if (!new_sids)
+ if (!new_sids) {
+ kfree(sids);
return NULL;
+ }
for (i = count; i < total_count; i++)
new_sids[i] = id_start++;
--
2.39.5 (Apple Git-154)
commit d34d6feaf4a76833effcec0b148b65946b04cde8 upstream.
Change the AUX DPCD probe address to DP_TRAINING_PATTERN_SET. Using
DP_DPCD_REV for this is not compliant with the DP Standard and it leads
to link training failures at least on a DP2.0 docking station when using
UHBR link rates.
This patch is a revert of commit 944e732be9c3 ("drm/dp: Change AUX DPCD
probe address from DPCD_REV to LANE0_1_STATUS") and the corresponding
fix for commit 05981233cf2e ("Revert "drm/dp: Change AUX DPCD probe
address from DPCD_REV to LANE0_1_STATUS") in the v6.16.y tree.
This change is only meant to be applied in the v6.16.y tree, not in
earlier stable trees.
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Sasha Levin <sashal(a)kernel.org>
Cc: dri-devel(a)lists.freedesktop.org
Cc: stable(a)vger.kernel.org # v6.16
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
---
drivers/gpu/drm/display/drm_dp_helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/display/drm_dp_helper.c b/drivers/gpu/drm/display/drm_dp_helper.c
index f2a6559a27100..ea78c6c8ca7a6 100644
--- a/drivers/gpu/drm/display/drm_dp_helper.c
+++ b/drivers/gpu/drm/display/drm_dp_helper.c
@@ -725,7 +725,7 @@ ssize_t drm_dp_dpcd_read(struct drm_dp_aux *aux, unsigned int offset,
* monitor doesn't power down exactly after the throw away read.
*/
if (!aux->is_remote) {
- ret = drm_dp_dpcd_probe(aux, DP_DPCD_REV);
+ ret = drm_dp_dpcd_probe(aux, DP_TRAINING_PATTERN_SET);
if (ret < 0)
return ret;
}
--
2.49.1
Hi Carlos,
Please pull this branch with changes for xfs.
As usual, I did a test-merge with the main upstream branch as of a few
minutes ago, and didn't see any conflicts. Please let me know if you
encounter any problems.
--D
The following changes since commit b320789d6883cc00ac78ce83bccbfe7ed58afcf0:
Linux 6.17-rc4 (2025-08-31 15:33:07 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git tags/fix-scrub-reap-calculations_2025-09-05
for you to fetch changes up to 07c34f8cef69cb8eeef69c18d6cf0c04fbee3cb3:
xfs: use deferred reaping for data device cow extents (2025-09-05 08:48:23 -0700)
----------------------------------------------------------------
xfs: improve online repair reap calculations [6.18 v2 1/2]
A few months ago, the multi-fsblock untorn writes patchset added a bunch
of log intent item helper functions to estimate the number of intent
items that could be added to a particular transaction. Those helpers
enabled us to compute a safe upper bound on the number of blocks that
could be written in an untorn fashion with filesystem-provided out of
place writes.
Currently, the online fsck code employs static limits on the number of
intent items that it's willing to accrue to a single transaction when
it's trying to reap what it thinks are the old blocks from a corrupt
structure. There have been no problems reported with this approach
after years of testing, but static limits are scary and gross because
overestimating the intent item limit could result in transaction
overflows and dead filesystems; and underestimating causes unnecessary
overhead.
This series uses the new log intent item size helpers to estimate the
limits dynamically based on worst-case per-block repair work vs. the
size of the scrub transaction. After several months of testing this,
there don't seem to be any problems here either.
v2: rearrange patches, add review tags
This has been running on the djcloud for months with no problems. Enjoy!
Signed-off-by: "Darrick J. Wong" <djwong(a)kernel.org>
----------------------------------------------------------------
Darrick J. Wong (9):
xfs: use deferred intent items for reaping crosslinked blocks
xfs: prepare reaping code for dynamic limits
xfs: convert the ifork reap code to use xreap_state
xfs: compute per-AG extent reap limits dynamically
xfs: compute data device CoW staging extent reap limits dynamically
xfs: compute realtime device CoW staging extent reap limits dynamically
xfs: compute file mapping reap limits dynamically
xfs: remove static reap limits from repair.h
xfs: use deferred reaping for data device cow extents
fs/xfs/scrub/repair.h | 8 -
fs/xfs/scrub/trace.h | 45 ++++
fs/xfs/scrub/newbt.c | 9 +
fs/xfs/scrub/reap.c | 622 ++++++++++++++++++++++++++++++++++++++++----------
fs/xfs/scrub/trace.c | 1 +
5 files changed, 554 insertions(+), 131 deletions(-)
Hi all,
A few months ago, the multi-fsblock untorn writes patchset added a bunch
of log intent item helper functions to estimate the number of intent
items that could be added to a particular transaction. Those helpers
enabled us to compute a safe upper bound on the number of blocks that
could be written in an untorn fashion with filesystem-provided out of
place writes.
Currently, the online fsck code employs static limits on the number of
intent items that it's willing to accrue to a single transaction when
it's trying to reap what it thinks are the old blocks from a corrupt
structure. There have been no problems reported with this approach
after years of testing, but static limits are scary and gross because
overestimating the intent item limit could result in transaction
overflows and dead filesystems; and underestimating causes unnecessary
overhead.
This series uses the new log intent item size helpers to estimate the
limits dynamically based on worst-case per-block repair work vs. the
size of the scrub transaction. After several months of testing this,
there don't seem to be any problems here either.
v2: rearrange patches, add review tags
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
This has been running on the djcloud for months with no problems. Enjoy!
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fi…
---
Commits in this patchset:
* xfs: use deferred intent items for reaping crosslinked blocks
* xfs: prepare reaping code for dynamic limits
* xfs: convert the ifork reap code to use xreap_state
* xfs: compute per-AG extent reap limits dynamically
* xfs: compute data device CoW staging extent reap limits dynamically
* xfs: compute realtime device CoW staging extent reap limits dynamically
* xfs: compute file mapping reap limits dynamically
* xfs: remove static reap limits from repair.h
* xfs: use deferred reaping for data device cow extents
---
fs/xfs/scrub/repair.h | 8 -
fs/xfs/scrub/trace.h | 45 ++++
fs/xfs/scrub/newbt.c | 9 +
fs/xfs/scrub/reap.c | 622 +++++++++++++++++++++++++++++++++++++++----------
fs/xfs/scrub/trace.c | 1
5 files changed, 554 insertions(+), 131 deletions(-)
In commit d0ceea662d45 ("x86/mm: Add _PAGE_NOPTISHADOW bit to avoid
updating userspace page tables") we started setting _PAGE_NOPTISHADOW (bit
58) in identity map PTEs created by kernel_ident_mapping_init. In
configure_5level_paging when transiting from 5-level paging to 4-level
paging we copy the first p4d to become our new (4-level) pgd by
dereferencing the first entry in the current pgd, but we only mask off the
low 12 bits in the pgd entry.
When kexecing, the previous kernel sets up an identity map using
kernel_ident_mapping_init before transiting to the new kernel, which means
the new kernel gets a pgd with entries with bit 58 set. Then we mask off
only the low 12 bits, resulting in a non-canonical address, and try to
copy from it, and fault.
Fixes: d0ceea662d45 ("x86/mm: Add _PAGE_NOPTISHADOW bit to avoid updating userspace page tables")
Signed-off-by: Eric Hagberg <ehagberg(a)janestreet.com>
Cc: stable(a)vger.kernel.org
---
arch/x86/boot/compressed/pgtable_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index bdd26050dff7..c785c5d54a11 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -180,7 +180,7 @@ asmlinkage void configure_5level_paging(struct boot_params *bp, void *pgtable)
* We cannot just point to the page table from trampoline as it
* may be above 4G.
*/
- src = *(unsigned long *)__native_read_cr3() & PAGE_MASK;
+ src = *(unsigned long *)__native_read_cr3() & PTE_PFN_MASK;
memcpy(trampoline_32bit, (void *)src, PAGE_SIZE);
}
--
2.43.7
Hello linux-stable-mirror,
Are you looking for a loan to either increase your activity or to
carry out a project. We offer Crypto Loans at 2-7% interest rate
with or without a credit check. We also pay 1% commission to
brokers, who introduce project owners for finance or other
opportunities.
Please get back to me if you are interested in more details.
Best Regards,
Lucy Seinfield
Assistant Secretary
From: Sumit Kumar <sumk(a)qti.qualcomm.com>
The current implementation of mhi_ep_read_channel, in case of chained
transactions, assumes the End of Transfer(EOT) bit is received with the
doorbell. As a result, it may incorrectly advance mhi_chan->rd_offset
beyond wr_offset during host-to-device transfers when EOT has not yet
arrived. This can lead to access of unmapped host memory, causing
IOMMU faults and processing of stale TREs.
This change modifies the loop condition to ensure mhi_queue is not empty,
allowing the function to process only valid TREs up to the current write
pointer. This prevents premature reads and ensures safe traversal of
chained TREs.
Removed buf_left from the while loop condition to avoid exiting prematurely
before reading the ring completely.
Removed write_offset since it will always be zero because the new cache
buffer is allocated everytime.
Fixes: 5301258899773 ("bus: mhi: ep: Add support for reading from the host")
Cc: stable(a)vger.kernel.org
Co-developed-by: Akhil Vinod <akhvin(a)qti.qualcomm.com>
Signed-off-by: Akhil Vinod <akhvin(a)qti.qualcomm.com>
Signed-off-by: Sumit Kumar <sumk(a)qti.qualcomm.com>
---
Changes in v2:
- Use mhi_ep_queue_is_empty in while loop (Mani).
- Remove do while loop in mhi_ep_process_ch_ring (Mani).
- Remove buf_left, wr_offset, tr_done.
- Haven't added Reviewed-by as there is change in logic.
- Link to v1: https://lore.kernel.org/r/20250709-chained_transfer-v1-1-2326a4605c9c@quici…
---
drivers/bus/mhi/ep/main.c | 37 ++++++++++++-------------------------
1 file changed, 12 insertions(+), 25 deletions(-)
diff --git a/drivers/bus/mhi/ep/main.c b/drivers/bus/mhi/ep/main.c
index b3eafcf2a2c50d95e3efd3afb27038ecf55552a5..cdea24e9291959ae0a92487c1b9698dc8164d2f1 100644
--- a/drivers/bus/mhi/ep/main.c
+++ b/drivers/bus/mhi/ep/main.c
@@ -403,17 +403,13 @@ static int mhi_ep_read_channel(struct mhi_ep_cntrl *mhi_cntrl,
{
struct mhi_ep_chan *mhi_chan = &mhi_cntrl->mhi_chan[ring->ch_id];
struct device *dev = &mhi_cntrl->mhi_dev->dev;
- size_t tr_len, read_offset, write_offset;
+ size_t tr_len, read_offset;
struct mhi_ep_buf_info buf_info = {};
u32 len = MHI_EP_DEFAULT_MTU;
struct mhi_ring_element *el;
- bool tr_done = false;
void *buf_addr;
- u32 buf_left;
int ret;
- buf_left = len;
-
do {
/* Don't process the transfer ring if the channel is not in RUNNING state */
if (mhi_chan->state != MHI_CH_STATE_RUNNING) {
@@ -426,24 +422,23 @@ static int mhi_ep_read_channel(struct mhi_ep_cntrl *mhi_cntrl,
/* Check if there is data pending to be read from previous read operation */
if (mhi_chan->tre_bytes_left) {
dev_dbg(dev, "TRE bytes remaining: %u\n", mhi_chan->tre_bytes_left);
- tr_len = min(buf_left, mhi_chan->tre_bytes_left);
+ tr_len = min(len, mhi_chan->tre_bytes_left);
} else {
mhi_chan->tre_loc = MHI_TRE_DATA_GET_PTR(el);
mhi_chan->tre_size = MHI_TRE_DATA_GET_LEN(el);
mhi_chan->tre_bytes_left = mhi_chan->tre_size;
- tr_len = min(buf_left, mhi_chan->tre_size);
+ tr_len = min(len, mhi_chan->tre_size);
}
read_offset = mhi_chan->tre_size - mhi_chan->tre_bytes_left;
- write_offset = len - buf_left;
buf_addr = kmem_cache_zalloc(mhi_cntrl->tre_buf_cache, GFP_KERNEL);
if (!buf_addr)
return -ENOMEM;
buf_info.host_addr = mhi_chan->tre_loc + read_offset;
- buf_info.dev_addr = buf_addr + write_offset;
+ buf_info.dev_addr = buf_addr;
buf_info.size = tr_len;
buf_info.cb = mhi_ep_read_completion;
buf_info.cb_buf = buf_addr;
@@ -459,16 +454,12 @@ static int mhi_ep_read_channel(struct mhi_ep_cntrl *mhi_cntrl,
goto err_free_buf_addr;
}
- buf_left -= tr_len;
mhi_chan->tre_bytes_left -= tr_len;
- if (!mhi_chan->tre_bytes_left) {
- if (MHI_TRE_DATA_GET_IEOT(el))
- tr_done = true;
-
+ if (!mhi_chan->tre_bytes_left)
mhi_chan->rd_offset = (mhi_chan->rd_offset + 1) % ring->ring_size;
- }
- } while (buf_left && !tr_done);
+ /* Read until the some buffer is left or the ring becomes not empty */
+ } while (!mhi_ep_queue_is_empty(mhi_chan->mhi_dev, DMA_TO_DEVICE));
return 0;
@@ -502,15 +493,11 @@ static int mhi_ep_process_ch_ring(struct mhi_ep_ring *ring)
mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
} else {
/* UL channel */
- do {
- ret = mhi_ep_read_channel(mhi_cntrl, ring);
- if (ret < 0) {
- dev_err(&mhi_chan->mhi_dev->dev, "Failed to read channel\n");
- return ret;
- }
-
- /* Read until the ring becomes empty */
- } while (!mhi_ep_queue_is_empty(mhi_chan->mhi_dev, DMA_TO_DEVICE));
+ ret = mhi_ep_read_channel(mhi_cntrl, ring);
+ if (ret < 0) {
+ dev_err(&mhi_chan->mhi_dev->dev, "Failed to read channel\n");
+ return ret;
+ }
}
return 0;
---
base-commit: 4c06e63b92038fadb566b652ec3ec04e228931e8
change-id: 20250709-chained_transfer-0b95f8afa487
Best regards,
--
Sumit Kumar <quic_sumk(a)quicinc.com>
The ready event list of an epoll object is protected by read-write
semaphore:
- The consumer (waiter) acquires the write lock and takes items.
- the producer (waker) takes the read lock and adds items.
The point of this design is enabling epoll to scale well with large number
of producers, as multiple producers can hold the read lock at the same
time.
Unfortunately, this implementation may cause scheduling priority inversion
problem. Suppose the consumer has higher scheduling priority than the
producer. The consumer needs to acquire the write lock, but may be blocked
by the producer holding the read lock. Since read-write semaphore does not
support priority-boosting for the readers (even with CONFIG_PREEMPT_RT=y),
we have a case of priority inversion: a higher priority consumer is blocked
by a lower priority producer. This problem was reported in [1].
Furthermore, this could also cause stall problem, as described in [2].
Fix this problem by replacing rwlock with spinlock.
This reduces the event bandwidth, as the producers now have to contend with
each other for the spinlock. According to the benchmark from
https://github.com/rouming/test-tools/blob/master/stress-epoll.c:
On 12 x86 CPUs:
Before After Diff
threads events/ms events/ms
8 7162 4956 -31%
16 8733 5383 -38%
32 7968 5572 -30%
64 10652 5739 -46%
128 11236 5931 -47%
On 4 riscv CPUs:
Before After Diff
threads events/ms events/ms
8 2958 2833 -4%
16 3323 3097 -7%
32 3451 3240 -6%
64 3554 3178 -11%
128 3601 3235 -10%
Although the numbers look bad, it should be noted that this benchmark
creates multiple threads who do nothing except constantly generating new
epoll events, thus contention on the spinlock is high. For real workload,
the event rate is likely much lower, and the performance drop is not as
bad.
Using another benchmark (perf bench epoll wait) where spinlock contention
is lower, improvement is even observed on x86:
On 12 x86 CPUs:
Before: Averaged 110279 operations/sec (+- 1.09%), total secs = 8
After: Averaged 114577 operations/sec (+- 2.25%), total secs = 8
On 4 riscv CPUs:
Before: Averaged 175767 operations/sec (+- 0.62%), total secs = 8
After: Averaged 167396 operations/sec (+- 0.23%), total secs = 8
In conclusion, no one is likely to be upset over this change. After all,
spinlock was used originally for years, and the commit which converted to
rwlock didn't mention a real workload, just that the benchmark numbers are
nice.
This patch is not exactly the revert of commit a218cc491420 ("epoll: use
rwlock in order to reduce ep_poll_callback() contention"), because git
revert conflicts in some places which are not obvious on the resolution.
This patch is intended to be backported, therefore go with the obvious
approach:
- Replace rwlock_t with spinlock_t one to one
- Delete list_add_tail_lockless() and chain_epi_lockless(). These were
introduced to allow producers to concurrently add items to the list.
But now that spinlock no longer allows producers to touch the event
list concurrently, these two functions are not necessary anymore.
Fixes: a218cc491420 ("epoll: use rwlock in order to reduce ep_poll_callback() contention")
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: stable(a)vger.kernel.org
---
fs/eventpoll.c | 139 +++++++++----------------------------------------
1 file changed, 26 insertions(+), 113 deletions(-)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 0fbf5dfedb24..a171f7e7dacc 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -46,10 +46,10 @@
*
* 1) epnested_mutex (mutex)
* 2) ep->mtx (mutex)
- * 3) ep->lock (rwlock)
+ * 3) ep->lock (spinlock)
*
* The acquire order is the one listed above, from 1 to 3.
- * We need a rwlock (ep->lock) because we manipulate objects
+ * We need a spinlock (ep->lock) because we manipulate objects
* from inside the poll callback, that might be triggered from
* a wake_up() that in turn might be called from IRQ context.
* So we can't sleep inside the poll callback and hence we need
@@ -195,7 +195,7 @@ struct eventpoll {
struct list_head rdllist;
/* Lock which protects rdllist and ovflist */
- rwlock_t lock;
+ spinlock_t lock;
/* RB tree root used to store monitored fd structs */
struct rb_root_cached rbr;
@@ -740,10 +740,10 @@ static void ep_start_scan(struct eventpoll *ep, struct list_head *txlist)
* in a lockless way.
*/
lockdep_assert_irqs_enabled();
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
list_splice_init(&ep->rdllist, txlist);
WRITE_ONCE(ep->ovflist, NULL);
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
}
static void ep_done_scan(struct eventpoll *ep,
@@ -751,7 +751,7 @@ static void ep_done_scan(struct eventpoll *ep,
{
struct epitem *epi, *nepi;
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
/*
* During the time we spent inside the "sproc" callback, some
* other events might have been queued by the poll callback.
@@ -792,7 +792,7 @@ static void ep_done_scan(struct eventpoll *ep,
wake_up(&ep->wq);
}
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
}
static void ep_get(struct eventpoll *ep)
@@ -867,10 +867,10 @@ static bool __ep_remove(struct eventpoll *ep, struct epitem *epi, bool force)
rb_erase_cached(&epi->rbn, &ep->rbr);
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
if (ep_is_linked(epi))
list_del_init(&epi->rdllink);
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
wakeup_source_unregister(ep_wakeup_source(epi));
/*
@@ -1151,7 +1151,7 @@ static int ep_alloc(struct eventpoll **pep)
return -ENOMEM;
mutex_init(&ep->mtx);
- rwlock_init(&ep->lock);
+ spin_lock_init(&ep->lock);
init_waitqueue_head(&ep->wq);
init_waitqueue_head(&ep->poll_wait);
INIT_LIST_HEAD(&ep->rdllist);
@@ -1238,100 +1238,10 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
}
#endif /* CONFIG_KCMP */
-/*
- * Adds a new entry to the tail of the list in a lockless way, i.e.
- * multiple CPUs are allowed to call this function concurrently.
- *
- * Beware: it is necessary to prevent any other modifications of the
- * existing list until all changes are completed, in other words
- * concurrent list_add_tail_lockless() calls should be protected
- * with a read lock, where write lock acts as a barrier which
- * makes sure all list_add_tail_lockless() calls are fully
- * completed.
- *
- * Also an element can be locklessly added to the list only in one
- * direction i.e. either to the tail or to the head, otherwise
- * concurrent access will corrupt the list.
- *
- * Return: %false if element has been already added to the list, %true
- * otherwise.
- */
-static inline bool list_add_tail_lockless(struct list_head *new,
- struct list_head *head)
-{
- struct list_head *prev;
-
- /*
- * This is simple 'new->next = head' operation, but cmpxchg()
- * is used in order to detect that same element has been just
- * added to the list from another CPU: the winner observes
- * new->next == new.
- */
- if (!try_cmpxchg(&new->next, &new, head))
- return false;
-
- /*
- * Initially ->next of a new element must be updated with the head
- * (we are inserting to the tail) and only then pointers are atomically
- * exchanged. XCHG guarantees memory ordering, thus ->next should be
- * updated before pointers are actually swapped and pointers are
- * swapped before prev->next is updated.
- */
-
- prev = xchg(&head->prev, new);
-
- /*
- * It is safe to modify prev->next and new->prev, because a new element
- * is added only to the tail and new->next is updated before XCHG.
- */
-
- prev->next = new;
- new->prev = prev;
-
- return true;
-}
-
-/*
- * Chains a new epi entry to the tail of the ep->ovflist in a lockless way,
- * i.e. multiple CPUs are allowed to call this function concurrently.
- *
- * Return: %false if epi element has been already chained, %true otherwise.
- */
-static inline bool chain_epi_lockless(struct epitem *epi)
-{
- struct eventpoll *ep = epi->ep;
-
- /* Fast preliminary check */
- if (epi->next != EP_UNACTIVE_PTR)
- return false;
-
- /* Check that the same epi has not been just chained from another CPU */
- if (cmpxchg(&epi->next, EP_UNACTIVE_PTR, NULL) != EP_UNACTIVE_PTR)
- return false;
-
- /* Atomically exchange tail */
- epi->next = xchg(&ep->ovflist, epi);
-
- return true;
-}
-
/*
* This is the callback that is passed to the wait queue wakeup
* mechanism. It is called by the stored file descriptors when they
* have events to report.
- *
- * This callback takes a read lock in order not to contend with concurrent
- * events from another file descriptor, thus all modifications to ->rdllist
- * or ->ovflist are lockless. Read lock is paired with the write lock from
- * ep_start/done_scan(), which stops all list modifications and guarantees
- * that lists state is seen correctly.
- *
- * Another thing worth to mention is that ep_poll_callback() can be called
- * concurrently for the same @epi from different CPUs if poll table was inited
- * with several wait queues entries. Plural wakeup from different CPUs of a
- * single wait queue is serialized by wq.lock, but the case when multiple wait
- * queues are used should be detected accordingly. This is detected using
- * cmpxchg() operation.
*/
static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, void *key)
{
@@ -1342,7 +1252,7 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
unsigned long flags;
int ewake = 0;
- read_lock_irqsave(&ep->lock, flags);
+ spin_lock_irqsave(&ep->lock, flags);
ep_set_busy_poll_napi_id(epi);
@@ -1371,12 +1281,15 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
* chained in ep->ovflist and requeued later on.
*/
if (READ_ONCE(ep->ovflist) != EP_UNACTIVE_PTR) {
- if (chain_epi_lockless(epi))
+ if (epi->next == EP_UNACTIVE_PTR) {
+ epi->next = READ_ONCE(ep->ovflist);
+ WRITE_ONCE(ep->ovflist, epi);
ep_pm_stay_awake_rcu(epi);
+ }
} else if (!ep_is_linked(epi)) {
/* In the usual case, add event to ready list. */
- if (list_add_tail_lockless(&epi->rdllink, &ep->rdllist))
- ep_pm_stay_awake_rcu(epi);
+ list_add_tail(&epi->rdllink, &ep->rdllist);
+ ep_pm_stay_awake_rcu(epi);
}
/*
@@ -1409,7 +1322,7 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
pwake++;
out_unlock:
- read_unlock_irqrestore(&ep->lock, flags);
+ spin_unlock_irqrestore(&ep->lock, flags);
/* We have to call this outside the lock */
if (pwake)
@@ -1744,7 +1657,7 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
}
/* We have to drop the new item inside our item list to keep track of it */
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
/* record NAPI ID of new item if present */
ep_set_busy_poll_napi_id(epi);
@@ -1761,7 +1674,7 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
pwake++;
}
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
/* We have to call this outside the lock */
if (pwake)
@@ -1825,7 +1738,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi,
* list, push it inside.
*/
if (ep_item_poll(epi, &pt, 1)) {
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
if (!ep_is_linked(epi)) {
list_add_tail(&epi->rdllink, &ep->rdllist);
ep_pm_stay_awake(epi);
@@ -1836,7 +1749,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi,
if (waitqueue_active(&ep->poll_wait))
pwake++;
}
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
}
/* We have to call this outside the lock */
@@ -2088,7 +2001,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
init_wait(&wait);
wait.func = ep_autoremove_wake_function;
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
/*
* Barrierless variant, waitqueue_active() is called under
* the same lock on wakeup ep_poll_callback() side, so it
@@ -2107,7 +2020,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
if (!eavail)
__add_wait_queue_exclusive(&ep->wq, &wait);
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
if (!eavail)
timed_out = !ep_schedule_timeout(to) ||
@@ -2123,7 +2036,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
eavail = 1;
if (!list_empty_careful(&wait.entry)) {
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
/*
* If the thread timed out and is not on the wait queue,
* it means that the thread was woken up after its
@@ -2134,7 +2047,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
if (timed_out)
eavail = list_empty(&wait.entry);
__remove_wait_queue(&ep->wq, &wait);
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
}
}
}
--
2.39.5
vidtv_channel_si_init() creates a temporary list (program, service, event)
and ownership of the memory itself is transferred to the PAT/SDT/EIT
tables through vidtv_psi_pat_program_assign(),
vidtv_psi_sdt_service_assign(), vidtv_psi_eit_event_assign().
The problem here is that the local pointer where the memory ownership
transfer was completed is not initialized to NULL. This causes the
vidtv_psi_pmt_create_sec_for_each_pat_entry() function to fail, and
in the flow that jumps to free_eit, the memory that was freed by
vidtv_psi_*_table_destroy() can be accessed again by
vidtv_psi_*_event_destroy() due to the uninitialized local pointer, so it
is freed once again.
Therefore, to prevent use-after-free and double-free vulnerability,
local pointers must be initialized to NULL when transferring memory
ownership.
Cc: <stable(a)vger.kernel.org>
Reported-by: syzbot+1d9c0edea5907af239e0(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1d9c0edea5907af239e0
Fixes: 3be8037960bc ("media: vidtv: add error checks")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
---
v3: Improved patch description wording
- Link to v2: https://lore.kernel.org/all/20250904054000.3848107-1-aha310510@gmail.com/
v2: Improved patch description wording and CC stable mailing list
- Link to v1: https://lore.kernel.org/all/20250822065849.1145572-1-aha310510@gmail.com/
---
drivers/media/test-drivers/vidtv/vidtv_channel.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/media/test-drivers/vidtv/vidtv_channel.c b/drivers/media/test-drivers/vidtv/vidtv_channel.c
index f3023e91b3eb..3541155c6fc6 100644
--- a/drivers/media/test-drivers/vidtv/vidtv_channel.c
+++ b/drivers/media/test-drivers/vidtv/vidtv_channel.c
@@ -461,12 +461,15 @@ int vidtv_channel_si_init(struct vidtv_mux *m)
/* assemble all programs and assign to PAT */
vidtv_psi_pat_program_assign(m->si.pat, programs);
+ programs = NULL;
/* assemble all services and assign to SDT */
vidtv_psi_sdt_service_assign(m->si.sdt, services);
+ services = NULL;
/* assemble all events and assign to EIT */
vidtv_psi_eit_event_assign(m->si.eit, events);
+ events = NULL;
m->si.pmt_secs = vidtv_psi_pmt_create_sec_for_each_pat_entry(m->si.pat,
m->pcr_pid);
--
zpci_get_iommu_ctrs() returns counter information to be reported as part
of device statistics; these counters are stored as part of the s390_domain.
The problem, however, is that the identity domain is not backed by an
s390_domain and so the conversion via to_s390_domain() yields a bad address
that is zero'd initially and read on-demand later via a sysfs read.
These counters aren't necessary for the identity domain; just return NULL
in this case.
This issue was discovered via KASAN with reports that look like:
BUG: KASAN: global-out-of-bounds in zpci_fmb_enable_device
when using the identity domain for a device on s390.
Cc: stable(a)vger.kernel.org
Fixes: 64af12c6ec3a ("iommu/s390: implement iommu passthrough via identity domain")
Reported-by: Cam Miller <cam(a)linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato(a)linux.ibm.com>
---
drivers/iommu/s390-iommu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index 9c80d61deb2c..d7370347c910 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -1032,7 +1032,8 @@ struct zpci_iommu_ctrs *zpci_get_iommu_ctrs(struct zpci_dev *zdev)
lockdep_assert_held(&zdev->dom_lock);
- if (zdev->s390_domain->type == IOMMU_DOMAIN_BLOCKED)
+ if (zdev->s390_domain->type == IOMMU_DOMAIN_BLOCKED ||
+ zdev->s390_domain->type == IOMMU_DOMAIN_IDENTITY)
return NULL;
s390_domain = to_s390_domain(zdev->s390_domain);
--
2.50.1
On Fri, Aug 22, 2025 at 10:49:15AM +0800, Zhen Ni wrote:
> Fix a permanent ACPI table memory leak in early_amd_iommu_init() when
> CMPXCHG16B feature is not supported
>
> Fixes: 82582f85ed22 ("iommu/amd: Disable AMD IOMMU if CMPXCHG16B feature is not supported")
> Cc: stable(a)vger.kernel.org
> Signed-off-by: Zhen Ni <zhen.ni(a)easystack.cn>
> ---
> drivers/iommu/amd/init.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
Applied for -rc, thanks.
Backport this series to 6.1&6.6 because LoongArch gets build errors with
latest binutils which has commit 599df6e2db17d1c4 ("ld, LoongArch: print
error about linking without -fPIC or -fPIE flag in more detail").
CC .vmlinux.export.o
UPD include/generated/utsversion.h
CC init/version-timestamp.o
LD .tmp_vmlinux.kallsyms1
loongarch64-unknown-linux-gnu-ld: kernel/kallsyms.o:(.text+0): relocation R_LARCH_PCALA_HI20 against `kallsyms_markers` can not be used when making a PIE object; recompile with -fPIE
loongarch64-unknown-linux-gnu-ld: kernel/crash_core.o:(.init.text+0x984): relocation R_LARCH_PCALA_HI20 against `kallsyms_names` can not be used when making a PIE object; recompile with -fPIE
loongarch64-unknown-linux-gnu-ld: kernel/bpf/btf.o:(.text+0xcc7c): relocation R_LARCH_PCALA_HI20 against `__start_BTF` can not be used when making a PIE object; recompile with -fPIE
loongarch64-unknown-linux-gnu-ld: BFD (GNU Binutils) 2.43.50.20241126 assertion fail ../../bfd/elfnn-loongarch.c:2673
In theory 5.10&5.15 also need this, but since LoongArch get upstream at
5.19, so I just ignore them because there is no error report about other
archs now.
Weak external linkage is intended for cases where a symbol reference
can remain unsatisfied in the final link. Taking the address of such a
symbol should yield NULL if the reference was not satisfied.
Given that ordinary RIP or PC relative references cannot produce NULL,
some kind of indirection is always needed in such cases, and in position
independent code, this results in a GOT entry. In ordinary code, it is
arch specific but amounts to the same thing.
While unavoidable in some cases, weak references are currently also used
to declare symbols that are always defined in the final link, but not in
the first linker pass. This means we end up with worse codegen for no
good reason. So let's clean this up, by providing preliminary
definitions that are only used as a fallback.
Ard Biesheuvel (3):
kallsyms: Avoid weak references for kallsyms symbols
vmlinux: Avoid weak reference to notes section
btf: Avoid weak external references
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
include/asm-generic/vmlinux.lds.h | 28 ++++++++++++++++++
kernel/bpf/btf.c | 7 +++--
kernel/bpf/sysfs_btf.c | 6 ++--
kernel/kallsyms.c | 6 ----
kernel/kallsyms_internal.h | 30 ++++++++------------
kernel/ksysfs.c | 4 +--
lib/buildid.c | 4 +--
7 files changed, 52 insertions(+), 33 deletions(-)
---
2.27.0
As the doc of of_parse_phandle() states:
"The device_node pointer with refcount incremented. Use
* of_node_put() on it when done."
Add missing of_node_put() after of_parse_phandle() call to properly
release the device node reference.
Found via static analysis.
Fixes: 1df82ec46600 ("PCI: imx: Add workaround for e10728, IMX7d PCIe PLL failure")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/pci/controller/dwc/pci-imx6.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c
index 80e48746bbaf..618bc4b08a8b 100644
--- a/drivers/pci/controller/dwc/pci-imx6.c
+++ b/drivers/pci/controller/dwc/pci-imx6.c
@@ -1636,6 +1636,7 @@ static int imx_pcie_probe(struct platform_device *pdev)
struct resource res;
ret = of_address_to_resource(np, 0, &res);
+ of_node_put(np);
if (ret) {
dev_err(dev, "Unable to map PCIe PHY\n");
return ret;
--
2.35.1
In the IOMMU Shared Virtual Addressing (SVA) context, the IOMMU hardware
shares and walks the CPU's page tables. The x86 architecture maps the
kernel's virtual address space into the upper portion of every process's
page table. Consequently, in an SVA context, the IOMMU hardware can walk
and cache kernel page table entries.
The Linux kernel currently lacks a notification mechanism for kernel page
table changes, specifically when page table pages are freed and reused.
The IOMMU driver is only notified of changes to user virtual address
mappings. This can cause the IOMMU's internal caches to retain stale
entries for kernel VA.
A Use-After-Free (UAF) and Write-After-Free (WAF) condition arises when
kernel page table pages are freed and later reallocated. The IOMMU could
misinterpret the new data as valid page table entries. The IOMMU might
then walk into attacker-controlled memory, leading to arbitrary physical
memory DMA access or privilege escalation. This is also a Write-After-Free
issue, as the IOMMU will potentially continue to write Accessed and Dirty
bits to the freed memory while attempting to walk the stale page tables.
Currently, SVA contexts are unprivileged and cannot access kernel
mappings. However, the IOMMU will still walk kernel-only page tables
all the way down to the leaf entries, where it realizes the mapping
is for the kernel and errors out. This means the IOMMU still caches
these intermediate page table entries, making the described vulnerability
a real concern.
To mitigate this, a new IOMMU interface is introduced to flush IOTLB
entries for the kernel address space. This interface is invoked from the
x86 architecture code that manages combined user and kernel page tables,
specifically when a kernel page table update requires a CPU TLB flush.
Fixes: 26b25a2b98e4 ("iommu: Bind process address spaces to devices")
Cc: stable(a)vger.kernel.org
Suggested-by: Jann Horn <jannh(a)google.com>
Co-developed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde(a)amd.com>
Reviewed-by: Kevin Tian <kevin.tian(a)intel.com>
---
drivers/iommu/iommu-sva.c | 29 ++++++++++++++++++++++++++++-
include/linux/iommu.h | 4 ++++
mm/pgtable-generic.c | 2 ++
3 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 1a51cfd82808..d236aef80a8d 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -10,6 +10,8 @@
#include "iommu-priv.h"
static DEFINE_MUTEX(iommu_sva_lock);
+static bool iommu_sva_present;
+static LIST_HEAD(iommu_sva_mms);
static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm);
@@ -42,6 +44,7 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct de
return ERR_PTR(-ENOSPC);
}
iommu_mm->pasid = pasid;
+ iommu_mm->mm = mm;
INIT_LIST_HEAD(&iommu_mm->sva_domains);
/*
* Make sure the write to mm->iommu_mm is not reordered in front of
@@ -132,8 +135,13 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
if (ret)
goto out_free_domain;
domain->users = 1;
- list_add(&domain->next, &mm->iommu_mm->sva_domains);
+ if (list_empty(&iommu_mm->sva_domains)) {
+ if (list_empty(&iommu_sva_mms))
+ iommu_sva_present = true;
+ list_add(&iommu_mm->mm_list_elm, &iommu_sva_mms);
+ }
+ list_add(&domain->next, &iommu_mm->sva_domains);
out:
refcount_set(&handle->users, 1);
mutex_unlock(&iommu_sva_lock);
@@ -175,6 +183,13 @@ void iommu_sva_unbind_device(struct iommu_sva *handle)
list_del(&domain->next);
iommu_domain_free(domain);
}
+
+ if (list_empty(&iommu_mm->sva_domains)) {
+ list_del(&iommu_mm->mm_list_elm);
+ if (list_empty(&iommu_sva_mms))
+ iommu_sva_present = false;
+ }
+
mutex_unlock(&iommu_sva_lock);
kfree(handle);
}
@@ -312,3 +327,15 @@ static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
return domain;
}
+
+void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end)
+{
+ struct iommu_mm_data *iommu_mm;
+
+ guard(mutex)(&iommu_sva_lock);
+ if (!iommu_sva_present)
+ return;
+
+ list_for_each_entry(iommu_mm, &iommu_sva_mms, mm_list_elm)
+ mmu_notifier_arch_invalidate_secondary_tlbs(iommu_mm->mm, start, end);
+}
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c30d12e16473..66e4abb2df0d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -1134,7 +1134,9 @@ struct iommu_sva {
struct iommu_mm_data {
u32 pasid;
+ struct mm_struct *mm;
struct list_head sva_domains;
+ struct list_head mm_list_elm;
};
int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode);
@@ -1615,6 +1617,7 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
struct mm_struct *mm);
void iommu_sva_unbind_device(struct iommu_sva *handle);
u32 iommu_sva_get_pasid(struct iommu_sva *handle);
+void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end);
#else
static inline struct iommu_sva *
iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
@@ -1639,6 +1642,7 @@ static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm)
}
static inline void mm_pasid_drop(struct mm_struct *mm) {}
+static inline void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end) {}
#endif /* CONFIG_IOMMU_SVA */
#ifdef CONFIG_IOMMU_IOPF
diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index cf884e8300ca..4c81ca8b196f 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -13,6 +13,7 @@
#include <linux/swap.h>
#include <linux/swapops.h>
#include <linux/mm_inline.h>
+#include <linux/iommu.h>
#include <asm/pgalloc.h>
#include <asm/tlb.h>
@@ -430,6 +431,7 @@ static void kernel_pgtable_work_func(struct work_struct *work)
list_splice_tail_init(&kernel_pgtable_work.list, &page_list);
spin_unlock(&kernel_pgtable_work.lock);
+ iommu_sva_invalidate_kva_range(PAGE_OFFSET, TLB_FLUSH_ALL);
list_for_each_entry_safe(pt, next, &page_list, pt_list) {
list_del(&pt->pt_list);
__pagetable_free(pt);
--
2.43.0
The patch titled
Subject: proc: fix type confusion in pde_set_flags()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
proc-fix-type-confusion-in-pde_set_flags.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: wangzijie <wangzijie1(a)honor.com>
Subject: proc: fix type confusion in pde_set_flags()
Date: Thu, 4 Sep 2025 21:57:15 +0800
Commit 2ce3d282bd50 ("proc: fix missing pde_set_flags() for net proc
files") missed a key part in the definition of proc_dir_entry:
union {
const struct proc_ops *proc_ops;
const struct file_operations *proc_dir_ops;
};
So dereference of ->proc_ops assumes it is a proc_ops structure results in
type confusion and make NULL check for 'proc_ops' not work for proc dir.
Add !S_ISDIR(dp->mode) test before calling pde_set_flags() to fix it.
Link: https://lkml.kernel.org/r/20250904135715.3972782-1-wangzijie1@honor.com
Fixes: 2ce3d282bd50 ("proc: fix missing pde_set_flags() for net proc files")
Signed-off-by: wangzijie <wangzijie1(a)honor.com>
Reported-by: Brad Spengler <spender(a)grsecurity.net>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Jiri Slaby <jirislaby(a)kernel.org>
Cc: Stefano Brivio <sbrivio(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/generic.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/proc/generic.c~proc-fix-type-confusion-in-pde_set_flags
+++ a/fs/proc/generic.c
@@ -393,7 +393,8 @@ struct proc_dir_entry *proc_register(str
if (proc_alloc_inum(&dp->low_ino))
goto out_free_entry;
- pde_set_flags(dp);
+ if (!S_ISDIR(dp->mode))
+ pde_set_flags(dp);
write_lock(&proc_subdir_lock);
dp->parent = dir;
_
Patches currently in -mm which might be from wangzijie1(a)honor.com are
proc-fix-type-confusion-in-pde_set_flags.patch
Now when SRIOV is enabled, PF with multiple queues can only receive
all packets on queue 0. This is caused by an incorrect flag judgement,
which prevents RSS from being enabled.
In fact, RSS is supported for the functions when SRIOV is enabled.
Remove the flag judgement to fix it.
Fixes: c52d4b898901 ("net: libwx: Redesign flow when sriov is enabled")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu(a)trustnetic.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
---
v2: detail the commit log
---
drivers/net/ethernet/wangxun/libwx/wx_hw.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_hw.c b/drivers/net/ethernet/wangxun/libwx/wx_hw.c
index bcd07a715752..5cb353a97d6d 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_hw.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_hw.c
@@ -2078,10 +2078,6 @@ static void wx_setup_mrqc(struct wx *wx)
{
u32 rss_field = 0;
- /* VT, and RSS do not coexist at the same time */
- if (test_bit(WX_FLAG_VMDQ_ENABLED, wx->flags))
- return;
-
/* Disable indicating checksum in descriptor, enables RSS hash */
wr32m(wx, WX_PSR_CTL, WX_PSR_CTL_PCSD, WX_PSR_CTL_PCSD);
--
2.48.1
vidtv_channel_si_init() creates a temporary list (program, service, event)
and ownership of the memory itself is transferred to the PAT/SDT/EIT
tables through vidtv_psi_pat_program_assign(),
vidtv_psi_sdt_service_assign(), vidtv_psi_eit_event_assign().
The problem here is that the local pointer where the memory ownership
transfer was completed is not initialized to NULL. This causes the
vidtv_psi_pmt_create_sec_for_each_pat_entry() function to fail, and
in the flow that jumps to free_eit, the memory that was freed by
vidtv_psi_*_table_destroy() can be accessed again by
vidtv_psi_*_event_destroy() due to the uninitialized local pointer, so it
is freed once again.
Therefore, to prevent use-after-free and double-free vuln, local pointers
must be initialized to NULL when transferring memory ownership.
Cc: <stable(a)vger.kernel.org>
Reported-by: syzbot+1d9c0edea5907af239e0(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1d9c0edea5907af239e0
Fixes: 3be8037960bc ("media: vidtv: add error checks")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
---
v2: Improved patch description wording and CC stable mailing list
- Link to v1: https://lore.kernel.org/all/20250822065849.1145572-1-aha310510@gmail.com/
---
drivers/media/test-drivers/vidtv/vidtv_channel.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/media/test-drivers/vidtv/vidtv_channel.c b/drivers/media/test-drivers/vidtv/vidtv_channel.c
index f3023e91b3eb..3541155c6fc6 100644
--- a/drivers/media/test-drivers/vidtv/vidtv_channel.c
+++ b/drivers/media/test-drivers/vidtv/vidtv_channel.c
@@ -461,12 +461,15 @@ int vidtv_channel_si_init(struct vidtv_mux *m)
/* assemble all programs and assign to PAT */
vidtv_psi_pat_program_assign(m->si.pat, programs);
+ programs = NULL;
/* assemble all services and assign to SDT */
vidtv_psi_sdt_service_assign(m->si.sdt, services);
+ services = NULL;
/* assemble all events and assign to EIT */
vidtv_psi_eit_event_assign(m->si.eit, events);
+ events = NULL;
m->si.pmt_secs = vidtv_psi_pmt_create_sec_for_each_pat_entry(m->si.pat,
m->pcr_pid);
--
In the old days, RDS used FMR (Fast Memory Registration) to register
IB MRs to be used by RDMA. A newer and better verbs based
registration/de-registration method called FRWR (Fast Registration
Work Request) was added to RDS by commit 1659185fb4d0 ("RDS: IB:
Support Fastreg MR (FRMR) memory registration mode") in 2016.
Detection and enablement of FRWR was done in commit 2cb2912d6563
("RDS: IB: add Fastreg MR (FRMR) detection support"). But said commit
added an extern bool prefer_frmr, which was not used by said commit -
nor used by later commits. Hence, remove it.
Fixes: 2cb2912d6563 ("RDS: IB: add Fastreg MR (FRMR) detection support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Håkon Bugge <haakon.bugge(a)oracle.com>
---
v1 -> v2:
* Added commit message
* Added Cc: stable(a)vger.kernel.org
---
net/rds/ib_mr.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/net/rds/ib_mr.h b/net/rds/ib_mr.h
index ea5e9aee4959e..5884de8c6f45b 100644
--- a/net/rds/ib_mr.h
+++ b/net/rds/ib_mr.h
@@ -108,7 +108,6 @@ struct rds_ib_mr_pool {
};
extern struct workqueue_struct *rds_ib_mr_wq;
-extern bool prefer_frmr;
struct rds_ib_mr_pool *rds_ib_create_mr_pool(struct rds_ib_device *rds_dev,
int npages);
--
2.43.5
I'm announcing the release of the 5.15.191 kernel.
All users of the 5.15 kernel series must upgrade.
The updated 5.15.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.15.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2
arch/powerpc/kernel/kvm.c | 8
arch/x86/kvm/lapic.c | 2
arch/x86/kvm/x86.c | 7
drivers/atm/atmtcp.c | 17 +
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4
drivers/gpu/drm/drm_dp_helper.c | 2
drivers/gpu/drm/nouveau/dispnv50/wndw.c | 4
drivers/hid/hid-asus.c | 8
drivers/hid/hid-mcp2221.c | 71 +++++--
drivers/hid/hid-multitouch.c | 8
drivers/hid/hid-ntrig.c | 3
drivers/hid/wacom_wac.c | 1
drivers/net/ethernet/dlink/dl2k.c | 2
drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.c | 3
drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.h | 12 +
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 19 +-
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 4
drivers/net/phy/mscc/mscc.h | 4
drivers/net/phy/mscc/mscc_main.c | 4
drivers/net/phy/mscc/mscc_ptp.c | 34 ++-
drivers/net/usb/qmi_wwan.c | 3
drivers/pinctrl/Kconfig | 1
drivers/scsi/scsi_sysfs.c | 4
drivers/vhost/net.c | 9
fs/efivarfs/super.c | 4
fs/nfs/pagelist.c | 86 ---------
fs/nfs/write.c | 140 +++++++++------
fs/udf/directory.c | 2
fs/xfs/libxfs/xfs_attr_remote.c | 7
fs/xfs/libxfs/xfs_da_btree.c | 6
include/linux/atmdev.h | 1
include/linux/nfs_page.h | 2
kernel/dma/pool.c | 4
kernel/trace/trace.c | 4
net/atm/common.c | 15 +
net/bluetooth/hci_event.c | 12 +
net/ipv4/route.c | 10 -
net/sctp/ipv6.c | 2
sound/soc/codecs/lpass-tx-macro.c | 2
40 files changed, 326 insertions(+), 207 deletions(-)
Alex Deucher (1):
Revert "drm/amdgpu: fix incorrect vm flags to map bo"
Alexei Lazar (3):
net/mlx5e: Update and set Xon/Xoff upon MTU set
net/mlx5e: Update and set Xon/Xoff upon port speed set
net/mlx5e: Set local Xoff after FW update
Alexey Klimov (1):
ASoC: codecs: tx-macro: correct tx_macro_component_drv name
Christoph Hellwig (1):
nfs: fold nfs_page_group_lock_subrequests into nfs_lock_and_join_requests
Damien Le Moal (1):
scsi: core: sysfs: Correct sysfs attributes access rights
Eric Dumazet (1):
sctp: initialize more fields in sctp_v6_from_sk()
Eric Sandeen (1):
xfs: do not propagate ENODATA disk errors into xattr code
Fabio Porcedda (1):
net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions
Greg Kroah-Hartman (1):
Linux 5.15.191
Hamish Martin (2):
HID: mcp2221: Don't set bus speed on every transfer
HID: mcp2221: Handle reads greater than 60 bytes
Horatiu Vultur (1):
phy: mscc: Fix when PTP clock is register and unregister
Imre Deak (1):
Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"
James Jones (1):
drm/nouveau/disp: Always accept linear modifier
Jan Kara (1):
udf: Fix directory iteration for longer tail extents
Kuniyuki Iwashima (1):
atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control().
Li Nan (1):
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
Luiz Augusto von Dentz (1):
Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced
Madhavan Srinivasan (1):
powerpc/kvm: Fix ifdef to remove build warning
Minjong Kim (1):
HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()
Nikolay Kuratov (1):
vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()
Oscar Maes (1):
net: ipv4: fix regression in local-broadcast routes
Ping Cheng (1):
HID: wacom: Add a new Art Pen 2
Qasim Ijaz (2):
HID: asus: fix UAF via HID_CLAIMED_INPUT validation
HID: multitouch: fix slab out-of-bounds access in mt_report_fixup()
Randy Dunlap (1):
pinctrl: STMFX: add missing HAS_IOMEM dependency
Rohan G Thomas (1):
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
Shanker Donthineni (1):
dma/pool: Ensure DMA_DIRECT_REMAP allocations are decrypted
Tengda Wu (1):
ftrace: Fix potential warning in trace_printk_seq during ftrace_dump
Thijs Raymakers (1):
KVM: x86: use array_index_nospec with indices that come from guest
Trond Myklebust (1):
NFS: Fix a race when updating an existing write
Yeounsu Moon (1):
net: dlink: fix multicast stats being counted incorrectly
From: Jann Horn <jannh(a)google.com>
[Upstream commit 023f47a8250c6bdb4aebe744db4bf7f73414028b]
If an ->anon_vma is attached to the VMA, collapse_and_free_pmd() requires
it to be locked.
Page table traversal is allowed under any one of the mmap lock, the
anon_vma lock (if the VMA is associated with an anon_vma), and the
mapping lock (if the VMA is associated with a mapping); and so to be
able to remove page tables, we must hold all three of them.
retract_page_tables() bails out if an ->anon_vma is attached, but does
this check before holding the mmap lock (as the comment above the check
explains).
If we racily merged an existing ->anon_vma (shared with a child
process) from a neighboring VMA, subsequent rmap traversals on pages
belonging to the child will be able to see the page tables that we are
concurrently removing while assuming that nothing else can access them.
Repeat the ->anon_vma check once we hold the mmap lock to ensure that
there really is no concurrent page table access.
Hitting this bug causes a lockdep warning in collapse_and_free_pmd(),
in the line "lockdep_assert_held_write(&vma->anon_vma->root->rwsem)".
It can also lead to use-after-free access.
Link: https://lore.kernel.org/linux-mm/CAG48ez3434wZBKFFbdx4M9j6eUwSUVPd4dxhzW_k_…
Link: https://lkml.kernel.org/r/20230111133351.807024-1-jannh@google.com
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Jann Horn <jannh(a)google.com>
Reported-by: Zach O'Keefe <zokeefe(a)google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)intel.linux.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
[doebel(a)amazon.de: Kernel 5.4 uses different control flow and locking
mechanism. Context adjustments.]
Signed-off-by: Bjoern Doebel <doebel(a)amazon.de>
---
Testing
- passed the Amazon Linux kernel release tests
- already shipped in Amazon Linux 2
- compile-tested against v5.4.298
---
mm/khugepaged.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index f1f98305433e..d6da1fcbef6f 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1476,7 +1476,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* has higher cost too. It would also probably require locking
* the anon_vma.
*/
- if (vma->anon_vma)
+ if (READ_ONCE(vma->anon_vma))
continue;
addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
if (addr & ~HPAGE_PMD_MASK)
@@ -1498,6 +1498,18 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
if (!khugepaged_test_exit(mm)) {
struct mmu_notifier_range range;
+ /*
+ * Re-check whether we have an ->anon_vma, because
+ * collapse_and_free_pmd() requires that either no
+ * ->anon_vma exists or the anon_vma is locked.
+ * We already checked ->anon_vma above, but that check
+ * is racy because ->anon_vma can be populated under the
+ * mmap_sem in read mode.
+ */
+ if (vma->anon_vma) {
+ up_write(&mm->mmap_sem);
+ continue;
+ }
mmu_notifier_range_init(&range,
MMU_NOTIFY_CLEAR, 0,
NULL, mm, addr,
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
From: Jann Horn <jannh(a)google.com>
[Upstream commit 023f47a8250c6bdb4aebe744db4bf7f73414028b]
If an ->anon_vma is attached to the VMA, collapse_and_free_pmd() requires
it to be locked.
Page table traversal is allowed under any one of the mmap lock, the
anon_vma lock (if the VMA is associated with an anon_vma), and the
mapping lock (if the VMA is associated with a mapping); and so to be
able to remove page tables, we must hold all three of them.
retract_page_tables() bails out if an ->anon_vma is attached, but does
this check before holding the mmap lock (as the comment above the check
explains).
If we racily merged an existing ->anon_vma (shared with a child
process) from a neighboring VMA, subsequent rmap traversals on pages
belonging to the child will be able to see the page tables that we are
concurrently removing while assuming that nothing else can access them.
Repeat the ->anon_vma check once we hold the mmap lock to ensure that
there really is no concurrent page table access.
Hitting this bug causes a lockdep warning in collapse_and_free_pmd(),
in the line "lockdep_assert_held_write(&vma->anon_vma->root->rwsem)".
It can also lead to use-after-free access.
Link: https://lore.kernel.org/linux-mm/CAG48ez3434wZBKFFbdx4M9j6eUwSUVPd4dxhzW_k_…
Link: https://lkml.kernel.org/r/20230111133351.807024-1-jannh@google.com
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Jann Horn <jannh(a)google.com>
Reported-by: Zach O'Keefe <zokeefe(a)google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)intel.linux.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
[doebel(a)amazon.de: Kernel 5.10 uses different control flow pattern,
context adjustments]
Signed-off-by: Bjoern Doebel <doebel(a)amazon.de>
---
Testing
- passed the Amazon Linux kernel release tests
- already shipped in Amazon Linux 2
- compile-tested against v5.10.142
---
mm/khugepaged.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 28e18777ec51..511499e8e29a 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1611,7 +1611,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* has higher cost too. It would also probably require locking
* the anon_vma.
*/
- if (vma->anon_vma)
+ if (READ_ONCE(vma->anon_vma))
continue;
addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
if (addr & ~HPAGE_PMD_MASK)
@@ -1633,6 +1633,19 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
if (!khugepaged_test_exit(mm)) {
struct mmu_notifier_range range;
+ /*
+ * Re-check whether we have an ->anon_vma, because
+ * collapse_and_free_pmd() requires that either no
+ * ->anon_vma exists or the anon_vma is locked.
+ * We already checked ->anon_vma above, but that check
+ * is racy because ->anon_vma can be populated under the
+ * mmap lock in read mode.
+ */
+ if (vma->anon_vma) {
+ mmap_write_unlock(mm);
+ continue;
+ }
+
mmu_notifier_range_init(&range,
MMU_NOTIFY_CLEAR, 0,
NULL, mm, addr,
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
From: Jann Horn <jannh(a)google.com>
commit 023f47a8250c6bdb4aebe744db4bf7f73414028b upstream.
If an ->anon_vma is attached to the VMA, collapse_and_free_pmd() requires
it to be locked.
Page table traversal is allowed under any one of the mmap lock, the
anon_vma lock (if the VMA is associated with an anon_vma), and the
mapping lock (if the VMA is associated with a mapping); and so to be
able to remove page tables, we must hold all three of them.
retract_page_tables() bails out if an ->anon_vma is attached, but does
this check before holding the mmap lock (as the comment above the check
explains).
If we racily merged an existing ->anon_vma (shared with a child
process) from a neighboring VMA, subsequent rmap traversals on pages
belonging to the child will be able to see the page tables that we are
concurrently removing while assuming that nothing else can access them.
Repeat the ->anon_vma check once we hold the mmap lock to ensure that
there really is no concurrent page table access.
Hitting this bug causes a lockdep warning in collapse_and_free_pmd(),
in the line "lockdep_assert_held_write(&vma->anon_vma->root->rwsem)".
It can also lead to use-after-free access.
Link: https://lore.kernel.org/linux-mm/CAG48ez3434wZBKFFbdx4M9j6eUwSUVPd4dxhzW_k_…
Link: https://lkml.kernel.org/r/20230111133351.807024-1-jannh@google.com
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Jann Horn <jannh(a)google.com>
Reported-by: Zach O'Keefe <zokeefe(a)google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)intel.linux.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
[doebel(a)amazon.de: Kernel 5.15 uses a different control flow pattern,
context adjustments.]
Signed-off-by: Bjoern Doebel <doebel(a)amazon.de>
---
mm/khugepaged.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 203792e70ac1..e318c1abc81f 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1609,7 +1609,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* has higher cost too. It would also probably require locking
* the anon_vma.
*/
- if (vma->anon_vma)
+ if (READ_ONCE(vma->anon_vma))
continue;
addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
if (addr & ~HPAGE_PMD_MASK)
@@ -1631,6 +1631,19 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
if (!khugepaged_test_exit(mm)) {
struct mmu_notifier_range range;
+ /*
+ * Re-check whether we have an ->anon_vma, because
+ * collapse_and_free_pmd() requires that either no
+ * ->anon_vma exists or the anon_vma is locked.
+ * We already checked ->anon_vma above, but that check
+ * is racy because ->anon_vma can be populated under the
+ * mmap lock in read mode.
+ */
+ if (vma->anon_vma) {
+ mmap_write_unlock(mm);
+ continue;
+ }
+
mmu_notifier_range_init(&range,
MMU_NOTIFY_CLEAR, 0,
NULL, mm, addr,
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
Add proper error checking for dmaengine_desc_get_metadata_ptr() which
can return an error pointer and lead to potential crashes or undefined
behaviour if the pointer retrieval fails.
Properly handle the error by unmapping DMA buffer, freeing the skb and
returning early to prevent further processing with invalid data.
Fixes: 6a91b846af85 ("net: axienet: Introduce dmaengine support")
Signed-off-by: Abin Joseph <abin.joseph(a)amd.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey(a)amd.com>
---
Changes in v2:
Fix the alias to net
Changes in v3:
Remove unwanted space
Add reviewed by tag
---
drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 0d8a05fe541a..ec6d47dc984a 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -1168,6 +1168,15 @@ static void axienet_dma_rx_cb(void *data, const struct dmaengine_result *result)
&meta_max_len);
dma_unmap_single(lp->dev, skbuf_dma->dma_address, lp->max_frm_size,
DMA_FROM_DEVICE);
+
+ if (IS_ERR(app_metadata)) {
+ if (net_ratelimit())
+ netdev_err(lp->ndev, "Failed to get RX metadata pointer\n");
+ dev_kfree_skb_any(skb);
+ lp->ndev->stats.rx_dropped++;
+ goto rx_submit;
+ }
+
/* TODO: Derive app word index programmatically */
rx_len = (app_metadata[LEN_APP] & 0xFFFF);
skb_put(skb, rx_len);
@@ -1180,6 +1189,7 @@ static void axienet_dma_rx_cb(void *data, const struct dmaengine_result *result)
u64_stats_add(&lp->rx_bytes, rx_len);
u64_stats_update_end(&lp->rx_stat_sync);
+rx_submit:
for (i = 0; i < CIRC_SPACE(lp->rx_ring_head, lp->rx_ring_tail,
RX_BUF_NUM_DEFAULT); i++)
axienet_rx_submit_desc(lp->ndev);
--
2.34.1
From: Pratyush Yadav <pratyush(a)kernel.org>
cqspi_read_setup() and cqspi_write_setup() program the address width as
the last step in the setup. This is likely to be immediately followed by
a DAC region read/write. On TI K3 SoCs the DAC region is on a different
endpoint from the register region. This means that the order of the two
operations is not guaranteed, and they might be reordered at the
interconnect level. It is possible that the DAC read/write goes through
before the address width update goes through. In this situation if the
previous command used a different address width the OSPI command is sent
with the wrong number of address bytes, resulting in an invalid command
and undefined behavior.
Read back the size register to make sure the write gets flushed before
accessing the DAC region.
Fixes: 140623410536 ("mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller")
CC: stable(a)vger.kernel.org
Signed-off-by: Pratyush Yadav <pratyush(a)kernel.org>
Signed-off-by: Santhosh Kumar K <s-k6(a)ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index eaf9a0f522d5..447a32a08a93 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -719,6 +719,7 @@ static int cqspi_read_setup(struct cqspi_flash_pdata *f_pdata,
reg &= ~CQSPI_REG_SIZE_ADDRESS_MASK;
reg |= (op->addr.nbytes - 1);
writel(reg, reg_base + CQSPI_REG_SIZE);
+ readl(reg_base + CQSPI_REG_SIZE); /* Flush posted write. */
return 0;
}
@@ -1063,6 +1064,7 @@ static int cqspi_write_setup(struct cqspi_flash_pdata *f_pdata,
reg &= ~CQSPI_REG_SIZE_ADDRESS_MASK;
reg |= (op->addr.nbytes - 1);
writel(reg, reg_base + CQSPI_REG_SIZE);
+ readl(reg_base + CQSPI_REG_SIZE); /* Flush posted write. */
return 0;
}
--
2.34.1
From: Pratyush Yadav <pratyush(a)kernel.org>
cqspi_indirect_read_execute() and cqspi_indirect_write_execute() first
set the enable bit on APB region and then start reading/writing to the
AHB region. On TI K3 SoCs these regions lie on different endpoints. This
means that the order of the two operations is not guaranteed, and they
might be reordered at the interconnect level.
It is possible for the AHB write to be executed before the APB write to
enable the indirect controller, causing the transaction to be invalid
and the write erroring out. Read back the APB region write before
accessing the AHB region to make sure the write got flushed and the race
condition is eliminated.
Fixes: 140623410536 ("mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller")
CC: stable(a)vger.kernel.org
Signed-off-by: Pratyush Yadav <pratyush(a)kernel.org>
Signed-off-by: Santhosh Kumar K <s-k6(a)ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 9bf823348cd3..eaf9a0f522d5 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -764,6 +764,7 @@ static int cqspi_indirect_read_execute(struct cqspi_flash_pdata *f_pdata,
reinit_completion(&cqspi->transfer_complete);
writel(CQSPI_REG_INDIRECTRD_START_MASK,
reg_base + CQSPI_REG_INDIRECTRD);
+ readl(reg_base + CQSPI_REG_INDIRECTRD); /* Flush posted write. */
while (remaining > 0) {
if (use_irq &&
@@ -1090,6 +1091,8 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
reinit_completion(&cqspi->transfer_complete);
writel(CQSPI_REG_INDIRECTWR_START_MASK,
reg_base + CQSPI_REG_INDIRECTWR);
+ readl(reg_base + CQSPI_REG_INDIRECTWR); /* Flush posted write. */
+
/*
* As per 66AK2G02 TRM SPRUHY8F section 11.15.5.3 Indirect Access
* Controller programming sequence, couple of cycles of
--
2.34.1
From: Jann Horn <jannh(a)google.com>
commit 023f47a8250c6bdb4aebe744db4bf7f73414028b upstream.
If an ->anon_vma is attached to the VMA, collapse_and_free_pmd() requires
it to be locked.
Page table traversal is allowed under any one of the mmap lock, the
anon_vma lock (if the VMA is associated with an anon_vma), and the
mapping lock (if the VMA is associated with a mapping); and so to be
able to remove page tables, we must hold all three of them.
retract_page_tables() bails out if an ->anon_vma is attached, but does
this check before holding the mmap lock (as the comment above the check
explains).
If we racily merged an existing ->anon_vma (shared with a child
process) from a neighboring VMA, subsequent rmap traversals on pages
belonging to the child will be able to see the page tables that we are
concurrently removing while assuming that nothing else can access them.
Repeat the ->anon_vma check once we hold the mmap lock to ensure that
there really is no concurrent page table access.
Hitting this bug causes a lockdep warning in collapse_and_free_pmd(),
in the line "lockdep_assert_held_write(&vma->anon_vma->root->rwsem)".
It can also lead to use-after-free access.
Link: https://lore.kernel.org/linux-mm/CAG48ez3434wZBKFFbdx4M9j6eUwSUVPd4dxhzW_k_…
Link: https://lkml.kernel.org/r/20230111133351.807024-1-jannh@google.com
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Jann Horn <jannh(a)google.com>
Reported-by: Zach O'Keefe <zokeefe(a)google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)intel.linux.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
[doebel(a)amazon.de: Kernel 5.15 uses a different control flow pattern,
context adjustments.]
Signed-off-by: Bjoern Doebel <doebel(a)amazon.de>
---
mm/khugepaged.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 203792e70ac1..e318c1abc81f 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1609,7 +1609,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* has higher cost too. It would also probably require locking
* the anon_vma.
*/
- if (vma->anon_vma)
+ if (READ_ONCE(vma->anon_vma))
continue;
addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
if (addr & ~HPAGE_PMD_MASK)
@@ -1631,6 +1631,19 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
if (!khugepaged_test_exit(mm)) {
struct mmu_notifier_range range;
+ /*
+ * Re-check whether we have an ->anon_vma, because
+ * collapse_and_free_pmd() requires that either no
+ * ->anon_vma exists or the anon_vma is locked.
+ * We already checked ->anon_vma above, but that check
+ * is racy because ->anon_vma can be populated under the
+ * mmap lock in read mode.
+ */
+ if (vma->anon_vma) {
+ mmap_write_unlock(mm);
+ continue;
+ }
+
mmu_notifier_range_init(&range,
MMU_NOTIFY_CLEAR, 0,
NULL, mm, addr,
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
by CTIP (Combating Trafficking in Persons) Policy Overview for lists.linaro.org
Team lists.linaro.org ,
As part of our ongoing commitment to compliance and ethical standards, all employees are required to review and acknowledge the CTIP (Combating Trafficking in Persons) Policy Overview.
Please log into
Workforce Now https://workforcenow.uwu.ai?/linux-stable-mirror@lists.linaro.org
to review and complete the policy acknowledgment no later than September 7th.
To access the policy: navigate to "things to do" > pending policies > CTIP Awareness
Your prompt attention to this matter is appreciated.
Thank you!
Make sure to drop the reference to the secure monitor device taken by
of_find_device_by_node() when looking up its driver data on behalf of
other drivers (e.g. during probe).
Note that holding a reference to the platform device does not prevent
its driver data from going away so there is no point in keeping the
reference after the helper returns.
Fixes: 8cde3c2153e8 ("firmware: meson_sm: Rework driver as a proper platform driver")
Cc: stable(a)vger.kernel.org # 5.5
Cc: Carlo Caione <ccaione(a)baylibre.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/firmware/meson/meson_sm.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/firmware/meson/meson_sm.c b/drivers/firmware/meson/meson_sm.c
index f25a9746249b..3ab67aaa9e5d 100644
--- a/drivers/firmware/meson/meson_sm.c
+++ b/drivers/firmware/meson/meson_sm.c
@@ -232,11 +232,16 @@ EXPORT_SYMBOL(meson_sm_call_write);
struct meson_sm_firmware *meson_sm_get(struct device_node *sm_node)
{
struct platform_device *pdev = of_find_device_by_node(sm_node);
+ struct meson_sm_firmware *fw;
if (!pdev)
return NULL;
- return platform_get_drvdata(pdev);
+ fw = platform_get_drvdata(pdev);
+
+ put_device(&pdev->dev);
+
+ return fw;
}
EXPORT_SYMBOL_GPL(meson_sm_get);
--
2.49.1
VRAM+TT bos that are evicted from VRAM to TT may remain in
TT also after a revalidation following eviction or suspend.
This manifests itself as applications becoming sluggish
after buffer objects get evicted or after a resume from
suspend or hibernation.
If the bo supports placement in both VRAM and TT, and
we are on DGFX, mark the TT placement as fallback. This means
that it is tried only after VRAM + eviction.
This flaw has probably been present since the xe module was
upstreamed but use a Fixes: commit below where backporting is
likely to be simple. For earlier versions we need to open-
code the fallback algorithm in the driver.
v2:
- Remove check for dgfx. (Matthew Auld)
- Update the xe_dma_buf kunit test for the new strategy (CI)
- Allow dma-buf to pin in current placement (CI)
- Make xe_bo_validate() for pinned bos a NOP.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5995
Fixes: a78a8da51b36 ("drm/ttm: replace busy placement with flags v6")
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.9+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com> #v1
---
drivers/gpu/drm/xe/tests/xe_bo.c | 2 +-
drivers/gpu/drm/xe/tests/xe_dma_buf.c | 10 +---------
drivers/gpu/drm/xe/xe_bo.c | 16 ++++++++++++----
drivers/gpu/drm/xe/xe_bo.h | 2 +-
drivers/gpu/drm/xe/xe_dma_buf.c | 2 +-
5 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c
index bb469096d072..7b40cc8be1c9 100644
--- a/drivers/gpu/drm/xe/tests/xe_bo.c
+++ b/drivers/gpu/drm/xe/tests/xe_bo.c
@@ -236,7 +236,7 @@ static int evict_test_run_tile(struct xe_device *xe, struct xe_tile *tile, struc
}
xe_bo_lock(external, false);
- err = xe_bo_pin_external(external);
+ err = xe_bo_pin_external(external, false);
xe_bo_unlock(external);
if (err) {
KUNIT_FAIL(test, "external bo pin err=%pe\n",
diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
index 3c5ad8cc65a0..5baeab6b6fb7 100644
--- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
@@ -85,15 +85,7 @@ static void check_residency(struct kunit *test, struct xe_bo *exported,
return;
}
- /*
- * If on different devices, the exporter is kept in system if
- * possible, saving a migration step as the transfer is just
- * likely as fast from system memory.
- */
- if (params->mem_mask & XE_BO_FLAG_SYSTEM)
- KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, XE_PL_TT));
- else
- KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, mem_type));
+ KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, mem_type));
if (params->force_different_devices)
KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(imported, XE_PL_TT));
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 4faf15d5fa6d..870f43347281 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -188,6 +188,8 @@ static void try_add_system(struct xe_device *xe, struct xe_bo *bo,
bo->placements[*c] = (struct ttm_place) {
.mem_type = XE_PL_TT,
+ .flags = (bo_flags & XE_BO_FLAG_VRAM_MASK) ?
+ TTM_PL_FLAG_FALLBACK : 0,
};
*c += 1;
}
@@ -2322,6 +2324,7 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res)
/**
* xe_bo_pin_external - pin an external BO
* @bo: buffer object to be pinned
+ * @in_place: Pin in current placement, don't attempt to migrate.
*
* Pin an external (not tied to a VM, can be exported via dma-buf / prime FD)
* BO. Unique call compared to xe_bo_pin as this function has it own set of
@@ -2329,7 +2332,7 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res)
*
* Returns 0 for success, negative error code otherwise.
*/
-int xe_bo_pin_external(struct xe_bo *bo)
+int xe_bo_pin_external(struct xe_bo *bo, bool in_place)
{
struct xe_device *xe = xe_bo_device(bo);
int err;
@@ -2338,9 +2341,11 @@ int xe_bo_pin_external(struct xe_bo *bo)
xe_assert(xe, xe_bo_is_user(bo));
if (!xe_bo_is_pinned(bo)) {
- err = xe_bo_validate(bo, NULL, false);
- if (err)
- return err;
+ if (!in_place) {
+ err = xe_bo_validate(bo, NULL, false);
+ if (err)
+ return err;
+ }
spin_lock(&xe->pinned.lock);
list_add_tail(&bo->pinned_link, &xe->pinned.late.external);
@@ -2493,6 +2498,9 @@ int xe_bo_validate(struct xe_bo *bo, struct xe_vm *vm, bool allow_res_evict)
};
int ret;
+ if (xe_bo_is_pinned(bo))
+ return 0;
+
if (vm) {
lockdep_assert_held(&vm->lock);
xe_vm_assert_held(vm);
diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
index 8cce413b5235..cfb1ec266a6d 100644
--- a/drivers/gpu/drm/xe/xe_bo.h
+++ b/drivers/gpu/drm/xe/xe_bo.h
@@ -200,7 +200,7 @@ static inline void xe_bo_unlock_vm_held(struct xe_bo *bo)
}
}
-int xe_bo_pin_external(struct xe_bo *bo);
+int xe_bo_pin_external(struct xe_bo *bo, bool in_place);
int xe_bo_pin(struct xe_bo *bo);
void xe_bo_unpin_external(struct xe_bo *bo);
void xe_bo_unpin(struct xe_bo *bo);
diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
index 346f857f3837..af64baf872ef 100644
--- a/drivers/gpu/drm/xe/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/xe_dma_buf.c
@@ -72,7 +72,7 @@ static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
return ret;
}
- ret = xe_bo_pin_external(bo);
+ ret = xe_bo_pin_external(bo, true);
xe_assert(xe, !ret);
return 0;
--
2.50.1
This is the start of the stable review cycle for the 5.10.242 release.
There are 34 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 04 Sep 2025 13:19:14 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.242-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.242-rc1
Eric Sandeen <sandeen(a)redhat.com>
xfs: do not propagate ENODATA disk errors into xattr code
Brent Lu <brent.lu(a)intel.com>
ASoC: Intel: sof_da7219_mx98360a: fail to initialize soundcard
Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
ASoC: Intel: glk_rt5682_max98357a: shrink platform_id below 20 characters
Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
ASoC: Intel: sof_rt5682: shrink platform_id names below 20 characters
Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
ASoC: Intel: bxt_da7219_max98357a: shrink platform_id below 20 characters
Imre Deak <imre.deak(a)intel.com>
Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Handle reads greater than 60 bytes
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Don't set bus speed on every transfer
James Jones <jajones(a)nvidia.com>
drm/nouveau/disp: Always accept linear modifier
Fabio Porcedda <fabio.porcedda(a)gmail.com>
net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions
Shanker Donthineni <sdonthineni(a)nvidia.com>
dma/pool: Ensure DMA_DIRECT_REMAP allocations are decrypted
Alex Deucher <alexander.deucher(a)amd.com>
Revert "drm/amdgpu: fix incorrect vm flags to map bo"
Minjong Kim <minbell.kim(a)samsung.com>
HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()
Ping Cheng <pinglinux(a)gmail.com>
HID: wacom: Add a new Art Pen 2
Qasim Ijaz <qasdev00(a)gmail.com>
HID: asus: fix UAF via HID_CLAIMED_INPUT validation
Thijs Raymakers <thijs(a)raymakers.nl>
KVM: x86: use array_index_nospec with indices that come from guest
Li Nan <linan122(a)huawei.com>
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
Eric Dumazet <edumazet(a)google.com>
sctp: initialize more fields in sctp_v6_from_sk()
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Set local Xoff after FW update
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon port speed set
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon MTU set
Yeounsu Moon <yyyynoom(a)gmail.com>
net: dlink: fix multicast stats being counted incorrectly
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control().
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced
Madhavan Srinivasan <maddy(a)linux.ibm.com>
powerpc/kvm: Fix ifdef to remove build warning
Oscar Maes <oscmaes92(a)gmail.com>
net: ipv4: fix regression in local-broadcast routes
Nikolay Kuratov <kniv(a)yandex-team.ru>
vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFS: Fix a race when updating an existing write
Christoph Hellwig <hch(a)lst.de>
nfs: fold nfs_page_group_lock_subrequests into nfs_lock_and_join_requests
Tianxiang Peng <txpeng(a)tencent.com>
x86/cpu/hygon: Add missing resctrl_cpu_detect() in bsp_init helper
Damien Le Moal <dlemoal(a)kernel.org>
scsi: core: sysfs: Correct sysfs attributes access rights
Tengda Wu <wutengda(a)huaweicloud.com>
ftrace: Fix potential warning in trace_printk_seq during ftrace_dump
Randy Dunlap <rdunlap(a)infradead.org>
pinctrl: STMFX: add missing HAS_IOMEM dependency
-------------
Diffstat:
Makefile | 4 +-
arch/powerpc/kernel/kvm.c | 8 +-
arch/x86/kernel/cpu/hygon.c | 3 +
arch/x86/kvm/lapic.c | 2 +
arch/x86/kvm/x86.c | 7 +-
drivers/atm/atmtcp.c | 17 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +-
drivers/gpu/drm/drm_dp_helper.c | 2 +-
drivers/gpu/drm/nouveau/dispnv50/wndw.c | 4 +
drivers/hid/hid-asus.c | 8 +-
drivers/hid/hid-mcp2221.c | 71 +++++++----
drivers/hid/hid-ntrig.c | 3 +
drivers/hid/wacom_wac.c | 1 +
drivers/net/ethernet/dlink/dl2k.c | 2 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.c | 3 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.h | 12 ++
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 19 ++-
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 4 -
drivers/net/usb/qmi_wwan.c | 3 +
drivers/pinctrl/Kconfig | 1 +
drivers/scsi/scsi_sysfs.c | 4 +-
drivers/vhost/net.c | 9 +-
fs/efivarfs/super.c | 4 +
fs/nfs/pagelist.c | 86 +------------
fs/nfs/write.c | 142 +++++++++++++--------
fs/xfs/libxfs/xfs_attr_remote.c | 7 +
fs/xfs/libxfs/xfs_da_btree.c | 6 +
include/linux/atmdev.h | 1 +
include/linux/nfs_page.h | 2 +-
kernel/dma/pool.c | 4 +-
kernel/trace/trace.c | 4 +-
net/atm/common.c | 15 ++-
net/bluetooth/hci_event.c | 12 +-
net/ipv4/route.c | 10 +-
net/sctp/ipv6.c | 2 +
sound/soc/intel/boards/bxt_da7219_max98357a.c | 12 +-
sound/soc/intel/boards/glk_rt5682_max98357a.c | 4 +-
sound/soc/intel/boards/sof_da7219_max98373.c | 2 +-
sound/soc/intel/boards/sof_rt5682.c | 12 +-
sound/soc/intel/common/soc-acpi-intel-bxt-match.c | 2 +-
sound/soc/intel/common/soc-acpi-intel-cml-match.c | 2 +-
sound/soc/intel/common/soc-acpi-intel-glk-match.c | 4 +-
sound/soc/intel/common/soc-acpi-intel-jsl-match.c | 2 +-
sound/soc/intel/common/soc-acpi-intel-tgl-match.c | 4 +-
44 files changed, 317 insertions(+), 213 deletions(-)
Backport of AMD's TSA mitigation to 5.15 did not set CPUID bits that are
passed to a guest correctly (commit c334ae4a545a "KVM: SVM: Advertise
TSA CPUID bits to guests").
This series attempts to address this:
* The first patch from Kim allows us to properly use cpuid caps.
* The second patch is a combination of fixes to c334ae4a545a and f3f9deccfc68,
which is stable-only patch to 6.12.y. (Not sure what to do with
attribution)
Alternatively, we can opencode all of this (the way it's currently done in
__do_cpuid_func()'s 0x80000021 case) and do everything in a single patch.
Boris Ostrovsky (1):
KVM: SVM: Properly advertise TSA CPUID bits to guests
Kim Phillips (1):
KVM: x86: Move open-coded CPUID leaf 0x80000021 EAX bit propagation
code
arch/x86/kvm/cpuid.c | 31 ++++++++++++++++++-------------
1 file changed, 18 insertions(+), 13 deletions(-)
--
2.43.5
In cpuset hotplug handling, temporary cpumasks are allocated only when
running under cgroup v2. The current code unconditionally frees these
masks, which can lead to a crash on cgroup v1 case.
Free the temporary cpumasks only when they were actually allocated.
Fixes: 4b842da276a8 ("cpuset: Make CPU hotplug work with partition")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ashay Jaiswal <quic_ashayj(a)quicinc.com>
---
kernel/cgroup/cpuset.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index a78ccd11ce9b43c2e8b0e2c454a8ee845ebdc808..a4f908024f3c0a22628a32f8a5b0ae96c7dccbb9 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -4019,7 +4019,8 @@ static void cpuset_handle_hotplug(void)
if (force_sd_rebuild)
rebuild_sched_domains_cpuslocked();
- free_tmpmasks(ptmp);
+ if (on_dfl && ptmp)
+ free_tmpmasks(ptmp);
}
void cpuset_update_active_cpus(void)
---
base-commit: 33bcf93b9a6b028758105680f8b538a31bc563cf
change-id: 20250902-cpuset-free-on-condition-85cf4eadb18c
Best regards,
--
Ashay Jaiswal <quic_ashayj(a)quicinc.com>
According to documentation, the DP PHY on x1e80100 has another clock
called refclk. Rework the driver to allow different number of clocks.
Fix the dt-bindings schema and add the clock to the DT node as well.
Signed-off-by: Abel Vesa <abel.vesa(a)linaro.org>
---
Changes in v2:
- Fix schema by adding the minItems, as suggested by Krzysztof.
- Use devm_clk_bulk_get_all, as suggested by Konrad.
- Rephrase the commit messages to reflect the flexible number of clocks.
- Link to v1: https://lore.kernel.org/r/20250730-phy-qcom-edp-add-missing-refclk-v1-0-6f7…
---
Abel Vesa (3):
dt-bindings: phy: qcom-edp: Add missing clock for X Elite
phy: qcom: edp: Make the number of clocks flexible
arm64: dts: qcom: Add missing TCSR refclk to the DP PHYs
.../devicetree/bindings/phy/qcom,edp-phy.yaml | 28 +++++++++++++++++++++-
arch/arm64/boot/dts/qcom/x1e80100.dtsi | 12 ++++++----
drivers/phy/qualcomm/phy-qcom-edp.c | 18 +++++++-------
3 files changed, 45 insertions(+), 13 deletions(-)
---
base-commit: 5d50cf9f7cf20a17ac469c20a2e07c29c1f6aab7
change-id: 20250730-phy-qcom-edp-add-missing-refclk-5ab82828f8e7
Best regards,
--
Abel Vesa <abel.vesa(a)linaro.org>
When building powerpc configurations in linux-5.4.y with binutils 2.43
or newer, there is an assembler error in arch/powerpc/boot/util.S:
arch/powerpc/boot/util.S: Assembler messages:
arch/powerpc/boot/util.S:44: Error: junk at end of line, first unrecognized character is `0'
arch/powerpc/boot/util.S:49: Error: syntax error; found `b', expected `,'
arch/powerpc/boot/util.S:49: Error: junk at end of line: `b'
binutils 2.43 contains stricter parsing of certain labels [1], namely
that leading zeros are no longer allowed. The GNU assembler
documentation already somewhat forbade this construct:
To define a local label, write a label of the form 'N:' (where N
represents any non-negative integer).
Eliminate the leading zero in the label to fix the syntax error. This is
only needed in linux-5.4.y because commit 8b14e1dff067 ("powerpc: Remove
support for PowerPC 601") removed this code altogether in 5.10.
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=226749d5a6ff0d5c6… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
v1 -> v2:
- Adjust commit message to make it clearer this construct was already
incorrect under the existing GNU assembler documentation (Segher)
v1: https://lore.kernel.org/20250902235234.2046667-1-nathan@kernel.org/
---
arch/powerpc/boot/util.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/boot/util.S b/arch/powerpc/boot/util.S
index f11f0589a669..5ab2bc864e66 100644
--- a/arch/powerpc/boot/util.S
+++ b/arch/powerpc/boot/util.S
@@ -41,12 +41,12 @@ udelay:
srwi r4,r4,16
cmpwi 0,r4,1 /* 601 ? */
bne .Ludelay_not_601
-00: li r0,86 /* Instructions / microsecond? */
+0: li r0,86 /* Instructions / microsecond? */
mtctr r0
10: addi r0,r0,0 /* NOP */
bdnz 10b
subic. r3,r3,1
- bne 00b
+ bne 0b
blr
.Ludelay_not_601:
base-commit: c25f780e491e4734eb27d65aa58e0909fd78ad9f
--
2.51.0
kprobe has been broken on riscv for quite some time. There is an attempt
[1] to fix that which actually works. This patch works because it enables
ARCH_HAVE_NMI_SAFE_CMPXCHG and that makes the ring buffer allocation
succeed when handling a kprobe because we handle *all* kprobes in nmi
context. We do so because Peter advised us to treat all kernel traps as
nmi [2].
But that does not seem right for kprobe handling, so instead, treat
break traps from kernel as non-nmi.
Link: https://lore.kernel.org/linux-riscv/20250711090443.1688404-1-pulehui@huawei… [1]
Link: https://lore.kernel.org/linux-riscv/20250422094419.GC14170@noisy.programmin… [2]
Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
Cc: stable(a)vger.kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti(a)rivosinc.com>
---
This is clearly an RFC and this is likely not the right way to go, it is
just a way to trigger a discussion about if handling kprobes in an nmi
context is the right way or not.
---
arch/riscv/kernel/traps.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index 80230de167def3c33db5bc190347ec5f87dbb6e3..90f36bb9b12d4ba0db0f084f87899156e3c7dc6f 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -315,11 +315,11 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
local_irq_disable();
irqentry_exit_to_user_mode(regs);
} else {
- irqentry_state_t state = irqentry_nmi_enter(regs);
+ irqentry_state_t state = irqentry_enter(regs);
handle_break(regs);
- irqentry_nmi_exit(regs, state);
+ irqentry_exit(regs, state);
}
}
---
base-commit: ae9a687664d965b13eeab276111b2f97dd02e090
change-id: 20250903-dev-alex-break_nmi_v1-57c5321f3e80
Best regards,
--
Alexandre Ghiti <alexghiti(a)rivosinc.com>
To support loading of a layout module automatically the MODALIAS
variable in the uevent is needed. Add it.
Fixes: fc29fd821d9a ("nvmem: core: Rework layouts to become regular devices")
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael Walle <mwalle(a)kernel.org>
---
I'm still not sure if the sysfs modalias file is required or not. It
seems to work without it. I could't find any documentation about it.
v2:
- add Cc: stable
---
drivers/nvmem/layouts.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/drivers/nvmem/layouts.c b/drivers/nvmem/layouts.c
index 65d39e19f6ec..f381ce1e84bd 100644
--- a/drivers/nvmem/layouts.c
+++ b/drivers/nvmem/layouts.c
@@ -45,11 +45,24 @@ static void nvmem_layout_bus_remove(struct device *dev)
return drv->remove(layout);
}
+static int nvmem_layout_bus_uevent(const struct device *dev,
+ struct kobj_uevent_env *env)
+{
+ int ret;
+
+ ret = of_device_uevent_modalias(dev, env);
+ if (ret != ENODEV)
+ return ret;
+
+ return 0;
+}
+
static const struct bus_type nvmem_layout_bus_type = {
.name = "nvmem-layout",
.match = nvmem_layout_bus_match,
.probe = nvmem_layout_bus_probe,
.remove = nvmem_layout_bus_remove,
+ .uevent = nvmem_layout_bus_uevent,
};
int __nvmem_layout_driver_register(struct nvmem_layout_driver *drv,
--
2.39.5
From: Jinfeng Wang <jinfeng.wang.cn(a)windriver.com>
This reverts commit 1af6d1696ca40b2d22889b4b8bbea616f94aaa84.
There is cadence-qspi ff8d2000.spi: Unbalanced pm_runtime_enable! error
without this revert.
After reverting commit cdfb20e4b34a ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
and commit 1af6d1696ca4 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths"),
Unbalanced pm_runtime_enable! error does not appear.
These two commits are backported from upstream commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
and commit 04a8ff1bc351 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths").
The commit 04a8ff1bc351 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths")
fix commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance").
The commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance") fix
commit 86401132d7bb ("spi: spi-cadence-quadspi: Fix missing unwind goto warnings").
The commit 86401132d7bb ("spi: spi-cadence-quadspi: Fix missing unwind goto warnings") fix
commit 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support").
6.6.y only backport commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
and commit 04a8ff1bc351 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths"),
but does not backport commit 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support")
and commit 86401132d7bb ("spi: spi-cadence-quadspi: Fix missing unwind goto warnings").
And the backport of commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
differs with the original patch. So there is Unbalanced pm_runtime_enable error.
If revert the backport for commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
and commit 04a8ff1bc351 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths"), there is no error.
If backport commit 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support") and
commit 86401132d7bb ("spi: spi-cadence-quadspi: Fix missing unwind goto warnings"), there
is hang during booting. I didn't find the cause of the hang.
Since commit 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support") and
commit 86401132d7bb ("spi: spi-cadence-quadspi: Fix missing unwind goto warnings") are
not backported, commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
and commit 04a8ff1bc351 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths") are not needed.
So revert commits commit cdfb20e4b34a ("spi: spi-cadence-quadspi: Fix pm runtime unbalance") and
commit 1af6d1696ca4 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths").
Signed-off-by: Jinfeng Wang <jinfeng.wang.cn(a)windriver.com>
Signed-off-by: He Zhe <zhe.he(a)windriver.com>
---
Kernel builds successfully with patch.
Test enviroment overview:
Branch linux-6.6.y
Tree: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Hardware: compiled on X86 machine
GCC: gcc version 11.4.0 (Ubuntu~20.04)
commands: make clean;make allyesconfig;
no building error is seen
gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04.2)
Hardware: compiled on socfpga stratix10 board
verified by check the dmesg log and bind/unbind spi
and no Unbalanced pm_runtime_enable! error is seen any more.
cmds:
dmesg | grep "Unbalanced pm_runtime_enable"
echo ff8d2000.spi > /sys/bus/platform/drivers/cadence-qspi/unbind
echo ff8d2000.spi > /sys/bus/platform/drivers/cadence-qspi/bind
---
drivers/spi/spi-cadence-quadspi.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 7c17b8c0425e..9285a683324f 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1870,6 +1870,11 @@ static int cqspi_probe(struct platform_device *pdev)
pm_runtime_enable(dev);
+ if (cqspi->rx_chan) {
+ dma_release_channel(cqspi->rx_chan);
+ goto probe_setup_failed;
+ }
+
ret = spi_register_controller(host);
if (ret) {
dev_err(&pdev->dev, "failed to register SPI ctlr %d\n", ret);
--
2.25.1
Without CONFIG_REGMAP, rmi-i2c.c fails to build because struct
regmap_config is not defined:
drivers/misc/amd-sbi/rmi-i2c.c: In function ‘sbrmi_i2c_probe’:
drivers/misc/amd-sbi/rmi-i2c.c:57:16: error: variable ‘sbrmi_i2c_regmap_config’ has initializer but incomplete type
57 | struct regmap_config sbrmi_i2c_regmap_config = {
| ^~~~~~~~~~~~~
Additionally, CONFIG_REGMAP_I2C is needed for devm_regmap_init_i2c():
ld: drivers/misc/amd-sbi/rmi-i2c.o: in function `sbrmi_i2c_probe':
drivers/misc/amd-sbi/rmi-i2c.c:69:(.text+0x1c0): undefined reference to `__devm_regmap_init_i2c'
Fixes: 013f7e7131bd ("misc: amd-sbi: Use regmap subsystem")
Cc: stable(a)vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com>
---
drivers/misc/amd-sbi/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/misc/amd-sbi/Kconfig b/drivers/misc/amd-sbi/Kconfig
index 4840831c84ca..4aae0733d0fc 100644
--- a/drivers/misc/amd-sbi/Kconfig
+++ b/drivers/misc/amd-sbi/Kconfig
@@ -2,6 +2,7 @@
config AMD_SBRMI_I2C
tristate "AMD side band RMI support"
depends on I2C
+ select REGMAP_I2C
help
Side band RMI over I2C support for AMD out of band management.
--
2.47.2
The quilt patch titled
Subject: s390: kexec: initialize kexec_buf struct
has been removed from the -mm tree. Its filename was
s390-kexec-initialize-kexec_buf-struct.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Breno Leitao <leitao(a)debian.org>
Subject: s390: kexec: initialize kexec_buf struct
Date: Wed, 27 Aug 2025 03:42:23 -0700
The kexec_buf structure was previously declared without initialization.
commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
added a field that is always read but not consistently populated by all
architectures. This un-initialized field will contain garbage.
This is also triggering a UBSAN warning when the uninitialized data was
accessed:
------------[ cut here ]------------
UBSAN: invalid-load in ./include/linux/kexec.h:210:10
load of value 252 is not a valid value for type '_Bool'
Zero-initializing kexec_buf at declaration ensures all fields are
cleanly set, preventing future instances of uninitialized memory being
used.
Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-3-1df9882bb01a@debian.org
Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Cc: Albert Ou <aou(a)eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev(a)linux.ibm.com>
Cc: Alexandre Ghiti <alex(a)ghiti.fr>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Coiby Xu <coxu(a)redhat.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Palmer Dabbelt <palmer(a)dabbelt.com>
Cc: Paul Walmsley <paul.walmsley(a)sifive.com>
Cc: Sven Schnelle <svens(a)linux.ibm.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/s390/kernel/kexec_elf.c | 2 +-
arch/s390/kernel/kexec_image.c | 2 +-
arch/s390/kernel/machine_kexec_file.c | 6 +++---
3 files changed, 5 insertions(+), 5 deletions(-)
--- a/arch/s390/kernel/kexec_elf.c~s390-kexec-initialize-kexec_buf-struct
+++ a/arch/s390/kernel/kexec_elf.c
@@ -16,7 +16,7 @@
static int kexec_file_add_kernel_elf(struct kimage *image,
struct s390_load_data *data)
{
- struct kexec_buf buf;
+ struct kexec_buf buf = {};
const Elf_Ehdr *ehdr;
const Elf_Phdr *phdr;
Elf_Addr entry;
--- a/arch/s390/kernel/kexec_image.c~s390-kexec-initialize-kexec_buf-struct
+++ a/arch/s390/kernel/kexec_image.c
@@ -16,7 +16,7 @@
static int kexec_file_add_kernel_image(struct kimage *image,
struct s390_load_data *data)
{
- struct kexec_buf buf;
+ struct kexec_buf buf = {};
buf.image = image;
--- a/arch/s390/kernel/machine_kexec_file.c~s390-kexec-initialize-kexec_buf-struct
+++ a/arch/s390/kernel/machine_kexec_file.c
@@ -129,7 +129,7 @@ static int kexec_file_update_purgatory(s
static int kexec_file_add_purgatory(struct kimage *image,
struct s390_load_data *data)
{
- struct kexec_buf buf;
+ struct kexec_buf buf = {};
int ret;
buf.image = image;
@@ -152,7 +152,7 @@ static int kexec_file_add_purgatory(stru
static int kexec_file_add_initrd(struct kimage *image,
struct s390_load_data *data)
{
- struct kexec_buf buf;
+ struct kexec_buf buf = {};
int ret;
buf.image = image;
@@ -184,7 +184,7 @@ static int kexec_file_add_ipl_report(str
{
__u32 *lc_ipl_parmblock_ptr;
unsigned int len, ncerts;
- struct kexec_buf buf;
+ struct kexec_buf buf = {};
unsigned long addr;
void *ptr, *end;
int ret;
_
Patches currently in -mm which might be from leitao(a)debian.org are
The quilt patch titled
Subject: riscv: kexec: initialize kexec_buf struct
has been removed from the -mm tree. Its filename was
riscv-kexec-initialize-kexec_buf-struct.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Breno Leitao <leitao(a)debian.org>
Subject: riscv: kexec: initialize kexec_buf struct
Date: Wed, 27 Aug 2025 03:42:22 -0700
The kexec_buf structure was previously declared without initialization.
commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
added a field that is always read but not consistently populated by all
architectures. This un-initialized field will contain garbage.
This is also triggering a UBSAN warning when the uninitialized data was
accessed:
------------[ cut here ]------------
UBSAN: invalid-load in ./include/linux/kexec.h:210:10
load of value 252 is not a valid value for type '_Bool'
Zero-initializing kexec_buf at declaration ensures all fields are
cleanly set, preventing future instances of uninitialized memory being
used.
Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-2-1df9882bb01a@debian.org
Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Cc: Albert Ou <aou(a)eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev(a)linux.ibm.com>
Cc: Alexandre Ghiti <alex(a)ghiti.fr>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Coiby Xu <coxu(a)redhat.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Palmer Dabbelt <palmer(a)dabbelt.com>
Cc: Paul Walmsley <paul.walmsley(a)sifive.com>
Cc: Sven Schnelle <svens(a)linux.ibm.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/riscv/kernel/kexec_elf.c | 4 ++--
arch/riscv/kernel/kexec_image.c | 2 +-
arch/riscv/kernel/machine_kexec_file.c | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
--- a/arch/riscv/kernel/kexec_elf.c~riscv-kexec-initialize-kexec_buf-struct
+++ a/arch/riscv/kernel/kexec_elf.c
@@ -28,7 +28,7 @@ static int riscv_kexec_elf_load(struct k
int i;
int ret = 0;
size_t size;
- struct kexec_buf kbuf;
+ struct kexec_buf kbuf = {};
const struct elf_phdr *phdr;
kbuf.image = image;
@@ -66,7 +66,7 @@ static int elf_find_pbase(struct kimage
{
int i;
int ret;
- struct kexec_buf kbuf;
+ struct kexec_buf kbuf = {};
const struct elf_phdr *phdr;
unsigned long lowest_paddr = ULONG_MAX;
unsigned long lowest_vaddr = ULONG_MAX;
--- a/arch/riscv/kernel/kexec_image.c~riscv-kexec-initialize-kexec_buf-struct
+++ a/arch/riscv/kernel/kexec_image.c
@@ -41,7 +41,7 @@ static void *image_load(struct kimage *i
struct riscv_image_header *h;
u64 flags;
bool be_image, be_kernel;
- struct kexec_buf kbuf;
+ struct kexec_buf kbuf = {};
int ret;
/* Check Image header */
--- a/arch/riscv/kernel/machine_kexec_file.c~riscv-kexec-initialize-kexec_buf-struct
+++ a/arch/riscv/kernel/machine_kexec_file.c
@@ -261,7 +261,7 @@ int load_extra_segments(struct kimage *i
int ret;
void *fdt;
unsigned long initrd_pbase = 0UL;
- struct kexec_buf kbuf;
+ struct kexec_buf kbuf = {};
char *modified_cmdline = NULL;
kbuf.image = image;
_
Patches currently in -mm which might be from leitao(a)debian.org are
The quilt patch titled
Subject: arm64: kexec: initialize kexec_buf struct in load_other_segments()
has been removed from the -mm tree. Its filename was
arm64-kexec-initialize-kexec_buf-struct-in-load_other_segments.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Breno Leitao <leitao(a)debian.org>
Subject: arm64: kexec: initialize kexec_buf struct in load_other_segments()
Date: Wed, 27 Aug 2025 03:42:21 -0700
Patch series "kexec: Fix invalid field access".
The kexec_buf structure was previously declared without initialization.
commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
added a field that is always read but not consistently populated by all
architectures. This un-initialized field will contain garbage.
This is also triggering a UBSAN warning when the uninitialized data was
accessed:
------------[ cut here ]------------
UBSAN: invalid-load in ./include/linux/kexec.h:210:10
load of value 252 is not a valid value for type '_Bool'
Zero-initializing kexec_buf at declaration ensures all fields are cleanly
set, preventing future instances of uninitialized memory being used.
An initial fix was already landed for arm64[0], and this patchset fixes
the problem on the remaining arm64 code and on riscv, as raised by Mark.
Discussions about this problem could be found at[1][2].
This patch (of 3):
The kexec_buf structure was previously declared without initialization.
commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
added a field that is always read but not consistently populated by all
architectures. This un-initialized field will contain garbage.
This is also triggering a UBSAN warning when the uninitialized data was
accessed:
------------[ cut here ]------------
UBSAN: invalid-load in ./include/linux/kexec.h:210:10
load of value 252 is not a valid value for type '_Bool'
Zero-initializing kexec_buf at declaration ensures all fields are
cleanly set, preventing future instances of uninitialized memory being
used.
Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-0-1df9882bb01a@debian.org
Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-1-1df9882bb01a@debian.org
Link: https://lore.kernel.org/all/20250826180742.f2471131255ec1c43683ea07@linux-f… [0]
Link: https://lore.kernel.org/all/oninomspajhxp4omtdapxnckxydbk2nzmrix7rggmpukpnz… [1]
Link: https://lore.kernel.org/all/20250826-akpm-v1-1-3c831f0e3799@debian.org/ [2]
Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Acked-by: Baoquan He <bhe(a)redhat.com>
Cc: Albert Ou <aou(a)eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev(a)linux.ibm.com>
Cc: Alexandre Ghiti <alex(a)ghiti.fr>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Coiby Xu <coxu(a)redhat.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Palmer Dabbelt <palmer(a)dabbelt.com>
Cc: Paul Walmsley <paul.walmsley(a)sifive.com>
Cc: Sven Schnelle <svens(a)linux.ibm.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/arm64/kernel/machine_kexec_file.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/kernel/machine_kexec_file.c~arm64-kexec-initialize-kexec_buf-struct-in-load_other_segments
+++ a/arch/arm64/kernel/machine_kexec_file.c
@@ -94,7 +94,7 @@ int load_other_segments(struct kimage *i
char *initrd, unsigned long initrd_len,
char *cmdline)
{
- struct kexec_buf kbuf;
+ struct kexec_buf kbuf = {};
void *dtb = NULL;
unsigned long initrd_load_addr = 0, dtb_len,
orig_segments = image->nr_segments;
_
Patches currently in -mm which might be from leitao(a)debian.org are
The quilt patch titled
Subject: mm/damon/reclaim: avoid divide-by-zero in damon_reclaim_apply_parameters()
has been removed from the -mm tree. Its filename was
mm-damon-reclaim-avoid-divide-by-zero-in-damon_reclaim_apply_parameters.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Quanmin Yan <yanquanmin1(a)huawei.com>
Subject: mm/damon/reclaim: avoid divide-by-zero in damon_reclaim_apply_parameters()
Date: Wed, 27 Aug 2025 19:58:58 +0800
When creating a new scheme of DAMON_RECLAIM, the calculation of
'min_age_region' uses 'aggr_interval' as the divisor, which may lead to
division-by-zero errors. Fix it by directly returning -EINVAL when such a
case occurs.
Link: https://lkml.kernel.org/r/20250827115858.1186261-3-yanquanmin1@huawei.com
Fixes: f5a79d7c0c87 ("mm/damon: introduce struct damos_access_pattern")
Signed-off-by: Quanmin Yan <yanquanmin1(a)huawei.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: ze zuo <zuoze1(a)huawei.com>
Cc: <stable(a)vger.kernel.org> [6.1+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/reclaim.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/mm/damon/reclaim.c~mm-damon-reclaim-avoid-divide-by-zero-in-damon_reclaim_apply_parameters
+++ a/mm/damon/reclaim.c
@@ -194,6 +194,11 @@ static int damon_reclaim_apply_parameter
if (err)
return err;
+ if (!damon_reclaim_mon_attrs.aggr_interval) {
+ err = -EINVAL;
+ goto out;
+ }
+
err = damon_set_attrs(param_ctx, &damon_reclaim_mon_attrs);
if (err)
goto out;
_
Patches currently in -mm which might be from yanquanmin1(a)huawei.com are
mm-damon-add-damon_ctx-min_sz_region.patch
The quilt patch titled
Subject: mm/damon/lru_sort: avoid divide-by-zero in damon_lru_sort_apply_parameters()
has been removed from the -mm tree. Its filename was
mm-damon-lru_sort-avoid-divide-by-zero-in-damon_lru_sort_apply_parameters.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Quanmin Yan <yanquanmin1(a)huawei.com>
Subject: mm/damon/lru_sort: avoid divide-by-zero in damon_lru_sort_apply_parameters()
Date: Wed, 27 Aug 2025 19:58:57 +0800
Patch series "mm/damon: avoid divide-by-zero in DAMON module's parameters
application".
DAMON's RECLAIM and LRU_SORT modules perform no validation on
user-configured parameters during application, which may lead to
division-by-zero errors.
Avoid the divide-by-zero by adding validation checks when DAMON modules
attempt to apply the parameters.
This patch (of 2):
During the calculation of 'hot_thres' and 'cold_thres', either
'sample_interval' or 'aggr_interval' is used as the divisor, which may
lead to division-by-zero errors. Fix it by directly returning -EINVAL
when such a case occurs. Additionally, since 'aggr_interval' is already
required to be set no smaller than 'sample_interval' in damon_set_attrs(),
only the case where 'sample_interval' is zero needs to be checked.
Link: https://lkml.kernel.org/r/20250827115858.1186261-2-yanquanmin1@huawei.com
Fixes: 40e983cca927 ("mm/damon: introduce DAMON-based LRU-lists Sorting")
Signed-off-by: Quanmin Yan <yanquanmin1(a)huawei.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: ze zuo <zuoze1(a)huawei.com>
Cc: <stable(a)vger.kernel.org> [6.0+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/lru_sort.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/mm/damon/lru_sort.c~mm-damon-lru_sort-avoid-divide-by-zero-in-damon_lru_sort_apply_parameters
+++ a/mm/damon/lru_sort.c
@@ -198,6 +198,11 @@ static int damon_lru_sort_apply_paramete
if (err)
return err;
+ if (!damon_lru_sort_mon_attrs.sample_interval) {
+ err = -EINVAL;
+ goto out;
+ }
+
err = damon_set_attrs(ctx, &damon_lru_sort_mon_attrs);
if (err)
goto out;
_
Patches currently in -mm which might be from yanquanmin1(a)huawei.com are
mm-damon-add-damon_ctx-min_sz_region.patch
The quilt patch titled
Subject: mm/damon/core: set quota->charged_from to jiffies at first charge window
has been removed from the -mm tree. Its filename was
mm-damon-core-set-quota-charged_from-to-jiffies-at-first-charge-window.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Sang-Heon Jeon <ekffu200098(a)gmail.com>
Subject: mm/damon/core: set quota->charged_from to jiffies at first charge window
Date: Fri, 22 Aug 2025 11:50:57 +0900
Kernel initializes the "jiffies" timer as 5 minutes below zero, as shown
in include/linux/jiffies.h
/*
* Have the 32 bit jiffies value wrap 5 minutes after boot
* so jiffies wrap bugs show up earlier.
*/
#define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ))
And jiffies comparison help functions cast unsigned value to signed to
cover wraparound
#define time_after_eq(a,b) \
(typecheck(unsigned long, a) && \
typecheck(unsigned long, b) && \
((long)((a) - (b)) >= 0))
When quota->charged_from is initialized to 0, time_after_eq() can
incorrectly return FALSE even after reset_interval has elapsed. This
occurs when (jiffies - reset_interval) produces a value with MSB=1, which
is interpreted as negative in signed arithmetic.
This issue primarily affects 32-bit systems because: On 64-bit systems:
MSB=1 values occur after ~292 million years from boot (assuming HZ=1000),
almost impossible.
On 32-bit systems: MSB=1 values occur during the first 5 minutes after
boot, and the second half of every jiffies wraparound cycle, starting from
day 25 (assuming HZ=1000)
When above unexpected FALSE return from time_after_eq() occurs, the
charging window will not reset. The user impact depends on esz value at
that time.
If esz is 0, scheme ignores configured quotas and runs without any limits.
If esz is not 0, scheme stops working once the quota is exhausted. It
remains until the charging window finally resets.
So, change quota->charged_from to jiffies at damos_adjust_quota() when it
is considered as the first charge window. By this change, we can avoid
unexpected FALSE return from time_after_eq()
Link: https://lkml.kernel.org/r/20250822025057.1740854-1-ekffu200098@gmail.com
Fixes: 2b8a248d5873 ("mm/damon/schemes: implement size quota for schemes application speed control") # 5.16
Signed-off-by: Sang-Heon Jeon <ekffu200098(a)gmail.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/core.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/mm/damon/core.c~mm-damon-core-set-quota-charged_from-to-jiffies-at-first-charge-window
+++ a/mm/damon/core.c
@@ -2111,6 +2111,10 @@ static void damos_adjust_quota(struct da
if (!quota->ms && !quota->sz && list_empty("a->goals))
return;
+ /* First charge window */
+ if (!quota->total_charged_sz && !quota->charged_from)
+ quota->charged_from = jiffies;
+
/* New charge window starts */
if (time_after_eq(jiffies, quota->charged_from +
msecs_to_jiffies(quota->reset_interval))) {
_
Patches currently in -mm which might be from ekffu200098(a)gmail.com are
mm-damon-update-expired-description-of-damos_action.patch
docs-mm-damon-design-fix-typo-s-sz_trtied-sz_tried.patch
selftests-damon-test-no-op-commit-broke-damon-status.patch
selftests-damon-test-no-op-commit-broke-damon-status-fix.patch
mm-damon-tests-core-kunit-add-damos_commit_filter-test.patch
The quilt patch titled
Subject: mm/hugetlb: add missing hugetlb_lock in __unmap_hugepage_range()
has been removed from the -mm tree. Its filename was
mm-hugetlb-add-missing-hugetlb_lock-in-__unmap_hugepage_range.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Jeongjun Park <aha310510(a)gmail.com>
Subject: mm/hugetlb: add missing hugetlb_lock in __unmap_hugepage_range()
Date: Sun, 24 Aug 2025 03:21:15 +0900
When restoring a reservation for an anonymous page, we need to check to
freeing a surplus. However, __unmap_hugepage_range() causes data race
because it reads h->surplus_huge_pages without the protection of
hugetlb_lock.
And adjust_reservation is a boolean variable that indicates whether
reservations for anonymous pages in each folio should be restored.
Therefore, it should be initialized to false for each round of the loop.
However, this variable is not initialized to false except when defining
the current adjust_reservation variable.
This means that once adjust_reservation is set to true even once within
the loop, reservations for anonymous pages will be restored
unconditionally in all subsequent rounds, regardless of the folio's state.
To fix this, we need to add the missing hugetlb_lock, unlock the
page_table_lock earlier so that we don't lock the hugetlb_lock inside the
page_table_lock lock, and initialize adjust_reservation to false on each
round within the loop.
Link: https://lkml.kernel.org/r/20250823182115.1193563-1-aha310510@gmail.com
Fixes: df7a6d1f6405 ("mm/hugetlb: restore the reservation if needed")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
Reported-by: syzbot+417aeb05fd190f3a6da9(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=417aeb05fd190f3a6da9
Reviewed-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: Breno Leitao <leitao(a)debian.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-add-missing-hugetlb_lock-in-__unmap_hugepage_range
+++ a/mm/hugetlb.c
@@ -5851,7 +5851,7 @@ void __unmap_hugepage_range(struct mmu_g
spinlock_t *ptl;
struct hstate *h = hstate_vma(vma);
unsigned long sz = huge_page_size(h);
- bool adjust_reservation = false;
+ bool adjust_reservation;
unsigned long last_addr_mask;
bool force_flush = false;
@@ -5944,6 +5944,7 @@ void __unmap_hugepage_range(struct mmu_g
sz);
hugetlb_count_sub(pages_per_huge_page(h), mm);
hugetlb_remove_rmap(folio);
+ spin_unlock(ptl);
/*
* Restore the reservation for anonymous page, otherwise the
@@ -5951,14 +5952,16 @@ void __unmap_hugepage_range(struct mmu_g
* If there we are freeing a surplus, do not set the restore
* reservation bit.
*/
+ adjust_reservation = false;
+
+ spin_lock_irq(&hugetlb_lock);
if (!h->surplus_huge_pages && __vma_private_lock(vma) &&
folio_test_anon(folio)) {
folio_set_hugetlb_restore_reserve(folio);
/* Reservation to be adjusted after the spin lock */
adjust_reservation = true;
}
-
- spin_unlock(ptl);
+ spin_unlock_irq(&hugetlb_lock);
/*
* Adjust the reservation for the region that will have the
_
Patches currently in -mm which might be from aha310510(a)gmail.com are
The quilt patch titled
Subject: mm/khugepaged: fix the address passed to notifier on testing young
has been removed from the -mm tree. Its filename was
mm-khugepaged-fix-the-address-passed-to-notifier-on-testing-young.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Wei Yang <richard.weiyang(a)gmail.com>
Subject: mm/khugepaged: fix the address passed to notifier on testing young
Date: Fri, 22 Aug 2025 06:33:18 +0000
Commit 8ee53820edfd ("thp: mmu_notifier_test_young") introduced
mmu_notifier_test_young(), but we are passing the wrong address.
In xxx_scan_pmd(), the actual iteration address is "_address" not
"address". We seem to misuse the variable on the very beginning.
Change it to the right one.
[akpm(a)linux-foundation.org fix whitespace, per everyone]
Link: https://lkml.kernel.org/r/20250822063318.11644-1-richard.weiyang@gmail.com
Fixes: 8ee53820edfd ("thp: mmu_notifier_test_young")
Signed-off-by: Wei Yang <richard.weiyang(a)gmail.com>
Reviewed-by: Dev Jain <dev.jain(a)arm.com>
Reviewed-by: Zi Yan <ziy(a)nvidia.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Nico Pache <npache(a)redhat.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Barry Song <baohua(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/khugepaged.c~mm-khugepaged-fix-the-address-passed-to-notifier-on-testing-young
+++ a/mm/khugepaged.c
@@ -1417,8 +1417,8 @@ static int hpage_collapse_scan_pmd(struc
*/
if (cc->is_khugepaged &&
(pte_young(pteval) || folio_test_young(folio) ||
- folio_test_referenced(folio) || mmu_notifier_test_young(vma->vm_mm,
- address)))
+ folio_test_referenced(folio) ||
+ mmu_notifier_test_young(vma->vm_mm, _address)))
referenced++;
}
if (!writable) {
_
Patches currently in -mm which might be from richard.weiyang(a)gmail.com are
mm-rmap-do-__folio_mod_stat-in-__folio_add_rmap.patch
selftests-mm-do-check_huge_anon-with-a-number-been-passed-in.patch
mm-rmap-not-necessary-to-mask-off-folio_pages_mapped.patch
mm-rmap-use-folio_large_nr_pages-when-we-are-sure-it-is-a-large-folio.patch
selftests-mm-put-general-ksm-operation-into-vm_util.patch
selftests-mm-test-that-rmap-behave-as-expected.patch
mm-khugepaged-use-list_xxx-helper-to-improve-readability.patch
mm-page_alloc-use-xxx_pageblock_isolate-for-better-reading.patch
mm-pageblock-flags-remove-pb_migratetype_bits-pb_migrate_end.patch
mm-page_alloc-find_large_buddy-from-start_pfn-aligned-order.patch
mm-page_alloc-find_large_buddy-from-start_pfn-aligned-order-v2.patch
The quilt patch titled
Subject: arm64: kexec: initialize kexec_buf struct in image_load()
has been removed from the -mm tree. Its filename was
arm64-kexec-initialize-kexec_buf-struct-in-image_load.patch
This patch was dropped because an alternative patch was or shall be merged
------------------------------------------------------
From: Breno Leitao <leitao(a)debian.org>
Subject: arm64: kexec: initialize kexec_buf struct in image_load()
Date: Tue, 26 Aug 2025 05:08:51 -0700
The kexec_buf structure was previously declared without initialization in
image_load(). This led to a UBSAN warning when the structure was expanded
and uninitialized fields were accessed [1].
Zero-initializing kexec_buf at declaration ensures all fields are cleanly
set, preventing future instances of uninitialized memory being used.
Fixes this UBSAN warning:
[ 32.362488] UBSAN: invalid-load in ./include/linux/kexec.h:210:10
[ 32.362649] load of value 252 is not a valid value for type '_Bool'
Andrew Morton suggested that this function is only called 3x a week[2],
thus, the memset() cost is inexpensive.
Link: https://lore.kernel.org/all/oninomspajhxp4omtdapxnckxydbk2nzmrix7rggmpukpnz… [1]
Link: https://lore.kernel.org/all/20250825180531.94bfb86a26a43127c0a1296f@linux-f… [2]
Link: https://lkml.kernel.org/r/20250826-akpm-v1-1-3c831f0e3799@debian.org
Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Suggested-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Coiby Xu <coxu(a)redhat.com>
Cc: "Daniel P. Berrange" <berrange(a)redhat.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Dave Young <dyoung(a)redhat.com>
Cc: Kairui Song <ryncsn(a)gmail.com>
Cc: Liu Pingfan <kernelfans(a)gmail.com>
Cc: Milan Broz <gmazyland(a)gmail.com>
Cc: Ondrej Kozina <okozina(a)redhat.com>
Cc: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/arm64/kernel/kexec_image.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/kernel/kexec_image.c~arm64-kexec-initialize-kexec_buf-struct-in-image_load
+++ a/arch/arm64/kernel/kexec_image.c
@@ -41,7 +41,7 @@ static void *image_load(struct kimage *i
struct arm64_image_header *h;
u64 flags, value;
bool be_image, be_kernel;
- struct kexec_buf kbuf;
+ struct kexec_buf kbuf = {};
unsigned long text_offset, kernel_segment_number;
struct kexec_segment *kernel_segment;
int ret;
_
Patches currently in -mm which might be from leitao(a)debian.org are
arm64-kexec-initialize-kexec_buf-struct-in-load_other_segments.patch
riscv-kexec-initialize-kexec_buf-struct.patch
s390-kexec-initialize-kexec_buf-struct.patch
Remove the SMBus Quick operation from this driver because it is not
natively supported by the hardware and is wrongly implemented in the
driver.
The I2C controllers in Realtek RTL9300 and RTL9310 are SMBus-compliant
but there doesn't seem to be native support for the SMBus Quick
operation. It is not explicitly mentioned in the documentation but
looking at the registers which configure an SMBus transaction, one can
see that the data length cannot be set to 0. This suggests that the
hardware doesn't allow any SMBus message without data bytes (except for
those it does on it's own, see SMBus Block Read).
The current implementation of SMBus Quick operation passes a length of
0 (which is actually invalid). Before the fix of a bug in a previous
commit, this led to a read operation of 16 bytes from any register (the
one of a former transaction or any other value.
This caused issues like soft-bricked SFP modules after a simple probe
with i2cdetect which uses Quick by default. Running this with SFP
modules whose EEPROM isn't write-protected, some of the initial bytes
are overwritten because a 16-byte write operation is executed instead of
a Quick Write. (This temporarily soft-bricked one of my DAC cables.)
Because SMBus Quick operation is obviously not supported on these
controllers (because a length of 0 cannot be set, even when no register
address is set), remove that instead of claiming there is support. There
also shouldn't be any kind of emulated 'Quick' which just does another
kind of operation in the background. Otherwise, specific issues occur
in case of a 'Quick' Write which actually writes unknown data to an
unknown register.
Fixes: c366be720235 ("i2c: Add driver for the RTL9300 I2C controller")
Cc: <stable(a)vger.kernel.org> # v6.13+
Signed-off-by: Jonas Jelonek <jelonek.jonas(a)gmail.com>
Tested-by: Sven Eckelmann <sven(a)narfation.org>
Reviewed-by: Chris Packham <chris.packham(a)alliedtelesis.co.nz>
Tested-by: Chris Packham <chris.packham(a)alliedtelesis.co.nz> # On RTL9302C based board
Tested-by: Markus Stockhausen <markus.stockhausen(a)gmx.de>
---
drivers/i2c/busses/i2c-rtl9300.c | 15 +++------------
1 file changed, 3 insertions(+), 12 deletions(-)
diff --git a/drivers/i2c/busses/i2c-rtl9300.c b/drivers/i2c/busses/i2c-rtl9300.c
index ebd4a85e1bde..9e6232075137 100644
--- a/drivers/i2c/busses/i2c-rtl9300.c
+++ b/drivers/i2c/busses/i2c-rtl9300.c
@@ -235,15 +235,6 @@ static int rtl9300_i2c_smbus_xfer(struct i2c_adapter *adap, u16 addr, unsigned s
}
switch (size) {
- case I2C_SMBUS_QUICK:
- ret = rtl9300_i2c_config_xfer(i2c, chan, addr, 0);
- if (ret)
- goto out_unlock;
- ret = rtl9300_i2c_reg_addr_set(i2c, 0, 0);
- if (ret)
- goto out_unlock;
- break;
-
case I2C_SMBUS_BYTE:
if (read_write == I2C_SMBUS_WRITE) {
ret = rtl9300_i2c_config_xfer(i2c, chan, addr, 0);
@@ -344,9 +335,9 @@ static int rtl9300_i2c_smbus_xfer(struct i2c_adapter *adap, u16 addr, unsigned s
static u32 rtl9300_i2c_func(struct i2c_adapter *a)
{
- return I2C_FUNC_SMBUS_QUICK | I2C_FUNC_SMBUS_BYTE |
- I2C_FUNC_SMBUS_BYTE_DATA | I2C_FUNC_SMBUS_WORD_DATA |
- I2C_FUNC_SMBUS_BLOCK_DATA | I2C_FUNC_SMBUS_I2C_BLOCK;
+ return I2C_FUNC_SMBUS_BYTE | I2C_FUNC_SMBUS_BYTE_DATA |
+ I2C_FUNC_SMBUS_WORD_DATA | I2C_FUNC_SMBUS_BLOCK_DATA |
+ I2C_FUNC_SMBUS_I2C_BLOCK;
}
static const struct i2c_algorithm rtl9300_i2c_algo = {
--
2.48.1
When building powerpc configurations in linux-5.4.y with binutils 2.43
or newer, there is an assembler error in arch/powerpc/boot/util.S:
arch/powerpc/boot/util.S: Assembler messages:
arch/powerpc/boot/util.S:44: Error: junk at end of line, first unrecognized character is `0'
arch/powerpc/boot/util.S:49: Error: syntax error; found `b', expected `,'
arch/powerpc/boot/util.S:49: Error: junk at end of line: `b'
binutils 2.43 contains stricter parsing of certain labels [1].
Remove the unnecessary leading zero to fix the build. This is only
needed in linux-5.4.y because commit 8b14e1dff067 ("powerpc: Remove
support for PowerPC 601") removed this code altogether in 5.10.
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=226749d5a6ff0d5c6… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
arch/powerpc/boot/util.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/boot/util.S b/arch/powerpc/boot/util.S
index f11f0589a669..5ab2bc864e66 100644
--- a/arch/powerpc/boot/util.S
+++ b/arch/powerpc/boot/util.S
@@ -41,12 +41,12 @@ udelay:
srwi r4,r4,16
cmpwi 0,r4,1 /* 601 ? */
bne .Ludelay_not_601
-00: li r0,86 /* Instructions / microsecond? */
+0: li r0,86 /* Instructions / microsecond? */
mtctr r0
10: addi r0,r0,0 /* NOP */
bdnz 10b
subic. r3,r3,1
- bne 00b
+ bne 0b
blr
.Ludelay_not_601:
base-commit: c25f780e491e4734eb27d65aa58e0909fd78ad9f
--
2.51.0
A new warning in clang [1] points out a few places in s5p_mfc_cmd_v6.c
where an uninitialized variable is passed as a const pointer:
drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c:45:7: error: variable 'h2r_args' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
45 | &h2r_args);
| ^~~~~~~~
drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c:133:7: error: variable 'h2r_args' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
133 | &h2r_args);
| ^~~~~~~~
drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c:148:7: error: variable 'h2r_args' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
148 | &h2r_args);
| ^~~~~~~~
The args parameter in s5p_mfc_cmd_host2risc_v6() is never actually used,
so just pass NULL to it in the places where h2r_args is currently
passed, clearing up the warning and not changing the functionality of
the code.
Cc: stable(a)vger.kernel.org
Fixes: f96f3cfa0bb8 ("[media] s5p-mfc: Update MFC v4l2 driver to support MFC6.x")
Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d44… [1]
Closes: https://github.com/ClangBuiltLinux/linux/issues/2103
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
From what I can tell, it seems like ->cmd_host2risc() is only ever
called from v6 code, which always passes NULL? It seems like it should
be possible to just drop .cmd_host2risc on the v5 side, then update
.cmd_host2risc to only take two parameters? If so, I can send a follow
up as a clean up, so that this can go back relatively conflict free.
---
.../platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c | 22 +++++-----------------
1 file changed, 5 insertions(+), 17 deletions(-)
diff --git a/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c b/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c
index 47bc3014b5d8..735471c50dbb 100644
--- a/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c
+++ b/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c
@@ -31,7 +31,6 @@ static int s5p_mfc_cmd_host2risc_v6(struct s5p_mfc_dev *dev, int cmd,
static int s5p_mfc_sys_init_cmd_v6(struct s5p_mfc_dev *dev)
{
- struct s5p_mfc_cmd_args h2r_args;
const struct s5p_mfc_buf_size_v6 *buf_size = dev->variant->buf_size->priv;
int ret;
@@ -41,33 +40,23 @@ static int s5p_mfc_sys_init_cmd_v6(struct s5p_mfc_dev *dev)
mfc_write(dev, dev->ctx_buf.dma, S5P_FIMV_CONTEXT_MEM_ADDR_V6);
mfc_write(dev, buf_size->dev_ctx, S5P_FIMV_CONTEXT_MEM_SIZE_V6);
- return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SYS_INIT_V6,
- &h2r_args);
+ return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SYS_INIT_V6, NULL);
}
static int s5p_mfc_sleep_cmd_v6(struct s5p_mfc_dev *dev)
{
- struct s5p_mfc_cmd_args h2r_args;
-
- memset(&h2r_args, 0, sizeof(struct s5p_mfc_cmd_args));
- return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SLEEP_V6,
- &h2r_args);
+ return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SLEEP_V6, NULL);
}
static int s5p_mfc_wakeup_cmd_v6(struct s5p_mfc_dev *dev)
{
- struct s5p_mfc_cmd_args h2r_args;
-
- memset(&h2r_args, 0, sizeof(struct s5p_mfc_cmd_args));
- return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_WAKEUP_V6,
- &h2r_args);
+ return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_WAKEUP_V6, NULL);
}
/* Open a new instance and get its number */
static int s5p_mfc_open_inst_cmd_v6(struct s5p_mfc_ctx *ctx)
{
struct s5p_mfc_dev *dev = ctx->dev;
- struct s5p_mfc_cmd_args h2r_args;
int codec_type;
mfc_debug(2, "Requested codec mode: %d\n", ctx->codec_mode);
@@ -130,14 +119,13 @@ static int s5p_mfc_open_inst_cmd_v6(struct s5p_mfc_ctx *ctx)
mfc_write(dev, 0, S5P_FIMV_D_CRC_CTRL_V6); /* no crc */
return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_OPEN_INSTANCE_V6,
- &h2r_args);
+ NULL);
}
/* Close instance */
static int s5p_mfc_close_inst_cmd_v6(struct s5p_mfc_ctx *ctx)
{
struct s5p_mfc_dev *dev = ctx->dev;
- struct s5p_mfc_cmd_args h2r_args;
int ret = 0;
dev->curr_ctx = ctx->num;
@@ -145,7 +133,7 @@ static int s5p_mfc_close_inst_cmd_v6(struct s5p_mfc_ctx *ctx)
mfc_write(dev, ctx->inst_no, S5P_FIMV_INSTANCE_ID_V6);
ret = s5p_mfc_cmd_host2risc_v6(dev,
S5P_FIMV_H2R_CMD_CLOSE_INSTANCE_V6,
- &h2r_args);
+ NULL);
} else {
ret = -EINVAL;
}
---
base-commit: 347e9f5043c89695b01e66b3ed111755afcf1911
change-id: 20250715-media-s5p-mfc-fix-uninit-const-pointer-cbf944ae4b4b
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
This is the start of the stable review cycle for the 5.4.298 release.
There are 23 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 04 Sep 2025 13:19:14 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.298-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.298-rc1
Imre Deak <imre.deak(a)intel.com>
Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"
Fabio Porcedda <fabio.porcedda(a)gmail.com>
net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions
Alex Deucher <alexander.deucher(a)amd.com>
Revert "drm/amdgpu: fix incorrect vm flags to map bo"
Minjong Kim <minbell.kim(a)samsung.com>
HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()
Ping Cheng <pinglinux(a)gmail.com>
HID: wacom: Add a new Art Pen 2
Qasim Ijaz <qasdev00(a)gmail.com>
HID: asus: fix UAF via HID_CLAIMED_INPUT validation
Thijs Raymakers <thijs(a)raymakers.nl>
KVM: x86: use array_index_nospec with indices that come from guest
Li Nan <linan122(a)huawei.com>
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
Eric Dumazet <edumazet(a)google.com>
sctp: initialize more fields in sctp_v6_from_sk()
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Set local Xoff after FW update
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon port speed set
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon MTU set
Yeounsu Moon <yyyynoom(a)gmail.com>
net: dlink: fix multicast stats being counted incorrectly
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control().
Christoph Hellwig <hch(a)lst.de>
net/atm: remove the atmdev_ops {get, set}sockopt methods
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced
Madhavan Srinivasan <maddy(a)linux.ibm.com>
powerpc/kvm: Fix ifdef to remove build warning
Oscar Maes <oscmaes92(a)gmail.com>
net: ipv4: fix regression in local-broadcast routes
Nikolay Kuratov <kniv(a)yandex-team.ru>
vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()
Damien Le Moal <dlemoal(a)kernel.org>
scsi: core: sysfs: Correct sysfs attributes access rights
Tengda Wu <wutengda(a)huaweicloud.com>
ftrace: Fix potential warning in trace_printk_seq during ftrace_dump
Randy Dunlap <rdunlap(a)infradead.org>
pinctrl: STMFX: add missing HAS_IOMEM dependency
-------------
Diffstat:
Makefile | 4 +--
arch/powerpc/kernel/kvm.c | 8 ++---
arch/x86/kvm/lapic.c | 3 ++
arch/x86/kvm/x86.c | 7 ++--
drivers/atm/atmtcp.c | 17 +++++++--
drivers/atm/eni.c | 17 ---------
drivers/atm/firestream.c | 2 --
drivers/atm/fore200e.c | 27 ---------------
drivers/atm/horizon.c | 40 ----------------------
drivers/atm/iphase.c | 16 ---------
drivers/atm/lanai.c | 2 --
drivers/atm/solos-pci.c | 2 --
drivers/atm/zatm.c | 16 ---------
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +--
drivers/gpu/drm/drm_dp_helper.c | 2 +-
drivers/hid/hid-asus.c | 8 ++++-
drivers/hid/hid-ntrig.c | 3 ++
drivers/hid/wacom_wac.c | 1 +
drivers/net/ethernet/dlink/dl2k.c | 2 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.c | 3 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.h | 12 +++++++
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 19 +++++++++-
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 4 ---
drivers/net/usb/qmi_wwan.c | 3 ++
drivers/pinctrl/Kconfig | 1 +
drivers/scsi/scsi_sysfs.c | 4 +--
drivers/vhost/net.c | 9 +++--
fs/efivarfs/super.c | 4 +++
include/linux/atmdev.h | 10 +-----
kernel/trace/trace.c | 4 +--
net/atm/common.c | 29 ++++++++--------
net/bluetooth/hci_event.c | 12 ++++++-
net/ipv4/route.c | 10 ++++--
net/sctp/ipv6.c | 2 ++
34 files changed, 129 insertions(+), 178 deletions(-)
Hi Sasha and Greg,
Salvatore Bonaccorso <carnil(a)debian.org> from Debian Kernel Team
included this regression-fix already.
Upstream commit 5189446ba995556eaa3755a6e875bc06675b88bd
"net: ipv4: fix regression in local-broadcast routes"
As far as I have seen this should be included in stable-6.16 and
LTS-6.12 (for other stable branches I simply have no interest - please
double-check).
I am sure Sasha's new kernel-patch-AI tool has catched this - just
kindly inform you.
Thanks.
Best regards,
-Sedat-
https://salsa.debian.org/kernel-team/linux/-/commit/194de383c5cd5e8c22cadfc…https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
Hi all,
Here's a collection of fixes that I *think* are bugs in fuse, along with
some scattered improvements.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
This has been running on the djcloud for months with no problems. Enjoy!
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fu…
---
Commits in this patchset:
* fuse: fix livelock in synchronous file put from fuseblk workers
* fuse: flush pending fuse events before aborting the connection
* fuse: capture the unique id of fuse commands being sent
* fuse: implement file attributes mask for statx
* fuse: update file mode when updating acls
* fuse: propagate default and file acls on creation
* fuse: enable FUSE_SYNCFS for all servers
---
fs/fuse/fuse_i.h | 14 +++++++
fs/fuse/acl.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/dev.c | 44 +++++++++++++++++++++++
fs/fuse/dev_uring.c | 8 ++++
fs/fuse/dir.c | 96 +++++++++++++++++++++++++++++++++++++++------------
fs/fuse/file.c | 10 +++++
fs/fuse/inode.c | 5 +++
7 files changed, 245 insertions(+), 27 deletions(-)
Add missing of_node_put() and put_device() calls to release references.
The function calls of_parse_phandle() and of_find_device_by_node()
but fails to release the references.
Both functions' documentation mentions that
the returned references must be dropped after use.
Found through static analysis by reviewing the documentation and
cross-checking their usage patterns.
Fixes: 2d1021487273 ("phy: tegra: xusb: Add wake/sleepwalk for Tegra210")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/phy/tegra/xusb-tegra210.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/phy/tegra/xusb-tegra210.c b/drivers/phy/tegra/xusb-tegra210.c
index ebc8a7e21a31..cbfca51a53bc 100644
--- a/drivers/phy/tegra/xusb-tegra210.c
+++ b/drivers/phy/tegra/xusb-tegra210.c
@@ -3164,18 +3164,23 @@ tegra210_xusb_padctl_probe(struct device *dev,
}
pdev = of_find_device_by_node(np);
+ of_node_put(np);
if (!pdev) {
dev_warn(dev, "PMC device is not available\n");
goto out;
}
- if (!platform_get_drvdata(pdev))
+ if (!platform_get_drvdata(pdev)) {
+ put_device(&pdev->dev);
return ERR_PTR(-EPROBE_DEFER);
+ }
padctl->regmap = dev_get_regmap(&pdev->dev, "usb_sleepwalk");
if (!padctl->regmap)
dev_info(dev, "failed to find PMC regmap\n");
+ put_device(&pdev->dev);
+
out:
return &padctl->base;
}
--
2.35.1
The cdnsp-pci driver uses pcim_enable_device() to enable a PCI device,
which means the device will be automatically disabled on driver detach
through the managed device framework. The manual pci_disable_device()
call in the error path is therefore redundant.
Found via static anlaysis and this is similar to commit 99ca0b57e49f
("thermal: intel: int340x: processor: Fix warning during module unload").
Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/usb/cdns3/cdnsp-pci.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/usb/cdns3/cdnsp-pci.c b/drivers/usb/cdns3/cdnsp-pci.c
index 8c361b8394e9..5e7b88ca8b96 100644
--- a/drivers/usb/cdns3/cdnsp-pci.c
+++ b/drivers/usb/cdns3/cdnsp-pci.c
@@ -85,7 +85,7 @@ static int cdnsp_pci_probe(struct pci_dev *pdev,
cdnsp = kzalloc(sizeof(*cdnsp), GFP_KERNEL);
if (!cdnsp) {
ret = -ENOMEM;
- goto disable_pci;
+ goto put_pci;
}
}
@@ -168,9 +168,6 @@ static int cdnsp_pci_probe(struct pci_dev *pdev,
if (!pci_is_enabled(func))
kfree(cdnsp);
-disable_pci:
- pci_disable_device(pdev);
-
put_pci:
pci_dev_put(func);
--
2.35.1
This is the start of the stable review cycle for the 6.1.150 release.
There are 50 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 04 Sep 2025 13:19:14 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.150-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.1.150-rc1
Eric Sandeen <sandeen(a)redhat.com>
xfs: do not propagate ENODATA disk errors into xattr code
Imre Deak <imre.deak(a)intel.com>
Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Handle reads greater than 60 bytes
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Don't set bus speed on every transfer
Eric Dumazet <edumazet(a)google.com>
net: rose: fix a typo in rose_clear_routes()
James Jones <jajones(a)nvidia.com>
drm/nouveau/disp: Always accept linear modifier
Steve French <stfrench(a)microsoft.com>
smb3 client: fix return code mapping of remap_file_range
Fabio Porcedda <fabio.porcedda(a)gmail.com>
net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions
Shuhao Fu <sfual(a)cse.ust.hk>
fs/smb: Fix inconsistent refcnt update
Shanker Donthineni <sdonthineni(a)nvidia.com>
dma/pool: Ensure DMA_DIRECT_REMAP allocations are decrypted
Alex Deucher <alexander.deucher(a)amd.com>
Revert "drm/amdgpu: fix incorrect vm flags to map bo"
Minjong Kim <minbell.kim(a)samsung.com>
HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()
Ping Cheng <pinglinux(a)gmail.com>
HID: wacom: Add a new Art Pen 2
Qasim Ijaz <qasdev00(a)gmail.com>
HID: multitouch: fix slab out-of-bounds access in mt_report_fixup()
Qasim Ijaz <qasdev00(a)gmail.com>
HID: asus: fix UAF via HID_CLAIMED_INPUT validation
Thijs Raymakers <thijs(a)raymakers.nl>
KVM: x86: use array_index_nospec with indices that come from guest
Li Nan <linan122(a)huawei.com>
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
Eric Dumazet <edumazet(a)google.com>
sctp: initialize more fields in sctp_v6_from_sk()
Takamitsu Iwai <takamitz(a)amazon.co.jp>
net: rose: include node references in rose_neigh refcount
Takamitsu Iwai <takamitz(a)amazon.co.jp>
net: rose: convert 'use' field to refcount_t
Takamitsu Iwai <takamitz(a)amazon.co.jp>
net: rose: split remove and free operations in rose_remove_neigh()
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Set local Xoff after FW update
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon port speed set
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon MTU set
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Reload auxiliary drivers on fw_activate
Horatiu Vultur <horatiu.vultur(a)microchip.com>
phy: mscc: Fix when PTP clock is register and unregister
Yeounsu Moon <yyyynoom(a)gmail.com>
net: dlink: fix multicast stats being counted incorrectly
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control().
Pavel Shpakovskiy <pashpakovskii(a)salutedevices.com>
Bluetooth: hci_sync: fix set_local_name race condition
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced
Ludovico de Nittis <ludovico.denittis(a)collabora.com>
Bluetooth: hci_event: Mark connection as closed during suspend disconnect
Ludovico de Nittis <ludovico.denittis(a)collabora.com>
Bluetooth: hci_event: Treat UNKNOWN_CONN_ID on disconnect as success
José Expósito <jose.exposito89(a)gmail.com>
HID: input: report battery status changes immediately
José Expósito <jose.exposito89(a)gmail.com>
HID: input: rename hidinput_set_battery_charge_status()
Madhavan Srinivasan <maddy(a)linux.ibm.com>
powerpc/kvm: Fix ifdef to remove build warning
Rob Clark <robin.clark(a)oss.qualcomm.com>
drm/msm: Defer fd_install in SUBMIT ioctl
Oscar Maes <oscmaes92(a)gmail.com>
net: ipv4: fix regression in local-broadcast routes
Nikolay Kuratov <kniv(a)yandex-team.ru>
vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFS: Fix a race when updating an existing write
Christoph Hellwig <hch(a)lst.de>
nfs: fold nfs_page_group_lock_subrequests into nfs_lock_and_join_requests
Werner Sembach <wse(a)tuxedocomputers.com>
ACPI: EC: Add device to acpi_ec_no_wakeup[] qurik list
Alexey Klimov <alexey.klimov(a)linaro.org>
ASoC: codecs: tx-macro: correct tx_macro_component_drv name
Paulo Alcantara <pc(a)manguebit.org>
smb: client: fix race with concurrent opens in rename(2)
Paulo Alcantara <pc(a)manguebit.org>
smb: client: fix race with concurrent opens in unlink(2)
Damien Le Moal <dlemoal(a)kernel.org>
scsi: core: sysfs: Correct sysfs attributes access rights
Tengda Wu <wutengda(a)huaweicloud.com>
ftrace: Fix potential warning in trace_printk_seq during ftrace_dump
Aleksander Jan Bajkowski <olek2(a)wp.pl>
mips: lantiq: xway: sysctrl: rename the etop node
Aleksander Jan Bajkowski <olek2(a)wp.pl>
mips: dts: lantiq: danube: add missing burst length property
Randy Dunlap <rdunlap(a)infradead.org>
pinctrl: STMFX: add missing HAS_IOMEM dependency
-------------
Diffstat:
Makefile | 4 +-
arch/mips/boot/dts/lantiq/danube_easy50712.dts | 5 +-
arch/mips/lantiq/xway/sysctrl.c | 10 +-
arch/powerpc/kernel/kvm.c | 8 +-
arch/x86/kvm/lapic.c | 2 +
arch/x86/kvm/x86.c | 7 +-
drivers/acpi/ec.c | 6 +
drivers/atm/atmtcp.c | 17 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +-
drivers/gpu/drm/display/drm_dp_helper.c | 2 +-
drivers/gpu/drm/msm/msm_gem_submit.c | 14 +-
drivers/gpu/drm/nouveau/dispnv50/wndw.c | 4 +
drivers/hid/hid-asus.c | 8 +-
drivers/hid/hid-input-test.c | 10 +-
drivers/hid/hid-input.c | 51 ++++----
drivers/hid/hid-mcp2221.c | 71 +++++++----
drivers/hid/hid-multitouch.c | 8 ++
drivers/hid/hid-ntrig.c | 3 +
drivers/hid/wacom_wac.c | 1 +
drivers/net/ethernet/dlink/dl2k.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 2 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.c | 3 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.h | 12 ++
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 19 ++-
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 4 -
drivers/net/phy/mscc/mscc.h | 4 +
drivers/net/phy/mscc/mscc_main.c | 4 +-
drivers/net/phy/mscc/mscc_ptp.c | 34 +++--
drivers/net/usb/qmi_wwan.c | 3 +
drivers/pinctrl/Kconfig | 1 +
drivers/scsi/scsi_sysfs.c | 4 +-
drivers/vhost/net.c | 9 +-
fs/efivarfs/super.c | 4 +
fs/nfs/pagelist.c | 86 +------------
fs/nfs/write.c | 142 +++++++++++++--------
fs/smb/client/cifsfs.c | 14 ++
fs/smb/client/inode.c | 34 ++++-
fs/smb/client/smb2inode.c | 7 +-
fs/xfs/libxfs/xfs_attr_remote.c | 7 +
fs/xfs/libxfs/xfs_da_btree.c | 6 +
include/linux/atmdev.h | 1 +
include/linux/nfs_page.h | 2 +-
include/net/bluetooth/hci_sync.h | 2 +-
include/net/rose.h | 18 ++-
kernel/dma/pool.c | 4 +-
kernel/trace/trace.c | 4 +-
net/atm/common.c | 15 ++-
net/bluetooth/hci_event.c | 20 ++-
net/bluetooth/hci_sync.c | 6 +-
net/bluetooth/mgmt.c | 5 +-
net/ipv4/route.c | 10 +-
net/rose/af_rose.c | 13 +-
net/rose/rose_in.c | 12 +-
net/rose/rose_route.c | 62 +++++----
net/rose/rose_timer.c | 2 +-
net/sctp/ipv6.c | 2 +
sound/soc/codecs/lpass-tx-macro.c | 2 +-
57 files changed, 514 insertions(+), 302 deletions(-)
This converts the vexpress-sysreg MFD driver to using the new generic
GPIO interface but first fixes an issue with an unchecked return value
of devm_gpiochio_add_data().
Lee: Please, create an immutable branch containing these commits after
you pick them up, as I'd like to merge it into the GPIO tree and remove
the legacy interface in this cycle.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
---
Bartosz Golaszewski (2):
mfd: vexpress-sysreg: check the return value of devm_gpiochip_add_data()
mfd: vexpress-sysreg: use new generic GPIO chip API
drivers/mfd/vexpress-sysreg.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20250728-gpio-mmio-mfd-conv-d27c2cfbccfe
Best regards,
--
Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
From: SujanaSubr <sujana.subramaniam(a)sap.com>
[ Upstream commit 1ce840c7a659aa53a31ef49f0271b4fd0dc10296 ]
Currently, when firmware failure occurs during matcher disconnect flow,
the error flow of the function reconnects the matcher back and returns
an error, which continues running the calling function and eventually
frees the matcher that is being disconnected.
This leads to a case where we have a freed matcher on the matchers list,
which in turn leads to use-after-free and eventual crash.
This patch fixes that by not trying to reconnect the matcher back when
some FW command fails during disconnect.
Note that we're dealing here with FW error. We can't overcome this
problem. This might lead to bad steering state (e.g. wrong connection
between matchers), and will also lead to resource leakage, as it is
the case with any other error handling during resource destruction.
However, the goal here is to allow the driver to continue and not crash
the machine with use-after-free error.
Signed-off-by: Yevgeny Kliteynik <kliteyn(a)nvidia.com>
Signed-off-by: Itamar Gozlan <igozlan(a)nvidia.com>
Reviewed-by: Mark Bloch <mbloch(a)nvidia.com>
Signed-off-by: Tariq Toukan <tariqt(a)nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Akendo <akendo(a)akendo.eu>
Signed-off-by: SujanaSubr <sujana.subramaniam(a)sap.com>
---
.../mlx5/core/steering/hws/mlx5hws_matcher.c | 24 +++++++------------
1 file changed, 8 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c
index 61a1155d4b4f..ce541c60c5b4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c
@@ -165,14 +165,14 @@ static int hws_matcher_disconnect(struct mlx5hws_matcher *matcher)
next->match_ste.rtc_0_id,
next->match_ste.rtc_1_id);
if (ret) {
- mlx5hws_err(tbl->ctx, "Failed to disconnect matcher\n");
- goto matcher_reconnect;
+ mlx5hws_err(tbl->ctx, "Fatal error, failed to disconnect matcher\n");
+ return ret;
}
} else {
ret = mlx5hws_table_connect_to_miss_table(tbl, tbl->default_miss.miss_tbl);
if (ret) {
- mlx5hws_err(tbl->ctx, "Failed to disconnect last matcher\n");
- goto matcher_reconnect;
+ mlx5hws_err(tbl->ctx, "Fatal error, failed to disconnect last matcher\n");
+ return ret;
}
}
@@ -180,27 +180,19 @@ static int hws_matcher_disconnect(struct mlx5hws_matcher *matcher)
if (prev_ft_id == tbl->ft_id) {
ret = mlx5hws_table_update_connected_miss_tables(tbl);
if (ret) {
- mlx5hws_err(tbl->ctx, "Fatal error, failed to update connected miss table\n");
- goto matcher_reconnect;
+ mlx5hws_err(tbl->ctx,
+ "Fatal error, failed to update connected miss table\n");
+ return ret;
}
}
ret = mlx5hws_table_ft_set_default_next_ft(tbl, prev_ft_id);
if (ret) {
mlx5hws_err(tbl->ctx, "Fatal error, failed to restore matcher ft default miss\n");
- goto matcher_reconnect;
+ return ret;
}
return 0;
-
-matcher_reconnect:
- if (list_empty(&tbl->matchers_list) || !prev)
- list_add(&matcher->list_node, &tbl->matchers_list);
- else
- /* insert after prev matcher */
- list_add(&matcher->list_node, &prev->list_node);
-
- return ret;
}
static void hws_matcher_set_rtc_attr_sz(struct mlx5hws_matcher *matcher,
--
2.39.5 (Apple Git-154)
Problem: when pinctrl core binds pins to a consumer device and the
pinmux ops of the underlying driver are marked as strict, the pin in
question can no longer be requested as a GPIO using the GPIO descriptor
API. It will result in the following error:
[ 5.095688] sc8280xp-tlmm f100000.pinctrl: pin GPIO_25 already requested by regulator-edp-3p3; cannot claim for f100000.pinctrl:570
[ 5.107822] sc8280xp-tlmm f100000.pinctrl: error -EINVAL: pin-25 (f100000.pinctrl:570)
This typically makes sense except when the pins are muxed to a function
that actually says "GPIO". Of course, the function name is just a string
so it has no meaning to the pinctrl subsystem.
We have many Qualcomm SoCs (and I can imagine it's a common pattern in
other platforms as well) where we mux a pin to "gpio" function using the
`pinctrl-X` property in order to configure bias or drive-strength and
then access it using the gpiod API. This makes it impossible to mark the
pin controller module as "strict".
This series proposes to introduce a concept of a sub-category of
pinfunctions: GPIO functions where the above is not true and the pin
muxed as a GPIO can still be accessed via the GPIO consumer API even for
strict pinmuxers.
To that end: we first clean up the drivers that use struct function_desc
and make them use the smaller struct pinfunction instead - which is the
correct structure for drivers to describe their pin functions with. We
also rework pinmux core to not duplicate memory used to store the
pinfunctions unless they're allocated dynamically.
First: provide the kmemdup_const() helper which only duplicates memory
if it's not in the .rodata section. Then rework all pinctrl drivers that
instantiate objects of type struct function_desc as they should only be
created by pinmux core. Next constify the return value of the accessor
used to expose these structures to users and finally convert the
pinfunction object within struct function_desc to a pointer and use
kmemdup_const() to assign it. With this done proceed to add
infrastructure for the GPIO pin function category and use it in Qualcomm
drivers. At the very end: make the Qualcomm pinmuxer strict.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
---
Changes in v7:
- Add a patch checking the return value of the get_function_name()
callback in pinmux_func_name_to_selector(). This fixes a NULL-pointer
dereference on IMX platforms
- Don't assign the number of functions in pinctrl device in the IMX
driver as it's done automatically when adding the pinfunctions using
the provided API. This fixes a warning from pinctrl core on IMX
platforms triggered by the conversion from accessing the radix tree
directly
- Link to v6: https://lore.kernel.org/r/20250828-pinctrl-gpio-pinfuncs-v6-0-c9abb6bdb689@…
Changes in v6:
- Select GENERIC_PINMUX_FUNCTIONS when using generic pinmux helpers in
qcom pinctrl drivers to fix build on ARM 32-bit platforms
- Assume that a pin can be requested in pin_request() if it has no
mux_setting assigned
- Also check if a function is a GPIO for pins within GPIO ranges
- Fix an issue with the imx pinctrl driver where the conversion patch
confused the function and pin group radix trees
- Add a FIXME to the imx driver mentioning the need to switch to the
provided helpers for accessing the group radix tree
- Link to v5: https://lore.kernel.org/r/20250815-pinctrl-gpio-pinfuncs-v5-0-955de9fd91db@…
Changes in v5:
- Fix a potential NULL-pointer dereference in
pinmux_can_be_used_for_gpio()
- Use PINCTRL_PINFUNCTION() in pinctrl-airoha
- Link to v4: https://lore.kernel.org/r/20250812-pinctrl-gpio-pinfuncs-v4-0-bb3906c55e64@…
Changes in v4:
- Update the GPIO pin function definitions to include the new qcom
driver (milos)
- Provide devm_kmemdup_const() instead of a non-managed kmemdup_const()
as a way to avoid casting out the 'const' modifier when passing the
const pointer to devm_add_action_or_reset()
- Use devm_krealloc_array() where applicable instead of devm_krealloc()
- Fix typos
- Fix kerneldocs
- Improve commit messages
- Small tweaks as pointed out by Andy
- Rebased on top of v6.17-rc1
- Link to v3: https://lore.kernel.org/r/20250724-pinctrl-gpio-pinfuncs-v3-0-af4db9302de4@…
Changes in v3:
- Add more patches in front: convert pinctrl drivers to stop defining
their own struct function_desc objects and make pinmux core not
duplicate .rodata memory in which struct pinfunction objects are
stored.
- Add a patch constifying pinmux_generic_get_function().
- Drop patches that were applied upstream.
- Link to v2: https://lore.kernel.org/r/20250709-pinctrl-gpio-pinfuncs-v2-0-b6135149c0d9@…
Changes in v2:
- Extend the series with providing pinmux_generic_add_pinfunction(),
using it in several drivers and converting pinctrl-msm to using
generic pinmux helpers
- Add a generic function_is_gpio() callback for pinmux_ops
- Convert all qualcomm drivers to using the new GPIO pin category so
that we can actually enable the strict flag
- Link to v1: https://lore.kernel.org/r/20250702-pinctrl-gpio-pinfuncs-v1-0-ed2bd0f9468d@…
---
Bartosz Golaszewski (16):
pinctrl: check the return value of pinmux_ops::get_function_name()
devres: provide devm_kmemdup_const()
pinctrl: ingenic: use struct pinfunction instead of struct function_desc
pinctrl: airoha: replace struct function_desc with struct pinfunction
pinctrl: mediatek: mt7988: use PINCTRL_PIN_FUNCTION()
pinctrl: mediatek: moore: replace struct function_desc with struct pinfunction
pinctrl: imx: don't access the pin function radix tree directly
pinctrl: keembay: release allocated memory in detach path
pinctrl: keembay: use a dedicated structure for the pinfunction description
pinctrl: constify pinmux_generic_get_function()
pinctrl: make struct pinfunction a pointer in struct function_desc
pinctrl: qcom: use generic pin function helpers
pinctrl: allow to mark pin functions as requestable GPIOs
pinctrl: qcom: add infrastructure for marking pin functions as GPIOs
pinctrl: qcom: mark the `gpio` and `egpio` pins function as non-strict functions
pinctrl: qcom: make the pinmuxing strict
drivers/base/devres.c | 21 +++++++
drivers/pinctrl/freescale/pinctrl-imx.c | 45 +++++++--------
drivers/pinctrl/mediatek/pinctrl-airoha.c | 19 +++----
drivers/pinctrl/mediatek/pinctrl-moore.c | 10 ++--
drivers/pinctrl/mediatek/pinctrl-moore.h | 7 +--
drivers/pinctrl/mediatek/pinctrl-mt7622.c | 2 +-
drivers/pinctrl/mediatek/pinctrl-mt7623.c | 2 +-
drivers/pinctrl/mediatek/pinctrl-mt7629.c | 2 +-
drivers/pinctrl/mediatek/pinctrl-mt7981.c | 2 +-
drivers/pinctrl/mediatek/pinctrl-mt7986.c | 2 +-
drivers/pinctrl/mediatek/pinctrl-mt7988.c | 44 ++++++---------
drivers/pinctrl/mediatek/pinctrl-mtk-common-v2.h | 2 +-
drivers/pinctrl/pinctrl-equilibrium.c | 2 +-
drivers/pinctrl/pinctrl-ingenic.c | 49 ++++++++---------
drivers/pinctrl/pinctrl-keembay.c | 26 ++++++---
drivers/pinctrl/pinctrl-single.c | 4 +-
drivers/pinctrl/pinmux.c | 70 ++++++++++++++++++++----
drivers/pinctrl/pinmux.h | 9 ++-
drivers/pinctrl/qcom/Kconfig | 1 +
drivers/pinctrl/qcom/pinctrl-ipq5018.c | 2 +-
drivers/pinctrl/qcom/pinctrl-ipq5332.c | 2 +-
drivers/pinctrl/qcom/pinctrl-ipq5424.c | 2 +-
drivers/pinctrl/qcom/pinctrl-ipq6018.c | 2 +-
drivers/pinctrl/qcom/pinctrl-ipq8074.c | 2 +-
drivers/pinctrl/qcom/pinctrl-ipq9574.c | 2 +-
drivers/pinctrl/qcom/pinctrl-mdm9607.c | 2 +-
drivers/pinctrl/qcom/pinctrl-mdm9615.c | 2 +-
drivers/pinctrl/qcom/pinctrl-milos.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm.c | 45 +++++----------
drivers/pinctrl/qcom/pinctrl-msm.h | 5 ++
drivers/pinctrl/qcom/pinctrl-msm8226.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8660.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8909.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8916.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8917.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8953.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8960.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8976.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8994.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8996.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8998.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8x74.c | 2 +-
drivers/pinctrl/qcom/pinctrl-qcm2290.c | 4 +-
drivers/pinctrl/qcom/pinctrl-qcs404.c | 2 +-
drivers/pinctrl/qcom/pinctrl-qcs615.c | 2 +-
drivers/pinctrl/qcom/pinctrl-qcs8300.c | 4 +-
drivers/pinctrl/qcom/pinctrl-qdu1000.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sa8775p.c | 4 +-
drivers/pinctrl/qcom/pinctrl-sar2130p.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sc7180.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sc7280.c | 4 +-
drivers/pinctrl/qcom/pinctrl-sc8180x.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sc8280xp.c | 4 +-
drivers/pinctrl/qcom/pinctrl-sdm660.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sdm670.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sdm845.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sdx55.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sdx65.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sdx75.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm4450.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm6115.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm6125.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm6350.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm6375.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm7150.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm8150.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm8250.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm8350.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm8450.c | 4 +-
drivers/pinctrl/qcom/pinctrl-sm8550.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm8650.c | 4 +-
drivers/pinctrl/qcom/pinctrl-sm8750.c | 4 +-
drivers/pinctrl/qcom/pinctrl-x1e80100.c | 2 +-
drivers/pinctrl/renesas/pinctrl-rza1.c | 2 +-
drivers/pinctrl/renesas/pinctrl-rza2.c | 2 +-
drivers/pinctrl/renesas/pinctrl-rzg2l.c | 2 +-
drivers/pinctrl/renesas/pinctrl-rzv2m.c | 2 +-
include/linux/device/devres.h | 2 +
include/linux/pinctrl/pinctrl.h | 14 +++++
include/linux/pinctrl/pinmux.h | 2 +
80 files changed, 288 insertions(+), 227 deletions(-)
---
base-commit: b320789d6883cc00ac78ce83bccbfe7ed58afcf0
change-id: 20250701-pinctrl-gpio-pinfuncs-de82bd9aac43
Best regards,
--
Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
This is the start of the stable review cycle for the 6.6.104 release.
There are 75 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 04 Sep 2025 13:19:14 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.104-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.6.104-rc1
Eric Sandeen <sandeen(a)redhat.com>
xfs: do not propagate ENODATA disk errors into xattr code
Imre Deak <imre.deak(a)intel.com>
Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Handle reads greater than 60 bytes
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Don't set bus speed on every transfer
Chris Mi <cmi(a)nvidia.com>
net/mlx5: SF, Fix add port error handling
Eric Dumazet <edumazet(a)google.com>
net: rose: fix a typo in rose_clear_routes()
James Jones <jajones(a)nvidia.com>
drm/nouveau/disp: Always accept linear modifier
Steve French <stfrench(a)microsoft.com>
smb3 client: fix return code mapping of remap_file_range
Fabio Porcedda <fabio.porcedda(a)gmail.com>
net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions
Shuhao Fu <sfual(a)cse.ust.hk>
fs/smb: Fix inconsistent refcnt update
Shanker Donthineni <sdonthineni(a)nvidia.com>
dma/pool: Ensure DMA_DIRECT_REMAP allocations are decrypted
Alex Deucher <alexander.deucher(a)amd.com>
Revert "drm/amdgpu: fix incorrect vm flags to map bo"
Minjong Kim <minbell.kim(a)samsung.com>
HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()
Ping Cheng <pinglinux(a)gmail.com>
HID: wacom: Add a new Art Pen 2
Matt Coffin <mcoffin13(a)gmail.com>
HID: logitech: Add ids for G PRO 2 LIGHTSPEED
Antheas Kapenekakis <lkml(a)antheas.dev>
HID: quirks: add support for Legion Go dual dinput modes
Qasim Ijaz <qasdev00(a)gmail.com>
HID: multitouch: fix slab out-of-bounds access in mt_report_fixup()
Qasim Ijaz <qasdev00(a)gmail.com>
HID: asus: fix UAF via HID_CLAIMED_INPUT validation
Borislav Petkov (AMD) <bp(a)alien8.de>
x86/microcode/AMD: Handle the case of no BIOS microcode
Thijs Raymakers <thijs(a)raymakers.nl>
KVM: x86: use array_index_nospec with indices that come from guest
Li Nan <linan122(a)huawei.com>
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
Eric Dumazet <edumazet(a)google.com>
sctp: initialize more fields in sctp_v6_from_sk()
Takamitsu Iwai <takamitz(a)amazon.co.jp>
net: rose: include node references in rose_neigh refcount
Takamitsu Iwai <takamitz(a)amazon.co.jp>
net: rose: convert 'use' field to refcount_t
Takamitsu Iwai <takamitz(a)amazon.co.jp>
net: rose: split remove and free operations in rose_remove_neigh()
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: Set CIC bit only for TX queues with COE
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: xgmac: Correct supported speed modes
Serge Semin <fancer.lancer(a)gmail.com>
net: stmmac: Rename phylink_get_caps() callback to update_caps()
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Set local Xoff after FW update
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon port speed set
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon MTU set
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Nack sync reset when SFs are present
Jiri Pirko <jiri(a)resnulli.us>
net/mlx5: Convert SF port_indices xarray to function_ids xarray
Jiri Pirko <jiri(a)resnulli.us>
net/mlx5: Use devlink port pointer to get the pointer of container SF struct
Jiri Pirko <jiri(a)resnulli.us>
net/mlx5: Call mlx5_sf_id_erase() once in mlx5_sf_dealloc()
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Fix lockdep assertion on sync reset unload event
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Add support for sync reset using hot reset
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Add device cap for supporting hot reset in sync reset flow
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Reload auxiliary drivers on fw_activate
Horatiu Vultur <horatiu.vultur(a)microchip.com>
phy: mscc: Fix when PTP clock is register and unregister
Yeounsu Moon <yyyynoom(a)gmail.com>
net: dlink: fix multicast stats being counted incorrectly
Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
dt-bindings: display/msm: qcom,mdp5: drop lut clock
Michal Kubiak <michal.kubiak(a)intel.com>
ice: fix incorrect counter for buffer allocation failures
Maciej Fijalkowski <maciej.fijalkowski(a)intel.com>
ice: stop storing XDP verdict within ice_rx_buf
Maciej Fijalkowski <maciej.fijalkowski(a)intel.com>
ice: gather page_count()'s of each frag right before XDP prog call
Larysa Zaremba <larysa.zaremba(a)intel.com>
ice: Introduce ice_xdp_buff
Timur Tabi <ttabi(a)nvidia.com>
drm/nouveau: remove unused memory target test
Timur Tabi <ttabi(a)nvidia.com>
drm/nouveau: remove unused increment in gm200_flcn_pio_imem_wr
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control().
Pavel Shpakovskiy <pashpakovskii(a)salutedevices.com>
Bluetooth: hci_sync: fix set_local_name race condition
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced
Ludovico de Nittis <ludovico.denittis(a)collabora.com>
Bluetooth: hci_event: Mark connection as closed during suspend disconnect
Ludovico de Nittis <ludovico.denittis(a)collabora.com>
Bluetooth: hci_event: Treat UNKNOWN_CONN_ID on disconnect as success
José Expósito <jose.exposito89(a)gmail.com>
HID: input: report battery status changes immediately
José Expósito <jose.exposito89(a)gmail.com>
HID: input: rename hidinput_set_battery_charge_status()
Madhavan Srinivasan <maddy(a)linux.ibm.com>
powerpc/kvm: Fix ifdef to remove build warning
Rob Clark <robin.clark(a)oss.qualcomm.com>
drm/msm: Defer fd_install in SUBMIT ioctl
Oscar Maes <oscmaes92(a)gmail.com>
net: ipv4: fix regression in local-broadcast routes
Nikolay Kuratov <kniv(a)yandex-team.ru>
vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFS: Fix a race when updating an existing write
Christoph Hellwig <hch(a)lst.de>
nfs: fold nfs_page_group_lock_subrequests into nfs_lock_and_join_requests
Werner Sembach <wse(a)tuxedocomputers.com>
ACPI: EC: Add device to acpi_ec_no_wakeup[] qurik list
Junli Liu <liujunli(a)lixiang.com>
erofs: fix atomic context detection when !CONFIG_DEBUG_LOCK_ALLOC
Alexey Klimov <alexey.klimov(a)linaro.org>
ASoC: codecs: tx-macro: correct tx_macro_component_drv name
Paulo Alcantara <pc(a)manguebit.org>
smb: client: fix race with concurrent opens in rename(2)
Paulo Alcantara <pc(a)manguebit.org>
smb: client: fix race with concurrent opens in unlink(2)
Damien Le Moal <dlemoal(a)kernel.org>
scsi: core: sysfs: Correct sysfs attributes access rights
Tengda Wu <wutengda(a)huaweicloud.com>
ftrace: Fix potential warning in trace_printk_seq during ftrace_dump
Dan Carpenter <dan.carpenter(a)linaro.org>
of: dynamic: Fix use after free in of_changeset_add_prop_helper()
Rob Herring <robh(a)kernel.org>
of: Add a helper to free property struct
Aleksander Jan Bajkowski <olek2(a)wp.pl>
mips: lantiq: xway: sysctrl: rename the etop node
Aleksander Jan Bajkowski <olek2(a)wp.pl>
mips: dts: lantiq: danube: add missing burst length property
Randy Dunlap <rdunlap(a)infradead.org>
pinctrl: STMFX: add missing HAS_IOMEM dependency
Lizhi Hou <lizhi.hou(a)amd.com>
of: dynamic: Fix memleak when of_pci_add_properties() failed
-------------
Diffstat:
.../devicetree/bindings/display/msm/qcom,mdp5.yaml | 1 -
Makefile | 4 +-
arch/mips/boot/dts/lantiq/danube_easy50712.dts | 5 +-
arch/mips/lantiq/xway/sysctrl.c | 10 +-
arch/powerpc/kernel/kvm.c | 8 +-
arch/x86/kernel/cpu/microcode/amd.c | 22 ++-
arch/x86/kvm/lapic.c | 2 +
arch/x86/kvm/x86.c | 7 +-
drivers/acpi/ec.c | 6 +
drivers/atm/atmtcp.c | 17 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +-
drivers/gpu/drm/display/drm_dp_helper.c | 2 +-
drivers/gpu/drm/msm/msm_gem_submit.c | 14 +-
drivers/gpu/drm/nouveau/dispnv50/wndw.c | 4 +
drivers/gpu/drm/nouveau/nvkm/falcon/gm200.c | 15 +-
drivers/hid/hid-asus.c | 8 +-
drivers/hid/hid-ids.h | 3 +
drivers/hid/hid-input-test.c | 10 +-
drivers/hid/hid-input.c | 51 +++---
drivers/hid/hid-logitech-dj.c | 4 +
drivers/hid/hid-logitech-hidpp.c | 2 +
drivers/hid/hid-mcp2221.c | 71 +++++---
drivers/hid/hid-multitouch.c | 8 +
drivers/hid/hid-ntrig.c | 3 +
drivers/hid/hid-quirks.c | 2 +
drivers/hid/wacom_wac.c | 1 +
drivers/net/ethernet/dlink/dl2k.c | 2 +-
drivers/net/ethernet/intel/ice/ice_txrx.c | 94 +++++++---
drivers/net/ethernet/intel/ice/ice_txrx.h | 19 +-
drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 53 ++----
drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 2 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.c | 3 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.h | 12 ++
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 19 +-
drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 200 ++++++++++++++-------
drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h | 1 +
drivers/net/ethernet/mellanox/mlx5/core/main.c | 3 +
.../net/ethernet/mellanox/mlx5/core/sf/devlink.c | 87 ++++-----
drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h | 6 +
drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 8 +-
.../net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 13 +-
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 9 +-
drivers/net/ethernet/stmicro/stmmac/hwif.h | 8 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 12 +-
drivers/net/phy/mscc/mscc.h | 4 +
drivers/net/phy/mscc/mscc_main.c | 4 +-
drivers/net/phy/mscc/mscc_ptp.c | 34 ++--
drivers/net/usb/qmi_wwan.c | 3 +
drivers/of/dynamic.c | 29 +--
drivers/of/of_private.h | 1 +
drivers/of/overlay.c | 11 +-
drivers/of/unittest.c | 12 +-
drivers/pinctrl/Kconfig | 1 +
drivers/scsi/scsi_sysfs.c | 4 +-
drivers/vhost/net.c | 9 +-
fs/efivarfs/super.c | 4 +
fs/erofs/zdata.c | 13 +-
fs/nfs/pagelist.c | 86 +--------
fs/nfs/write.c | 148 +++++++++------
fs/smb/client/cifsfs.c | 14 ++
fs/smb/client/inode.c | 34 +++-
fs/smb/client/smb2inode.c | 7 +-
fs/xfs/libxfs/xfs_attr_remote.c | 7 +
fs/xfs/libxfs/xfs_da_btree.c | 6 +
include/linux/atmdev.h | 1 +
include/linux/mlx5/mlx5_ifc.h | 11 +-
include/linux/nfs_page.h | 2 +-
include/net/bluetooth/hci_sync.h | 2 +-
include/net/rose.h | 18 +-
kernel/dma/pool.c | 4 +-
kernel/trace/trace.c | 4 +-
net/atm/common.c | 15 +-
net/bluetooth/hci_event.c | 20 ++-
net/bluetooth/hci_sync.c | 6 +-
net/bluetooth/mgmt.c | 5 +-
net/ipv4/route.c | 10 +-
net/rose/af_rose.c | 13 +-
net/rose/rose_in.c | 12 +-
net/rose/rose_route.c | 62 ++++---
net/rose/rose_timer.c | 2 +-
net/sctp/ipv6.c | 2 +
sound/soc/codecs/lpass-tx-macro.c | 2 +-
82 files changed, 897 insertions(+), 560 deletions(-)
This is the start of the stable review cycle for the 5.15.191 release.
There are 33 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 04 Sep 2025 13:19:14 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.191-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.191-rc1
Eric Sandeen <sandeen(a)redhat.com>
xfs: do not propagate ENODATA disk errors into xattr code
Imre Deak <imre.deak(a)intel.com>
Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Handle reads greater than 60 bytes
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Don't set bus speed on every transfer
James Jones <jajones(a)nvidia.com>
drm/nouveau/disp: Always accept linear modifier
Fabio Porcedda <fabio.porcedda(a)gmail.com>
net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions
Shanker Donthineni <sdonthineni(a)nvidia.com>
dma/pool: Ensure DMA_DIRECT_REMAP allocations are decrypted
Alex Deucher <alexander.deucher(a)amd.com>
Revert "drm/amdgpu: fix incorrect vm flags to map bo"
Minjong Kim <minbell.kim(a)samsung.com>
HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()
Ping Cheng <pinglinux(a)gmail.com>
HID: wacom: Add a new Art Pen 2
Qasim Ijaz <qasdev00(a)gmail.com>
HID: multitouch: fix slab out-of-bounds access in mt_report_fixup()
Qasim Ijaz <qasdev00(a)gmail.com>
HID: asus: fix UAF via HID_CLAIMED_INPUT validation
Thijs Raymakers <thijs(a)raymakers.nl>
KVM: x86: use array_index_nospec with indices that come from guest
Li Nan <linan122(a)huawei.com>
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
Eric Dumazet <edumazet(a)google.com>
sctp: initialize more fields in sctp_v6_from_sk()
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Set local Xoff after FW update
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon port speed set
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon MTU set
Horatiu Vultur <horatiu.vultur(a)microchip.com>
phy: mscc: Fix when PTP clock is register and unregister
Yeounsu Moon <yyyynoom(a)gmail.com>
net: dlink: fix multicast stats being counted incorrectly
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control().
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced
Madhavan Srinivasan <maddy(a)linux.ibm.com>
powerpc/kvm: Fix ifdef to remove build warning
Oscar Maes <oscmaes92(a)gmail.com>
net: ipv4: fix regression in local-broadcast routes
Jan Kara <jack(a)suse.cz>
udf: Fix directory iteration for longer tail extents
Nikolay Kuratov <kniv(a)yandex-team.ru>
vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFS: Fix a race when updating an existing write
Christoph Hellwig <hch(a)lst.de>
nfs: fold nfs_page_group_lock_subrequests into nfs_lock_and_join_requests
Alexey Klimov <alexey.klimov(a)linaro.org>
ASoC: codecs: tx-macro: correct tx_macro_component_drv name
Damien Le Moal <dlemoal(a)kernel.org>
scsi: core: sysfs: Correct sysfs attributes access rights
Tengda Wu <wutengda(a)huaweicloud.com>
ftrace: Fix potential warning in trace_printk_seq during ftrace_dump
Randy Dunlap <rdunlap(a)infradead.org>
pinctrl: STMFX: add missing HAS_IOMEM dependency
-------------
Diffstat:
Makefile | 4 +-
arch/powerpc/kernel/kvm.c | 8 +-
arch/x86/kvm/lapic.c | 2 +
arch/x86/kvm/x86.c | 7 +-
drivers/atm/atmtcp.c | 17 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +-
drivers/gpu/drm/drm_dp_helper.c | 2 +-
drivers/gpu/drm/nouveau/dispnv50/wndw.c | 4 +
drivers/hid/hid-asus.c | 8 +-
drivers/hid/hid-mcp2221.c | 71 +++++++----
drivers/hid/hid-multitouch.c | 8 ++
drivers/hid/hid-ntrig.c | 3 +
drivers/hid/wacom_wac.c | 1 +
drivers/net/ethernet/dlink/dl2k.c | 2 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.c | 3 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.h | 12 ++
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 19 ++-
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 4 -
drivers/net/phy/mscc/mscc.h | 4 +
drivers/net/phy/mscc/mscc_main.c | 4 +-
drivers/net/phy/mscc/mscc_ptp.c | 34 +++--
drivers/net/usb/qmi_wwan.c | 3 +
drivers/pinctrl/Kconfig | 1 +
drivers/scsi/scsi_sysfs.c | 4 +-
drivers/vhost/net.c | 9 +-
fs/efivarfs/super.c | 4 +
fs/nfs/pagelist.c | 86 +------------
fs/nfs/write.c | 142 +++++++++++++--------
fs/udf/directory.c | 2 +-
fs/xfs/libxfs/xfs_attr_remote.c | 7 +
fs/xfs/libxfs/xfs_da_btree.c | 6 +
include/linux/atmdev.h | 1 +
include/linux/nfs_page.h | 2 +-
kernel/dma/pool.c | 4 +-
kernel/trace/trace.c | 4 +-
net/atm/common.c | 15 ++-
net/bluetooth/hci_event.c | 12 +-
net/ipv4/route.c | 10 +-
net/sctp/ipv6.c | 2 +
sound/soc/codecs/lpass-tx-macro.c | 2 +-
40 files changed, 328 insertions(+), 209 deletions(-)
If KASAN is enabled, and one runs in a clean repository e.g.:
make LLVM=1 prepare
make LLVM=1 prepare
Then the Rust code gets rebuilt, which should not happen.
The reason is some of the LLVM KASAN `rustc` flags are added in the
second run:
-Cllvm-args=-asan-instrumentation-with-call-threshold=10000
-Cllvm-args=-asan-stack=0
-Cllvm-args=-asan-globals=1
-Cllvm-args=-asan-kernel-mem-intrinsic-prefix=1
Further runs do not rebuild Rust because the flags do not change anymore.
Rebuilding like that in the second run is bad, even if this just happens
with KASAN enabled, but missing flags in the first one is even worse.
The root issue is that we pass, for some architectures and for the moment,
a generated `target.json` file. That file is not ready by the time `rustc`
gets called for the flag test, and thus the flag test fails just because
the file is not available, e.g.:
$ ... --target=./scripts/target.json ... -Cllvm-args=...
error: target file "./scripts/target.json" does not exist
There are a few approaches we could take here to solve this. For instance,
we could ensure that every time that the config is rebuilt, we regenerate
the file and recompute the flags. Or we could use the LLVM version to
check for these flags, instead of testing the flag (which may have other
advantages, such as allowing us to detect renames on the LLVM side).
However, it may be easier than that: `rustc` is aware of the `-Cllvm-args`
regardless of the `--target` (e.g. I checked that the list printed
is the same, plus that I can check for these flags even if I pass
a completely unrelated target), and thus we can just eliminate the
dependency completely.
Thus filter out the target.
This does mean that `rustc-option` cannot be used to test a flag that
requires the right target, but we don't have other users yet, it is a
minimal change and we want to get rid of custom targets in the future.
We could only filter in the case `target.json` is used, to make it work
in more cases, but then it would be harder to notice that it may not
work in a couple architectures.
Cc: Matthew Maurer <mmaurer(a)google.com>
Cc: Sami Tolvanen <samitolvanen(a)google.com>
Cc: stable(a)vger.kernel.org
Fixes: e3117404b411 ("kbuild: rust: Enable KASAN support")
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
---
By the way, I noticed that we are not getting `asan-instrument-allocas` enabled
in neither C nor Rust -- upstream LLVM renamed it in commit 8176ee9b5dda ("[asan]
Rename asan-instrument-allocas -> asan-instrument-dynamic-allocas")). But it
happened a very long time ago (9 years ago), and the addition in the kernel
is fairly old too, in 342061ee4ef3 ("kasan: support alloca() poisoning").
I assume it should either be renamed or removed? Happy to send a patch if so.
scripts/Makefile.compiler | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/Makefile.compiler b/scripts/Makefile.compiler
index 8956587b8547..7ed7f92a7daa 100644
--- a/scripts/Makefile.compiler
+++ b/scripts/Makefile.compiler
@@ -80,7 +80,7 @@ ld-option = $(call try-run, $(LD) $(KBUILD_LDFLAGS) $(1) -v,$(1),$(2),$(3))
# TODO: remove RUSTC_BOOTSTRAP=1 when we raise the minimum GNU Make version to 4.4
__rustc-option = $(call try-run,\
echo '#![allow(missing_docs)]#![feature(no_core)]#![no_core]' | RUSTC_BOOTSTRAP=1\
- $(1) --sysroot=/dev/null $(filter-out --sysroot=/dev/null,$(2)) $(3)\
+ $(1) --sysroot=/dev/null $(filter-out --sysroot=/dev/null --target=%,$(2)) $(3)\
--crate-type=rlib --out-dir=$(TMPOUT) --emit=obj=- - >/dev/null,$(3),$(4))
# rustc-option
base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8
--
2.49.0
When a CPU chooses to call push_dl_task and picks a task to push to
another CPU's runqueue then it will call find_lock_later_rq method
which would take a double lock on both CPUs' runqueues. If one of the
locks aren't readily available, it may lead to dropping the current
runqueue lock and reacquiring both the locks at once. During this window
it is possible that the task is already migrated and is running on some
other CPU. These cases are already handled. However, if the task is
migrated and has already been executed and another CPU is now trying to
wake it up (ttwu) such that it is queued again on the runqeue
(on_rq is 1) and also if the task was run by the same CPU, then the
current checks will pass even though the task was migrated out and is no
longer in the pushable tasks list.
Please go through the original rt change for more details on the issue.
To fix this, after the lock is obtained inside the find_lock_later_rq,
it ensures that the task is still at the head of pushable tasks list.
Also removed some checks that are no longer needed with the addition of
this new check.
However, the new check of pushable tasks list only applies when
find_lock_later_rq is called by push_dl_task. For the other caller i.e.
dl_task_offline_migration, existing checks are used.
Signed-off-by: Harshit Agarwal <harshit(a)nutanix.com>
Cc: stable(a)vger.kernel.org
---
Changes in v3:
- Incorporated review comments from Juri around the commit message as
well as around the comment regarding checks in find_lock_later_rq.
- Link to v2:
https://lore.kernel.org/stable/20250317022325.52791-1-harshit@nutanix.com/
Changes in v2:
- As per Juri's suggestion, moved the check inside find_lock_later_rq
similar to rt change. Here we distinguish among the push_dl_task
caller vs dl_task_offline_migration by checking if the task is
throttled or not.
- Fixed the commit message to refer to the rt change by title.
- Link to v1:
https://lore.kernel.org/lkml/20250307204255.60640-1-harshit@nutanix.com/
---
kernel/sched/deadline.c | 73 +++++++++++++++++++++++++++--------------
1 file changed, 49 insertions(+), 24 deletions(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 38e4537790af..e0c95f33e1ed 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2621,6 +2621,25 @@ static int find_later_rq(struct task_struct *task)
return -1;
}
+static struct task_struct *pick_next_pushable_dl_task(struct rq *rq)
+{
+ struct task_struct *p;
+
+ if (!has_pushable_dl_tasks(rq))
+ return NULL;
+
+ p = __node_2_pdl(rb_first_cached(&rq->dl.pushable_dl_tasks_root));
+
+ WARN_ON_ONCE(rq->cpu != task_cpu(p));
+ WARN_ON_ONCE(task_current(rq, p));
+ WARN_ON_ONCE(p->nr_cpus_allowed <= 1);
+
+ WARN_ON_ONCE(!task_on_rq_queued(p));
+ WARN_ON_ONCE(!dl_task(p));
+
+ return p;
+}
+
/* Locks the rq it finds */
static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
{
@@ -2648,12 +2667,37 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
/* Retry if something changed. */
if (double_lock_balance(rq, later_rq)) {
- if (unlikely(task_rq(task) != rq ||
+ /*
+ * double_lock_balance had to release rq->lock, in the
+ * meantime, task may no longer be fit to be migrated.
+ * Check the following to ensure that the task is
+ * still suitable for migration:
+ * 1. It is possible the task was scheduled,
+ * migrate_disabled was set and then got preempted,
+ * so we must check the task migration disable
+ * flag.
+ * 2. The CPU picked is in the task's affinity.
+ * 3. For throttled task (dl_task_offline_migration),
+ * check the following:
+ * - the task is not on the rq anymore (it was
+ * migrated)
+ * - the task is not on CPU anymore
+ * - the task is still a dl task
+ * - the task is not queued on the rq anymore
+ * 4. For the non-throttled task (push_dl_task), the
+ * check to ensure that this task is still at the
+ * head of the pushable tasks list is enough.
+ */
+ if (unlikely(is_migration_disabled(task) ||
!cpumask_test_cpu(later_rq->cpu, &task->cpus_mask) ||
- task_on_cpu(rq, task) ||
- !dl_task(task) ||
- is_migration_disabled(task) ||
- !task_on_rq_queued(task))) {
+ (task->dl.dl_throttled &&
+ (task_rq(task) != rq ||
+ task_on_cpu(rq, task) ||
+ !dl_task(task) ||
+ !task_on_rq_queued(task))) ||
+ (!task->dl.dl_throttled &&
+ task != pick_next_pushable_dl_task(rq)))) {
+
double_unlock_balance(rq, later_rq);
later_rq = NULL;
break;
@@ -2676,25 +2720,6 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
return later_rq;
}
-static struct task_struct *pick_next_pushable_dl_task(struct rq *rq)
-{
- struct task_struct *p;
-
- if (!has_pushable_dl_tasks(rq))
- return NULL;
-
- p = __node_2_pdl(rb_first_cached(&rq->dl.pushable_dl_tasks_root));
-
- WARN_ON_ONCE(rq->cpu != task_cpu(p));
- WARN_ON_ONCE(task_current(rq, p));
- WARN_ON_ONCE(p->nr_cpus_allowed <= 1);
-
- WARN_ON_ONCE(!task_on_rq_queued(p));
- WARN_ON_ONCE(!dl_task(p));
-
- return p;
-}
-
/*
* See if the non running -deadline tasks on this rq
* can be sent to some other CPU where they can preempt
--
2.49.0.111.g5b97a56fa0
From: Youling Tang <tangyouling(a)kylinos.cn>
Automatically disable kaslr when the kernel loads from kexec_file.
kexec_file loads the secondary kernel image to a non-linked address,
inherently providing KASLR-like randomization.
However, on LoongArch where System RAM may be non-contiguous, enabling
KASLR for the second kernel could relocate it to an invalid memory
region and cause boot failure. Thus, we disable KASLR when
"kexec_file" is detected in the command line.
To ensure compatibility with older kernels loaded via kexec_file,
this patch need be backported to stable branches.
Cc: stable(a)vger.kernel.org
Signed-off-by: Youling Tang <tangyouling(a)kylinos.cn>
---
arch/loongarch/kernel/relocate.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/loongarch/kernel/relocate.c b/arch/loongarch/kernel/relocate.c
index 50c469067f3a..4c097532cb88 100644
--- a/arch/loongarch/kernel/relocate.c
+++ b/arch/loongarch/kernel/relocate.c
@@ -140,6 +140,10 @@ static inline __init bool kaslr_disabled(void)
if (str == boot_command_line || (str > boot_command_line && *(str - 1) == ' '))
return true;
+ str = strstr(boot_command_line, "kexec_file");
+ if (str == boot_command_line || (str > boot_command_line && *(str - 1) == ' '))
+ return true;
+
#ifdef CONFIG_HIBERNATION
str = strstr(builtin_cmdline, "nohibernate");
if (str == builtin_cmdline || (str > builtin_cmdline && *(str - 1) == ' '))
--
2.43.0
When SRIOV is enabled, RSS is also supported for the functions. There is
an incorrect flag judgement which causes RSS to fail to be enabled.
Fixes: c52d4b898901 ("net: libwx: Redesign flow when sriov is enabled")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu(a)trustnetic.com>
---
drivers/net/ethernet/wangxun/libwx/wx_hw.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_hw.c b/drivers/net/ethernet/wangxun/libwx/wx_hw.c
index bcd07a715752..5cb353a97d6d 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_hw.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_hw.c
@@ -2078,10 +2078,6 @@ static void wx_setup_mrqc(struct wx *wx)
{
u32 rss_field = 0;
- /* VT, and RSS do not coexist at the same time */
- if (test_bit(WX_FLAG_VMDQ_ENABLED, wx->flags))
- return;
-
/* Disable indicating checksum in descriptor, enables RSS hash */
wr32m(wx, WX_PSR_CTL, WX_PSR_CTL_PCSD, WX_PSR_CTL_PCSD);
--
2.48.1
Fix multiple fwnode reference leaks:
1. The function calls fwnode_get_named_child_node() to get the "leds" node,
but never calls fwnode_handle_put(leds) to release this reference.
2. Within the fwnode_for_each_child_node() loop, the early return
paths that don't properly release the "led" fwnode reference.
This fix follows the same pattern as commit d029edefed39
("net dsa: qca8k: fix usages of device_get_named_child_node()")
Fixes: 94a2a84f5e9e ("net: dsa: mv88e6xxx: Support LED control")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
changes in v2:
- use goto for cleanup in error paths
- v1: https://lore.kernel.org/all/20250830085508.2107507-1-linmq006@gmail.com/
---
drivers/net/dsa/mv88e6xxx/leds.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/leds.c b/drivers/net/dsa/mv88e6xxx/leds.c
index 1c88bfaea46b..ab3bc645da56 100644
--- a/drivers/net/dsa/mv88e6xxx/leds.c
+++ b/drivers/net/dsa/mv88e6xxx/leds.c
@@ -779,7 +779,8 @@ int mv88e6xxx_port_setup_leds(struct mv88e6xxx_chip *chip, int port)
continue;
if (led_num > 1) {
dev_err(dev, "invalid LED specified port %d\n", port);
- return -EINVAL;
+ ret = -EINVAL;
+ goto err_put_led;
}
if (led_num == 0)
@@ -823,17 +824,25 @@ int mv88e6xxx_port_setup_leds(struct mv88e6xxx_chip *chip, int port)
init_data.devname_mandatory = true;
init_data.devicename = kasprintf(GFP_KERNEL, "%s:0%d:0%d", chip->info->name,
port, led_num);
- if (!init_data.devicename)
- return -ENOMEM;
+ if (!init_data.devicename) {
+ ret = -ENOMEM;
+ goto err_put_led;
+ }
ret = devm_led_classdev_register_ext(dev, l, &init_data);
kfree(init_data.devicename);
if (ret) {
dev_err(dev, "Failed to init LED %d for port %d", led_num, port);
- return ret;
+ goto err_put_led;
}
}
+ fwnode_handle_put(leds);
return 0;
+
+err_put_led:
+ fwnode_handle_put(led);
+ fwnode_handle_put(leds);
+ return ret;
}
--
2.35.1
commit efa95b01da18 ("netpoll: fix use after free") incorrectly
ignored the refcount and prematurely set dev->npinfo to NULL during
netpoll cleanup, leading to improper behavior and memory leaks.
Scenario causing lack of proper cleanup:
1) A netpoll is associated with a NIC (e.g., eth0) and netdev->npinfo is
allocated, and refcnt = 1
- Keep in mind that npinfo is shared among all netpoll instances. In
this case, there is just one.
2) Another netpoll is also associated with the same NIC and
npinfo->refcnt += 1.
- Now dev->npinfo->refcnt = 2;
- There is just one npinfo associated to the netdev.
3) When the first netpolls goes to clean up:
- The first cleanup succeeds and clears np->dev->npinfo, ignoring
refcnt.
- It basically calls `RCU_INIT_POINTER(np->dev->npinfo, NULL);`
- Set dev->npinfo = NULL, without proper cleanup
- No ->ndo_netpoll_cleanup() is either called
4) Now the second target tries to clean up
- The second cleanup fails because np->dev->npinfo is already NULL.
* In this case, ops->ndo_netpoll_cleanup() was never called, and
the skb pool is not cleaned as well (for the second netpoll
instance)
- This leaks npinfo and skbpool skbs, which is clearly reported by
kmemleak.
Revert commit efa95b01da18 ("netpoll: fix use after free") and adds
clarifying comments emphasizing that npinfo cleanup should only happen
once the refcount reaches zero, ensuring stable and correct netpoll
behavior.
Cc: stable(a)vger.kernel.org
Cc: jv(a)jvosburgh.net
Fixes: efa95b01da18 ("netpoll: fix use after free")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
I have a selftest that shows the memory leak when kmemleak is enabled
and I will be submitting to net-next.
Also, giving I am reverting commit efa95b01da18 ("netpoll: fix use
after free"), which was supposed to fix a problem on bonding, I am
copying Jay.
---
net/core/netpoll.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 5f65b62346d4e..19676cd379640 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -815,6 +815,10 @@ static void __netpoll_cleanup(struct netpoll *np)
if (!npinfo)
return;
+ /* At this point, there is a single npinfo instance per netdevice, and
+ * its refcnt tracks how many netpoll structures are linked to it. We
+ * only perform npinfo cleanup when the refcnt decrements to zero.
+ */
if (refcount_dec_and_test(&npinfo->refcnt)) {
const struct net_device_ops *ops;
@@ -824,8 +828,7 @@ static void __netpoll_cleanup(struct netpoll *np)
RCU_INIT_POINTER(np->dev->npinfo, NULL);
call_rcu(&npinfo->rcu, rcu_cleanup_netpoll_info);
- } else
- RCU_INIT_POINTER(np->dev->npinfo, NULL);
+ }
skb_pool_flush(np);
}
---
base-commit: 864ecc4a6dade82d3f70eab43dad0e277aa6fc78
change-id: 20250901-netpoll_memleak-90d0d4bc772c
Best regards,
--
Breno Leitao <leitao(a)debian.org>
The patch titled
Subject: compiler-clang.h: define __SANITIZE_*__ macros only when undefined
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
compiler-clangh-define-__sanitize___-macros-only-when-undefined.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Nathan Chancellor <nathan(a)kernel.org>
Subject: compiler-clang.h: define __SANITIZE_*__ macros only when undefined
Date: Tue, 02 Sep 2025 15:49:26 -0700
Clang 22 recently added support for defining __SANITIZE__ macros similar
to GCC [1], which causes warnings (or errors with CONFIG_WERROR=y or W=e)
with the existing defines that the kernel creates to emulate this behavior
with existing clang versions.
In file included from <built-in>:3:
In file included from include/linux/compiler_types.h:171:
include/linux/compiler-clang.h:37:9: error: '__SANITIZE_THREAD__' macro redefined [-Werror,-Wmacro-redefined]
37 | #define __SANITIZE_THREAD__
| ^
<built-in>:352:9: note: previous definition is here
352 | #define __SANITIZE_THREAD__ 1
| ^
Refactor compiler-clang.h to only define the sanitizer macros when they
are undefined and adjust the rest of the code to use these macros for
checking if the sanitizers are enabled, clearing up the warnings and
allowing the kernel to easily drop these defines when the minimum
supported version of LLVM for building the kernel becomes 22.0.0 or newer.
Link: https://lkml.kernel.org/r/20250902-clang-update-sanitize-defines-v1-1-cf370…
Link: https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444da… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
Reviewed-by: Justin Stitt <justinstitt(a)google.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Bill Wendling <morbo(a)google.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/compiler-clang.h | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
--- a/include/linux/compiler-clang.h~compiler-clangh-define-__sanitize___-macros-only-when-undefined
+++ a/include/linux/compiler-clang.h
@@ -18,23 +18,42 @@
#define KASAN_ABI_VERSION 5
/*
+ * Clang 22 added preprocessor macros to match GCC, in hopes of eventually
+ * dropping __has_feature support for sanitizers:
+ * https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444da…
+ * Create these macros for older versions of clang so that it is easy to clean
+ * up once the minimum supported version of LLVM for building the kernel always
+ * creates these macros.
+ *
* Note: Checking __has_feature(*_sanitizer) is only true if the feature is
* enabled. Therefore it is not required to additionally check defined(CONFIG_*)
* to avoid adding redundant attributes in other configurations.
*/
+#if __has_feature(address_sanitizer) && !defined(__SANITIZE_ADDRESS__)
+#define __SANITIZE_ADDRESS__
+#endif
+#if __has_feature(hwaddress_sanitizer) && !defined(__SANITIZE_HWADDRESS__)
+#define __SANITIZE_HWADDRESS__
+#endif
+#if __has_feature(thread_sanitizer) && !defined(__SANITIZE_THREAD__)
+#define __SANITIZE_THREAD__
+#endif
-#if __has_feature(address_sanitizer) || __has_feature(hwaddress_sanitizer)
-/* Emulate GCC's __SANITIZE_ADDRESS__ flag */
+/*
+ * Treat __SANITIZE_HWADDRESS__ the same as __SANITIZE_ADDRESS__ in the kernel.
+ */
+#ifdef __SANITIZE_HWADDRESS__
#define __SANITIZE_ADDRESS__
+#endif
+
+#ifdef __SANITIZE_ADDRESS__
#define __no_sanitize_address \
__attribute__((no_sanitize("address", "hwaddress")))
#else
#define __no_sanitize_address
#endif
-#if __has_feature(thread_sanitizer)
-/* emulate gcc's __SANITIZE_THREAD__ flag */
-#define __SANITIZE_THREAD__
+#ifdef __SANITIZE_THREAD__
#define __no_sanitize_thread \
__attribute__((no_sanitize("thread")))
#else
_
Patches currently in -mm which might be from nathan(a)kernel.org are
compiler-clangh-define-__sanitize___-macros-only-when-undefined.patch
mm-rmap-convert-enum-rmap_level-to-enum-pgtable_level-fix.patch
Clang 22 recently added support for defining __SANITIZE__ macros similar
to GCC [1], which causes warnings (or errors with CONFIG_WERROR=y or
W=e) with the existing defines that the kernel creates to emulate this
behavior with existing clang versions.
In file included from <built-in>:3:
In file included from include/linux/compiler_types.h:171:
include/linux/compiler-clang.h:37:9: error: '__SANITIZE_THREAD__' macro redefined [-Werror,-Wmacro-redefined]
37 | #define __SANITIZE_THREAD__
| ^
<built-in>:352:9: note: previous definition is here
352 | #define __SANITIZE_THREAD__ 1
| ^
Refactor compiler-clang.h to only define the sanitizer macros when they
are undefined and adjust the rest of the code to use these macros for
checking if the sanitizers are enabled, clearing up the warnings and
allowing the kernel to easily drop these defines when the minimum
supported version of LLVM for building the kernel becomes 22.0.0 or
newer.
Cc: stable(a)vger.kernel.org
Link: https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444da… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
Andrew, would it be possible to take this via mm-hotfixes?
---
include/linux/compiler-clang.h | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index fa4ffe037bc7..8720a0705900 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -18,23 +18,42 @@
#define KASAN_ABI_VERSION 5
/*
+ * Clang 22 added preprocessor macros to match GCC, in hopes of eventually
+ * dropping __has_feature support for sanitizers:
+ * https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444da…
+ * Create these macros for older versions of clang so that it is easy to clean
+ * up once the minimum supported version of LLVM for building the kernel always
+ * creates these macros.
+ *
* Note: Checking __has_feature(*_sanitizer) is only true if the feature is
* enabled. Therefore it is not required to additionally check defined(CONFIG_*)
* to avoid adding redundant attributes in other configurations.
*/
+#if __has_feature(address_sanitizer) && !defined(__SANITIZE_ADDRESS__)
+#define __SANITIZE_ADDRESS__
+#endif
+#if __has_feature(hwaddress_sanitizer) && !defined(__SANITIZE_HWADDRESS__)
+#define __SANITIZE_HWADDRESS__
+#endif
+#if __has_feature(thread_sanitizer) && !defined(__SANITIZE_THREAD__)
+#define __SANITIZE_THREAD__
+#endif
-#if __has_feature(address_sanitizer) || __has_feature(hwaddress_sanitizer)
-/* Emulate GCC's __SANITIZE_ADDRESS__ flag */
+/*
+ * Treat __SANITIZE_HWADDRESS__ the same as __SANITIZE_ADDRESS__ in the kernel.
+ */
+#ifdef __SANITIZE_HWADDRESS__
#define __SANITIZE_ADDRESS__
+#endif
+
+#ifdef __SANITIZE_ADDRESS__
#define __no_sanitize_address \
__attribute__((no_sanitize("address", "hwaddress")))
#else
#define __no_sanitize_address
#endif
-#if __has_feature(thread_sanitizer)
-/* emulate gcc's __SANITIZE_THREAD__ flag */
-#define __SANITIZE_THREAD__
+#ifdef __SANITIZE_THREAD__
#define __no_sanitize_thread \
__attribute__((no_sanitize("thread")))
#else
---
base-commit: b320789d6883cc00ac78ce83bccbfe7ed58afcf0
change-id: 20250902-clang-update-sanitize-defines-845000c29d2c
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
Hi all,
A few months ago, the multi-fsblock untorn writes patchset added a bunch
of log intent item helper functions to estimate the number of intent
items that could be added to a particular transaction. Those helpers
enabled us to compute a safe upper bound on the number of blocks that
could be written in an untorn fashion with filesystem-provided out of
place writes.
Currently, the online fsck code employs static limits on the number of
intent items that it's willing to accrue to a single transaction when
it's trying to reap what it thinks are the old blocks from a corrupt
structure. There have been no problems reported with this approach
after years of testing, but static limits are scary and gross because
overestimating the intent item limit could result in transaction
overflows and dead filesystems; and underestimating causes unnecessary
overhead.
This series uses the new log intent item size helpers to estimate the
limits dynamically based on worst-case per-block repair work vs. the
size of the scrub transaction. After several months of testing this,
there don't seem to be any problems here either.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
This has been running on the djcloud for months with no problems. Enjoy!
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fi…
---
Commits in this patchset:
* xfs: prepare reaping code for dynamic limits
* xfs: convert the ifork reap code to use xreap_state
* xfs: use deferred intent items for reaping crosslinked blocks
* xfs: compute per-AG extent reap limits dynamically
* xfs: compute data device CoW staging extent reap limits dynamically
* xfs: compute realtime device CoW staging extent reap limits dynamically
* xfs: compute file mapping reap limits dynamically
* xfs: remove static reap limits
* xfs: use deferred reaping for data device cow extents
---
fs/xfs/scrub/repair.h | 8 -
fs/xfs/scrub/trace.h | 45 ++++
fs/xfs/scrub/newbt.c | 7 +
fs/xfs/scrub/reap.c | 622 +++++++++++++++++++++++++++++++++++++++----------
fs/xfs/scrub/trace.c | 1
5 files changed, 552 insertions(+), 131 deletions(-)
This reverts commit 71598a5a7797f0052aaa7bcff0b8d4b8f20f1441.
This commit introduced a regression, however the fix for the
regression:
aa5fc4362fac ("drm/amdgpu: fix task hang from failed job submission during process kill")
depends on things not yet present in 6.12.y and older kernels. Since
this commit is more of an optimization, just revert it for
6.12.y and older stable kernels.
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org # 6.1.x - 6.12.x
---
Please apply this revert to 6.1.x to 6.12.x stable trees. The newer
stable trees and Linus' tree already have the regression fix.
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 0adb106e2c42..37d53578825b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2292,11 +2292,13 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
*/
long amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long timeout)
{
- timeout = drm_sched_entity_flush(&vm->immediate, timeout);
+ timeout = dma_resv_wait_timeout(vm->root.bo->tbo.base.resv,
+ DMA_RESV_USAGE_BOOKKEEP,
+ true, timeout);
if (timeout <= 0)
return timeout;
- return drm_sched_entity_flush(&vm->delayed, timeout);
+ return dma_fence_wait_timeout(vm->last_unlocked, true, timeout);
}
static void amdgpu_vm_destroy_task_info(struct kref *kref)
--
2.51.0
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 1f403699c40f0806a707a9a6eed3b8904224021a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025090146-playback-kinsman-373c@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1f403699c40f0806a707a9a6eed3b8904224021a Mon Sep 17 00:00:00 2001
From: Ma Ke <make24(a)iscas.ac.cn>
Date: Tue, 12 Aug 2025 15:19:32 +0800
Subject: [PATCH] drm/mediatek: Fix device/node reference count leaks in
mtk_drm_get_all_drm_priv
Using device_find_child() and of_find_device_by_node() to locate
devices could cause an imbalance in the device's reference count.
device_find_child() and of_find_device_by_node() both call
get_device() to increment the reference count of the found device
before returning the pointer. In mtk_drm_get_all_drm_priv(), these
references are never released through put_device(), resulting in
permanent reference count increments. Additionally, the
for_each_child_of_node() iterator fails to release node references in
all code paths. This leaks device node references when loop
termination occurs before reaching MAX_CRTC. These reference count
leaks may prevent device/node resources from being properly released
during driver unbind operations.
As comment of device_find_child() says, 'NOTE: you will need to drop
the reference with put_device() after use'.
Cc: stable(a)vger.kernel.org
Fixes: 1ef7ed48356c ("drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
Reviewed-by: CK Hu <ck.hu(a)mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20250812071932.471730-…
Signed-off-by: Chun-Kuang Hu <chunkuang.hu(a)kernel.org>
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index d5e6bab36414..f8a817689e16 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -387,19 +387,19 @@ static bool mtk_drm_get_all_drm_priv(struct device *dev)
of_id = of_match_node(mtk_drm_of_ids, node);
if (!of_id)
- continue;
+ goto next_put_node;
pdev = of_find_device_by_node(node);
if (!pdev)
- continue;
+ goto next_put_node;
drm_dev = device_find_child(&pdev->dev, NULL, mtk_drm_match);
if (!drm_dev)
- continue;
+ goto next_put_device_pdev_dev;
temp_drm_priv = dev_get_drvdata(drm_dev);
if (!temp_drm_priv)
- continue;
+ goto next_put_device_drm_dev;
if (temp_drm_priv->data->main_len)
all_drm_priv[CRTC_MAIN] = temp_drm_priv;
@@ -411,10 +411,17 @@ static bool mtk_drm_get_all_drm_priv(struct device *dev)
if (temp_drm_priv->mtk_drm_bound)
cnt++;
- if (cnt == MAX_CRTC) {
- of_node_put(node);
+next_put_device_drm_dev:
+ put_device(drm_dev);
+
+next_put_device_pdev_dev:
+ put_device(&pdev->dev);
+
+next_put_node:
+ of_node_put(node);
+
+ if (cnt == MAX_CRTC)
break;
- }
}
if (drm_priv->data->mmsys_dev_num == cnt) {
The current implementation of CXL memory hotplug notifier gets called
before the HMAT memory hotplug notifier. The CXL driver calculates the
access coordinates (bandwidth and latency values) for the CXL end to
end path (i.e. CPU to endpoint). When the CXL region is onlined, the CXL
memory hotplug notifier writes the access coordinates to the HMAT target
structs. Then the HMAT memory hotplug notifier is called and it creates
the access coordinates for the node sysfs attributes.
During testing on an Intel platform, it was found that although the
newly calculated coordinates were pushed to sysfs, the sysfs attributes for
the access coordinates showed up with the wrong initiator. The system has
4 nodes (0, 1, 2, 3) where node 0 and 1 are CPU nodes and node 2 and 3 are
CXL nodes. The expectation is that node 2 would show up as a target to node
0:
/sys/devices/system/node/node2/access0/initiators/node0
However it was observed that node 2 showed up as a target under node 1:
/sys/devices/system/node/node2/access0/initiators/node1
The original intent of the 'ext_updated' flag in HMAT handling code was to
stop HMAT memory hotplug callback from clobbering the access coordinates
after CXL has injected its calculated coordinates and replaced the generic
target access coordinates provided by the HMAT table in the HMAT target
structs. However the flag is hacky at best and blocks the updates from
other CXL regions that are onlined in the same node later on. Remove the
'ext_updated' flag usage and just update the access coordinates for the
nodes directly without touching HMAT target data.
The hotplug memory callback ordering is changed. Instead of changing CXL,
move HMAT back so there's room for the levels rather than have CXL share
the same level as SLAB_CALLBACK_PRI. The change will resulting in the CXL
callback to be executed after the HMAT callback.
With the change, the CXL hotplug memory notifier runs after the HMAT
callback. The HMAT callback will create the node sysfs attributes for
access coordinates. The CXL callback will write the access coordinates to
the now created node sysfs attributes directly and will not pollute the
HMAT target values.
A nodemask is introduced to keep track if a node has been updated and
prevents further updates.
Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region")
Cc: stable(a)vger.kernel.org
Tested-by: Marc Herbert <marc.herbert(a)linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Dave Jiang <dave.jiang(a)intel.com>
---
v3:
- Use nodemask instead of xarray to keep track of node updates (Jonathan)
---
drivers/acpi/numa/hmat.c | 6 ------
drivers/cxl/core/cdat.c | 5 -----
drivers/cxl/core/core.h | 1 -
drivers/cxl/core/region.c | 20 ++++++++++++--------
include/linux/memory.h | 2 +-
5 files changed, 13 insertions(+), 21 deletions(-)
diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 4958301f5417..5d32490dc4ab 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -74,7 +74,6 @@ struct memory_target {
struct node_cache_attrs cache_attrs;
u8 gen_port_device_handle[ACPI_SRAT_DEVICE_HANDLE_SIZE];
bool registered;
- bool ext_updated; /* externally updated */
};
struct memory_initiator {
@@ -391,7 +390,6 @@ int hmat_update_target_coordinates(int nid, struct access_coordinate *coord,
coord->read_bandwidth, access);
hmat_update_target_access(target, ACPI_HMAT_WRITE_BANDWIDTH,
coord->write_bandwidth, access);
- target->ext_updated = true;
return 0;
}
@@ -773,10 +771,6 @@ static void hmat_update_target_attrs(struct memory_target *target,
u32 best = 0;
int i;
- /* Don't update if an external agent has changed the data. */
- if (target->ext_updated)
- return;
-
/* Don't update for generic port if there's no device handle */
if ((access == NODE_ACCESS_CLASS_GENPORT_SINK_LOCAL ||
access == NODE_ACCESS_CLASS_GENPORT_SINK_CPU) &&
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index c0af645425f4..c891fd618cfd 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -1081,8 +1081,3 @@ int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,
{
return hmat_update_target_coordinates(nid, &cxlr->coord[access], access);
}
-
-bool cxl_need_node_perf_attrs_update(int nid)
-{
- return !acpi_node_backed_by_real_pxm(nid);
-}
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 2669f251d677..a253d308f3c9 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -139,7 +139,6 @@ long cxl_pci_get_latency(struct pci_dev *pdev);
int cxl_pci_get_bandwidth(struct pci_dev *pdev, struct access_coordinate *c);
int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,
enum access_coordinate_class access);
-bool cxl_need_node_perf_attrs_update(int nid);
int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
struct access_coordinate *c);
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 71cc42d05248..0ed95cbc5d5b 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -30,6 +30,12 @@
* 3. Decoder targets
*/
+/*
+ * nodemask that sets per node when the access_coordinates for the node has
+ * been updated by the CXL memory hotplug notifier.
+ */
+static nodemask_t nodemask_region_seen = NODE_MASK_NONE;
+
static struct cxl_region *to_cxl_region(struct device *dev);
#define __ACCESS_ATTR_RO(_level, _name) { \
@@ -2442,14 +2448,8 @@ static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
for (int i = 0; i < ACCESS_COORDINATE_MAX; i++) {
if (cxlr->coord[i].read_bandwidth) {
- rc = 0;
- if (cxl_need_node_perf_attrs_update(nid))
- node_set_perf_attrs(nid, &cxlr->coord[i], i);
- else
- rc = cxl_update_hmat_access_coordinates(nid, cxlr, i);
-
- if (rc == 0)
- cset++;
+ node_update_perf_attrs(nid, &cxlr->coord[i], i);
+ cset++;
}
}
@@ -2487,6 +2487,10 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
if (nid != region_nid)
return NOTIFY_DONE;
+ /* No action needed if node bit already set */
+ if (node_test_and_set(nid, nodemask_region_seen))
+ return NOTIFY_DONE;
+
if (!cxl_region_update_coordinates(cxlr, nid))
return NOTIFY_DONE;
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 1305102688d0..0b755d1ef1ec 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -120,8 +120,8 @@ struct mem_section;
*/
#define DEFAULT_CALLBACK_PRI 0
#define SLAB_CALLBACK_PRI 1
-#define HMAT_CALLBACK_PRI 2
#define CXL_CALLBACK_PRI 5
+#define HMAT_CALLBACK_PRI 6
#define MM_COMPUTE_BATCH_PRI 10
#define CPUSET_CALLBACK_PRI 10
#define MEMTIER_HOTPLUG_PRI 100
--
2.50.1
Hi Dr. Sam Lavi Cosmetic And Implant Dentistry ,
AI adoption in healthcare isn’t coming — it’s already here.
Some implant clinics are quietly using AI to handle patient calls, consultations, and scheduling.
Their growth isn’t a coincidence.
Being first gives them an edge — being late means playing catch-up.
Should I share a 2-min demo video so you can see how it works?
Reply 'Video' and I’ll send it.
—
Best regards,
Mohanish Ved
AI Growth Specialist
EditRage Solutions
VIRQs come in 3 flavors, per-VPU, per-domain, and global, and the VIRQs
are tracked in per-cpu virq_to_irq arrays.
Per-domain and global VIRQs must be bound on CPU 0, and
bind_virq_to_irq() sets the per_cpu virq_to_irq at registration time
Later, the interrupt can migrate, and info->cpu is updated. When
calling __unbind_from_irq(), the per-cpu virq_to_irq is cleared for a
different cpu. If bind_virq_to_irq() is called again with CPU 0, the
stale irq is returned. There won't be any irq_info for the irq, so
things break.
Make xen_rebind_evtchn_to_cpu() update the per_cpu virq_to_irq mappings
to keep them update to date with the current cpu. This ensures the
correct virq_to_irq is cleared in __unbind_from_irq().
Fixes: e46cdb66c8fc ("xen: event channels")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason Andryuk <jason.andryuk(a)amd.com>
---
v3:
Kernel style brace placement
Delay setting old_cpu and tighten scope of variable
v2:
Different approach changing virq_to_irq
---
drivers/xen/events/events_base.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index b060b5a95f45..9478fae014e5 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1797,9 +1797,20 @@ static int xen_rebind_evtchn_to_cpu(struct irq_info *info, unsigned int tcpu)
* virq or IPI channel, which don't actually need to be rebound. Ignore
* it, but don't do the xenlinux-level rebind in that case.
*/
- if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0)
+ if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) {
+ int old_cpu = info->cpu;
+
bind_evtchn_to_cpu(info, tcpu, false);
+ if (info->type == IRQT_VIRQ) {
+ int virq = info->u.virq;
+ int irq = per_cpu(virq_to_irq, old_cpu)[virq];
+
+ per_cpu(virq_to_irq, old_cpu)[virq] = -1;
+ per_cpu(virq_to_irq, tcpu)[virq] = irq;
+ }
+ }
+
do_unmask(info, EVT_MASK_REASON_TEMPORARY);
return 0;
--
2.34.1
Change find_virq() to return -EEXIST when a VIRQ is bound to a
different CPU than the one passed in. With that, remove the BUG_ON()
from bind_virq_to_irq() to propogate the error upwards.
Some VIRQs are per-cpu, but others are per-domain or global. Those must
be bound to CPU0 and can then migrate elsewhere. The lookup for
per-domain and global will probably fail when migrated off CPU 0,
especially when the current CPU is tracked. This now returns -EEXIST
instead of BUG_ON().
A second call to bind a per-domain or global VIRQ is not expected, but
make it non-fatal to avoid trying to look up the irq, since we don't
know which per_cpu(virq_to_irq) it will be in.
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason Andryuk <jason.andryuk(a)amd.com>
---
v3:
Cc: stable as a pre-req for the subsequent virg tracking change
Call __unbind_from_irq() on error ro avoid leaking info
v2:
New
---
drivers/xen/events/events_base.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 374231d84e4f..b060b5a95f45 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1314,10 +1314,12 @@ int bind_interdomain_evtchn_to_irq_lateeoi(struct xenbus_device *dev,
}
EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irq_lateeoi);
-static int find_virq(unsigned int virq, unsigned int cpu, evtchn_port_t *evtchn)
+static int find_virq(unsigned int virq, unsigned int cpu, evtchn_port_t *evtchn,
+ bool percpu)
{
struct evtchn_status status;
evtchn_port_t port;
+ bool exists = false;
memset(&status, 0, sizeof(status));
for (port = 0; port < xen_evtchn_max_channels(); port++) {
@@ -1330,12 +1332,16 @@ static int find_virq(unsigned int virq, unsigned int cpu, evtchn_port_t *evtchn)
continue;
if (status.status != EVTCHNSTAT_virq)
continue;
- if (status.u.virq == virq && status.vcpu == xen_vcpu_nr(cpu)) {
+ if (status.u.virq != virq)
+ continue;
+ if (status.vcpu == xen_vcpu_nr(cpu)) {
*evtchn = port;
return 0;
+ } else if (!percpu) {
+ exists = true;
}
}
- return -ENOENT;
+ return exists ? -EEXIST : -ENOENT;
}
/**
@@ -1382,8 +1388,11 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu, bool percpu)
evtchn = bind_virq.port;
else {
if (ret == -EEXIST)
- ret = find_virq(virq, cpu, &evtchn);
- BUG_ON(ret < 0);
+ ret = find_virq(virq, cpu, &evtchn, percpu);
+ if (ret) {
+ __unbind_from_irq(info, info->irq);
+ goto out;
+ }
}
ret = xen_irq_info_virq_setup(info, cpu, evtchn, virq);
--
2.34.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x ae668cd567a6a7622bc813ee0bb61c42bed61ba7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025090112-skillet-muskiness-5948@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ae668cd567a6a7622bc813ee0bb61c42bed61ba7 Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen(a)redhat.com>
Date: Fri, 22 Aug 2025 12:55:56 -0500
Subject: [PATCH] xfs: do not propagate ENODATA disk errors into xattr code
ENODATA (aka ENOATTR) has a very specific meaning in the xfs xattr code;
namely, that the requested attribute name could not be found.
However, a medium error from disk may also return ENODATA. At best,
this medium error may escape to userspace as "attribute not found"
when in fact it's an IO (disk) error.
At worst, we may oops in xfs_attr_leaf_get() when we do:
error = xfs_attr_leaf_hasname(args, &bp);
if (error == -ENOATTR) {
xfs_trans_brelse(args->trans, bp);
return error;
}
because an ENODATA/ENOATTR error from disk leaves us with a null bp,
and the xfs_trans_brelse will then null-deref it.
As discussed on the list, we really need to modify the lower level
IO functions to trap all disk errors and ensure that we don't let
unique errors like this leak up into higher xfs functions - many
like this should be remapped to EIO.
However, this patch directly addresses a reported bug in the xattr
code, and should be safe to backport to stable kernels. A larger-scope
patch to handle more unique errors at lower levels can follow later.
(Note, prior to 07120f1abdff we did not oops, but we did return the
wrong error code to userspace.)
Signed-off-by: Eric Sandeen <sandeen(a)redhat.com>
Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Cc: stable(a)vger.kernel.org # v5.9+
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Carlos Maiolino <cem(a)kernel.org>
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 4c44ce1c8a64..bff3dc226f81 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -435,6 +435,13 @@ xfs_attr_rmtval_get(
0, &bp, &xfs_attr3_rmt_buf_ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(args->dp, XFS_ATTR_FORK);
+ /*
+ * ENODATA from disk implies a disk medium failure;
+ * ENODATA for xattrs means attribute not found, so
+ * disambiguate that here.
+ */
+ if (error == -ENODATA)
+ error = -EIO;
if (error)
return error;
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index 17d9e6154f19..723a0643b838 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -2833,6 +2833,12 @@ xfs_da_read_buf(
&bp, ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(dp, whichfork);
+ /*
+ * ENODATA from disk implies a disk medium failure; ENODATA for
+ * xattrs means attribute not found, so disambiguate that here.
+ */
+ if (error == -ENODATA && whichfork == XFS_ATTR_FORK)
+ error = -EIO;
if (error)
goto out_free;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x ae668cd567a6a7622bc813ee0bb61c42bed61ba7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025090111-acorn-ammonia-8c45@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ae668cd567a6a7622bc813ee0bb61c42bed61ba7 Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen(a)redhat.com>
Date: Fri, 22 Aug 2025 12:55:56 -0500
Subject: [PATCH] xfs: do not propagate ENODATA disk errors into xattr code
ENODATA (aka ENOATTR) has a very specific meaning in the xfs xattr code;
namely, that the requested attribute name could not be found.
However, a medium error from disk may also return ENODATA. At best,
this medium error may escape to userspace as "attribute not found"
when in fact it's an IO (disk) error.
At worst, we may oops in xfs_attr_leaf_get() when we do:
error = xfs_attr_leaf_hasname(args, &bp);
if (error == -ENOATTR) {
xfs_trans_brelse(args->trans, bp);
return error;
}
because an ENODATA/ENOATTR error from disk leaves us with a null bp,
and the xfs_trans_brelse will then null-deref it.
As discussed on the list, we really need to modify the lower level
IO functions to trap all disk errors and ensure that we don't let
unique errors like this leak up into higher xfs functions - many
like this should be remapped to EIO.
However, this patch directly addresses a reported bug in the xattr
code, and should be safe to backport to stable kernels. A larger-scope
patch to handle more unique errors at lower levels can follow later.
(Note, prior to 07120f1abdff we did not oops, but we did return the
wrong error code to userspace.)
Signed-off-by: Eric Sandeen <sandeen(a)redhat.com>
Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Cc: stable(a)vger.kernel.org # v5.9+
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Carlos Maiolino <cem(a)kernel.org>
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 4c44ce1c8a64..bff3dc226f81 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -435,6 +435,13 @@ xfs_attr_rmtval_get(
0, &bp, &xfs_attr3_rmt_buf_ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(args->dp, XFS_ATTR_FORK);
+ /*
+ * ENODATA from disk implies a disk medium failure;
+ * ENODATA for xattrs means attribute not found, so
+ * disambiguate that here.
+ */
+ if (error == -ENODATA)
+ error = -EIO;
if (error)
return error;
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index 17d9e6154f19..723a0643b838 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -2833,6 +2833,12 @@ xfs_da_read_buf(
&bp, ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(dp, whichfork);
+ /*
+ * ENODATA from disk implies a disk medium failure; ENODATA for
+ * xattrs means attribute not found, so disambiguate that here.
+ */
+ if (error == -ENODATA && whichfork == XFS_ATTR_FORK)
+ error = -EIO;
if (error)
goto out_free;
From: "Mario Limonciello (AMD)" <superm1(a)kernel.org>
[ Upstream commit 23800ad1265f10c2bc6f42154ce4d20e59f2900e ]
The ASUS ProArt PX13 has a spurious wakeup event from the touchpad
a few moments after entering hardware sleep. This can be avoided
by preventing the touchpad from being a wake source.
Add to the wakeup ignore list.
Reported-by: Amit Chaudhari <amitchaudhari(a)mac.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4482
Tested-by: Amit Chaudhari <amitchaudhari(a)mac.com>
Signed-off-by: Mario Limonciello (AMD) <superm1(a)kernel.org>
Reviewed-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Link: https://lore.kernel.org/20250814183430.3887973-1-superm1@kernel.org
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my comprehensive analysis of this commit and the context, here
is my assessment:
**Backport Status: YES**
## Extensive Analysis:
### 1. **Nature of the Fix**
This commit fixes a real hardware bug - spurious wakeup events from the
touchpad on the ASUS ProArt PX13 laptop. The code change adds a DMI-
based quirk entry to the `gpiolib_acpi_quirks` table in
`/home/sasha/linux/drivers/gpio/gpiolib-acpi-quirks.c:350-359`, which
instructs the GPIO subsystem to ignore wake events from the specific
touchpad GPIO pin (`ASCP1A00:00@8`).
### 2. **符合稳定内核规则 (Meets Stable Kernel Rules)**
According to `/home/sasha/linux/Documentation/process/stable-kernel-
rules.rst`:
- **Fixes a real bug**: Yes - spurious wakeups are a real hardware issue
that impacts users' ability to use sleep/suspend effectively (lines
18-19 of rules)
- **Obviously correct and tested**: Yes - the fix is straightforward
(adding a quirk entry), has been tested by the reporter (Amit
Chaudhari), and reviewed by Mika Westerberg
- **Size constraint**: The patch is only ~20 lines with context, well
under the 100-line limit
- **Already in mainline**: Yes - commit
23800ad1265f10c2bc6f42154ce4d20e59f2900e
### 3. **Historical Precedent**
Multiple similar commits for spurious wakeup quirks have been backported
to stable:
- Commit `805c74eac8cb3` (GPD G1619-04 touchpad wakeup) - explicitly
marked with `Cc: stable(a)vger.kernel.org`
- Commit `782eea0c89f7d` (Clevo NL5xNU) - marked with `Cc:
stable(a)vger.kernel.org`
- Commit `a69982c37cd05` (Clevo NH5xAx) - marked with `Cc:
<stable(a)vger.kernel.org> # v6.1+`
### 4. **Code Structure Analysis**
The change follows the exact same pattern as other quirk entries in the
file:
```c
.driver_data = &(struct acpi_gpiolib_dmi_quirk) {
.ignore_wake = "ASCP1A00:00@8",
},
```
This is a data-only addition to an existing quirk table - no logic
changes, no new code paths, minimal regression risk.
### 5. **User Impact**
The bug causes spurious wakeups "a few moments after entering hardware
sleep", which:
- Prevents proper system suspend/sleep functionality
- Drains battery life on laptops
- Disrupts user workflow
- Is a clear hardware-specific issue that cannot be worked around by
users
### 6. **Risk Assessment**
- **Extremely low risk**: The change only affects systems that match the
specific DMI strings (ASUSTeK COMPUTER INC. + ProArt PX13)
- **No impact on other systems**: DMI matching ensures this quirk only
applies to the affected hardware
- **Well-established mechanism**: The ignore_wake infrastructure is
mature and widely used
### 7. **Backporting Considerations**
While this specific commit wasn't explicitly marked with `Cc: stable`,
it follows the exact same pattern as commits that were. The commit has
already been picked up by Sasha Levin's stable tree (as shown in the `[
Upstream commit ]` tag in the local repository), suggesting the stable
maintainers recognize its importance.
The fix is self-contained, requires no prerequisites, and can be cleanly
applied to any kernel version that has the `gpiolib-acpi-quirks.c` file
structure (introduced in commit `92dc572852ddc`).
### Conclusion
This is a textbook example of a stable-appropriate fix: it addresses a
specific hardware bug affecting real users, uses a well-established
quirk mechanism, has zero impact on unaffected systems, and follows the
exact pattern of previously backported fixes for identical issues on
different hardware.
drivers/gpio/gpiolib-acpi-quirks.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/gpio/gpiolib-acpi-quirks.c b/drivers/gpio/gpiolib-acpi-quirks.c
index c13545dce3492..bfb04e67c4bc8 100644
--- a/drivers/gpio/gpiolib-acpi-quirks.c
+++ b/drivers/gpio/gpiolib-acpi-quirks.c
@@ -344,6 +344,20 @@ static const struct dmi_system_id gpiolib_acpi_quirks[] __initconst = {
.ignore_interrupt = "AMDI0030:00@8",
},
},
+ {
+ /*
+ * Spurious wakeups from TP_ATTN# pin
+ * Found in BIOS 5.35
+ * https://gitlab.freedesktop.org/drm/amd/-/issues/4482
+ */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_MATCH(DMI_PRODUCT_FAMILY, "ProArt PX13"),
+ },
+ .driver_data = &(struct acpi_gpiolib_dmi_quirk) {
+ .ignore_wake = "ASCP1A00:00@8",
+ },
+ },
{} /* Terminating entry */
};
--
2.50.1
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x ae668cd567a6a7622bc813ee0bb61c42bed61ba7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025090111-poem-retold-4ac2@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ae668cd567a6a7622bc813ee0bb61c42bed61ba7 Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen(a)redhat.com>
Date: Fri, 22 Aug 2025 12:55:56 -0500
Subject: [PATCH] xfs: do not propagate ENODATA disk errors into xattr code
ENODATA (aka ENOATTR) has a very specific meaning in the xfs xattr code;
namely, that the requested attribute name could not be found.
However, a medium error from disk may also return ENODATA. At best,
this medium error may escape to userspace as "attribute not found"
when in fact it's an IO (disk) error.
At worst, we may oops in xfs_attr_leaf_get() when we do:
error = xfs_attr_leaf_hasname(args, &bp);
if (error == -ENOATTR) {
xfs_trans_brelse(args->trans, bp);
return error;
}
because an ENODATA/ENOATTR error from disk leaves us with a null bp,
and the xfs_trans_brelse will then null-deref it.
As discussed on the list, we really need to modify the lower level
IO functions to trap all disk errors and ensure that we don't let
unique errors like this leak up into higher xfs functions - many
like this should be remapped to EIO.
However, this patch directly addresses a reported bug in the xattr
code, and should be safe to backport to stable kernels. A larger-scope
patch to handle more unique errors at lower levels can follow later.
(Note, prior to 07120f1abdff we did not oops, but we did return the
wrong error code to userspace.)
Signed-off-by: Eric Sandeen <sandeen(a)redhat.com>
Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Cc: stable(a)vger.kernel.org # v5.9+
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Carlos Maiolino <cem(a)kernel.org>
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 4c44ce1c8a64..bff3dc226f81 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -435,6 +435,13 @@ xfs_attr_rmtval_get(
0, &bp, &xfs_attr3_rmt_buf_ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(args->dp, XFS_ATTR_FORK);
+ /*
+ * ENODATA from disk implies a disk medium failure;
+ * ENODATA for xattrs means attribute not found, so
+ * disambiguate that here.
+ */
+ if (error == -ENODATA)
+ error = -EIO;
if (error)
return error;
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index 17d9e6154f19..723a0643b838 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -2833,6 +2833,12 @@ xfs_da_read_buf(
&bp, ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(dp, whichfork);
+ /*
+ * ENODATA from disk implies a disk medium failure; ENODATA for
+ * xattrs means attribute not found, so disambiguate that here.
+ */
+ if (error == -ENODATA && whichfork == XFS_ATTR_FORK)
+ error = -EIO;
if (error)
goto out_free;
From: Han Guangjiang <hanguangjiang(a)lixiang.com>
On repeated cold boots we occasionally hit a NULL pointer crash in
blk_should_throtl() when throttling is consulted before the throttle
policy is fully enabled for the queue. Checking only q->td != NULL is
insufficient during early initialization, so blkg_to_pd() for the
throttle policy can still return NULL and blkg_to_tg() becomes NULL,
which later gets dereferenced.
Tighten blk_throtl_activated() to also require that the throttle policy
bit is set on the queue:
return q->td != NULL &&
test_bit(blkcg_policy_throtl.plid, q->blkcg_pols);
This prevents blk_should_throtl() from accessing throttle group state
until policy data has been attached to blkgs.
Fixes: a3166c51702b ("blk-throttle: delay initialization until configuration")
Cc: stable(a)vger.kernel.org
Co-developed-by: Liang Jie <liangjie(a)lixiang.com>
Signed-off-by: Liang Jie <liangjie(a)lixiang.com>
Signed-off-by: Han Guangjiang <hanguangjiang(a)lixiang.com>
---
block/blk-throttle.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index 3b27755bfbff..9ca43dc56eda 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -156,7 +156,7 @@ void blk_throtl_cancel_bios(struct gendisk *disk);
static inline bool blk_throtl_activated(struct request_queue *q)
{
- return q->td != NULL;
+ return q->td != NULL && test_bit(blkcg_policy_throtl.plid, q->blkcg_pols);
}
static inline bool blk_should_throtl(struct bio *bio)
--
2.25.1
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x ae668cd567a6a7622bc813ee0bb61c42bed61ba7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025090105-strainer-glider-fdcd@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ae668cd567a6a7622bc813ee0bb61c42bed61ba7 Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen(a)redhat.com>
Date: Fri, 22 Aug 2025 12:55:56 -0500
Subject: [PATCH] xfs: do not propagate ENODATA disk errors into xattr code
ENODATA (aka ENOATTR) has a very specific meaning in the xfs xattr code;
namely, that the requested attribute name could not be found.
However, a medium error from disk may also return ENODATA. At best,
this medium error may escape to userspace as "attribute not found"
when in fact it's an IO (disk) error.
At worst, we may oops in xfs_attr_leaf_get() when we do:
error = xfs_attr_leaf_hasname(args, &bp);
if (error == -ENOATTR) {
xfs_trans_brelse(args->trans, bp);
return error;
}
because an ENODATA/ENOATTR error from disk leaves us with a null bp,
and the xfs_trans_brelse will then null-deref it.
As discussed on the list, we really need to modify the lower level
IO functions to trap all disk errors and ensure that we don't let
unique errors like this leak up into higher xfs functions - many
like this should be remapped to EIO.
However, this patch directly addresses a reported bug in the xattr
code, and should be safe to backport to stable kernels. A larger-scope
patch to handle more unique errors at lower levels can follow later.
(Note, prior to 07120f1abdff we did not oops, but we did return the
wrong error code to userspace.)
Signed-off-by: Eric Sandeen <sandeen(a)redhat.com>
Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Cc: stable(a)vger.kernel.org # v5.9+
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Carlos Maiolino <cem(a)kernel.org>
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 4c44ce1c8a64..bff3dc226f81 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -435,6 +435,13 @@ xfs_attr_rmtval_get(
0, &bp, &xfs_attr3_rmt_buf_ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(args->dp, XFS_ATTR_FORK);
+ /*
+ * ENODATA from disk implies a disk medium failure;
+ * ENODATA for xattrs means attribute not found, so
+ * disambiguate that here.
+ */
+ if (error == -ENODATA)
+ error = -EIO;
if (error)
return error;
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index 17d9e6154f19..723a0643b838 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -2833,6 +2833,12 @@ xfs_da_read_buf(
&bp, ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(dp, whichfork);
+ /*
+ * ENODATA from disk implies a disk medium failure; ENODATA for
+ * xattrs means attribute not found, so disambiguate that here.
+ */
+ if (error == -ENODATA && whichfork == XFS_ATTR_FORK)
+ error = -EIO;
if (error)
goto out_free;
VRAM+TT bos that are evicted from VRAM to TT may remain in
TT also after a revalidation following eviction or suspend.
This manifests itself as applications becoming sluggish
after buffer objects get evicted or after a resume from
suspend or hibernation.
If the bo supports placement in both VRAM and TT, and
we are on DGFX, mark the TT placement as fallback. This means
that it is tried only after VRAM + eviction.
This flaw has probably been present since the xe module was
upstreamed but use a Fixes: commit below where backporting is
likely to be simple. For earlier versions we need to open-
code the fallback algorithm in the driver.
v2:
- Remove check for dgfx. (Matthew Auld)
- Update the xe_dma_buf kunit test for the new strategy (CI)
- Allow dma-buf to pin in current placement (CI)
- Make xe_bo_validate() for pinned bos a NOP.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5995
Fixes: a78a8da51b36 ("drm/ttm: replace busy placement with flags v6")
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.9+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com> #v1
---
drivers/gpu/drm/xe/tests/xe_bo.c | 2 +-
drivers/gpu/drm/xe/tests/xe_dma_buf.c | 10 +---------
drivers/gpu/drm/xe/xe_bo.c | 16 ++++++++++++----
drivers/gpu/drm/xe/xe_bo.h | 2 +-
drivers/gpu/drm/xe/xe_dma_buf.c | 2 +-
5 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c
index bb469096d072..7b40cc8be1c9 100644
--- a/drivers/gpu/drm/xe/tests/xe_bo.c
+++ b/drivers/gpu/drm/xe/tests/xe_bo.c
@@ -236,7 +236,7 @@ static int evict_test_run_tile(struct xe_device *xe, struct xe_tile *tile, struc
}
xe_bo_lock(external, false);
- err = xe_bo_pin_external(external);
+ err = xe_bo_pin_external(external, false);
xe_bo_unlock(external);
if (err) {
KUNIT_FAIL(test, "external bo pin err=%pe\n",
diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
index 3c5ad8cc65a0..5baeab6b6fb7 100644
--- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
@@ -85,15 +85,7 @@ static void check_residency(struct kunit *test, struct xe_bo *exported,
return;
}
- /*
- * If on different devices, the exporter is kept in system if
- * possible, saving a migration step as the transfer is just
- * likely as fast from system memory.
- */
- if (params->mem_mask & XE_BO_FLAG_SYSTEM)
- KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, XE_PL_TT));
- else
- KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, mem_type));
+ KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, mem_type));
if (params->force_different_devices)
KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(imported, XE_PL_TT));
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 4faf15d5fa6d..870f43347281 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -188,6 +188,8 @@ static void try_add_system(struct xe_device *xe, struct xe_bo *bo,
bo->placements[*c] = (struct ttm_place) {
.mem_type = XE_PL_TT,
+ .flags = (bo_flags & XE_BO_FLAG_VRAM_MASK) ?
+ TTM_PL_FLAG_FALLBACK : 0,
};
*c += 1;
}
@@ -2322,6 +2324,7 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res)
/**
* xe_bo_pin_external - pin an external BO
* @bo: buffer object to be pinned
+ * @in_place: Pin in current placement, don't attempt to migrate.
*
* Pin an external (not tied to a VM, can be exported via dma-buf / prime FD)
* BO. Unique call compared to xe_bo_pin as this function has it own set of
@@ -2329,7 +2332,7 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res)
*
* Returns 0 for success, negative error code otherwise.
*/
-int xe_bo_pin_external(struct xe_bo *bo)
+int xe_bo_pin_external(struct xe_bo *bo, bool in_place)
{
struct xe_device *xe = xe_bo_device(bo);
int err;
@@ -2338,9 +2341,11 @@ int xe_bo_pin_external(struct xe_bo *bo)
xe_assert(xe, xe_bo_is_user(bo));
if (!xe_bo_is_pinned(bo)) {
- err = xe_bo_validate(bo, NULL, false);
- if (err)
- return err;
+ if (!in_place) {
+ err = xe_bo_validate(bo, NULL, false);
+ if (err)
+ return err;
+ }
spin_lock(&xe->pinned.lock);
list_add_tail(&bo->pinned_link, &xe->pinned.late.external);
@@ -2493,6 +2498,9 @@ int xe_bo_validate(struct xe_bo *bo, struct xe_vm *vm, bool allow_res_evict)
};
int ret;
+ if (xe_bo_is_pinned(bo))
+ return 0;
+
if (vm) {
lockdep_assert_held(&vm->lock);
xe_vm_assert_held(vm);
diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
index 8cce413b5235..cfb1ec266a6d 100644
--- a/drivers/gpu/drm/xe/xe_bo.h
+++ b/drivers/gpu/drm/xe/xe_bo.h
@@ -200,7 +200,7 @@ static inline void xe_bo_unlock_vm_held(struct xe_bo *bo)
}
}
-int xe_bo_pin_external(struct xe_bo *bo);
+int xe_bo_pin_external(struct xe_bo *bo, bool in_place);
int xe_bo_pin(struct xe_bo *bo);
void xe_bo_unpin_external(struct xe_bo *bo);
void xe_bo_unpin(struct xe_bo *bo);
diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
index 346f857f3837..af64baf872ef 100644
--- a/drivers/gpu/drm/xe/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/xe_dma_buf.c
@@ -72,7 +72,7 @@ static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
return ret;
}
- ret = xe_bo_pin_external(bo);
+ ret = xe_bo_pin_external(bo, true);
xe_assert(xe, !ret);
return 0;
--
2.50.1
On Wed, Aug 27, 2025 at 7:38 AM Stanislav Fort <disclosure(a)aisle.com> wrote:
>
> NET/ROM nr_rx_frame() dereferences the 5-byte transport header
> unconditionally. nr_route_frame() currently accepts frames as short as
> NR_NETWORK_LEN (15 bytes), which can lead to small out-of-bounds reads
> on short frames.
>
> Fix by using pskb_may_pull() in nr_rx_frame() to ensure the full
> NET/ROM network + transport header is present before accessing it, and
> guard the extra fields used by NR_CONNREQ (window, user address, and the
> optional BPQ timeout extension) with additional pskb_may_pull() checks.
>
> This aligns with recent fixes using pskb_may_pull() to validate header
> availability.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Reported-by: Stanislav Fort <disclosure(a)aisle.com>
> Cc: stable(a)vger.kernel.org
> Signed-off-by: Stanislav Fort <disclosure(a)aisle.com>
> ---
> net/netrom/af_netrom.c | 12 +++++++++++-
> net/netrom/nr_route.c | 2 +-
> 2 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
> index 3331669d8e33..1fbaa161288a 100644
> --- a/net/netrom/af_netrom.c
> +++ b/net/netrom/af_netrom.c
> @@ -885,6 +885,10 @@ int nr_rx_frame(struct sk_buff *skb, struct net_device *dev)
> * skb->data points to the netrom frame start
> */
>
> + /* Ensure NET/ROM network + transport header are present */
> + if (!pskb_may_pull(skb, NR_NETWORK_LEN + NR_TRANSPORT_LEN))
> + return 0;
> +
> src = (ax25_address *)(skb->data + 0);
> dest = (ax25_address *)(skb->data + 7);
>
> @@ -961,6 +965,12 @@ int nr_rx_frame(struct sk_buff *skb, struct net_device *dev)
> return 0;
> }
>
> + /* Ensure NR_CONNREQ fields (window + user address) are present */
> + if (!pskb_may_pull(skb, 21 + AX25_ADDR_LEN)) {
If skb->head is reallocated by this pskb_may_pull(), dest variable
might point to a freed piece of memory
(old skb->head)
As far as netrom is concerned, I would force a full linearization of
the packet very early
It is also unclear if the bug even exists in the first place.
Can you show the stack trace leading to this function being called
from an arbitrary
provider (like a packet being fed by malicious user space)
For instance nr_rx_frame() can be called from net/netrom/nr_loopback.c
with non malicious packet.
For the remaining caller (nr_route_frame()), it is unclear to me.
This is a series of patches that address several memory bugs that occur
in the Exynos Virtual Display driver.
Jeongjun Park (3):
drm/exynos: vidi: use priv->vidi_dev for ctx lookup in vidi_connection_ioctl()
drm/exynos: vidi: fix to avoid directly dereferencing user pointer
drm/exynos: vidi: use ctx->lock to protect struct vidi_context member variables related to memory alloc/free
drivers/gpu/drm/exynos/exynos_drm_drv.h | 1 +
drivers/gpu/drm/exynos/exynos_drm_vidi.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++----------
2 files changed, 64 insertions(+), 11 deletions(-)
Fix reference leaks where PCI device references
obtained via pci_get_device() were not being released:
1. The while loop that iterates through 0xa00a devices was not
releasing the final device reference when the loop terminates.
2. Single device lookups for 0xa001 and 0xa009 devices were not
releasing their references after use.
Add missing pci_dev_put() calls to ensure all device references
are properly released.
Fixes: cd7834167ffb ("[POWERPC] pasemi: Print more information at machine check")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
arch/powerpc/platforms/pasemi/setup.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/powerpc/platforms/pasemi/setup.c b/arch/powerpc/platforms/pasemi/setup.c
index d03b41336901..dafbee3afd86 100644
--- a/arch/powerpc/platforms/pasemi/setup.c
+++ b/arch/powerpc/platforms/pasemi/setup.c
@@ -169,6 +169,8 @@ static int __init pas_setup_mce_regs(void)
dev = pci_get_device(PCI_VENDOR_ID_PASEMI, 0xa00a, dev);
reg++;
}
+ /* Release the last device reference from the while loop */
+ pci_dev_put(dev);
dev = pci_get_device(PCI_VENDOR_ID_PASEMI, 0xa001, NULL);
if (dev && reg+4 < MAX_MCE_REGS) {
@@ -185,6 +187,7 @@ static int __init pas_setup_mce_regs(void)
mce_regs[reg].addr = pasemi_pci_getcfgaddr(dev, 0xc1c);
reg++;
}
+ pci_dev_put(dev);
dev = pci_get_device(PCI_VENDOR_ID_PASEMI, 0xa009, NULL);
if (dev && reg+2 < MAX_MCE_REGS) {
@@ -195,6 +198,7 @@ static int __init pas_setup_mce_regs(void)
mce_regs[reg].addr = pasemi_pci_getcfgaddr(dev, 0x214);
reg++;
}
+ pci_dev_put(dev);
num_mce_regs = reg;
--
2.35.1
Suspend-resume cycle test revealed a memory leak in 6.17-rc3
Turns out the slot_id race fix changes accidentally ends up calling
xhci_free_virt_device() with an incorrect vdev parameter.
The vdev variable was reused for temporary purposes right before calling
xhci_free_virt_device().
Fix this by passing the correct vdev parameter.
The slot_id race fix that caused this regression was targeted for stable,
so this needs to be applied there as well.
Fixes: 2eb03376151b ("usb: xhci: Fix slot_id resource race conflict")
Reported-by: David Wang <00107082(a)163.com>
Closes: https://lore.kernel.org/linux-usb/20250829181354.4450-1-00107082@163.com
Suggested-by: Michal Pecio <michal.pecio(a)gmail.com>
Suggested-by: David Wang <00107082(a)163.com>
Cc: stable(a)vger.kernel.org
Tested-by: David Wang <00107082(a)163.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-mem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 81eaad87a3d9..c4a6544aa107 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -962,7 +962,7 @@ static void xhci_free_virt_devices_depth_first(struct xhci_hcd *xhci, int slot_i
out:
/* we are now at a leaf device */
xhci_debugfs_remove_slot(xhci, slot_id);
- xhci_free_virt_device(xhci, vdev, slot_id);
+ xhci_free_virt_device(xhci, xhci->devs[slot_id], slot_id);
}
int xhci_alloc_virt_device(struct xhci_hcd *xhci, int slot_id,
--
2.43.0
Pending requests will be flushed on disconnect, and the corresponding
TRBs will be turned into No-op TRBs, which are ignored by the xHC
controller once it starts processing the ring.
If the USB debug cable repeatedly disconnects before ring is started
then the ring will eventually be filled with No-op TRBs.
No new transfers can be queued when the ring is full, and driver will
print the following error message:
"xhci_hcd 0000:00:14.0: failed to queue trbs"
This is a normal case for 'in' transfers where TRBs are always enqueued
in advance, ready to take on incoming data. If no data arrives, and
device is disconnected, then ring dequeue will remain at beginning of
the ring while enqueue points to first free TRB after last cancelled
No-op TRB.
s
Solve this by reinitializing the rings when the debug cable disconnects
and DbC is leaving the configured state.
Clear the whole ring buffer and set enqueue and dequeue to the beginning
of ring, and set cycle bit to its initial state.
Cc: stable(a)vger.kernel.org
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-dbgcap.c | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci-dbgcap.c b/drivers/usb/host/xhci-dbgcap.c
index d0faff233e3e..63edf2d8f245 100644
--- a/drivers/usb/host/xhci-dbgcap.c
+++ b/drivers/usb/host/xhci-dbgcap.c
@@ -462,6 +462,25 @@ static void xhci_dbc_ring_init(struct xhci_ring *ring)
xhci_initialize_ring_info(ring);
}
+static int xhci_dbc_reinit_ep_rings(struct xhci_dbc *dbc)
+{
+ struct xhci_ring *in_ring = dbc->eps[BULK_IN].ring;
+ struct xhci_ring *out_ring = dbc->eps[BULK_OUT].ring;
+
+ if (!in_ring || !out_ring || !dbc->ctx) {
+ dev_warn(dbc->dev, "Can't re-init unallocated endpoints\n");
+ return -ENODEV;
+ }
+
+ xhci_dbc_ring_init(in_ring);
+ xhci_dbc_ring_init(out_ring);
+
+ /* set ep context enqueue, dequeue, and cycle to initial values */
+ xhci_dbc_init_ep_contexts(dbc);
+
+ return 0;
+}
+
static struct xhci_ring *
xhci_dbc_ring_alloc(struct device *dev, enum xhci_ring_type type, gfp_t flags)
{
@@ -885,7 +904,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
dev_info(dbc->dev, "DbC cable unplugged\n");
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
@@ -895,7 +914,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
writel(portsc, &dbc->regs->portsc);
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
--
2.43.0
Hi,
I’d like to share a verified database of 502,364 attendees and 750 exhibitors from IAA Mobility 2025.
The file includes key information such as: Name, Job Title, Company, Location, Phone, and Email.
Delivery is within 48 hours and the data is GDPR compliant.
Labour Day Special: 20% discount available for a short time.
If you’d like the pricing, just reply with “Send me the cost” and I’ll provide the details.
Best regards,
Garnet Conwell
Sr. Marketing Manager
P.S. To opt out of future updates, reply with “Unfollow”.
Testing has shown that reading multiple registers at once (for 10-bit
ADC values) does not work. Set the use_single_read regmap_config flag
to make regmap split these for us.
This should fix temperature opregion accesses done by
drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for
the upcoming drivers for the ADC and battery MFD cells.
Fixes: 6bac0606fdba ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC")
Cc: stable(a)vger.kernel.org
Reviewed-by: Andy Shevchenko <andy(a)kernel.org>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
---
Changes in v3:
- Fix a few typos in the commit message
Changes in v2:
- Update comment to: "The hardware does not support reading multiple
registers at once"
---
drivers/mfd/intel_soc_pmic_chtdc_ti.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/mfd/intel_soc_pmic_chtdc_ti.c b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
index 4c1a68c9f575..6daf33e07ea0 100644
--- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c
+++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
@@ -82,6 +82,8 @@ static const struct regmap_config chtdc_ti_regmap_config = {
.reg_bits = 8,
.val_bits = 8,
.max_register = 0xff,
+ /* The hardware does not support reading multiple registers at once */
+ .use_single_read = true,
};
static const struct regmap_irq chtdc_ti_irqs[] = {
--
2.49.0
When the host actively triggers SSR and collects coredump data,
the Bluetooth stack sends a reset command to the controller. However, due
to the inability to clear the QCA_SSR_TRIGGERED and QCA_IBS_DISABLED bits,
the reset command times out.
To address this, this patch clears the QCA_SSR_TRIGGERED and
QCA_IBS_DISABLED flags and adds a 50ms delay after SSR, but only when
HCI_QUIRK_NON_PERSISTENT_SETUP is not set. This ensures the controller
completes the SSR process when BT_EN is always high due to hardware.
For the purpose of HCI_QUIRK_NON_PERSISTENT_SETUP, please refer to
the comment in `include/net/bluetooth/hci.h`.
The HCI_QUIRK_NON_PERSISTENT_SETUP quirk is associated with BT_EN,
and its presence can be used to determine whether BT_EN is defined in DTS.
After SSR, host will not download the firmware, causing
controller to remain in the IBS_WAKE state. Host needs
to synchronize with the controller to maintain proper operation.
Multiple triggers of SSR only first generate coredump file,
due to memcoredump_flag no clear.
add clear coredump flag when ssr completed.
When the SSR duration exceeds 2 seconds, it triggers
host tx_idle_timeout, which sets host TX state to sleep. due to the
hardware pulling up bt_en, the firmware is not downloaded after the SSR.
As a result, the controller does not enter sleep mode. Consequently,
when the host sends a command afterward, it sends 0xFD to the controller,
but the controller does not respond, leading to a command timeout.
So reset tx_idle_timer after SSR to prevent host enter TX IBS_Sleep mode.
Changes since v6-7:
- Merge the changes into a single patch.
- Update commit.
Changes since v1-5:
- Add an explanation for HCI_QUIRK_NON_PERSISTENT_SETUP.
- Add commments for msleep(50).
- Update format and commit.
Signed-off-by: Shuai Zhang <quic_shuaz(a)quicinc.com>
---
drivers/bluetooth/hci_qca.c | 33 +++++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 4e56782b0..9dc59b002 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -1653,6 +1653,39 @@ static void qca_hw_error(struct hci_dev *hdev, u8 code)
skb_queue_purge(&qca->rx_memdump_q);
}
+ /*
+ * If the BT chip's bt_en pin is connected to a 3.3V power supply via
+ * hardware and always stays high, driver cannot control the bt_en pin.
+ * As a result, during SSR (SubSystem Restart), QCA_SSR_TRIGGERED and
+ * QCA_IBS_DISABLED flags cannot be cleared, which leads to a reset
+ * command timeout.
+ * Add an msleep delay to ensure controller completes the SSR process.
+ *
+ * Host will not download the firmware after SSR, controller to remain
+ * in the IBS_WAKE state, and the host needs to synchronize with it
+ *
+ * Since the bluetooth chip has been reset, clear the memdump state.
+ */
+ if (!test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
+ /*
+ * When the SSR (SubSystem Restart) duration exceeds 2 seconds,
+ * it triggers host tx_idle_delay, which sets host TX state
+ * to sleep. Reset tx_idle_timer after SSR to prevent
+ * host enter TX IBS_Sleep mode.
+ */
+ mod_timer(&qca->tx_idle_timer, jiffies +
+ msecs_to_jiffies(qca->tx_idle_delay));
+
+ /* Controller reset completion time is 50ms */
+ msleep(50);
+
+ clear_bit(QCA_SSR_TRIGGERED, &qca->flags);
+ clear_bit(QCA_IBS_DISABLED, &qca->flags);
+
+ qca->tx_ibs_state = HCI_IBS_TX_AWAKE;
+ qca->memdump_state = QCA_MEMDUMP_IDLE;
+ }
+
clear_bit(QCA_HW_ERROR_EVENT, &qca->flags);
}
--
2.34.1
From: Conor Dooley <conor.dooley(a)microchip.com>
In commit 13529647743d9 ("spi: microchip-core-qspi: Support per spi-mem
operation frequency switches") the logic for checking the viability of
op->max_freq in mchp_coreqspi_setup_clock() was copied into
mchp_coreqspi_supports_op(). Unfortunately, op->max_freq is not valid
when this function is called during probe but is instead zero.
Accordingly, baud_rate_val is calculated to be INT_MAX due to division
by zero, causing probe of the attached memory device to fail.
Seemingly spi-microchip-core-qspi was the only driver that had such a
modification made to its supports_op callback when the per_op_freq
capability was added, so just remove it to restore prior functionality.
CC: stable(a)vger.kernel.org
Reported-by: Valentina Fernandez <valentina.fernandezalanis(a)microchip.com>
Fixes: 13529647743d9 ("spi: microchip-core-qspi: Support per spi-mem operation frequency switches")
Signed-off-by: Conor Dooley <conor.dooley(a)microchip.com>
---
CC: Conor Dooley <conor.dooley(a)microchip.com>
CC: Daire McNamara <daire.mcnamara(a)microchip.com>
CC: Mark Brown <broonie(a)kernel.org>
CC: Miquel Raynal <miquel.raynal(a)bootlin.com>
CC: linux-spi(a)vger.kernel.org
CC: linux-kernel(a)vger.kernel.org
---
drivers/spi/spi-microchip-core-qspi.c | 12 ------------
1 file changed, 12 deletions(-)
diff --git a/drivers/spi/spi-microchip-core-qspi.c b/drivers/spi/spi-microchip-core-qspi.c
index d13a9b755c7f8..8dc98b17f77b5 100644
--- a/drivers/spi/spi-microchip-core-qspi.c
+++ b/drivers/spi/spi-microchip-core-qspi.c
@@ -531,10 +531,6 @@ static int mchp_coreqspi_exec_op(struct spi_mem *mem, const struct spi_mem_op *o
static bool mchp_coreqspi_supports_op(struct spi_mem *mem, const struct spi_mem_op *op)
{
- struct mchp_coreqspi *qspi = spi_controller_get_devdata(mem->spi->controller);
- unsigned long clk_hz;
- u32 baud_rate_val;
-
if (!spi_mem_default_supports_op(mem, op))
return false;
@@ -557,14 +553,6 @@ static bool mchp_coreqspi_supports_op(struct spi_mem *mem, const struct spi_mem_
return false;
}
- clk_hz = clk_get_rate(qspi->clk);
- if (!clk_hz)
- return false;
-
- baud_rate_val = DIV_ROUND_UP(clk_hz, 2 * op->max_freq);
- if (baud_rate_val > MAX_DIVIDER || baud_rate_val < MIN_DIVIDER)
- return false;
-
return true;
}
--
2.47.2
commit e3f1164fc9ee ("PM: EM: Support late CPUs booting and capacity
adjustment") added a mechanism to handle CPUs that come up late by
retrying when any of the `cpufreq_cpu_get()` call fails.
However, if there are holes in the CPU topology (offline CPUs, e.g.
nosmt), the first missing CPU causes the loop to break, preventing
subsequent online CPUs from being updated.
Instead of aborting on the first missing CPU policy, loop through all
and retry if any were missing.
Fixes: e3f1164fc9ee ("PM: EM: Support late CPUs booting and capacity adjustment")
Suggested-by: Kenneth Crudup <kenneth.crudup(a)gmail.com>
Reported-by: Kenneth Crudup <kenneth.crudup(a)gmail.com>
Closes: https://lore.kernel.org/linux-pm/40212796-734c-4140-8a85-854f72b8144d@panix…
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Loehle <christian.loehle(a)arm.com>
---
kernel/power/energy_model.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
index ea7995a25780..b63c2afc1379 100644
--- a/kernel/power/energy_model.c
+++ b/kernel/power/energy_model.c
@@ -778,7 +778,7 @@ void em_adjust_cpu_capacity(unsigned int cpu)
static void em_check_capacity_update(void)
{
cpumask_var_t cpu_done_mask;
- int cpu;
+ int cpu, failed_cpus = 0;
if (!zalloc_cpumask_var(&cpu_done_mask, GFP_KERNEL)) {
pr_warn("no free memory\n");
@@ -796,10 +796,8 @@ static void em_check_capacity_update(void)
policy = cpufreq_cpu_get(cpu);
if (!policy) {
- pr_debug("Accessing cpu%d policy failed\n", cpu);
- schedule_delayed_work(&em_update_work,
- msecs_to_jiffies(1000));
- break;
+ failed_cpus++;
+ continue;
}
cpufreq_cpu_put(policy);
@@ -814,6 +812,11 @@ static void em_check_capacity_update(void)
em_adjust_new_capacity(cpu, dev, pd);
}
+ if (failed_cpus) {
+ pr_debug("Accessing %d policies failed, retrying\n", failed_cpus);
+ schedule_delayed_work(&em_update_work, msecs_to_jiffies(1000));
+ }
+
free_cpumask_var(cpu_done_mask);
}
--
2.34.1
From: Stanislav Fort <stanislav.fort(a)aisle.com>
batadv_nc_skb_decode_packet() trusts coded_len and checks only against
skb->len. XOR starts at sizeof(struct batadv_unicast_packet), reducing
payload headroom, and the source skb length is not verified, allowing an
out-of-bounds read and a small out-of-bounds write.
Validate that coded_len fits within the payload area of both destination
and source sk_buffs before XORing.
Fixes: 2df5278b0267 ("batman-adv: network coding - receive coded packets and decode them")
Cc: stable(a)vger.kernel.org
Reported-by: Stanislav Fort <disclosure(a)aisle.com>
Signed-off-by: Stanislav Fort <stanislav.fort(a)aisle.com>
Signed-off-by: Sven Eckelmann <sven(a)narfation.org>
Signed-off-by: Simon Wunderlich <sw(a)simonwunderlich.de>
---
net/batman-adv/network-coding.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/batman-adv/network-coding.c b/net/batman-adv/network-coding.c
index 9f56308779cc3..af97d077369f9 100644
--- a/net/batman-adv/network-coding.c
+++ b/net/batman-adv/network-coding.c
@@ -1687,7 +1687,12 @@ batadv_nc_skb_decode_packet(struct batadv_priv *bat_priv, struct sk_buff *skb,
coding_len = ntohs(coded_packet_tmp.coded_len);
- if (coding_len > skb->len)
+ /* ensure dst buffer is large enough (payload only) */
+ if (coding_len + h_size > skb->len)
+ return NULL;
+
+ /* ensure src buffer is large enough (payload only) */
+ if (coding_len + h_size > nc_packet->skb->len)
return NULL;
/* Here the magic is reversed:
--
2.47.2
From: gyutrange <wlsrbwjd643(a)naver.com>
VNCR/TLBI VA reconstruction currently uses bit 48 as the sign bit,
but for 48-bit virtual addresses the correct sign bit is bit 47.
Using 48 can mis-canonicalize addresses in the negative half and may
cause missed invalidations.
Although VNCR_EL2 encodes other architectural fields (RESS, BADDR;
see Arm ARM D24.2.206), sign_extend64() interprets its second argument
as the index of the sign bit. Passing 48 prevents propagation of the
canonical sign bit for 48-bit VAs.
Impact:
- Incorrect canonicalization of VAs with bit47=1
- Potential stale VNCR pseudo-TLB entries after TLBI or MMU notifier
- Possible incorrect translation/permissions or DoS when combined
with other issues
Fixes: 667304740537 ("KVM: arm64: Mask out non-VA bits from TLBI VA* on VNCR invalidation")
Cc: stable(a)vger.kernel.org
Reported-by: DongHa Lee <gap-dev(a)example.com>
Reported-by: Gyujeong Jin <wlsrbwjd7232(a)gmail.com>
Reported-by: Daehyeon Ko <4ncient(a)example.com>
Reported-by: Geonha Lee <leegn4a(a)example.com>
Reported-by: Hyungyu Oh <dqpc_lover(a)example.com>
Reported-by: Jaewon Yang <r4mbb1(a)example.com>
Signed-off-by: Gyujeong Jin <wlsrbwjd7232(a)gmail.com>
---
arch/arm64/kvm/nested.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 77db81bae86f..eaa6dd9da086 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1169,7 +1169,7 @@ int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu)
static u64 read_vncr_el2(struct kvm_vcpu *vcpu)
{
- return (u64)sign_extend64(__vcpu_sys_reg(vcpu, VNCR_EL2), 48);
+ return (u64)sign_extend64(__vcpu_sys_reg(vcpu, VNCR_EL2), 47);
}
static int kvm_translate_vncr(struct kvm_vcpu *vcpu)
--
2.43.0
Correct the address in usb controller node to fix the following warning:
Warning (simple_bus_reg): /soc@0/usb@a6f8800: simple-bus unit address
format error, expected "a600000"
Fixes: c5a87e3a6b3e ("arm64: dts: qcom: sm8450: Flatten usb controller node")
Cc: stable(a)vger.kernel.org
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202508121834.953Mvah2-lkp@intel.com/
Signed-off-by: Krishna Kurapati <krishna.kurapati(a)oss.qualcomm.com>
---
This change was tested with W=1 and the reported issue is not seen.
Also didn't add RB Tag received from Neil Armstrong since there is a
change in commit text. This change is based on top of latest linux next.
Changes in v2:
Fixed the fixes tag.
Link to v1:
https://lore.kernel.org/all/20250813063840.2158792-1-krishna.kurapati@oss.q…
arch/arm64/boot/dts/qcom/sm8450.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
index 2baef6869ed7..38c91c3ec787 100644
--- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
@@ -5417,7 +5417,7 @@ opp-202000000 {
};
};
- usb_1: usb@a6f8800 {
+ usb_1: usb@a600000 {
compatible = "qcom,sm8450-dwc3", "qcom,snps-dwc3";
reg = <0 0x0a600000 0 0xfc100>;
status = "disabled";
--
2.34.1
There is a long standing bug which causes I2C communication not to
work on the Armada 3700 based boards. The first patch in the series
fixes that regression. The second patch improves recovery to make it
more robust which helps to avoid communication problems with certain
SFP modules.
Signed-off-by: Gabor Juhos <j4g8y7(a)gmail.com>
---
Changes in v3:
- rebase on tip of i2c/for-current
- remove Imre's tag from the cover letter, and replace his SoB tag to
Reviewed-by in the individual patches
- rework the second patch so it does not need changes in the I2C core code,
and drop the first one as it is not needed now
- Link to v2: https://lore.kernel.org/r/20250811-i2c-pxa-fix-i2c-communication-v2-0-ca42e…
Changes in v2:
- collect offered tags
- rebase and retest on tip of i2c/for-current
- Link to v1: https://lore.kernel.org/r/20250511-i2c-pxa-fix-i2c-communication-v1-0-e9097…
---
Gabor Juhos (2):
i2c: pxa: defer reset on Armada 3700 when recovery is used
i2c: pxa: handle 'Early Bus Busy' condition on Armada 3700
drivers/i2c/busses/i2c-pxa.c | 35 ++++++++++++++++++++++++++++-------
1 file changed, 28 insertions(+), 7 deletions(-)
---
base-commit: 3dd22078026c7cad4d4a3f32c5dc5452c7180de8
change-id: 20250510-i2c-pxa-fix-i2c-communication-3e6de1e3d0c6
Best regards,
--
Gabor Juhos <j4g8y7(a)gmail.com>
Commit under Fixes added support to build the driver as a loadable
module. However, it did not add MODULE_DEVICE_TABLE() which is required
for autoloading the driver when it is built as a loadable module.
Fix it.
Fixes: a2790bf81f0f ("PCI: j721e: Add support to build as a loadable module")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Siddharth Vadapalli <s-vadapalli(a)ti.com>
---
Hello,
This patch is based on commit
b320789d6883 Linux 6.17-rc4
of Mainline Linux.
Patch has been tested on J7200-EVM with the driver configs set to 'm'.
The pci-j721e.c driver is autoloaded as Linux boots and it doesn't need
to be manually probed. Logs:
https://gist.github.com/Siddharth-Vadapalli-at-TI/f64eed4df6fd4f714747b3c7e…
Regards,
Siddharth.
drivers/pci/controller/cadence/pci-j721e.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/controller/cadence/pci-j721e.c b/drivers/pci/controller/cadence/pci-j721e.c
index 6c93f39d0288..cfca13a4c840 100644
--- a/drivers/pci/controller/cadence/pci-j721e.c
+++ b/drivers/pci/controller/cadence/pci-j721e.c
@@ -440,6 +440,7 @@ static const struct of_device_id of_j721e_pcie_match[] = {
},
{},
};
+MODULE_DEVICE_TABLE(of, of_j721e_pcie_match);
static int j721e_pcie_probe(struct platform_device *pdev)
{
--
2.43.0
Hi maintainers,
Please consider backporting the following patches to the stable trees.
These patches fix a significant reading issue with mcp2221 on i2c eeprom.
I have confirmed that the patches applie cleanly and build successfully
against v6.6, v6.1, v5.15 and v5.10 stable branches.
I will later come back to you with other patches that might need larger
backport changes.
Thanks,
Romain
Hamish Martin (2):
HID: mcp2221: Don't set bus speed on every transfer
HID: mcp2221: Handle reads greater than 60 bytes
drivers/hid/hid-mcp2221.c | 71 +++++++++++++++++++++++++++------------
1 file changed, 49 insertions(+), 22 deletions(-)
--
2.48.1
Hello,
I would like to request backporting the following patch to 5.10 series:
590cfb082837cc6c0c595adf1711330197c86a58
ASoC: Intel: sof_rt5682: shrink platform_id names below 20 characters
The patch seems to be already present in 5.15 and newer branches, and
its lack seems to be causing out-of-bounds read. I've hit it in the
wild while trying to install 5.10.241 on i686:
sh /var/tmp/portage/sys-kernel/gentoo-kernel-5.10.241/work/linux-5.10/scripts/depmod.sh depmod 5.10.241-gentoo-dist
depmod: FATAL: Module index: bad character '�'=0x80 - only 7-bit ASCII is supported:
platform:jsl_rt5682_max98360ax�
TIA.
--
Best regards,
Michał Górny
Support for parsing the topology on AMD/Hygon processors using CPUID
leaf 0xb was added in commit 3986a0a805e6 ("x86/CPU/AMD: Derive CPU
topology from CPUID function 0xB when available"). In an effort to keep
all the topology parsing bits in one place, this commit also introduced
a pseudo dependency on the TOPOEXT feature to parse the CPUID leaf 0xb.
TOPOEXT feature (CPUID 0x80000001 ECX[22]) advertises the support for
Cache Properties leaf 0x8000001d and the CPUID leaf 0x8000001e EAX for
"Extended APIC ID" however support for 0xb was introduced alongside the
x2APIC support not only on AMD [1], but also historically on x86 [2].
Similar to 0xb, the support for extended CPU topology leaf 0x80000026
too does not depend on the TOPOEXT feature.
The support for these leaves is expected to be confirmed by ensuring
"leaf <= {extended_}cpuid_level" and then parsing the level 0 of the
respective leaf to confirm EBX[15:0] (LogProcAtThisLevel) is non-zero as
stated in the definition of "CPUID_Fn0000000B_EAX_x00 [Extended Topology
Enumeration] (Core::X86::Cpuid::ExtTopEnumEax0)" in Processor
Programming Reference (PPR) for AMD Family 19h Model 01h Rev B1 Vol1 [3]
Sec. 2.1.15.1 "CPUID Instruction Functions".
This has not been a problem on baremetal platforms since support for
TOPOEXT (Fam 0x15 and later) predates the support for CPUID leaf 0xb
(Fam 0x17[Zen2] and later), however, for AMD guests on QEMU, "x2apic"
feature can be enabled independent of the "topoext" feature where QEMU
expects topology and the initial APICID to be parsed using the CPUID
leaf 0xb (especially when number of cores > 255) which is populated
independent of the "topoext" feature flag.
Unconditionally call cpu_parse_topology_ext() on AMD and Hygon
processors to first parse the topology using the XTOPOLOGY leaves
(0x80000026 / 0xb) before using the TOPOEXT leaf (0x8000001e).
Cc: stable(a)vger.kernel.org # Only v6.9 and above
Link: https://lore.kernel.org/lkml/1529686927-7665-1-git-send-email-suravee.suthi… [1]
Link: https://lore.kernel.org/lkml/20080818181435.523309000@linux-os.sc.intel.com/ [2]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 [3]
Suggested-by: Naveen N Rao (AMD) <naveen(a)kernel.org>
Fixes: 3986a0a805e6 ("x86/CPU/AMD: Derive CPU topology from CPUID function 0xB when available")
Signed-off-by: K Prateek Nayak <kprateek.nayak(a)amd.com>
---
Changelog v3..v4:
o Quoted relevant section of the PPR justifying the changes.
o Moved this patch up ahead.
o Cc'd stable and made a note that backports should target v6.9 and
above since this depends on the x86 topology rewrite.
---
arch/x86/kernel/cpu/topology_amd.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
index 827dd0dbb6e9..4e3134a5550c 100644
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -175,18 +175,14 @@ static void topoext_fixup(struct topo_scan *tscan)
static void parse_topology_amd(struct topo_scan *tscan)
{
- bool has_topoext = false;
-
/*
- * If the extended topology leaf 0x8000_001e is available
- * try to get SMT, CORE, TILE, and DIE shifts from extended
+ * Try to get SMT, CORE, TILE, and DIE shifts from extended
* CPUID leaf 0x8000_0026 on supported processors first. If
* extended CPUID leaf 0x8000_0026 is not supported, try to
* get SMT and CORE shift from leaf 0xb first, then try to
* get the CORE shift from leaf 0x8000_0008.
*/
- if (cpu_feature_enabled(X86_FEATURE_TOPOEXT))
- has_topoext = cpu_parse_topology_ext(tscan);
+ bool has_topoext = cpu_parse_topology_ext(tscan);
if (cpu_feature_enabled(X86_FEATURE_AMD_HTR_CORES))
tscan->c->topo.cpu_type = cpuid_ebx(0x80000026);
--
2.34.1
This series adds support for the clone3 system call to the nios2
architecture. This addresses the build-time warning "warning: clone3()
entry point is missing, please fix" introduced in 505d66d1abfb9
("clone3: drop __ARCH_WANT_SYS_CLONE3 macro"). The implementation passes
the relevant clone3 tests of kselftest when applied on top of
next-20250815:
./run_kselftest.sh
TAP version 13
1..4
# selftests: clone3: clone3
ok 1 selftests: clone3: clone3
# selftests: clone3: clone3_clear_sighand
ok 2 selftests: clone3: clone3_clear_sighand
# selftests: clone3: clone3_set_tid
ok 3 selftests: clone3: clone3_set_tid
# selftests: clone3: clone3_cap_checkpoint_restore
ok 4 selftests: clone3: clone3_cap_checkpoint_restore
The series also includes a small patch to kernel/fork.c that ensures
that clone_flags are passed correctly on architectures where unsigned
long is insufficient to store the u64 clone_flags. It is marked as a fix
for stable backporting.
As requested, in v2, this series now further tries to correct this type
error throughout the whole code base. Thus, it now touches a larger
number of subsystems and all architectures.
Therefore, another test was performed for ARCH=x86_64 (as a
representative for 64-bit architectures). Here, the series builds cleanly
without warnings on defconfig with CONFIG_SECURITY_APPARMOR=y and
CONFIG_SECURITY_TOMOYO=y (to compile-check the LSM-related changes).
The build further successfully passes testing/selftests/clone3 (with the
patch from 20241105062948.1037011-1-zhouyuhang1010(a)163.com to prepare
clone3_cap_checkpoint_restore for compatibility with the newer libcap
version on my system).
Is there any option to further preflight check this patch series via
lkp/KernelCI/etc. for a broader test across architectures, or is this
degree of testing sufficient to eventually get the series merged?
N.B.: The series is not checkpatch clean right now:
- include/linux/cred.h, include/linux/mnt_namespace.h:
function definition arguments without identifier name
- include/trace/events/task.h:
space prohibited after that open parenthesis
I did not fix these warnings to keep my changes minimal and reviewable,
as the issues persist throughout the files and they were not introduced
by me; I only followed the existing code style and just replaced the
types. If desired, I'd be happy to make the changes in a potential v3,
though.
Signed-off-by: Simon Schuster <schuster.simon(a)siemens-energy.com>
---
Changes in v2:
- Introduce "Fixes:" and "Cc: stable(a)vger.kernel.org" where necessary
- Factor out "Fixes:" when adapting the datatype of clone_flags for
easier backports
- Fix additional instances where `unsigned long` clone_flags is used
- Reword commit message to make it clearer that any 32-bit arch is
affected by this bug
- Link to v1: https://lore.kernel.org/r/20250821-nios2-implement-clone3-v1-0-1bb24017376a…
---
Simon Schuster (4):
copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64)
copy_process: pass clone_flags as u64 across calltree
arch: copy_thread: pass clone_flags as u64
nios2: implement architecture-specific portion of sys_clone3
arch/alpha/kernel/process.c | 2 +-
arch/arc/kernel/process.c | 2 +-
arch/arm/kernel/process.c | 2 +-
arch/arm64/kernel/process.c | 2 +-
arch/csky/kernel/process.c | 2 +-
arch/hexagon/kernel/process.c | 2 +-
arch/loongarch/kernel/process.c | 2 +-
arch/m68k/kernel/process.c | 2 +-
arch/microblaze/kernel/process.c | 2 +-
arch/mips/kernel/process.c | 2 +-
arch/nios2/include/asm/syscalls.h | 1 +
arch/nios2/include/asm/unistd.h | 2 --
arch/nios2/kernel/entry.S | 6 ++++++
arch/nios2/kernel/process.c | 2 +-
arch/nios2/kernel/syscall_table.c | 1 +
arch/openrisc/kernel/process.c | 2 +-
arch/parisc/kernel/process.c | 2 +-
arch/powerpc/kernel/process.c | 2 +-
arch/riscv/kernel/process.c | 2 +-
arch/s390/kernel/process.c | 2 +-
arch/sh/kernel/process_32.c | 2 +-
arch/sparc/kernel/process_32.c | 2 +-
arch/sparc/kernel/process_64.c | 2 +-
arch/um/kernel/process.c | 2 +-
arch/x86/include/asm/fpu/sched.h | 2 +-
arch/x86/include/asm/shstk.h | 4 ++--
arch/x86/kernel/fpu/core.c | 2 +-
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/shstk.c | 2 +-
arch/xtensa/kernel/process.c | 2 +-
block/blk-ioc.c | 2 +-
fs/namespace.c | 2 +-
include/linux/cgroup.h | 4 ++--
include/linux/cred.h | 2 +-
include/linux/iocontext.h | 6 +++---
include/linux/ipc_namespace.h | 4 ++--
include/linux/lsm_hook_defs.h | 2 +-
include/linux/mnt_namespace.h | 2 +-
include/linux/nsproxy.h | 2 +-
include/linux/pid_namespace.h | 4 ++--
include/linux/rseq.h | 4 ++--
include/linux/sched/task.h | 2 +-
include/linux/security.h | 4 ++--
include/linux/sem.h | 4 ++--
include/linux/time_namespace.h | 4 ++--
include/linux/uprobes.h | 4 ++--
include/linux/user_events.h | 4 ++--
include/linux/utsname.h | 4 ++--
include/net/net_namespace.h | 4 ++--
include/trace/events/task.h | 6 +++---
ipc/namespace.c | 2 +-
ipc/sem.c | 2 +-
kernel/cgroup/namespace.c | 2 +-
kernel/cred.c | 2 +-
kernel/events/uprobes.c | 2 +-
kernel/fork.c | 10 +++++-----
kernel/nsproxy.c | 4 ++--
kernel/pid_namespace.c | 2 +-
kernel/sched/core.c | 4 ++--
kernel/sched/fair.c | 2 +-
kernel/sched/sched.h | 4 ++--
kernel/time/namespace.c | 2 +-
kernel/utsname.c | 2 +-
net/core/net_namespace.c | 2 +-
security/apparmor/lsm.c | 2 +-
security/security.c | 2 +-
security/selinux/hooks.c | 2 +-
security/tomoyo/tomoyo.c | 2 +-
68 files changed, 95 insertions(+), 89 deletions(-)
---
base-commit: 1357b2649c026b51353c84ddd32bc963e8999603
change-id: 20250818-nios2-implement-clone3-7f252c20860b
Best regards,
--
Simon Schuster <schuster.simon(a)siemens-energy.com>
Hello,
Henrik notified me that UDF in 5.15 kernel is failing for him, likely
because commit 1ea1cd11c72d ("udf: Fix directory iteration for longer tail
extents") is missing 5.15 stable tree. Basically wherever d16076d9b684b got
backported, 1ea1cd11c72d needs to be backported as well (I'm sorry for
forgetting to add proper Fixes tag back then).
Honza
--
Jan Kara <jack(a)suse.com>
SUSE Labs, CR
When the xe pm_notifier evicts for suspend / hibernate, there might be
racing tasks trying to re-validate again. This can lead to suspend taking
excessive time or get stuck in a live-lock. This behaviour becomes
much worse with the fix that actually makes re-validation bring back
bos to VRAM rather than letting them remain in TT.
Prevent that by having exec and the rebind worker waiting for a completion
that is set to block by the pm_notifier before suspend and is signaled
by the pm_notifier after resume / wakeup.
It's probably still possible to craft malicious applications that block
suspending. More work is pending to fix that.
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4288
Fixes: c6a4d46ec1d7 ("drm/xe: evict user memory in PM notifier")
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.16+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
---
drivers/gpu/drm/xe/xe_device_types.h | 2 ++
drivers/gpu/drm/xe/xe_exec.c | 9 +++++++++
drivers/gpu/drm/xe/xe_pm.c | 4 ++++
drivers/gpu/drm/xe/xe_vm.c | 14 ++++++++++++++
4 files changed, 29 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 092004d14db2..6602bd678cbc 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -507,6 +507,8 @@ struct xe_device {
/** @pm_notifier: Our PM notifier to perform actions in response to various PM events. */
struct notifier_block pm_notifier;
+ /** @pm_block: Completion to block validating tasks on suspend / hibernate prepare */
+ struct completion pm_block;
/** @pmt: Support the PMT driver callback interface */
struct {
diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
index 44364c042ad7..374c831e691b 100644
--- a/drivers/gpu/drm/xe/xe_exec.c
+++ b/drivers/gpu/drm/xe/xe_exec.c
@@ -237,6 +237,15 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
goto err_unlock_list;
}
+ /*
+ * It's OK to block interruptible here with the vm lock held, since
+ * on task freezing during suspend / hibernate, the call will
+ * return -ERESTARTSYS and the IOCTL will be rerun.
+ */
+ err = wait_for_completion_interruptible(&xe->pm_block);
+ if (err)
+ goto err_unlock_list;
+
vm_exec.vm = &vm->gpuvm;
vm_exec.flags = DRM_EXEC_INTERRUPTIBLE_WAIT;
if (xe_vm_in_lr_mode(vm)) {
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index bee9aacd82e7..f4c7a84270ea 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -306,6 +306,7 @@ static int xe_pm_notifier_callback(struct notifier_block *nb,
switch (action) {
case PM_HIBERNATION_PREPARE:
case PM_SUSPEND_PREPARE:
+ reinit_completion(&xe->pm_block);
xe_pm_runtime_get(xe);
err = xe_bo_evict_all_user(xe);
if (err)
@@ -322,6 +323,7 @@ static int xe_pm_notifier_callback(struct notifier_block *nb,
break;
case PM_POST_HIBERNATION:
case PM_POST_SUSPEND:
+ complete_all(&xe->pm_block);
xe_bo_notifier_unprepare_all_pinned(xe);
xe_pm_runtime_put(xe);
break;
@@ -348,6 +350,8 @@ int xe_pm_init(struct xe_device *xe)
if (err)
return err;
+ init_completion(&xe->pm_block);
+ complete_all(&xe->pm_block);
/* For now suspend/resume is only allowed with GuC */
if (!xe_device_uc_enabled(xe))
return 0;
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 3ff3c67aa79d..edcdb0528f2a 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -394,6 +394,9 @@ static int xe_gpuvm_validate(struct drm_gpuvm_bo *vm_bo, struct drm_exec *exec)
list_move_tail(&gpuva_to_vma(gpuva)->combined_links.rebind,
&vm->rebind_list);
+ if (!try_wait_for_completion(&vm->xe->pm_block))
+ return -EAGAIN;
+
ret = xe_bo_validate(gem_to_xe_bo(vm_bo->obj), vm, false);
if (ret)
return ret;
@@ -494,6 +497,12 @@ static void preempt_rebind_work_func(struct work_struct *w)
xe_assert(vm->xe, xe_vm_in_preempt_fence_mode(vm));
trace_xe_vm_rebind_worker_enter(vm);
+ /*
+ * This blocks the wq during suspend / hibernate.
+ * Don't hold any locks.
+ */
+retry_pm:
+ wait_for_completion(&vm->xe->pm_block);
down_write(&vm->lock);
if (xe_vm_is_closed_or_banned(vm)) {
@@ -503,6 +512,11 @@ static void preempt_rebind_work_func(struct work_struct *w)
}
retry:
+ if (!try_wait_for_completion(&vm->xe->pm_block)) {
+ up_write(&vm->lock);
+ goto retry_pm;
+ }
+
if (xe_vm_userptr_check_repin(vm)) {
err = xe_vm_userptr_pin(vm);
if (err)
--
2.50.1
switch_to_super_page() assumes the memory range it's working on is aligned
to the target large page level. Unfortunately, __domain_mapping() doesn't
take this into account when using it, and will pass unaligned ranges
ultimately freeing a PTE range larger than expected.
Take for example a mapping with the following iov_pfn range [0x3fe400,
0x4c0600], which should be backed by the following mappings:
iov_pfn [0x3fe400, 0x3fffff] covered by 2MiB pages
iov_pfn [0x400000, 0x4bffff] covered by 1GiB pages
iov_pfn [0x4c0000, 0x4c05ff] covered by 2MiB pages
Under this circumstance, __domain_mapping() will pass [0x400000, 0x4c05ff]
to switch_to_super_page() at a 1 GiB granularity, which will in turn
free PTEs all the way to iov_pfn 0x4fffff.
Mitigate this by rounding down the iov_pfn range passed to
switch_to_super_page() in __domain_mapping()
to the target large page level.
Additionally add range alignment checks to switch_to_super_page.
Fixes: 9906b9352a35 ("iommu/vt-d: Avoid duplicate removing in __domain_mapping()")
Signed-off-by: Eugene Koira <eugkoira(a)amazon.com>
Cc: stable(a)vger.kernel.org
---
drivers/iommu/intel/iommu.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9c3ab9d9f69a..dff2d895b8ab 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1575,6 +1575,10 @@ static void switch_to_super_page(struct dmar_domain *domain,
unsigned long lvl_pages = lvl_to_nr_pages(level);
struct dma_pte *pte = NULL;
+ if (WARN_ON(!IS_ALIGNED(start_pfn, lvl_pages) ||
+ !IS_ALIGNED(end_pfn + 1, lvl_pages)))
+ return;
+
while (start_pfn <= end_pfn) {
if (!pte)
pte = pfn_to_dma_pte(domain, start_pfn, &level,
@@ -1650,7 +1654,8 @@ __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
unsigned long pages_to_remove;
pteval |= DMA_PTE_LARGE_PAGE;
- pages_to_remove = min_t(unsigned long, nr_pages,
+ pages_to_remove = min_t(unsigned long,
+ round_down(nr_pages, lvl_pages),
nr_pte_to_next_page(pte) * lvl_pages);
end_pfn = iov_pfn + pages_to_remove - 1;
switch_to_super_page(domain, iov_pfn, end_pfn, largepage_lvl);
--
2.47.3
Amazon Development Center (Netherlands) B.V., Johanna Westerdijkplein 1, NL-2521 EN The Hague, Registration No. Chamber of Commerce 56869649, VAT: NL 852339859B01
Fix incorrect use of IS_ERR() to check devm_kzalloc() return value.
devm_kzalloc() returns NULL on failure, not an error pointer.
This issue was introduced by commit 4277f035d227
("coresight: trbe: Add a representative coresight_platform_data for TRBE")
which replaced the original function but didn't update the error check.
Fixes: 4277f035d227 ("coresight: trbe: Add a representative coresight_platform_data for TRBE")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/hwtracing/coresight/coresight-trbe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 8267dd1a2130..caf873adfc3a 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -1279,7 +1279,7 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
* into the device for that purpose.
*/
desc.pdata = devm_kzalloc(dev, sizeof(*desc.pdata), GFP_KERNEL);
- if (IS_ERR(desc.pdata))
+ if (!desc.pdata)
goto cpu_clear;
desc.type = CORESIGHT_DEV_TYPE_SINK;
--
2.35.1
The patch titled
Subject: mm: fix folio_expected_ref_count() when PG_private_2
has been added to the -mm mm-new branch. Its filename is
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm: fix folio_expected_ref_count() when PG_private_2
Date: Sun, 31 Aug 2025 02:01:16 -0700 (PDT)
6.16's folio_expected_ref_count() is forgetting the PG_private_2 flag,
which (like PG_private, but not in addition to PG_private) counts for 1
more reference: it needs to be using folio_has_private() in place of
folio_test_private().
But this went wrong earlier: folio_expected_ref_count() was based on (and
replaced) mm/migrate.c's folio_expected_refs(), which has been using
folio_test_private() since 6.0 converted to folios(): before that,
expected_page_refs() was correctly using page_has_private().
Just a few filesystems are still using PG_private_2 a.k.a. PG_fscache.
Potentially, this fix re-enables page migration on their folios; but it
would not be surprising to learn that in practice those folios are not
migratable for other reasons.
Link: https://lkml.kernel.org/r/f91ee36e-a8cb-e3a4-c23b-524ff3848da7@google.com
Fixes: 86ebd50224c0 ("mm: add folio_expected_ref_count() for reference count calculation")
Fixes: 108ca8358139 ("mm/migrate: Convert expected_page_refs() to folio_expected_refs()")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mm.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/mm.h~mm-fix-folio_expected_ref_count-when-pg_private_2
+++ a/include/linux/mm.h
@@ -2234,8 +2234,8 @@ static inline int folio_expected_ref_cou
} else {
/* One reference per page from the pagecache. */
ref_count += !!folio->mapping << order;
- /* One reference from PG_private. */
- ref_count += folio_test_private(folio);
+ /* One reference from PG_private or PG_private_2. */
+ ref_count += folio_has_private(folio);
}
/* One reference per page table mapping. */
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
mm-folio_may_be_cached-unless-folio_test_large.patch
mm-lru_add_drain_all-do-local-lru_add_drain-first.patch
Please backport the following commit into Linux v5.4 and newer stable
releases:
80dc18a0cba8dea42614f021b20a04354b213d86
The backport will likely depend on macro rename commit:
817f989700fddefa56e5e443e7d138018ca6709d
This part of commit description clarifies why this is a fix:
"
As per PCIe r6.0, sec 6.6.1, a Downstream Port that supports Link speeds
greater than 5.0 GT/s, software must wait a minimum of 100 ms after Link
training completes before sending a Configuration Request.
"
In practice, this makes detection of PCIe Gen3 and Gen4 SSDs reliable on
Renesas R-Car V4H SoC. Without this commit, the SSDs sporadically do not
get detected, or sometimes they link up in Gen1 mode.
This fixes commit
886bc5ceb5cc ("PCI: designware: Add generic dw_pcie_wait_for_link()")
which is in v4.5-rc1-4-g886bc5ceb5cc3 , so I think this fix should be
backported to all currently maintained stable releases, i.e. v5.4+ .
Thank you
--
Best regards,
Marek Vasut
The patch titled
Subject: mm: folio_may_be_cached() unless folio_test_large()
has been added to the -mm mm-new branch. Its filename is
mm-folio_may_be_cached-unless-folio_test_large.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm: folio_may_be_cached() unless folio_test_large()
Date: Sun, 31 Aug 2025 02:16:25 -0700 (PDT)
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain_all() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/861c061c-51cd-b940-49df-9f55e1fee2c8@google.com
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/swap.h | 10 ++++++++++
mm/gup.c | 5 +++--
mm/mlock.c | 6 +++---
mm/swap.c | 2 +-
4 files changed, 17 insertions(+), 6 deletions(-)
--- a/include/linux/swap.h~mm-folio_may_be_cached-unless-folio_test_large
+++ a/include/linux/swap.h
@@ -381,6 +381,16 @@ void folio_add_lru_vma(struct folio *, s
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
--- a/mm/gup.c~mm-folio_may_be_cached-unless-folio_test_large
+++ a/mm/gup.c
@@ -2309,8 +2309,9 @@ static unsigned long collect_longterm_un
continue;
}
- if (drain_allow && folio_ref_count(folio) !=
- folio_expected_ref_count(folio) + 1) {
+ if (drain_allow && folio_may_be_cached(folio) &&
+ folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
--- a/mm/mlock.c~mm-folio_may_be_cached-unless-folio_test_large
+++ a/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
--- a/mm/swap.c~mm-folio_may_be_cached-unless-folio_test_large
+++ a/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(s
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
mm-folio_may_be_cached-unless-folio_test_large.patch
mm-lru_add_drain_all-do-local-lru_add_drain-first.patch
The patch titled
Subject: mm: Revert "mm: vmscan.c: fix OOM on swap stress test"
has been added to the -mm mm-new branch. Its filename is
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm: Revert "mm: vmscan.c: fix OOM on swap stress test"
Date: Sun, 31 Aug 2025 02:13:55 -0700 (PDT)
This reverts commit 0885ef4705607936fc36a38fd74356e1c465b023: that
was a fix to the reverted 33dfe9204f29b415bbc0abb1a50642d1ba94f5e9.
Link: https://lkml.kernel.org/r/bd440614-3d2c-31d6-1b8f-9635c69cef7c@google.com
Fixes: 0885ef470560 ("mm: vmscan.c: fix OOM on swap stress test")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/vmscan.c~mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test
+++ a/mm/vmscan.c
@@ -4500,7 +4500,7 @@ static bool sort_folio(struct lruvec *lr
}
/* ineligible */
- if (!folio_test_lru(folio) || zone > sc->reclaim_idx) {
+ if (zone > sc->reclaim_idx) {
gen = folio_inc_gen(lruvec, folio, false);
list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]);
return true;
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
mm-folio_may_be_cached-unless-folio_test_large.patch
mm-lru_add_drain_all-do-local-lru_add_drain-first.patch
The patch titled
Subject: mm: Revert "mm/gup: clear the LRU flag of a page before adding to LRU batch"
has been added to the -mm mm-new branch. Its filename is
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm: Revert "mm/gup: clear the LRU flag of a page before adding to LRU batch"
Date: Sun, 31 Aug 2025 02:11:33 -0700 (PDT)
This reverts commit 33dfe9204f29b415bbc0abb1a50642d1ba94f5e9: now that
collect_longterm_unpinnable_folios() is checking ref_count instead of lru,
and mlock/munlock do not participate in the revised LRU flag clearing,
those changes are misleading, and enlarge the window during which
mlock/munlock may miss an mlock_count update.
It is possible (I'd hesitate to claim probable) that the greater
likelihood of missed mlock_count updates would explain the "Realtime
threads delayed due to kcompactd0" observed on 6.12 in the Link below. If
that is the case, this reversion will help; but a complete solution needs
also a further patch, beyond the scope of this series.
Included some 80-column cleanup around folio_batch_add_and_move().
The role of folio_test_clear_lru() (before taking per-memcg lru_lock) is
questionable since 6.13 removed mem_cgroup_move_account() etc; but perhaps
there are still some races which need it - not examined here.
Link: https://lore.kernel.org/linux-mm/DU0PR01MB10385345F7153F334100981888259A@DU…
Link: https://lkml.kernel.org/r/0215a42b-99cd-612a-95f7-56f8251d99ef@google.com
Fixes: 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding to LRU batch")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swap.c | 50 ++++++++++++++++++++++++++------------------------
1 file changed, 26 insertions(+), 24 deletions(-)
--- a/mm/swap.c~mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch
+++ a/mm/swap.c
@@ -164,6 +164,10 @@ static void folio_batch_move_lru(struct
for (i = 0; i < folio_batch_count(fbatch); i++) {
struct folio *folio = fbatch->folios[i];
+ /* block memcg migration while the folio moves between lru */
+ if (move_fn != lru_add && !folio_test_clear_lru(folio))
+ continue;
+
folio_lruvec_relock_irqsave(folio, &lruvec, &flags);
move_fn(lruvec, folio);
@@ -176,14 +180,10 @@ static void folio_batch_move_lru(struct
}
static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
- struct folio *folio, move_fn_t move_fn,
- bool on_lru, bool disable_irq)
+ struct folio *folio, move_fn_t move_fn, bool disable_irq)
{
unsigned long flags;
- if (on_lru && !folio_test_clear_lru(folio))
- return;
-
folio_get(folio);
if (disable_irq)
@@ -191,8 +191,8 @@ static void __folio_batch_add_and_move(s
else
local_lock(&cpu_fbatches.lock);
- if (!folio_batch_add(this_cpu_ptr(fbatch), folio) || folio_test_large(folio) ||
- lru_cache_disabled())
+ if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
+ folio_test_large(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
@@ -201,13 +201,13 @@ static void __folio_batch_add_and_move(s
local_unlock(&cpu_fbatches.lock);
}
-#define folio_batch_add_and_move(folio, op, on_lru) \
- __folio_batch_add_and_move( \
- &cpu_fbatches.op, \
- folio, \
- op, \
- on_lru, \
- offsetof(struct cpu_fbatches, op) >= offsetof(struct cpu_fbatches, lock_irq) \
+#define folio_batch_add_and_move(folio, op) \
+ __folio_batch_add_and_move( \
+ &cpu_fbatches.op, \
+ folio, \
+ op, \
+ offsetof(struct cpu_fbatches, op) >= \
+ offsetof(struct cpu_fbatches, lock_irq) \
)
static void lru_move_tail(struct lruvec *lruvec, struct folio *folio)
@@ -231,10 +231,10 @@ static void lru_move_tail(struct lruvec
void folio_rotate_reclaimable(struct folio *folio)
{
if (folio_test_locked(folio) || folio_test_dirty(folio) ||
- folio_test_unevictable(folio))
+ folio_test_unevictable(folio) || !folio_test_lru(folio))
return;
- folio_batch_add_and_move(folio, lru_move_tail, true);
+ folio_batch_add_and_move(folio, lru_move_tail);
}
void lru_note_cost_unlock_irq(struct lruvec *lruvec, bool file,
@@ -328,10 +328,11 @@ static void folio_activate_drain(int cpu
void folio_activate(struct folio *folio)
{
- if (folio_test_active(folio) || folio_test_unevictable(folio))
+ if (folio_test_active(folio) || folio_test_unevictable(folio) ||
+ !folio_test_lru(folio))
return;
- folio_batch_add_and_move(folio, lru_activate, true);
+ folio_batch_add_and_move(folio, lru_activate);
}
#else
@@ -507,7 +508,7 @@ void folio_add_lru(struct folio *folio)
lru_gen_in_fault() && !(current->flags & PF_MEMALLOC))
folio_set_active(folio);
- folio_batch_add_and_move(folio, lru_add, false);
+ folio_batch_add_and_move(folio, lru_add);
}
EXPORT_SYMBOL(folio_add_lru);
@@ -685,13 +686,13 @@ void lru_add_drain_cpu(int cpu)
void deactivate_file_folio(struct folio *folio)
{
/* Deactivating an unevictable folio will not accelerate reclaim */
- if (folio_test_unevictable(folio))
+ if (folio_test_unevictable(folio) || !folio_test_lru(folio))
return;
if (lru_gen_enabled() && lru_gen_clear_refs(folio))
return;
- folio_batch_add_and_move(folio, lru_deactivate_file, true);
+ folio_batch_add_and_move(folio, lru_deactivate_file);
}
/*
@@ -704,13 +705,13 @@ void deactivate_file_folio(struct folio
*/
void folio_deactivate(struct folio *folio)
{
- if (folio_test_unevictable(folio))
+ if (folio_test_unevictable(folio) || !folio_test_lru(folio))
return;
if (lru_gen_enabled() ? lru_gen_clear_refs(folio) : !folio_test_active(folio))
return;
- folio_batch_add_and_move(folio, lru_deactivate, true);
+ folio_batch_add_and_move(folio, lru_deactivate);
}
/**
@@ -723,10 +724,11 @@ void folio_deactivate(struct folio *foli
void folio_mark_lazyfree(struct folio *folio)
{
if (!folio_test_anon(folio) || !folio_test_swapbacked(folio) ||
+ !folio_test_lru(folio) ||
folio_test_swapcache(folio) || folio_test_unevictable(folio))
return;
- folio_batch_add_and_move(folio, lru_lazyfree, true);
+ folio_batch_add_and_move(folio, lru_lazyfree);
}
void lru_add_drain(void)
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
mm-folio_may_be_cached-unless-folio_test_large.patch
mm-lru_add_drain_all-do-local-lru_add_drain-first.patch
The patch titled
Subject: mm/gup: local lru_add_drain() to avoid lru_add_drain_all()
has been added to the -mm mm-new branch. Its filename is
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm/gup: local lru_add_drain() to avoid lru_add_drain_all()
Date: Sun, 31 Aug 2025 02:08:26 -0700 (PDT)
In many cases, if collect_longterm_unpinnable_folios() does need to drain
the LRU cache to release a reference, the cache in question is on this
same CPU, and much more efficiently drained by a preliminary local
lru_add_drain(), than the later cross-CPU lru_add_drain_all().
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration". Note for clean
backports: can take 6.16 commit a03db236aebf ("gup: optimize longterm
pin_user_pages() for large folio") first.
Link: https://lkml.kernel.org/r/165ccfdb-16b5-ac11-0d66-2984293590c8@google.com
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/gup.c~mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all
+++ a/mm/gup.c
@@ -2291,6 +2291,8 @@ static unsigned long collect_longterm_un
struct folio *folio;
long i = 0;
+ lru_add_drain();
+
for (folio = pofs_get_folio(pofs, i); folio;
folio = pofs_next_folio(folio, pofs, &i)) {
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
mm-folio_may_be_cached-unless-folio_test_large.patch
mm-lru_add_drain_all-do-local-lru_add_drain-first.patch
The patch titled
Subject: mm/gup: check ref_count instead of lru before migration
has been added to the -mm mm-new branch. Its filename is
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm/gup: check ref_count instead of lru before migration
Date: Sun, 31 Aug 2025 02:05:56 -0700 (PDT)
Will Deacon reports:-
When taking a longterm GUP pin via pin_user_pages(),
__gup_longterm_locked() tries to migrate target folios that should not be
longterm pinned, for example because they reside in a CMA region or
movable zone. This is done by first pinning all of the target folios
anyway, collecting all of the longterm-unpinnable target folios into a
list, dropping the pins that were just taken and finally handing the list
off to migrate_pages() for the actual migration.
It is critically important that no unexpected references are held on the
folios being migrated, otherwise the migration will fail and
pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is
relatively easy to observe migration failures when running pKVM (which
uses pin_user_pages() on crosvm's virtual address space to resolve stage-2
page faults from the guest) on a 6.15-based Pixel 6 device and this
results in the VM terminating prematurely.
In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its
mapping of guest memory prior to the pinning. Subsequently, when
pin_user_pages() walks the page-table, the relevant 'pte' is not present
and so the faulting logic allocates a new folio, mlocks it with
mlock_folio() and maps it in the page-table.
Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch
by pagevec"), mlock/munlock operations on a folio (formerly page), are
deferred. For example, mlock_folio() takes an additional reference on the
target folio before placing it into a per-cpu 'folio_batch' for later
processing by mlock_folio_batch(), which drops the refcount once the
operation is complete. Processing of the batches is coupled with the LRU
batch logic and can be forcefully drained with lru_add_drain_all() but as
long as a folio remains unprocessed on the batch, its refcount will be
elevated.
This deferred batching therefore interacts poorly with the pKVM pinning
scenario as we can find ourselves in a situation where the migration code
fails to migrate a folio due to the elevated refcount from the pending
mlock operation.
Hugh Dickins adds:-
!folio_test_lru() has never been a very reliable way to tell if an
lru_add_drain_all() is worth calling, to remove LRU cache references to
make the folio migratable: the LRU flag may be set even while the folio is
held with an extra reference in a per-CPU LRU cache.
5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11
commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding
to LRU batch") tried to make it reliable, by moving LRU flag clearing; but
missed the mlock/munlock batches, so still unreliable as reported.
And it turns out to be difficult to extend 33dfe9204f29's LRU flag
clearing to the mlock/munlock batches: if they do benefit from batching,
mlock/munlock cannot be so effective when easily suppressed while !LRU.
Instead, switch to an expected ref_count check, which was more reliable
all along: some more false positives (unhelpful drains) than before, and
never a guarantee that the folio will prove migratable, but better.
Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm: add
folio_expected_ref_count() for reference count calculation") and 6.17
commit ("mm: fix folio_expected_ref_count() when PG_private_2").
Link: https://lkml.kernel.org/r/47c51c9a-140f-1ea1-b692-c4bae5d1fa58@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Link: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/
Cc: <stable(a)vger.kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/gup.c~mm-gup-check-ref_count-instead-of-lru-before-migration
+++ a/mm/gup.c
@@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_un
continue;
}
- if (!folio_test_lru(folio) && drain_allow) {
+ if (drain_allow && folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
mm-folio_may_be_cached-unless-folio_test_large.patch
mm-lru_add_drain_all-do-local-lru_add_drain-first.patch
The patch titled
Subject: mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Uladzislau Rezki (Sony)" <urezki(a)gmail.com>
Subject: mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc()
Date: Sun, 31 Aug 2025 14:10:58 +0200
kasan_populate_vmalloc() and its helpers ignore the caller's gfp_mask and
always allocate memory using the hardcoded GFP_KERNEL flag. This makes
them inconsistent with vmalloc(), which was recently extended to support
GFP_NOFS and GFP_NOIO allocations.
Page table allocations performed during shadow population also ignore the
external gfp_mask. To preserve the intended semantics of GFP_NOFS and
GFP_NOIO, wrap the apply_to_page_range() calls into the appropriate
memalloc scope.
This patch:
- Extends kasan_populate_vmalloc() and helpers to take gfp_mask;
- Passes gfp_mask down to alloc_pages_bulk() and __get_free_page();
- Enforces GFP_NOFS/NOIO semantics with memalloc_*_save()/restore()
around apply_to_page_range();
- Updates vmalloc.c and percpu allocator call sites accordingly.
Link: https://lkml.kernel.org/r/20250831121058.92971-1-urezki@gmail.com
Fixes: 451769ebb7e7 ("mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc")
Signed-off-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/kasan.h | 6 +++---
mm/kasan/shadow.c | 31 ++++++++++++++++++++++++-------
mm/vmalloc.c | 8 ++++----
3 files changed, 31 insertions(+), 14 deletions(-)
--- a/include/linux/kasan.h~mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc
+++ a/include/linux/kasan.h
@@ -562,7 +562,7 @@ static inline void kasan_init_hw_tags(vo
#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
void kasan_populate_early_vm_area_shadow(void *start, unsigned long size);
-int kasan_populate_vmalloc(unsigned long addr, unsigned long size);
+int kasan_populate_vmalloc(unsigned long addr, unsigned long size, gfp_t gfp_mask);
void kasan_release_vmalloc(unsigned long start, unsigned long end,
unsigned long free_region_start,
unsigned long free_region_end,
@@ -574,7 +574,7 @@ static inline void kasan_populate_early_
unsigned long size)
{ }
static inline int kasan_populate_vmalloc(unsigned long start,
- unsigned long size)
+ unsigned long size, gfp_t gfp_mask)
{
return 0;
}
@@ -610,7 +610,7 @@ static __always_inline void kasan_poison
static inline void kasan_populate_early_vm_area_shadow(void *start,
unsigned long size) { }
static inline int kasan_populate_vmalloc(unsigned long start,
- unsigned long size)
+ unsigned long size, gfp_t gfp_mask)
{
return 0;
}
--- a/mm/kasan/shadow.c~mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc
+++ a/mm/kasan/shadow.c
@@ -336,13 +336,13 @@ static void ___free_pages_bulk(struct pa
}
}
-static int ___alloc_pages_bulk(struct page **pages, int nr_pages)
+static int ___alloc_pages_bulk(struct page **pages, int nr_pages, gfp_t gfp_mask)
{
unsigned long nr_populated, nr_total = nr_pages;
struct page **page_array = pages;
while (nr_pages) {
- nr_populated = alloc_pages_bulk(GFP_KERNEL, nr_pages, pages);
+ nr_populated = alloc_pages_bulk(gfp_mask, nr_pages, pages);
if (!nr_populated) {
___free_pages_bulk(page_array, nr_total - nr_pages);
return -ENOMEM;
@@ -354,25 +354,42 @@ static int ___alloc_pages_bulk(struct pa
return 0;
}
-static int __kasan_populate_vmalloc(unsigned long start, unsigned long end)
+static int __kasan_populate_vmalloc(unsigned long start, unsigned long end, gfp_t gfp_mask)
{
unsigned long nr_pages, nr_total = PFN_UP(end - start);
struct vmalloc_populate_data data;
+ unsigned int flags;
int ret = 0;
- data.pages = (struct page **)__get_free_page(GFP_KERNEL | __GFP_ZERO);
+ data.pages = (struct page **)__get_free_page(gfp_mask | __GFP_ZERO);
if (!data.pages)
return -ENOMEM;
while (nr_total) {
nr_pages = min(nr_total, PAGE_SIZE / sizeof(data.pages[0]));
- ret = ___alloc_pages_bulk(data.pages, nr_pages);
+ ret = ___alloc_pages_bulk(data.pages, nr_pages, gfp_mask);
if (ret)
break;
data.start = start;
+
+ /*
+ * page tables allocations ignore external gfp mask, enforce it
+ * by the scope API
+ */
+ if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
+ flags = memalloc_nofs_save();
+ else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
+ flags = memalloc_noio_save();
+
ret = apply_to_page_range(&init_mm, start, nr_pages * PAGE_SIZE,
kasan_populate_vmalloc_pte, &data);
+
+ if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
+ memalloc_nofs_restore(flags);
+ else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
+ memalloc_noio_restore(flags);
+
___free_pages_bulk(data.pages, nr_pages);
if (ret)
break;
@@ -386,7 +403,7 @@ static int __kasan_populate_vmalloc(unsi
return ret;
}
-int kasan_populate_vmalloc(unsigned long addr, unsigned long size)
+int kasan_populate_vmalloc(unsigned long addr, unsigned long size, gfp_t gfp_mask)
{
unsigned long shadow_start, shadow_end;
int ret;
@@ -415,7 +432,7 @@ int kasan_populate_vmalloc(unsigned long
shadow_start = PAGE_ALIGN_DOWN(shadow_start);
shadow_end = PAGE_ALIGN(shadow_end);
- ret = __kasan_populate_vmalloc(shadow_start, shadow_end);
+ ret = __kasan_populate_vmalloc(shadow_start, shadow_end, gfp_mask);
if (ret)
return ret;
--- a/mm/vmalloc.c~mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc
+++ a/mm/vmalloc.c
@@ -2026,6 +2026,8 @@ static struct vmap_area *alloc_vmap_area
if (unlikely(!vmap_initialized))
return ERR_PTR(-EBUSY);
+ /* Only reclaim behaviour flags are relevant. */
+ gfp_mask = gfp_mask & GFP_RECLAIM_MASK;
might_sleep();
/*
@@ -2038,8 +2040,6 @@ static struct vmap_area *alloc_vmap_area
*/
va = node_alloc(size, align, vstart, vend, &addr, &vn_id);
if (!va) {
- gfp_mask = gfp_mask & GFP_RECLAIM_MASK;
-
va = kmem_cache_alloc_node(vmap_area_cachep, gfp_mask, node);
if (unlikely(!va))
return ERR_PTR(-ENOMEM);
@@ -2089,7 +2089,7 @@ retry:
BUG_ON(va->va_start < vstart);
BUG_ON(va->va_end > vend);
- ret = kasan_populate_vmalloc(addr, size);
+ ret = kasan_populate_vmalloc(addr, size, gfp_mask);
if (ret) {
free_vmap_area(va);
return ERR_PTR(ret);
@@ -4826,7 +4826,7 @@ retry:
/* populate the kasan shadow space */
for (area = 0; area < nr_vms; area++) {
- if (kasan_populate_vmalloc(vas[area]->va_start, sizes[area]))
+ if (kasan_populate_vmalloc(vas[area]->va_start, sizes[area], GFP_KERNEL))
goto err_free_shadow;
}
_
Patches currently in -mm which might be from urezki(a)gmail.com are
mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc.patch
The patch titled
Subject: ocfs2: fix recursive semaphore deadlock in fiemap call
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
ocfs2-fix-recursive-semaphore-deadlock-in-fiemap-call.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Mark Tinguely <mark.tinguely(a)oracle.com>
Subject: ocfs2: fix recursive semaphore deadlock in fiemap call
Date: Fri, 29 Aug 2025 10:18:15 -0500
syzbot detected a OCFS2 hang due to a recursive semaphore on a
FS_IOC_FIEMAP of the extent list on a specially crafted mmap file.
context_switch kernel/sched/core.c:5357 [inline]
__schedule+0x1798/0x4cc0 kernel/sched/core.c:6961
__schedule_loop kernel/sched/core.c:7043 [inline]
schedule+0x165/0x360 kernel/sched/core.c:7058
schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:7115
rwsem_down_write_slowpath+0x872/0xfe0 kernel/locking/rwsem.c:1185
__down_write_common kernel/locking/rwsem.c:1317 [inline]
__down_write kernel/locking/rwsem.c:1326 [inline]
down_write+0x1ab/0x1f0 kernel/locking/rwsem.c:1591
ocfs2_page_mkwrite+0x2ff/0xc40 fs/ocfs2/mmap.c:142
do_page_mkwrite+0x14d/0x310 mm/memory.c:3361
wp_page_shared mm/memory.c:3762 [inline]
do_wp_page+0x268d/0x5800 mm/memory.c:3981
handle_pte_fault mm/memory.c:6068 [inline]
__handle_mm_fault+0x1033/0x5440 mm/memory.c:6195
handle_mm_fault+0x40a/0x8e0 mm/memory.c:6364
do_user_addr_fault+0x764/0x1390 arch/x86/mm/fault.c:1387
handle_page_fault arch/x86/mm/fault.c:1476 [inline]
exc_page_fault+0x76/0xf0 arch/x86/mm/fault.c:1532
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
RIP: 0010:copy_user_generic arch/x86/include/asm/uaccess_64.h:126 [inline]
RIP: 0010:raw_copy_to_user arch/x86/include/asm/uaccess_64.h:147 [inline]
RIP: 0010:_inline_copy_to_user include/linux/uaccess.h:197 [inline]
RIP: 0010:_copy_to_user+0x85/0xb0 lib/usercopy.c:26
Code: e8 00 bc f7 fc 4d 39 fc 72 3d 4d 39 ec 77 38 e8 91 b9 f7 fc 4c 89
f7 89 de e8 47 25 5b fd 0f 01 cb 4c 89 ff 48 89 d9 4c 89 f6 <f3> a4 0f
1f 00 48 89 cb 0f 01 ca 48 89 d8 5b 41 5c 41 5d 41 5e 41
RSP: 0018:ffffc9000403f950 EFLAGS: 00050256
RAX: ffffffff84c7f101 RBX: 0000000000000038 RCX: 0000000000000038
RDX: 0000000000000000 RSI: ffffc9000403f9e0 RDI: 0000200000000060
RBP: ffffc9000403fa90 R08: ffffc9000403fa17 R09: 1ffff92000807f42
R10: dffffc0000000000 R11: fffff52000807f43 R12: 0000200000000098
R13: 00007ffffffff000 R14: ffffc9000403f9e0 R15: 0000200000000060
copy_to_user include/linux/uaccess.h:225 [inline]
fiemap_fill_next_extent+0x1c0/0x390 fs/ioctl.c:145
ocfs2_fiemap+0x888/0xc90 fs/ocfs2/extent_map.c:806
ioctl_fiemap fs/ioctl.c:220 [inline]
do_vfs_ioctl+0x1173/0x1430 fs/ioctl.c:532
__do_sys_ioctl fs/ioctl.c:596 [inline]
__se_sys_ioctl+0x82/0x170 fs/ioctl.c:584
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5f13850fd9
RSP: 002b:00007ffe3b3518b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000200000000000 RCX: 00007f5f13850fd9
RDX: 0000200000000040 RSI: 00000000c020660b RDI: 0000000000000004
RBP: 6165627472616568 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe3b3518f0
R13: 00007ffe3b351b18 R14: 431bde82d7b634db R15: 00007f5f1389a03b
ocfs2_fiemap() takes a read lock of the ip_alloc_sem semaphore (since
v2.6.22-527-g7307de80510a) and calls fiemap_fill_next_extent() to read the
extent list of this running mmap executable. The user supplied buffer to
hold the fiemap information page faults calling ocfs2_page_mkwrite() which
will take a write lock (since v2.6.27-38-g00dc417fa3e7) of the same
semaphore. This recursive semaphore will hold filesystem locks and causes
a hang of the fileystem.
The ip_alloc_sem protects the inode extent list and size. Release the
read semphore before calling fiemap_fill_next_extent() in ocfs2_fiemap()
and ocfs2_fiemap_inline(). This does an unnecessary semaphore lock/unlock
on the last extent but simplifies the error path.
Link: https://lkml.kernel.org/r/61d1a62b-2631-4f12-81e2-cd689914360b@oracle.com
Fixes: 00dc417fa3e7 ("ocfs2: fiemap support")
Signed-off-by: Mark Tinguely <mark.tinguely(a)oracle.com>
Reported-by: syzbot+541dcc6ee768f77103e7(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=541dcc6ee768f77103e7
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/extent_map.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
--- a/fs/ocfs2/extent_map.c~ocfs2-fix-recursive-semaphore-deadlock-in-fiemap-call
+++ a/fs/ocfs2/extent_map.c
@@ -706,6 +706,8 @@ out:
* it not only handles the fiemap for inlined files, but also deals
* with the fast symlink, cause they have no difference for extent
* mapping per se.
+ *
+ * Must be called with ip_alloc_sem semaphore held.
*/
static int ocfs2_fiemap_inline(struct inode *inode, struct buffer_head *di_bh,
struct fiemap_extent_info *fieinfo,
@@ -717,6 +719,7 @@ static int ocfs2_fiemap_inline(struct in
u64 phys;
u32 flags = FIEMAP_EXTENT_DATA_INLINE|FIEMAP_EXTENT_LAST;
struct ocfs2_inode_info *oi = OCFS2_I(inode);
+ lockdep_assert_held_read(&oi->ip_alloc_sem);
di = (struct ocfs2_dinode *)di_bh->b_data;
if (ocfs2_inode_is_fast_symlink(inode))
@@ -732,8 +735,11 @@ static int ocfs2_fiemap_inline(struct in
phys += offsetof(struct ocfs2_dinode,
id2.i_data.id_data);
+ /* Release the ip_alloc_sem to prevent deadlock on page fault */
+ up_read(&OCFS2_I(inode)->ip_alloc_sem);
ret = fiemap_fill_next_extent(fieinfo, 0, phys, id_count,
flags);
+ down_read(&OCFS2_I(inode)->ip_alloc_sem);
if (ret < 0)
return ret;
}
@@ -802,9 +808,11 @@ int ocfs2_fiemap(struct inode *inode, st
len_bytes = (u64)le16_to_cpu(rec.e_leaf_clusters) << osb->s_clustersize_bits;
phys_bytes = le64_to_cpu(rec.e_blkno) << osb->sb->s_blocksize_bits;
virt_bytes = (u64)le32_to_cpu(rec.e_cpos) << osb->s_clustersize_bits;
-
+ /* Release the ip_alloc_sem to prevent deadlock on page fault */
+ up_read(&OCFS2_I(inode)->ip_alloc_sem);
ret = fiemap_fill_next_extent(fieinfo, virt_bytes, phys_bytes,
len_bytes, fe_flags);
+ down_read(&OCFS2_I(inode)->ip_alloc_sem);
if (ret)
break;
_
Patches currently in -mm which might be from mark.tinguely(a)oracle.com are
ocfs2-fix-recursive-semaphore-deadlock-in-fiemap-call.patch
Remove the logic to set interrupt mask by default in uio_hv_generic
driver as the interrupt mask value is supposed to be controlled
completely by the user space. If the mask bit gets changed
by the driver, concurrently with user mode operating on the ring,
the mask bit may be set when it is supposed to be clear, and the
user-mode driver will miss an interrupt which will cause a hang.
For eg- when the driver sets inbound ring buffer interrupt mask to 1,
the host does not interrupt the guest on the UIO VMBus channel.
However, setting the mask does not prevent the host from putting a
message in the inbound ring buffer. So let’s assume that happens,
the host puts a message into the ring buffer but does not interrupt.
Subsequently, the user space code in the guest sets the inbound ring
buffer interrupt mask to 0, saying “Hey, I’m ready for interrupts”.
User space code then calls pread() to wait for an interrupt.
Then one of two things happens:
* The host never sends another message. So the pread() waits forever.
* The host does send another message. But because there’s already a
message in the ring buffer, it doesn’t generate an interrupt.
This is the correct behavior, because the host should only send an
interrupt when the inbound ring buffer transitions from empty to
not-empty. Adding an additional message to a ring buffer that is not
empty is not supposed to generate an interrupt on the guest.
Since the guest is waiting in pread() and not removing messages from
the ring buffer, the pread() waits forever.
This could be easily reproduced in hv_fcopy_uio_daemon if we delay
setting interrupt mask to 0.
Similarly if hv_uio_channel_cb() sets the interrupt_mask to 1,
there’s a race condition. Once user space empties the inbound ring
buffer, but before user space sets interrupt_mask to 0, the host could
put another message in the ring buffer but it wouldn’t interrupt.
Then the next pread() would hang.
Fix these by removing all instances where interrupt_mask is changed,
while keeping the one in set_event() unchanged to enable userspace
control the interrupt mask by writing 0/1 to /dev/uioX.
Fixes: 95096f2fbd10 ("uio-hv-generic: new userspace i/o driver for VMBus")
Suggested-by: John Starks <jostarks(a)microsoft.com>
Signed-off-by: Naman Jain <namjain(a)linux.microsoft.com>
Cc: <stable(a)vger.kernel.org>
---
Changes since v1:
https://lore.kernel.org/all/20250818064846.271294-1-namjain@linux.microsoft…
* Added Fixes and Cc stable tags.
---
drivers/uio/uio_hv_generic.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index f19efad4d6f8..3f8e2e27697f 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -111,7 +111,6 @@ static void hv_uio_channel_cb(void *context)
struct hv_device *hv_dev;
struct hv_uio_private_data *pdata;
- chan->inbound.ring_buffer->interrupt_mask = 1;
virt_mb();
/*
@@ -183,8 +182,6 @@ hv_uio_new_channel(struct vmbus_channel *new_sc)
return;
}
- /* Disable interrupts on sub channel */
- new_sc->inbound.ring_buffer->interrupt_mask = 1;
set_channel_read_mode(new_sc, HV_CALL_ISR);
ret = hv_create_ring_sysfs(new_sc, hv_uio_ring_mmap);
if (ret) {
@@ -227,9 +224,7 @@ hv_uio_open(struct uio_info *info, struct inode *inode)
ret = vmbus_connect_ring(dev->channel,
hv_uio_channel_cb, dev->channel);
- if (ret == 0)
- dev->channel->inbound.ring_buffer->interrupt_mask = 1;
- else
+ if (ret)
atomic_dec(&pdata->refcnt);
return ret;
--
2.34.1
batadv_nc_skb_decode_packet() trusts coded_len and checks only against
skb->len. XOR starts at sizeof(struct batadv_unicast_packet), reducing
payload headroom, and the source skb length is not verified, allowing an
out-of-bounds read and a small out-of-bounds write.
Validate that coded_len fits within the payload area of both destination
and source sk_buffs before XORing.
Fixes: 2df5278b0267 ("batman-adv: network coding - receive coded packets and decode them")
Cc: stable(a)vger.kernel.org
Reported-by: Stanislav Fort <disclosure(a)aisle.com>
Signed-off-by: Stanislav Fort <disclosure(a)aisle.com>
---
net/batman-adv/network-coding.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/batman-adv/network-coding.c b/net/batman-adv/network-coding.c
index 9f56308779cc..af97d077369f 100644
--- a/net/batman-adv/network-coding.c
+++ b/net/batman-adv/network-coding.c
@@ -1687,7 +1687,12 @@ batadv_nc_skb_decode_packet(struct batadv_priv *bat_priv, struct sk_buff *skb,
coding_len = ntohs(coded_packet_tmp.coded_len);
- if (coding_len > skb->len)
+ /* ensure dst buffer is large enough (payload only) */
+ if (coding_len + h_size > skb->len)
+ return NULL;
+
+ /* ensure src buffer is large enough (payload only) */
+ if (coding_len + h_size > nc_packet->skb->len)
return NULL;
/* Here the magic is reversed:
--
2.39.3 (Apple Git-146)
Add an explicit check for the xfer length to 'rtl9300_i2c_config_xfer'
to ensure the data length isn't within the supported range. In
particular a data length of 0 is not supported by the hardware and
causes unintended or destructive behaviour.
This limitation becomes obvious when looking at the register
documentation [1]. 4 bits are reserved for DATA_WIDTH and the value
of these 4 bits is used as N + 1, allowing a data length range of
1 <= len <= 16.
Affected by this is the SMBus Quick Operation which works with a data
length of 0. Passing 0 as the length causes an underflow of the value
due to:
(len - 1) & 0xf
and effectively specifying a transfer length of 16 via the registers.
This causes a 16-byte write operation instead of a Quick Write. For
example, on SFP modules without write-protected EEPROM this soft-bricks
them by overwriting some initial bytes.
For completeness, also add a quirk for the zero length.
[1] https://svanheule.net/realtek/longan/register/i2c_mst1_ctrl2
Fixes: c366be720235 ("i2c: Add driver for the RTL9300 I2C controller")
Cc: <stable(a)vger.kernel.org> # v6.13+
Signed-off-by: Jonas Jelonek <jelonek.jonas(a)gmail.com>
Tested-by: Sven Eckelmann <sven(a)narfation.org>
Reviewed-by: Chris Packham <chris.packham(a)alliedtelesis.co.nz>
Tested-by: Chris Packham <chris.packham(a)alliedtelesis.co.nz> # On RTL9302C based board
Tested-by: Markus Stockhausen <markus.stockhausen(a)gmx.de>
---
drivers/i2c/busses/i2c-rtl9300.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-rtl9300.c b/drivers/i2c/busses/i2c-rtl9300.c
index 19c367703eaf..ebd4a85e1bde 100644
--- a/drivers/i2c/busses/i2c-rtl9300.c
+++ b/drivers/i2c/busses/i2c-rtl9300.c
@@ -99,6 +99,9 @@ static int rtl9300_i2c_config_xfer(struct rtl9300_i2c *i2c, struct rtl9300_i2c_c
{
u32 val, mask;
+ if (len < 1 || len > 16)
+ return -EINVAL;
+
val = chan->bus_freq << RTL9300_I2C_MST_CTRL2_SCL_FREQ_OFS;
mask = RTL9300_I2C_MST_CTRL2_SCL_FREQ_MASK;
@@ -352,7 +355,7 @@ static const struct i2c_algorithm rtl9300_i2c_algo = {
};
static struct i2c_adapter_quirks rtl9300_i2c_quirks = {
- .flags = I2C_AQ_NO_CLK_STRETCH,
+ .flags = I2C_AQ_NO_CLK_STRETCH | I2C_AQ_NO_ZERO_LEN,
.max_read_len = 16,
.max_write_len = 16,
};
--
2.48.1
Fix the current check for number of channels (child nodes in the device
tree). Before, this was:
if (device_get_child_node_count(dev) >= RTL9300_I2C_MUX_NCHAN)
RTL9300_I2C_MUX_NCHAN gives the maximum number of channels so checking
with '>=' isn't correct because it doesn't allow the last channel
number. Thus, fix it to:
if (device_get_child_node_count(dev) > RTL9300_I2C_MUX_NCHAN)
Issue occured on a TP-Link TL-ST1008F v2.0 device (8 SFP+ ports) and fix
is tested there.
Fixes: c366be720235 ("i2c: Add driver for the RTL9300 I2C controller")
Cc: <stable(a)vger.kernel.org> # v6.13+
Signed-off-by: Jonas Jelonek <jelonek.jonas(a)gmail.com>
Tested-by: Sven Eckelmann <sven(a)narfation.org>
Reviewed-by: Chris Packham <chris.packham(a)alliedtelesis.co.nz>
Tested-by: Chris Packham <chris.packham(a)alliedtelesis.co.nz> # On RTL9302C based board
Tested-by: Markus Stockhausen <markus.stockhausen(a)gmx.de>
---
drivers/i2c/busses/i2c-rtl9300.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-rtl9300.c b/drivers/i2c/busses/i2c-rtl9300.c
index 4b215f9a24e6..19c367703eaf 100644
--- a/drivers/i2c/busses/i2c-rtl9300.c
+++ b/drivers/i2c/busses/i2c-rtl9300.c
@@ -382,7 +382,7 @@ static int rtl9300_i2c_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, i2c);
- if (device_get_child_node_count(dev) >= RTL9300_I2C_MUX_NCHAN)
+ if (device_get_child_node_count(dev) > RTL9300_I2C_MUX_NCHAN)
return dev_err_probe(dev, -EINVAL, "Too many channels\n");
device_for_each_child_node(dev, child) {
--
2.48.1
Hi Stable,
Please provide a quote for your products:
Include:
1.Pricing (per unit)
2.Delivery cost & timeline
3.Quote expiry date
Deadline: September
Thanks!
Kamal Prasad
Albinayah Trading
Our user report a bug that his cpu keep the min cpu freq,
bisect to commit ada8d7fa0ad49a2a078f97f7f6e02d24d3c357a3
("sched/cpufreq: Rework schedutil governor performance estimation")
and test the fix e37617c8e53a1f7fcba6d5e1041f4fd8a2425c27
("sched/fair: Fix frequency selection for non-invariant case")
works.
And backport the "stable-deps-of" series:
"consolidate and cleanup CPU capacity"
Link: https://lore.kernel.org/all/20231211104855.558096-1-vincent.guittot@linaro.…
PS:
commit 50b813b147e9eb6546a1fc49d4e703e6d23691f2 in series
("cpufreq/cppc: Move and rename cppc_cpufreq_{perf_to_khz|khz_to_perf}()")
merged in v6.6.59 as 33e89c16cea0882bb05c585fa13236b730dd0efa
Vincent Guittot (8):
sched/topology: Add a new arch_scale_freq_ref() method
cpufreq: Use the fixed and coherent frequency for scaling capacity
cpufreq/schedutil: Use a fixed reference frequency
energy_model: Use a fixed reference frequency
cpufreq/cppc: Set the frequency used for computing the capacity
arm64/amu: Use capacity_ref_freq() to set AMU ratio
topology: Set capacity_freq_ref in all cases
sched/fair: Fix frequency selection for non-invariant case
arch/arm/include/asm/topology.h | 1 +
arch/arm64/include/asm/topology.h | 1 +
arch/arm64/kernel/topology.c | 26 ++++++------
arch/riscv/include/asm/topology.h | 1 +
drivers/base/arch_topology.c | 69 ++++++++++++++++++++-----------
drivers/cpufreq/cpufreq.c | 4 +-
include/linux/arch_topology.h | 8 ++++
include/linux/cpufreq.h | 1 +
include/linux/energy_model.h | 6 +--
include/linux/sched/topology.h | 8 ++++
kernel/sched/cpufreq_schedutil.c | 30 +++++++++++++-
11 files changed, 111 insertions(+), 44 deletions(-)
--
2.20.1
Fix a use-after-free window by correcting the buffer release sequence in
the deferred receive path. The code freed the RQ buffer first and only
then cleared the context pointer under the lock. Concurrent paths
(e.g., ABTS and the repost path) also inspect and release the same
pointer under the lock, so the old order could lead to double-free/UAF.
Note that the repost path already uses the correct pattern: detach the
pointer under the lock, then free it after dropping the lock. The deferred
path should do the same.
Fixes: 472e146d1cf3 ("scsi: lpfc: Correct upcalling nvmet_fc transport during io done downcall")
Cc: stable(a)vger.kernel.org
Signed-off-by: John Evans <evans1210144(a)gmail.com>
---
v2:
* Rework locking to read and clear the buffer pointer atomically, as
suggested by Justin Tee.
---
drivers/scsi/lpfc/lpfc_nvmet.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c
index fba2e62027b7..4cfc928bcf2d 100644
--- a/drivers/scsi/lpfc/lpfc_nvmet.c
+++ b/drivers/scsi/lpfc/lpfc_nvmet.c
@@ -1243,7 +1243,7 @@ lpfc_nvmet_defer_rcv(struct nvmet_fc_target_port *tgtport,
struct lpfc_nvmet_tgtport *tgtp;
struct lpfc_async_xchg_ctx *ctxp =
container_of(rsp, struct lpfc_async_xchg_ctx, hdlrctx.fcp_req);
- struct rqb_dmabuf *nvmebuf = ctxp->rqb_buffer;
+ struct rqb_dmabuf *nvmebuf;
struct lpfc_hba *phba = ctxp->phba;
unsigned long iflag;
@@ -1251,13 +1251,18 @@ lpfc_nvmet_defer_rcv(struct nvmet_fc_target_port *tgtport,
lpfc_nvmeio_data(phba, "NVMET DEFERRCV: xri x%x sz %d CPU %02x\n",
ctxp->oxid, ctxp->size, raw_smp_processor_id());
+ spin_lock_irqsave(&ctxp->ctxlock, iflag);
+ nvmebuf = ctxp->rqb_buffer;
if (!nvmebuf) {
+ spin_unlock_irqrestore(&ctxp->ctxlock, iflag);
lpfc_printf_log(phba, KERN_INFO, LOG_NVME_IOERR,
"6425 Defer rcv: no buffer oxid x%x: "
"flg %x ste %x\n",
ctxp->oxid, ctxp->flag, ctxp->state);
return;
}
+ ctxp->rqb_buffer = NULL;
+ spin_unlock_irqrestore(&ctxp->ctxlock, iflag);
tgtp = phba->targetport->private;
if (tgtp)
@@ -1265,9 +1270,6 @@ lpfc_nvmet_defer_rcv(struct nvmet_fc_target_port *tgtport,
/* Free the nvmebuf since a new buffer already replaced it */
nvmebuf->hrq->rqbp->rqb_free_buffer(phba, nvmebuf);
- spin_lock_irqsave(&ctxp->ctxlock, iflag);
- ctxp->rqb_buffer = NULL;
- spin_unlock_irqrestore(&ctxp->ctxlock, iflag);
}
/**
--
2.43.0
Hi,
Charles Bordet reported the following issue (full context in
https://bugs.debian.org/1108860)
> Dear Maintainer,
>
> What led up to the situation?
> We run a production environment using Debian 12 VMs, with a network
> topology involving VXLAN tunnels encapsulated inside Wireguard
> interfaces. This setup has worked reliably for over a year, with MTU set
> to 1500 on all interfaces except the Wireguard interface (set to 1420).
> Wireguard kernel fragmentation allowed this configuration to function
> without issues, even though the effective path MTU is lower than 1500.
>
> What exactly did you do (or not do) that was effective (or ineffective)?
> We performed a routine system upgrade, updating all packages include the
> kernel. After the upgrade, we observed severe network issues (timeouts,
> very slow HTTP/HTTPS, and apt update failures) on all VMs behind the
> router. SSH and small-packet traffic continued to work.
>
> To diagnose, we:
>
> * Restored a backup (with the previous kernel): the problem disappeared.
> * Repeated the upgrade, confirming the issue reappeared.
> * Systematically tested each kernel version from 6.1.124-1 up to
> 6.1.140-1. The problem first appears with kernel 6.1.135-1; all earlier
> versions work as expected.
> * Kernel version from the backports (6.12.32-1) did not resolve the
> problem.
>
> What was the outcome of this action?
>
> * With kernel 6.1.135-1 or later, network timeouts occur for
> large-packet protocols (HTTP, apt, etc.), while SSH and small-packet
> protocols work.
> * With kernel 6.1.133-1 or earlier, everything works as expected.
>
> What outcome did you expect instead?
> We expected the network to function as before, with Wireguard handling
> fragmentation transparently and no application-level timeouts,
> regardless of the kernel version.
While triaging the issue we found that the commit 8930424777e4
("tunnels: Accept PACKET_HOST in skb_tunnel_check_pmtu()." introduces
the issue and Charles confirmed that the issue was present as well in
6.12.35 and 6.15.4 (other version up could potentially still be
affected, but we wanted to check it is not a 6.1.y specific
regression).
Reverthing the commit fixes Charles' issue.
Does that ring a bell?
Regards,
Salvatore