From: Pratyush Yadav <pratyush(a)kernel.org>
cqspi_indirect_read_execute() and cqspi_indirect_write_execute() first
set the enable bit on APB region and then start reading/writing to the
AHB region. On TI K3 SoCs these regions lie on different endpoints. This
means that the order of the two operations is not guaranteed, and they
might be reordered at the interconnect level.
It is possible for the AHB write to be executed before the APB write to
enable the indirect controller, causing the transaction to be invalid
and the write erroring out. Read back the APB region write before
accessing the AHB region to make sure the write got flushed and the race
condition is eliminated.
Fixes: 140623410536 ("mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller")
CC: stable(a)vger.kernel.org
Signed-off-by: Pratyush Yadav <pratyush(a)kernel.org>
Signed-off-by: Santhosh Kumar K <s-k6(a)ti.com>
---
drivers/spi/spi-cadence-quadspi.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 9bf823348cd3..eaf9a0f522d5 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -764,6 +764,7 @@ static int cqspi_indirect_read_execute(struct cqspi_flash_pdata *f_pdata,
reinit_completion(&cqspi->transfer_complete);
writel(CQSPI_REG_INDIRECTRD_START_MASK,
reg_base + CQSPI_REG_INDIRECTRD);
+ readl(reg_base + CQSPI_REG_INDIRECTRD); /* Flush posted write. */
while (remaining > 0) {
if (use_irq &&
@@ -1090,6 +1091,8 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
reinit_completion(&cqspi->transfer_complete);
writel(CQSPI_REG_INDIRECTWR_START_MASK,
reg_base + CQSPI_REG_INDIRECTWR);
+ readl(reg_base + CQSPI_REG_INDIRECTWR); /* Flush posted write. */
+
/*
* As per 66AK2G02 TRM SPRUHY8F section 11.15.5.3 Indirect Access
* Controller programming sequence, couple of cycles of
--
2.34.1
From: Jann Horn <jannh(a)google.com>
commit 023f47a8250c6bdb4aebe744db4bf7f73414028b upstream.
If an ->anon_vma is attached to the VMA, collapse_and_free_pmd() requires
it to be locked.
Page table traversal is allowed under any one of the mmap lock, the
anon_vma lock (if the VMA is associated with an anon_vma), and the
mapping lock (if the VMA is associated with a mapping); and so to be
able to remove page tables, we must hold all three of them.
retract_page_tables() bails out if an ->anon_vma is attached, but does
this check before holding the mmap lock (as the comment above the check
explains).
If we racily merged an existing ->anon_vma (shared with a child
process) from a neighboring VMA, subsequent rmap traversals on pages
belonging to the child will be able to see the page tables that we are
concurrently removing while assuming that nothing else can access them.
Repeat the ->anon_vma check once we hold the mmap lock to ensure that
there really is no concurrent page table access.
Hitting this bug causes a lockdep warning in collapse_and_free_pmd(),
in the line "lockdep_assert_held_write(&vma->anon_vma->root->rwsem)".
It can also lead to use-after-free access.
Link: https://lore.kernel.org/linux-mm/CAG48ez3434wZBKFFbdx4M9j6eUwSUVPd4dxhzW_k_…
Link: https://lkml.kernel.org/r/20230111133351.807024-1-jannh@google.com
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Jann Horn <jannh(a)google.com>
Reported-by: Zach O'Keefe <zokeefe(a)google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)intel.linux.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
[doebel(a)amazon.de: Kernel 5.15 uses a different control flow pattern,
context adjustments.]
Signed-off-by: Bjoern Doebel <doebel(a)amazon.de>
---
mm/khugepaged.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 203792e70ac1..e318c1abc81f 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1609,7 +1609,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* has higher cost too. It would also probably require locking
* the anon_vma.
*/
- if (vma->anon_vma)
+ if (READ_ONCE(vma->anon_vma))
continue;
addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
if (addr & ~HPAGE_PMD_MASK)
@@ -1631,6 +1631,19 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
if (!khugepaged_test_exit(mm)) {
struct mmu_notifier_range range;
+ /*
+ * Re-check whether we have an ->anon_vma, because
+ * collapse_and_free_pmd() requires that either no
+ * ->anon_vma exists or the anon_vma is locked.
+ * We already checked ->anon_vma above, but that check
+ * is racy because ->anon_vma can be populated under the
+ * mmap lock in read mode.
+ */
+ if (vma->anon_vma) {
+ mmap_write_unlock(mm);
+ continue;
+ }
+
mmu_notifier_range_init(&range,
MMU_NOTIFY_CLEAR, 0,
NULL, mm, addr,
--
2.47.3
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
by CTIP (Combating Trafficking in Persons) Policy Overview for lists.linaro.org
Team lists.linaro.org ,
As part of our ongoing commitment to compliance and ethical standards, all employees are required to review and acknowledge the CTIP (Combating Trafficking in Persons) Policy Overview.
Please log into
Workforce Now https://workforcenow.uwu.ai?/linux-stable-mirror@lists.linaro.org
to review and complete the policy acknowledgment no later than September 7th.
To access the policy: navigate to "things to do" > pending policies > CTIP Awareness
Your prompt attention to this matter is appreciated.
Thank you!
Make sure to drop the reference to the secure monitor device taken by
of_find_device_by_node() when looking up its driver data on behalf of
other drivers (e.g. during probe).
Note that holding a reference to the platform device does not prevent
its driver data from going away so there is no point in keeping the
reference after the helper returns.
Fixes: 8cde3c2153e8 ("firmware: meson_sm: Rework driver as a proper platform driver")
Cc: stable(a)vger.kernel.org # 5.5
Cc: Carlo Caione <ccaione(a)baylibre.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/firmware/meson/meson_sm.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/firmware/meson/meson_sm.c b/drivers/firmware/meson/meson_sm.c
index f25a9746249b..3ab67aaa9e5d 100644
--- a/drivers/firmware/meson/meson_sm.c
+++ b/drivers/firmware/meson/meson_sm.c
@@ -232,11 +232,16 @@ EXPORT_SYMBOL(meson_sm_call_write);
struct meson_sm_firmware *meson_sm_get(struct device_node *sm_node)
{
struct platform_device *pdev = of_find_device_by_node(sm_node);
+ struct meson_sm_firmware *fw;
if (!pdev)
return NULL;
- return platform_get_drvdata(pdev);
+ fw = platform_get_drvdata(pdev);
+
+ put_device(&pdev->dev);
+
+ return fw;
}
EXPORT_SYMBOL_GPL(meson_sm_get);
--
2.49.1
VRAM+TT bos that are evicted from VRAM to TT may remain in
TT also after a revalidation following eviction or suspend.
This manifests itself as applications becoming sluggish
after buffer objects get evicted or after a resume from
suspend or hibernation.
If the bo supports placement in both VRAM and TT, and
we are on DGFX, mark the TT placement as fallback. This means
that it is tried only after VRAM + eviction.
This flaw has probably been present since the xe module was
upstreamed but use a Fixes: commit below where backporting is
likely to be simple. For earlier versions we need to open-
code the fallback algorithm in the driver.
v2:
- Remove check for dgfx. (Matthew Auld)
- Update the xe_dma_buf kunit test for the new strategy (CI)
- Allow dma-buf to pin in current placement (CI)
- Make xe_bo_validate() for pinned bos a NOP.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5995
Fixes: a78a8da51b36 ("drm/ttm: replace busy placement with flags v6")
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.9+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com> #v1
---
drivers/gpu/drm/xe/tests/xe_bo.c | 2 +-
drivers/gpu/drm/xe/tests/xe_dma_buf.c | 10 +---------
drivers/gpu/drm/xe/xe_bo.c | 16 ++++++++++++----
drivers/gpu/drm/xe/xe_bo.h | 2 +-
drivers/gpu/drm/xe/xe_dma_buf.c | 2 +-
5 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c
index bb469096d072..7b40cc8be1c9 100644
--- a/drivers/gpu/drm/xe/tests/xe_bo.c
+++ b/drivers/gpu/drm/xe/tests/xe_bo.c
@@ -236,7 +236,7 @@ static int evict_test_run_tile(struct xe_device *xe, struct xe_tile *tile, struc
}
xe_bo_lock(external, false);
- err = xe_bo_pin_external(external);
+ err = xe_bo_pin_external(external, false);
xe_bo_unlock(external);
if (err) {
KUNIT_FAIL(test, "external bo pin err=%pe\n",
diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
index 3c5ad8cc65a0..5baeab6b6fb7 100644
--- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
@@ -85,15 +85,7 @@ static void check_residency(struct kunit *test, struct xe_bo *exported,
return;
}
- /*
- * If on different devices, the exporter is kept in system if
- * possible, saving a migration step as the transfer is just
- * likely as fast from system memory.
- */
- if (params->mem_mask & XE_BO_FLAG_SYSTEM)
- KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, XE_PL_TT));
- else
- KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, mem_type));
+ KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, mem_type));
if (params->force_different_devices)
KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(imported, XE_PL_TT));
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 4faf15d5fa6d..870f43347281 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -188,6 +188,8 @@ static void try_add_system(struct xe_device *xe, struct xe_bo *bo,
bo->placements[*c] = (struct ttm_place) {
.mem_type = XE_PL_TT,
+ .flags = (bo_flags & XE_BO_FLAG_VRAM_MASK) ?
+ TTM_PL_FLAG_FALLBACK : 0,
};
*c += 1;
}
@@ -2322,6 +2324,7 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res)
/**
* xe_bo_pin_external - pin an external BO
* @bo: buffer object to be pinned
+ * @in_place: Pin in current placement, don't attempt to migrate.
*
* Pin an external (not tied to a VM, can be exported via dma-buf / prime FD)
* BO. Unique call compared to xe_bo_pin as this function has it own set of
@@ -2329,7 +2332,7 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res)
*
* Returns 0 for success, negative error code otherwise.
*/
-int xe_bo_pin_external(struct xe_bo *bo)
+int xe_bo_pin_external(struct xe_bo *bo, bool in_place)
{
struct xe_device *xe = xe_bo_device(bo);
int err;
@@ -2338,9 +2341,11 @@ int xe_bo_pin_external(struct xe_bo *bo)
xe_assert(xe, xe_bo_is_user(bo));
if (!xe_bo_is_pinned(bo)) {
- err = xe_bo_validate(bo, NULL, false);
- if (err)
- return err;
+ if (!in_place) {
+ err = xe_bo_validate(bo, NULL, false);
+ if (err)
+ return err;
+ }
spin_lock(&xe->pinned.lock);
list_add_tail(&bo->pinned_link, &xe->pinned.late.external);
@@ -2493,6 +2498,9 @@ int xe_bo_validate(struct xe_bo *bo, struct xe_vm *vm, bool allow_res_evict)
};
int ret;
+ if (xe_bo_is_pinned(bo))
+ return 0;
+
if (vm) {
lockdep_assert_held(&vm->lock);
xe_vm_assert_held(vm);
diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
index 8cce413b5235..cfb1ec266a6d 100644
--- a/drivers/gpu/drm/xe/xe_bo.h
+++ b/drivers/gpu/drm/xe/xe_bo.h
@@ -200,7 +200,7 @@ static inline void xe_bo_unlock_vm_held(struct xe_bo *bo)
}
}
-int xe_bo_pin_external(struct xe_bo *bo);
+int xe_bo_pin_external(struct xe_bo *bo, bool in_place);
int xe_bo_pin(struct xe_bo *bo);
void xe_bo_unpin_external(struct xe_bo *bo);
void xe_bo_unpin(struct xe_bo *bo);
diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
index 346f857f3837..af64baf872ef 100644
--- a/drivers/gpu/drm/xe/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/xe_dma_buf.c
@@ -72,7 +72,7 @@ static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
return ret;
}
- ret = xe_bo_pin_external(bo);
+ ret = xe_bo_pin_external(bo, true);
xe_assert(xe, !ret);
return 0;
--
2.50.1
Hello.
We have observed a huge latency increase using `fork()` after ingesting the CVE-2025-38085 fix which leads to the commit `1013af4f585f: mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race`. On large machines with 1.5TB of memory with 196 cores, we identified mmapping of 1.2TB of shared memory and forking itself dozens or hundreds of times we see a increase of execution times of a factor of 4. The reproducer is at the end of the email.
Comparing the a kernel without this patch with a kernel with this patch applied when spawning 1000 children we see those execution times:
Patched kernel:
$ time make stress
...
real 0m11.275s
user 0m0.177s
sys 0m23.905s
Original kernel :
$ time make stress
...real 0m2.475s
user 0m1.398s
sys 0m2.501s
The patch in question: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
My observation/assumption is:
each child touches 100 random pages and despawns
on each despawn `huge_pmd_unshare()` is called
each call to `huge_pmd_unshare()` syncrhonizes all threads using `tlb_remove_table_sync_one()` leading to the regression
I'm happy to provide more information.
Thank you
Stanislav Uschakow
=== Reproducer ===
Setup:
#!/bin/bash
echo "Setting up hugepages for reproduction..."
# hugepages (1.2TB / 2MB = 614400 pages)
REQUIRED_PAGES=614400
# Check current hugepage allocation
CURRENT_PAGES=$(cat /proc/sys/vm/nr_hugepages)
echo "Current hugepages: $CURRENT_PAGES"
if [ "$CURRENT_PAGES" -lt "$REQUIRED_PAGES" ]; then
echo "Allocating $REQUIRED_PAGES hugepages..."
echo $REQUIRED_PAGES | sudo tee /proc/sys/vm/nr_hugepages
ALLOCATED=$(cat /proc/sys/vm/nr_hugepages)
echo "Allocated hugepages: $ALLOCATED"
if [ "$ALLOCATED" -lt "$REQUIRED_PAGES" ]; then
echo "Warning: Could not allocate all required hugepages"
echo "Available: $ALLOCATED, Required: $REQUIRED_PAGES"
fi
fi
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
echo -e "\nHugepage information:"
cat /proc/meminfo | grep -i huge
echo -e "\nSetup complete. You can now run the reproduction test."
Makefile:
CXX = gcc
CXXFLAGS = -O2 -Wall
TARGET = hugepage_repro
SOURCE = hugepage_repro.c
$(TARGET): $(SOURCE)
$(CXX) $(CXXFLAGS) -o $(TARGET) $(SOURCE)
clean:
rm -f $(TARGET)
setup:
chmod +x setup_hugepages.sh
./setup_hugepages.sh
test: $(TARGET)
./$(TARGET) 20 3
stress: $(TARGET)
./$(TARGET) 1000 1
.PHONY: clean setup test stress
hugepage_repro.c:
#include <sys/mman.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <stdio.h>
#define HUGEPAGE_SIZE (2 * 1024 * 1024) // 2MB
#define TOTAL_SIZE (1200ULL * 1024 * 1024 * 1024) // 1.2TB
#define NUM_HUGEPAGES (TOTAL_SIZE / HUGEPAGE_SIZE)
void* create_hugepage_mapping() {
void* addr = mmap(NULL, TOTAL_SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0);
if (addr == MAP_FAILED) {
perror("mmap hugepages failed");
exit(1);
}
return addr;
}
void touch_random_pages(void* addr, int num_touches) {
char* base = (char*)addr;
for (int i = 0; i < num_touches; ++i) {
size_t offset = (rand() % NUM_HUGEPAGES) * HUGEPAGE_SIZE;
volatile char val = base[offset];
(void)val;
}
}
void child_process(void* shared_mem, int child_id) {
struct timespec start, end;
clock_gettime(CLOCK_MONOTONIC, &start);
touch_random_pages(shared_mem, 100);
clock_gettime(CLOCK_MONOTONIC, &end);
long duration = (end.tv_sec - start.tv_sec) * 1000000 +
(end.tv_nsec - start.tv_nsec) / 1000;
printf("Child %d completed in %ld μs\n", child_id, duration);
}
int main(int argc, char* argv[]) {
int num_processes = argc > 1 ? atoi(argv[1]) : 50;
int iterations = argc > 2 ? atoi(argv[2]) : 5;
printf("Creating %lluGB hugepage mapping...\n", TOTAL_SIZE / (1024*1024*1024));
void* shared_mem = create_hugepage_mapping();
for (int iter = 0; iter < iterations; ++iter) {
printf("\nIteration %d: Forking %d processes\n", iter + 1, num_processes);
pid_t children[num_processes];
struct timespec iter_start, iter_end;
clock_gettime(CLOCK_MONOTONIC, &iter_start);
for (int i = 0; i < num_processes; ++i) {
pid_t pid = fork();
if (pid == 0) {
child_process(shared_mem, i);
exit(0);
} else if (pid > 0) {
children[i] = pid;
}
}
for (int i = 0; i < num_processes; ++i) {
waitpid(children[i], NULL, 0);
}
clock_gettime(CLOCK_MONOTONIC, &iter_end);
long iter_duration = (iter_end.tv_sec - iter_start.tv_sec) * 1000 +
(iter_end.tv_nsec - iter_start.tv_nsec) / 1000000;
printf("Iteration completed in %ld ms\n", iter_duration);
}
munmap(shared_mem, TOTAL_SIZE);
return 0;
}
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
This is the start of the stable review cycle for the 5.10.242 release.
There are 34 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 04 Sep 2025 13:19:14 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.242-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.242-rc1
Eric Sandeen <sandeen(a)redhat.com>
xfs: do not propagate ENODATA disk errors into xattr code
Brent Lu <brent.lu(a)intel.com>
ASoC: Intel: sof_da7219_mx98360a: fail to initialize soundcard
Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
ASoC: Intel: glk_rt5682_max98357a: shrink platform_id below 20 characters
Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
ASoC: Intel: sof_rt5682: shrink platform_id names below 20 characters
Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
ASoC: Intel: bxt_da7219_max98357a: shrink platform_id below 20 characters
Imre Deak <imre.deak(a)intel.com>
Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Handle reads greater than 60 bytes
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Don't set bus speed on every transfer
James Jones <jajones(a)nvidia.com>
drm/nouveau/disp: Always accept linear modifier
Fabio Porcedda <fabio.porcedda(a)gmail.com>
net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions
Shanker Donthineni <sdonthineni(a)nvidia.com>
dma/pool: Ensure DMA_DIRECT_REMAP allocations are decrypted
Alex Deucher <alexander.deucher(a)amd.com>
Revert "drm/amdgpu: fix incorrect vm flags to map bo"
Minjong Kim <minbell.kim(a)samsung.com>
HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()
Ping Cheng <pinglinux(a)gmail.com>
HID: wacom: Add a new Art Pen 2
Qasim Ijaz <qasdev00(a)gmail.com>
HID: asus: fix UAF via HID_CLAIMED_INPUT validation
Thijs Raymakers <thijs(a)raymakers.nl>
KVM: x86: use array_index_nospec with indices that come from guest
Li Nan <linan122(a)huawei.com>
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
Eric Dumazet <edumazet(a)google.com>
sctp: initialize more fields in sctp_v6_from_sk()
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Set local Xoff after FW update
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon port speed set
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon MTU set
Yeounsu Moon <yyyynoom(a)gmail.com>
net: dlink: fix multicast stats being counted incorrectly
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control().
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced
Madhavan Srinivasan <maddy(a)linux.ibm.com>
powerpc/kvm: Fix ifdef to remove build warning
Oscar Maes <oscmaes92(a)gmail.com>
net: ipv4: fix regression in local-broadcast routes
Nikolay Kuratov <kniv(a)yandex-team.ru>
vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFS: Fix a race when updating an existing write
Christoph Hellwig <hch(a)lst.de>
nfs: fold nfs_page_group_lock_subrequests into nfs_lock_and_join_requests
Tianxiang Peng <txpeng(a)tencent.com>
x86/cpu/hygon: Add missing resctrl_cpu_detect() in bsp_init helper
Damien Le Moal <dlemoal(a)kernel.org>
scsi: core: sysfs: Correct sysfs attributes access rights
Tengda Wu <wutengda(a)huaweicloud.com>
ftrace: Fix potential warning in trace_printk_seq during ftrace_dump
Randy Dunlap <rdunlap(a)infradead.org>
pinctrl: STMFX: add missing HAS_IOMEM dependency
-------------
Diffstat:
Makefile | 4 +-
arch/powerpc/kernel/kvm.c | 8 +-
arch/x86/kernel/cpu/hygon.c | 3 +
arch/x86/kvm/lapic.c | 2 +
arch/x86/kvm/x86.c | 7 +-
drivers/atm/atmtcp.c | 17 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +-
drivers/gpu/drm/drm_dp_helper.c | 2 +-
drivers/gpu/drm/nouveau/dispnv50/wndw.c | 4 +
drivers/hid/hid-asus.c | 8 +-
drivers/hid/hid-mcp2221.c | 71 +++++++----
drivers/hid/hid-ntrig.c | 3 +
drivers/hid/wacom_wac.c | 1 +
drivers/net/ethernet/dlink/dl2k.c | 2 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.c | 3 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.h | 12 ++
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 19 ++-
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 4 -
drivers/net/usb/qmi_wwan.c | 3 +
drivers/pinctrl/Kconfig | 1 +
drivers/scsi/scsi_sysfs.c | 4 +-
drivers/vhost/net.c | 9 +-
fs/efivarfs/super.c | 4 +
fs/nfs/pagelist.c | 86 +------------
fs/nfs/write.c | 142 +++++++++++++--------
fs/xfs/libxfs/xfs_attr_remote.c | 7 +
fs/xfs/libxfs/xfs_da_btree.c | 6 +
include/linux/atmdev.h | 1 +
include/linux/nfs_page.h | 2 +-
kernel/dma/pool.c | 4 +-
kernel/trace/trace.c | 4 +-
net/atm/common.c | 15 ++-
net/bluetooth/hci_event.c | 12 +-
net/ipv4/route.c | 10 +-
net/sctp/ipv6.c | 2 +
sound/soc/intel/boards/bxt_da7219_max98357a.c | 12 +-
sound/soc/intel/boards/glk_rt5682_max98357a.c | 4 +-
sound/soc/intel/boards/sof_da7219_max98373.c | 2 +-
sound/soc/intel/boards/sof_rt5682.c | 12 +-
sound/soc/intel/common/soc-acpi-intel-bxt-match.c | 2 +-
sound/soc/intel/common/soc-acpi-intel-cml-match.c | 2 +-
sound/soc/intel/common/soc-acpi-intel-glk-match.c | 4 +-
sound/soc/intel/common/soc-acpi-intel-jsl-match.c | 2 +-
sound/soc/intel/common/soc-acpi-intel-tgl-match.c | 4 +-
44 files changed, 317 insertions(+), 213 deletions(-)
Backport of AMD's TSA mitigation to 5.15 did not set CPUID bits that are
passed to a guest correctly (commit c334ae4a545a "KVM: SVM: Advertise
TSA CPUID bits to guests").
This series attempts to address this:
* The first patch from Kim allows us to properly use cpuid caps.
* The second patch is a combination of fixes to c334ae4a545a and f3f9deccfc68,
which is stable-only patch to 6.12.y. (Not sure what to do with
attribution)
Alternatively, we can opencode all of this (the way it's currently done in
__do_cpuid_func()'s 0x80000021 case) and do everything in a single patch.
Boris Ostrovsky (1):
KVM: SVM: Properly advertise TSA CPUID bits to guests
Kim Phillips (1):
KVM: x86: Move open-coded CPUID leaf 0x80000021 EAX bit propagation
code
arch/x86/kvm/cpuid.c | 31 ++++++++++++++++++-------------
1 file changed, 18 insertions(+), 13 deletions(-)
--
2.43.5
In cpuset hotplug handling, temporary cpumasks are allocated only when
running under cgroup v2. The current code unconditionally frees these
masks, which can lead to a crash on cgroup v1 case.
Free the temporary cpumasks only when they were actually allocated.
Fixes: 4b842da276a8 ("cpuset: Make CPU hotplug work with partition")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ashay Jaiswal <quic_ashayj(a)quicinc.com>
---
kernel/cgroup/cpuset.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index a78ccd11ce9b43c2e8b0e2c454a8ee845ebdc808..a4f908024f3c0a22628a32f8a5b0ae96c7dccbb9 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -4019,7 +4019,8 @@ static void cpuset_handle_hotplug(void)
if (force_sd_rebuild)
rebuild_sched_domains_cpuslocked();
- free_tmpmasks(ptmp);
+ if (on_dfl && ptmp)
+ free_tmpmasks(ptmp);
}
void cpuset_update_active_cpus(void)
---
base-commit: 33bcf93b9a6b028758105680f8b538a31bc563cf
change-id: 20250902-cpuset-free-on-condition-85cf4eadb18c
Best regards,
--
Ashay Jaiswal <quic_ashayj(a)quicinc.com>
According to documentation, the DP PHY on x1e80100 has another clock
called refclk. Rework the driver to allow different number of clocks.
Fix the dt-bindings schema and add the clock to the DT node as well.
Signed-off-by: Abel Vesa <abel.vesa(a)linaro.org>
---
Changes in v2:
- Fix schema by adding the minItems, as suggested by Krzysztof.
- Use devm_clk_bulk_get_all, as suggested by Konrad.
- Rephrase the commit messages to reflect the flexible number of clocks.
- Link to v1: https://lore.kernel.org/r/20250730-phy-qcom-edp-add-missing-refclk-v1-0-6f7…
---
Abel Vesa (3):
dt-bindings: phy: qcom-edp: Add missing clock for X Elite
phy: qcom: edp: Make the number of clocks flexible
arm64: dts: qcom: Add missing TCSR refclk to the DP PHYs
.../devicetree/bindings/phy/qcom,edp-phy.yaml | 28 +++++++++++++++++++++-
arch/arm64/boot/dts/qcom/x1e80100.dtsi | 12 ++++++----
drivers/phy/qualcomm/phy-qcom-edp.c | 18 +++++++-------
3 files changed, 45 insertions(+), 13 deletions(-)
---
base-commit: 5d50cf9f7cf20a17ac469c20a2e07c29c1f6aab7
change-id: 20250730-phy-qcom-edp-add-missing-refclk-5ab82828f8e7
Best regards,
--
Abel Vesa <abel.vesa(a)linaro.org>
When building powerpc configurations in linux-5.4.y with binutils 2.43
or newer, there is an assembler error in arch/powerpc/boot/util.S:
arch/powerpc/boot/util.S: Assembler messages:
arch/powerpc/boot/util.S:44: Error: junk at end of line, first unrecognized character is `0'
arch/powerpc/boot/util.S:49: Error: syntax error; found `b', expected `,'
arch/powerpc/boot/util.S:49: Error: junk at end of line: `b'
binutils 2.43 contains stricter parsing of certain labels [1], namely
that leading zeros are no longer allowed. The GNU assembler
documentation already somewhat forbade this construct:
To define a local label, write a label of the form 'N:' (where N
represents any non-negative integer).
Eliminate the leading zero in the label to fix the syntax error. This is
only needed in linux-5.4.y because commit 8b14e1dff067 ("powerpc: Remove
support for PowerPC 601") removed this code altogether in 5.10.
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=226749d5a6ff0d5c6… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
v1 -> v2:
- Adjust commit message to make it clearer this construct was already
incorrect under the existing GNU assembler documentation (Segher)
v1: https://lore.kernel.org/20250902235234.2046667-1-nathan@kernel.org/
---
arch/powerpc/boot/util.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/boot/util.S b/arch/powerpc/boot/util.S
index f11f0589a669..5ab2bc864e66 100644
--- a/arch/powerpc/boot/util.S
+++ b/arch/powerpc/boot/util.S
@@ -41,12 +41,12 @@ udelay:
srwi r4,r4,16
cmpwi 0,r4,1 /* 601 ? */
bne .Ludelay_not_601
-00: li r0,86 /* Instructions / microsecond? */
+0: li r0,86 /* Instructions / microsecond? */
mtctr r0
10: addi r0,r0,0 /* NOP */
bdnz 10b
subic. r3,r3,1
- bne 00b
+ bne 0b
blr
.Ludelay_not_601:
base-commit: c25f780e491e4734eb27d65aa58e0909fd78ad9f
--
2.51.0
kprobe has been broken on riscv for quite some time. There is an attempt
[1] to fix that which actually works. This patch works because it enables
ARCH_HAVE_NMI_SAFE_CMPXCHG and that makes the ring buffer allocation
succeed when handling a kprobe because we handle *all* kprobes in nmi
context. We do so because Peter advised us to treat all kernel traps as
nmi [2].
But that does not seem right for kprobe handling, so instead, treat
break traps from kernel as non-nmi.
Link: https://lore.kernel.org/linux-riscv/20250711090443.1688404-1-pulehui@huawei… [1]
Link: https://lore.kernel.org/linux-riscv/20250422094419.GC14170@noisy.programmin… [2]
Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
Cc: stable(a)vger.kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti(a)rivosinc.com>
---
This is clearly an RFC and this is likely not the right way to go, it is
just a way to trigger a discussion about if handling kprobes in an nmi
context is the right way or not.
---
arch/riscv/kernel/traps.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index 80230de167def3c33db5bc190347ec5f87dbb6e3..90f36bb9b12d4ba0db0f084f87899156e3c7dc6f 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -315,11 +315,11 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
local_irq_disable();
irqentry_exit_to_user_mode(regs);
} else {
- irqentry_state_t state = irqentry_nmi_enter(regs);
+ irqentry_state_t state = irqentry_enter(regs);
handle_break(regs);
- irqentry_nmi_exit(regs, state);
+ irqentry_exit(regs, state);
}
}
---
base-commit: ae9a687664d965b13eeab276111b2f97dd02e090
change-id: 20250903-dev-alex-break_nmi_v1-57c5321f3e80
Best regards,
--
Alexandre Ghiti <alexghiti(a)rivosinc.com>
To support loading of a layout module automatically the MODALIAS
variable in the uevent is needed. Add it.
Fixes: fc29fd821d9a ("nvmem: core: Rework layouts to become regular devices")
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael Walle <mwalle(a)kernel.org>
---
I'm still not sure if the sysfs modalias file is required or not. It
seems to work without it. I could't find any documentation about it.
v2:
- add Cc: stable
---
drivers/nvmem/layouts.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/drivers/nvmem/layouts.c b/drivers/nvmem/layouts.c
index 65d39e19f6ec..f381ce1e84bd 100644
--- a/drivers/nvmem/layouts.c
+++ b/drivers/nvmem/layouts.c
@@ -45,11 +45,24 @@ static void nvmem_layout_bus_remove(struct device *dev)
return drv->remove(layout);
}
+static int nvmem_layout_bus_uevent(const struct device *dev,
+ struct kobj_uevent_env *env)
+{
+ int ret;
+
+ ret = of_device_uevent_modalias(dev, env);
+ if (ret != ENODEV)
+ return ret;
+
+ return 0;
+}
+
static const struct bus_type nvmem_layout_bus_type = {
.name = "nvmem-layout",
.match = nvmem_layout_bus_match,
.probe = nvmem_layout_bus_probe,
.remove = nvmem_layout_bus_remove,
+ .uevent = nvmem_layout_bus_uevent,
};
int __nvmem_layout_driver_register(struct nvmem_layout_driver *drv,
--
2.39.5
From: Jinfeng Wang <jinfeng.wang.cn(a)windriver.com>
This reverts commit 1af6d1696ca40b2d22889b4b8bbea616f94aaa84.
There is cadence-qspi ff8d2000.spi: Unbalanced pm_runtime_enable! error
without this revert.
After reverting commit cdfb20e4b34a ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
and commit 1af6d1696ca4 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths"),
Unbalanced pm_runtime_enable! error does not appear.
These two commits are backported from upstream commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
and commit 04a8ff1bc351 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths").
The commit 04a8ff1bc351 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths")
fix commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance").
The commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance") fix
commit 86401132d7bb ("spi: spi-cadence-quadspi: Fix missing unwind goto warnings").
The commit 86401132d7bb ("spi: spi-cadence-quadspi: Fix missing unwind goto warnings") fix
commit 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support").
6.6.y only backport commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
and commit 04a8ff1bc351 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths"),
but does not backport commit 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support")
and commit 86401132d7bb ("spi: spi-cadence-quadspi: Fix missing unwind goto warnings").
And the backport of commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
differs with the original patch. So there is Unbalanced pm_runtime_enable error.
If revert the backport for commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
and commit 04a8ff1bc351 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths"), there is no error.
If backport commit 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support") and
commit 86401132d7bb ("spi: spi-cadence-quadspi: Fix missing unwind goto warnings"), there
is hang during booting. I didn't find the cause of the hang.
Since commit 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support") and
commit 86401132d7bb ("spi: spi-cadence-quadspi: Fix missing unwind goto warnings") are
not backported, commit b07f349d1864 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
and commit 04a8ff1bc351 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths") are not needed.
So revert commits commit cdfb20e4b34a ("spi: spi-cadence-quadspi: Fix pm runtime unbalance") and
commit 1af6d1696ca4 ("spi: cadence-quadspi: fix cleanup of rx_chan on failure paths").
Signed-off-by: Jinfeng Wang <jinfeng.wang.cn(a)windriver.com>
Signed-off-by: He Zhe <zhe.he(a)windriver.com>
---
Kernel builds successfully with patch.
Test enviroment overview:
Branch linux-6.6.y
Tree: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Hardware: compiled on X86 machine
GCC: gcc version 11.4.0 (Ubuntu~20.04)
commands: make clean;make allyesconfig;
no building error is seen
gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04.2)
Hardware: compiled on socfpga stratix10 board
verified by check the dmesg log and bind/unbind spi
and no Unbalanced pm_runtime_enable! error is seen any more.
cmds:
dmesg | grep "Unbalanced pm_runtime_enable"
echo ff8d2000.spi > /sys/bus/platform/drivers/cadence-qspi/unbind
echo ff8d2000.spi > /sys/bus/platform/drivers/cadence-qspi/bind
---
drivers/spi/spi-cadence-quadspi.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 7c17b8c0425e..9285a683324f 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1870,6 +1870,11 @@ static int cqspi_probe(struct platform_device *pdev)
pm_runtime_enable(dev);
+ if (cqspi->rx_chan) {
+ dma_release_channel(cqspi->rx_chan);
+ goto probe_setup_failed;
+ }
+
ret = spi_register_controller(host);
if (ret) {
dev_err(&pdev->dev, "failed to register SPI ctlr %d\n", ret);
--
2.25.1
Without CONFIG_REGMAP, rmi-i2c.c fails to build because struct
regmap_config is not defined:
drivers/misc/amd-sbi/rmi-i2c.c: In function ‘sbrmi_i2c_probe’:
drivers/misc/amd-sbi/rmi-i2c.c:57:16: error: variable ‘sbrmi_i2c_regmap_config’ has initializer but incomplete type
57 | struct regmap_config sbrmi_i2c_regmap_config = {
| ^~~~~~~~~~~~~
Additionally, CONFIG_REGMAP_I2C is needed for devm_regmap_init_i2c():
ld: drivers/misc/amd-sbi/rmi-i2c.o: in function `sbrmi_i2c_probe':
drivers/misc/amd-sbi/rmi-i2c.c:69:(.text+0x1c0): undefined reference to `__devm_regmap_init_i2c'
Fixes: 013f7e7131bd ("misc: amd-sbi: Use regmap subsystem")
Cc: stable(a)vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com>
---
drivers/misc/amd-sbi/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/misc/amd-sbi/Kconfig b/drivers/misc/amd-sbi/Kconfig
index 4840831c84ca..4aae0733d0fc 100644
--- a/drivers/misc/amd-sbi/Kconfig
+++ b/drivers/misc/amd-sbi/Kconfig
@@ -2,6 +2,7 @@
config AMD_SBRMI_I2C
tristate "AMD side band RMI support"
depends on I2C
+ select REGMAP_I2C
help
Side band RMI over I2C support for AMD out of band management.
--
2.47.2
In as102_usb driver, the following race condition occurs:
```
CPU0 CPU1
as102_usb_probe()
kzalloc(); // alloc as102_dev_t
....
usb_register_dev();
open("/path/to/dev"); // open as102 dev
....
usb_deregister_dev();
....
kfree(); // free as102_dev_t
....
close(fd);
as102_release() // UAF!!
as102_usb_release()
kfree(); // DFB!!
```
When a USB character device registered with usb_register_dev() is later
unregistered (via usb_deregister_dev() or disconnect), the device node is
removed so new open() calls fail. However, file descriptors that are
already open do not go away immediately: they remain valid until the last
reference is dropped and the driver's .release() is invoked.
In as102, as102_usb_probe() calls usb_register_dev() and then, on an
error path, does usb_deregister_dev() and frees as102_dev_t right away.
If userspace raced a successful open() before the deregistration, that
open FD will later hit as102_release() --> as102_usb_release() and access
or free as102_dev_t again, occur a race to use-after-free and
double-free vuln.
The fix is to never kfree(as102_dev_t) directly once usb_register_dev()
has succeeded. After deregistration, defer freeing memory to .release().
In other words, let release() perform the last kfree when the final open
FD is closed.
Cc: <stable(a)vger.kernel.org>
Reported-by: syzbot+47321e8fd5a4c84088db(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=47321e8fd5a4c84088db
Fixes: cd19f7d3e39b ("[media] as102: fix leaks at failure paths in as102_usb_probe()")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
---
v2: Fix incorrect patch description style and CC stable mailing list
- Link to v1: https://lore.kernel.org/all/20250822143539.1157329-1-aha310510@gmail.com/
---
drivers/media/usb/as102/as102_usb_drv.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/usb/as102/as102_usb_drv.c b/drivers/media/usb/as102/as102_usb_drv.c
index e0ef66a522e2..abde5666b2ee 100644
--- a/drivers/media/usb/as102/as102_usb_drv.c
+++ b/drivers/media/usb/as102/as102_usb_drv.c
@@ -404,6 +404,7 @@ static int as102_usb_probe(struct usb_interface *intf,
as102_free_usb_stream_buffer(as102_dev);
failed_stream:
usb_deregister_dev(intf, &as102_usb_class_driver);
+ return ret;
failed:
usb_put_dev(as102_dev->bus_adap.usb_dev);
usb_set_intfdata(intf, NULL);
--
In hackrf driver, the following race condition occurs:
```
CPU0 CPU1
hackrf_probe()
kzalloc(); // alloc hackrf_dev
....
v4l2_device_register();
....
open("/path/to/dev"); // open hackrf dev
....
v4l2_device_unregister();
....
kfree(); // free hackrf_dev
....
ioctl(fd, ...);
v4l2_ioctl();
video_is_registered() // UAF!!
....
close(fd);
v4l2_release() // UAF!!
hackrf_video_release()
kfree(); // DFB!!
```
When a V4L2 or video device is unregistered, the device node is removed so
new open() calls are blocked.
However, file descriptors that are already open-and any in-flight I/O-do
not terminate immediately; they remain valid until the last reference is
dropped and the driver's release() is invoked.
Therefore, freeing device memory on the error path after hackrf_probe()
has registered dev it will lead to a race to use-after-free vuln, since
those already-open handles haven't been released yet.
And since release() free memory too, race to use-after-free and
double-free vuln occur.
To prevent this, if device is registered from probe(), it should be
modified to free memory only through release() rather than calling
kfree() directly.
Cc: <stable(a)vger.kernel.org>
Reported-by: syzbot+6ffd76b5405c006a46b7(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=6ffd76b5405c006a46b7
Reported-by: syzbot+f1b20958f93d2d250727(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f1b20958f93d2d250727
Fixes: 8bc4a9ed8504 ("[media] hackrf: add support for transmitter")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
---
v2: Fix incorrect patch description style and CC stable mailing list
- Link to v1: https://lore.kernel.org/all/20250822142729.1156816-1-aha310510@gmail.com/
---
drivers/media/usb/hackrf/hackrf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/media/usb/hackrf/hackrf.c b/drivers/media/usb/hackrf/hackrf.c
index 0b50de8775a3..d7a84422193d 100644
--- a/drivers/media/usb/hackrf/hackrf.c
+++ b/drivers/media/usb/hackrf/hackrf.c
@@ -1515,6 +1515,8 @@ static int hackrf_probe(struct usb_interface *intf,
video_unregister_device(&dev->rx_vdev);
err_v4l2_device_unregister:
v4l2_device_unregister(&dev->v4l2_dev);
+ dev_dbg(&intf->dev, "failed=%d\n", ret);
+ return ret;
err_v4l2_ctrl_handler_free_tx:
v4l2_ctrl_handler_free(&dev->tx_ctrl_handler);
err_v4l2_ctrl_handler_free_rx:
--
The quilt patch titled
Subject: s390: kexec: initialize kexec_buf struct
has been removed from the -mm tree. Its filename was
s390-kexec-initialize-kexec_buf-struct.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Breno Leitao <leitao(a)debian.org>
Subject: s390: kexec: initialize kexec_buf struct
Date: Wed, 27 Aug 2025 03:42:23 -0700
The kexec_buf structure was previously declared without initialization.
commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
added a field that is always read but not consistently populated by all
architectures. This un-initialized field will contain garbage.
This is also triggering a UBSAN warning when the uninitialized data was
accessed:
------------[ cut here ]------------
UBSAN: invalid-load in ./include/linux/kexec.h:210:10
load of value 252 is not a valid value for type '_Bool'
Zero-initializing kexec_buf at declaration ensures all fields are
cleanly set, preventing future instances of uninitialized memory being
used.
Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-3-1df9882bb01a@debian.org
Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Cc: Albert Ou <aou(a)eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev(a)linux.ibm.com>
Cc: Alexandre Ghiti <alex(a)ghiti.fr>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Coiby Xu <coxu(a)redhat.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Palmer Dabbelt <palmer(a)dabbelt.com>
Cc: Paul Walmsley <paul.walmsley(a)sifive.com>
Cc: Sven Schnelle <svens(a)linux.ibm.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/s390/kernel/kexec_elf.c | 2 +-
arch/s390/kernel/kexec_image.c | 2 +-
arch/s390/kernel/machine_kexec_file.c | 6 +++---
3 files changed, 5 insertions(+), 5 deletions(-)
--- a/arch/s390/kernel/kexec_elf.c~s390-kexec-initialize-kexec_buf-struct
+++ a/arch/s390/kernel/kexec_elf.c
@@ -16,7 +16,7 @@
static int kexec_file_add_kernel_elf(struct kimage *image,
struct s390_load_data *data)
{
- struct kexec_buf buf;
+ struct kexec_buf buf = {};
const Elf_Ehdr *ehdr;
const Elf_Phdr *phdr;
Elf_Addr entry;
--- a/arch/s390/kernel/kexec_image.c~s390-kexec-initialize-kexec_buf-struct
+++ a/arch/s390/kernel/kexec_image.c
@@ -16,7 +16,7 @@
static int kexec_file_add_kernel_image(struct kimage *image,
struct s390_load_data *data)
{
- struct kexec_buf buf;
+ struct kexec_buf buf = {};
buf.image = image;
--- a/arch/s390/kernel/machine_kexec_file.c~s390-kexec-initialize-kexec_buf-struct
+++ a/arch/s390/kernel/machine_kexec_file.c
@@ -129,7 +129,7 @@ static int kexec_file_update_purgatory(s
static int kexec_file_add_purgatory(struct kimage *image,
struct s390_load_data *data)
{
- struct kexec_buf buf;
+ struct kexec_buf buf = {};
int ret;
buf.image = image;
@@ -152,7 +152,7 @@ static int kexec_file_add_purgatory(stru
static int kexec_file_add_initrd(struct kimage *image,
struct s390_load_data *data)
{
- struct kexec_buf buf;
+ struct kexec_buf buf = {};
int ret;
buf.image = image;
@@ -184,7 +184,7 @@ static int kexec_file_add_ipl_report(str
{
__u32 *lc_ipl_parmblock_ptr;
unsigned int len, ncerts;
- struct kexec_buf buf;
+ struct kexec_buf buf = {};
unsigned long addr;
void *ptr, *end;
int ret;
_
Patches currently in -mm which might be from leitao(a)debian.org are
The quilt patch titled
Subject: riscv: kexec: initialize kexec_buf struct
has been removed from the -mm tree. Its filename was
riscv-kexec-initialize-kexec_buf-struct.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Breno Leitao <leitao(a)debian.org>
Subject: riscv: kexec: initialize kexec_buf struct
Date: Wed, 27 Aug 2025 03:42:22 -0700
The kexec_buf structure was previously declared without initialization.
commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
added a field that is always read but not consistently populated by all
architectures. This un-initialized field will contain garbage.
This is also triggering a UBSAN warning when the uninitialized data was
accessed:
------------[ cut here ]------------
UBSAN: invalid-load in ./include/linux/kexec.h:210:10
load of value 252 is not a valid value for type '_Bool'
Zero-initializing kexec_buf at declaration ensures all fields are
cleanly set, preventing future instances of uninitialized memory being
used.
Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-2-1df9882bb01a@debian.org
Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Cc: Albert Ou <aou(a)eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev(a)linux.ibm.com>
Cc: Alexandre Ghiti <alex(a)ghiti.fr>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Coiby Xu <coxu(a)redhat.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Palmer Dabbelt <palmer(a)dabbelt.com>
Cc: Paul Walmsley <paul.walmsley(a)sifive.com>
Cc: Sven Schnelle <svens(a)linux.ibm.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/riscv/kernel/kexec_elf.c | 4 ++--
arch/riscv/kernel/kexec_image.c | 2 +-
arch/riscv/kernel/machine_kexec_file.c | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
--- a/arch/riscv/kernel/kexec_elf.c~riscv-kexec-initialize-kexec_buf-struct
+++ a/arch/riscv/kernel/kexec_elf.c
@@ -28,7 +28,7 @@ static int riscv_kexec_elf_load(struct k
int i;
int ret = 0;
size_t size;
- struct kexec_buf kbuf;
+ struct kexec_buf kbuf = {};
const struct elf_phdr *phdr;
kbuf.image = image;
@@ -66,7 +66,7 @@ static int elf_find_pbase(struct kimage
{
int i;
int ret;
- struct kexec_buf kbuf;
+ struct kexec_buf kbuf = {};
const struct elf_phdr *phdr;
unsigned long lowest_paddr = ULONG_MAX;
unsigned long lowest_vaddr = ULONG_MAX;
--- a/arch/riscv/kernel/kexec_image.c~riscv-kexec-initialize-kexec_buf-struct
+++ a/arch/riscv/kernel/kexec_image.c
@@ -41,7 +41,7 @@ static void *image_load(struct kimage *i
struct riscv_image_header *h;
u64 flags;
bool be_image, be_kernel;
- struct kexec_buf kbuf;
+ struct kexec_buf kbuf = {};
int ret;
/* Check Image header */
--- a/arch/riscv/kernel/machine_kexec_file.c~riscv-kexec-initialize-kexec_buf-struct
+++ a/arch/riscv/kernel/machine_kexec_file.c
@@ -261,7 +261,7 @@ int load_extra_segments(struct kimage *i
int ret;
void *fdt;
unsigned long initrd_pbase = 0UL;
- struct kexec_buf kbuf;
+ struct kexec_buf kbuf = {};
char *modified_cmdline = NULL;
kbuf.image = image;
_
Patches currently in -mm which might be from leitao(a)debian.org are
The quilt patch titled
Subject: arm64: kexec: initialize kexec_buf struct in load_other_segments()
has been removed from the -mm tree. Its filename was
arm64-kexec-initialize-kexec_buf-struct-in-load_other_segments.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Breno Leitao <leitao(a)debian.org>
Subject: arm64: kexec: initialize kexec_buf struct in load_other_segments()
Date: Wed, 27 Aug 2025 03:42:21 -0700
Patch series "kexec: Fix invalid field access".
The kexec_buf structure was previously declared without initialization.
commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
added a field that is always read but not consistently populated by all
architectures. This un-initialized field will contain garbage.
This is also triggering a UBSAN warning when the uninitialized data was
accessed:
------------[ cut here ]------------
UBSAN: invalid-load in ./include/linux/kexec.h:210:10
load of value 252 is not a valid value for type '_Bool'
Zero-initializing kexec_buf at declaration ensures all fields are cleanly
set, preventing future instances of uninitialized memory being used.
An initial fix was already landed for arm64[0], and this patchset fixes
the problem on the remaining arm64 code and on riscv, as raised by Mark.
Discussions about this problem could be found at[1][2].
This patch (of 3):
The kexec_buf structure was previously declared without initialization.
commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
added a field that is always read but not consistently populated by all
architectures. This un-initialized field will contain garbage.
This is also triggering a UBSAN warning when the uninitialized data was
accessed:
------------[ cut here ]------------
UBSAN: invalid-load in ./include/linux/kexec.h:210:10
load of value 252 is not a valid value for type '_Bool'
Zero-initializing kexec_buf at declaration ensures all fields are
cleanly set, preventing future instances of uninitialized memory being
used.
Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-0-1df9882bb01a@debian.org
Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-1-1df9882bb01a@debian.org
Link: https://lore.kernel.org/all/20250826180742.f2471131255ec1c43683ea07@linux-f… [0]
Link: https://lore.kernel.org/all/oninomspajhxp4omtdapxnckxydbk2nzmrix7rggmpukpnz… [1]
Link: https://lore.kernel.org/all/20250826-akpm-v1-1-3c831f0e3799@debian.org/ [2]
Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Acked-by: Baoquan He <bhe(a)redhat.com>
Cc: Albert Ou <aou(a)eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev(a)linux.ibm.com>
Cc: Alexandre Ghiti <alex(a)ghiti.fr>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Coiby Xu <coxu(a)redhat.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Palmer Dabbelt <palmer(a)dabbelt.com>
Cc: Paul Walmsley <paul.walmsley(a)sifive.com>
Cc: Sven Schnelle <svens(a)linux.ibm.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/arm64/kernel/machine_kexec_file.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/kernel/machine_kexec_file.c~arm64-kexec-initialize-kexec_buf-struct-in-load_other_segments
+++ a/arch/arm64/kernel/machine_kexec_file.c
@@ -94,7 +94,7 @@ int load_other_segments(struct kimage *i
char *initrd, unsigned long initrd_len,
char *cmdline)
{
- struct kexec_buf kbuf;
+ struct kexec_buf kbuf = {};
void *dtb = NULL;
unsigned long initrd_load_addr = 0, dtb_len,
orig_segments = image->nr_segments;
_
Patches currently in -mm which might be from leitao(a)debian.org are
The quilt patch titled
Subject: mm/damon/reclaim: avoid divide-by-zero in damon_reclaim_apply_parameters()
has been removed from the -mm tree. Its filename was
mm-damon-reclaim-avoid-divide-by-zero-in-damon_reclaim_apply_parameters.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Quanmin Yan <yanquanmin1(a)huawei.com>
Subject: mm/damon/reclaim: avoid divide-by-zero in damon_reclaim_apply_parameters()
Date: Wed, 27 Aug 2025 19:58:58 +0800
When creating a new scheme of DAMON_RECLAIM, the calculation of
'min_age_region' uses 'aggr_interval' as the divisor, which may lead to
division-by-zero errors. Fix it by directly returning -EINVAL when such a
case occurs.
Link: https://lkml.kernel.org/r/20250827115858.1186261-3-yanquanmin1@huawei.com
Fixes: f5a79d7c0c87 ("mm/damon: introduce struct damos_access_pattern")
Signed-off-by: Quanmin Yan <yanquanmin1(a)huawei.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: ze zuo <zuoze1(a)huawei.com>
Cc: <stable(a)vger.kernel.org> [6.1+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/reclaim.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/mm/damon/reclaim.c~mm-damon-reclaim-avoid-divide-by-zero-in-damon_reclaim_apply_parameters
+++ a/mm/damon/reclaim.c
@@ -194,6 +194,11 @@ static int damon_reclaim_apply_parameter
if (err)
return err;
+ if (!damon_reclaim_mon_attrs.aggr_interval) {
+ err = -EINVAL;
+ goto out;
+ }
+
err = damon_set_attrs(param_ctx, &damon_reclaim_mon_attrs);
if (err)
goto out;
_
Patches currently in -mm which might be from yanquanmin1(a)huawei.com are
mm-damon-add-damon_ctx-min_sz_region.patch
The quilt patch titled
Subject: mm/damon/lru_sort: avoid divide-by-zero in damon_lru_sort_apply_parameters()
has been removed from the -mm tree. Its filename was
mm-damon-lru_sort-avoid-divide-by-zero-in-damon_lru_sort_apply_parameters.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Quanmin Yan <yanquanmin1(a)huawei.com>
Subject: mm/damon/lru_sort: avoid divide-by-zero in damon_lru_sort_apply_parameters()
Date: Wed, 27 Aug 2025 19:58:57 +0800
Patch series "mm/damon: avoid divide-by-zero in DAMON module's parameters
application".
DAMON's RECLAIM and LRU_SORT modules perform no validation on
user-configured parameters during application, which may lead to
division-by-zero errors.
Avoid the divide-by-zero by adding validation checks when DAMON modules
attempt to apply the parameters.
This patch (of 2):
During the calculation of 'hot_thres' and 'cold_thres', either
'sample_interval' or 'aggr_interval' is used as the divisor, which may
lead to division-by-zero errors. Fix it by directly returning -EINVAL
when such a case occurs. Additionally, since 'aggr_interval' is already
required to be set no smaller than 'sample_interval' in damon_set_attrs(),
only the case where 'sample_interval' is zero needs to be checked.
Link: https://lkml.kernel.org/r/20250827115858.1186261-2-yanquanmin1@huawei.com
Fixes: 40e983cca927 ("mm/damon: introduce DAMON-based LRU-lists Sorting")
Signed-off-by: Quanmin Yan <yanquanmin1(a)huawei.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: ze zuo <zuoze1(a)huawei.com>
Cc: <stable(a)vger.kernel.org> [6.0+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/lru_sort.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/mm/damon/lru_sort.c~mm-damon-lru_sort-avoid-divide-by-zero-in-damon_lru_sort_apply_parameters
+++ a/mm/damon/lru_sort.c
@@ -198,6 +198,11 @@ static int damon_lru_sort_apply_paramete
if (err)
return err;
+ if (!damon_lru_sort_mon_attrs.sample_interval) {
+ err = -EINVAL;
+ goto out;
+ }
+
err = damon_set_attrs(ctx, &damon_lru_sort_mon_attrs);
if (err)
goto out;
_
Patches currently in -mm which might be from yanquanmin1(a)huawei.com are
mm-damon-add-damon_ctx-min_sz_region.patch
The quilt patch titled
Subject: mm/damon/core: set quota->charged_from to jiffies at first charge window
has been removed from the -mm tree. Its filename was
mm-damon-core-set-quota-charged_from-to-jiffies-at-first-charge-window.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Sang-Heon Jeon <ekffu200098(a)gmail.com>
Subject: mm/damon/core: set quota->charged_from to jiffies at first charge window
Date: Fri, 22 Aug 2025 11:50:57 +0900
Kernel initializes the "jiffies" timer as 5 minutes below zero, as shown
in include/linux/jiffies.h
/*
* Have the 32 bit jiffies value wrap 5 minutes after boot
* so jiffies wrap bugs show up earlier.
*/
#define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ))
And jiffies comparison help functions cast unsigned value to signed to
cover wraparound
#define time_after_eq(a,b) \
(typecheck(unsigned long, a) && \
typecheck(unsigned long, b) && \
((long)((a) - (b)) >= 0))
When quota->charged_from is initialized to 0, time_after_eq() can
incorrectly return FALSE even after reset_interval has elapsed. This
occurs when (jiffies - reset_interval) produces a value with MSB=1, which
is interpreted as negative in signed arithmetic.
This issue primarily affects 32-bit systems because: On 64-bit systems:
MSB=1 values occur after ~292 million years from boot (assuming HZ=1000),
almost impossible.
On 32-bit systems: MSB=1 values occur during the first 5 minutes after
boot, and the second half of every jiffies wraparound cycle, starting from
day 25 (assuming HZ=1000)
When above unexpected FALSE return from time_after_eq() occurs, the
charging window will not reset. The user impact depends on esz value at
that time.
If esz is 0, scheme ignores configured quotas and runs without any limits.
If esz is not 0, scheme stops working once the quota is exhausted. It
remains until the charging window finally resets.
So, change quota->charged_from to jiffies at damos_adjust_quota() when it
is considered as the first charge window. By this change, we can avoid
unexpected FALSE return from time_after_eq()
Link: https://lkml.kernel.org/r/20250822025057.1740854-1-ekffu200098@gmail.com
Fixes: 2b8a248d5873 ("mm/damon/schemes: implement size quota for schemes application speed control") # 5.16
Signed-off-by: Sang-Heon Jeon <ekffu200098(a)gmail.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/core.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/mm/damon/core.c~mm-damon-core-set-quota-charged_from-to-jiffies-at-first-charge-window
+++ a/mm/damon/core.c
@@ -2111,6 +2111,10 @@ static void damos_adjust_quota(struct da
if (!quota->ms && !quota->sz && list_empty("a->goals))
return;
+ /* First charge window */
+ if (!quota->total_charged_sz && !quota->charged_from)
+ quota->charged_from = jiffies;
+
/* New charge window starts */
if (time_after_eq(jiffies, quota->charged_from +
msecs_to_jiffies(quota->reset_interval))) {
_
Patches currently in -mm which might be from ekffu200098(a)gmail.com are
mm-damon-update-expired-description-of-damos_action.patch
docs-mm-damon-design-fix-typo-s-sz_trtied-sz_tried.patch
selftests-damon-test-no-op-commit-broke-damon-status.patch
selftests-damon-test-no-op-commit-broke-damon-status-fix.patch
mm-damon-tests-core-kunit-add-damos_commit_filter-test.patch
The quilt patch titled
Subject: mm/hugetlb: add missing hugetlb_lock in __unmap_hugepage_range()
has been removed from the -mm tree. Its filename was
mm-hugetlb-add-missing-hugetlb_lock-in-__unmap_hugepage_range.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Jeongjun Park <aha310510(a)gmail.com>
Subject: mm/hugetlb: add missing hugetlb_lock in __unmap_hugepage_range()
Date: Sun, 24 Aug 2025 03:21:15 +0900
When restoring a reservation for an anonymous page, we need to check to
freeing a surplus. However, __unmap_hugepage_range() causes data race
because it reads h->surplus_huge_pages without the protection of
hugetlb_lock.
And adjust_reservation is a boolean variable that indicates whether
reservations for anonymous pages in each folio should be restored.
Therefore, it should be initialized to false for each round of the loop.
However, this variable is not initialized to false except when defining
the current adjust_reservation variable.
This means that once adjust_reservation is set to true even once within
the loop, reservations for anonymous pages will be restored
unconditionally in all subsequent rounds, regardless of the folio's state.
To fix this, we need to add the missing hugetlb_lock, unlock the
page_table_lock earlier so that we don't lock the hugetlb_lock inside the
page_table_lock lock, and initialize adjust_reservation to false on each
round within the loop.
Link: https://lkml.kernel.org/r/20250823182115.1193563-1-aha310510@gmail.com
Fixes: df7a6d1f6405 ("mm/hugetlb: restore the reservation if needed")
Signed-off-by: Jeongjun Park <aha310510(a)gmail.com>
Reported-by: syzbot+417aeb05fd190f3a6da9(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=417aeb05fd190f3a6da9
Reviewed-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: Breno Leitao <leitao(a)debian.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-add-missing-hugetlb_lock-in-__unmap_hugepage_range
+++ a/mm/hugetlb.c
@@ -5851,7 +5851,7 @@ void __unmap_hugepage_range(struct mmu_g
spinlock_t *ptl;
struct hstate *h = hstate_vma(vma);
unsigned long sz = huge_page_size(h);
- bool adjust_reservation = false;
+ bool adjust_reservation;
unsigned long last_addr_mask;
bool force_flush = false;
@@ -5944,6 +5944,7 @@ void __unmap_hugepage_range(struct mmu_g
sz);
hugetlb_count_sub(pages_per_huge_page(h), mm);
hugetlb_remove_rmap(folio);
+ spin_unlock(ptl);
/*
* Restore the reservation for anonymous page, otherwise the
@@ -5951,14 +5952,16 @@ void __unmap_hugepage_range(struct mmu_g
* If there we are freeing a surplus, do not set the restore
* reservation bit.
*/
+ adjust_reservation = false;
+
+ spin_lock_irq(&hugetlb_lock);
if (!h->surplus_huge_pages && __vma_private_lock(vma) &&
folio_test_anon(folio)) {
folio_set_hugetlb_restore_reserve(folio);
/* Reservation to be adjusted after the spin lock */
adjust_reservation = true;
}
-
- spin_unlock(ptl);
+ spin_unlock_irq(&hugetlb_lock);
/*
* Adjust the reservation for the region that will have the
_
Patches currently in -mm which might be from aha310510(a)gmail.com are
The quilt patch titled
Subject: mm/khugepaged: fix the address passed to notifier on testing young
has been removed from the -mm tree. Its filename was
mm-khugepaged-fix-the-address-passed-to-notifier-on-testing-young.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Wei Yang <richard.weiyang(a)gmail.com>
Subject: mm/khugepaged: fix the address passed to notifier on testing young
Date: Fri, 22 Aug 2025 06:33:18 +0000
Commit 8ee53820edfd ("thp: mmu_notifier_test_young") introduced
mmu_notifier_test_young(), but we are passing the wrong address.
In xxx_scan_pmd(), the actual iteration address is "_address" not
"address". We seem to misuse the variable on the very beginning.
Change it to the right one.
[akpm(a)linux-foundation.org fix whitespace, per everyone]
Link: https://lkml.kernel.org/r/20250822063318.11644-1-richard.weiyang@gmail.com
Fixes: 8ee53820edfd ("thp: mmu_notifier_test_young")
Signed-off-by: Wei Yang <richard.weiyang(a)gmail.com>
Reviewed-by: Dev Jain <dev.jain(a)arm.com>
Reviewed-by: Zi Yan <ziy(a)nvidia.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Nico Pache <npache(a)redhat.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Barry Song <baohua(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/khugepaged.c~mm-khugepaged-fix-the-address-passed-to-notifier-on-testing-young
+++ a/mm/khugepaged.c
@@ -1417,8 +1417,8 @@ static int hpage_collapse_scan_pmd(struc
*/
if (cc->is_khugepaged &&
(pte_young(pteval) || folio_test_young(folio) ||
- folio_test_referenced(folio) || mmu_notifier_test_young(vma->vm_mm,
- address)))
+ folio_test_referenced(folio) ||
+ mmu_notifier_test_young(vma->vm_mm, _address)))
referenced++;
}
if (!writable) {
_
Patches currently in -mm which might be from richard.weiyang(a)gmail.com are
mm-rmap-do-__folio_mod_stat-in-__folio_add_rmap.patch
selftests-mm-do-check_huge_anon-with-a-number-been-passed-in.patch
mm-rmap-not-necessary-to-mask-off-folio_pages_mapped.patch
mm-rmap-use-folio_large_nr_pages-when-we-are-sure-it-is-a-large-folio.patch
selftests-mm-put-general-ksm-operation-into-vm_util.patch
selftests-mm-test-that-rmap-behave-as-expected.patch
mm-khugepaged-use-list_xxx-helper-to-improve-readability.patch
mm-page_alloc-use-xxx_pageblock_isolate-for-better-reading.patch
mm-pageblock-flags-remove-pb_migratetype_bits-pb_migrate_end.patch
mm-page_alloc-find_large_buddy-from-start_pfn-aligned-order.patch
mm-page_alloc-find_large_buddy-from-start_pfn-aligned-order-v2.patch
The quilt patch titled
Subject: arm64: kexec: initialize kexec_buf struct in image_load()
has been removed from the -mm tree. Its filename was
arm64-kexec-initialize-kexec_buf-struct-in-image_load.patch
This patch was dropped because an alternative patch was or shall be merged
------------------------------------------------------
From: Breno Leitao <leitao(a)debian.org>
Subject: arm64: kexec: initialize kexec_buf struct in image_load()
Date: Tue, 26 Aug 2025 05:08:51 -0700
The kexec_buf structure was previously declared without initialization in
image_load(). This led to a UBSAN warning when the structure was expanded
and uninitialized fields were accessed [1].
Zero-initializing kexec_buf at declaration ensures all fields are cleanly
set, preventing future instances of uninitialized memory being used.
Fixes this UBSAN warning:
[ 32.362488] UBSAN: invalid-load in ./include/linux/kexec.h:210:10
[ 32.362649] load of value 252 is not a valid value for type '_Bool'
Andrew Morton suggested that this function is only called 3x a week[2],
thus, the memset() cost is inexpensive.
Link: https://lore.kernel.org/all/oninomspajhxp4omtdapxnckxydbk2nzmrix7rggmpukpnz… [1]
Link: https://lore.kernel.org/all/20250825180531.94bfb86a26a43127c0a1296f@linux-f… [2]
Link: https://lkml.kernel.org/r/20250826-akpm-v1-1-3c831f0e3799@debian.org
Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Suggested-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Coiby Xu <coxu(a)redhat.com>
Cc: "Daniel P. Berrange" <berrange(a)redhat.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Dave Young <dyoung(a)redhat.com>
Cc: Kairui Song <ryncsn(a)gmail.com>
Cc: Liu Pingfan <kernelfans(a)gmail.com>
Cc: Milan Broz <gmazyland(a)gmail.com>
Cc: Ondrej Kozina <okozina(a)redhat.com>
Cc: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/arm64/kernel/kexec_image.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/kernel/kexec_image.c~arm64-kexec-initialize-kexec_buf-struct-in-image_load
+++ a/arch/arm64/kernel/kexec_image.c
@@ -41,7 +41,7 @@ static void *image_load(struct kimage *i
struct arm64_image_header *h;
u64 flags, value;
bool be_image, be_kernel;
- struct kexec_buf kbuf;
+ struct kexec_buf kbuf = {};
unsigned long text_offset, kernel_segment_number;
struct kexec_segment *kernel_segment;
int ret;
_
Patches currently in -mm which might be from leitao(a)debian.org are
arm64-kexec-initialize-kexec_buf-struct-in-load_other_segments.patch
riscv-kexec-initialize-kexec_buf-struct.patch
s390-kexec-initialize-kexec_buf-struct.patch
Remove the SMBus Quick operation from this driver because it is not
natively supported by the hardware and is wrongly implemented in the
driver.
The I2C controllers in Realtek RTL9300 and RTL9310 are SMBus-compliant
but there doesn't seem to be native support for the SMBus Quick
operation. It is not explicitly mentioned in the documentation but
looking at the registers which configure an SMBus transaction, one can
see that the data length cannot be set to 0. This suggests that the
hardware doesn't allow any SMBus message without data bytes (except for
those it does on it's own, see SMBus Block Read).
The current implementation of SMBus Quick operation passes a length of
0 (which is actually invalid). Before the fix of a bug in a previous
commit, this led to a read operation of 16 bytes from any register (the
one of a former transaction or any other value.
This caused issues like soft-bricked SFP modules after a simple probe
with i2cdetect which uses Quick by default. Running this with SFP
modules whose EEPROM isn't write-protected, some of the initial bytes
are overwritten because a 16-byte write operation is executed instead of
a Quick Write. (This temporarily soft-bricked one of my DAC cables.)
Because SMBus Quick operation is obviously not supported on these
controllers (because a length of 0 cannot be set, even when no register
address is set), remove that instead of claiming there is support. There
also shouldn't be any kind of emulated 'Quick' which just does another
kind of operation in the background. Otherwise, specific issues occur
in case of a 'Quick' Write which actually writes unknown data to an
unknown register.
Fixes: c366be720235 ("i2c: Add driver for the RTL9300 I2C controller")
Cc: <stable(a)vger.kernel.org> # v6.13+
Signed-off-by: Jonas Jelonek <jelonek.jonas(a)gmail.com>
Tested-by: Sven Eckelmann <sven(a)narfation.org>
Reviewed-by: Chris Packham <chris.packham(a)alliedtelesis.co.nz>
Tested-by: Chris Packham <chris.packham(a)alliedtelesis.co.nz> # On RTL9302C based board
Tested-by: Markus Stockhausen <markus.stockhausen(a)gmx.de>
---
drivers/i2c/busses/i2c-rtl9300.c | 15 +++------------
1 file changed, 3 insertions(+), 12 deletions(-)
diff --git a/drivers/i2c/busses/i2c-rtl9300.c b/drivers/i2c/busses/i2c-rtl9300.c
index ebd4a85e1bde..9e6232075137 100644
--- a/drivers/i2c/busses/i2c-rtl9300.c
+++ b/drivers/i2c/busses/i2c-rtl9300.c
@@ -235,15 +235,6 @@ static int rtl9300_i2c_smbus_xfer(struct i2c_adapter *adap, u16 addr, unsigned s
}
switch (size) {
- case I2C_SMBUS_QUICK:
- ret = rtl9300_i2c_config_xfer(i2c, chan, addr, 0);
- if (ret)
- goto out_unlock;
- ret = rtl9300_i2c_reg_addr_set(i2c, 0, 0);
- if (ret)
- goto out_unlock;
- break;
-
case I2C_SMBUS_BYTE:
if (read_write == I2C_SMBUS_WRITE) {
ret = rtl9300_i2c_config_xfer(i2c, chan, addr, 0);
@@ -344,9 +335,9 @@ static int rtl9300_i2c_smbus_xfer(struct i2c_adapter *adap, u16 addr, unsigned s
static u32 rtl9300_i2c_func(struct i2c_adapter *a)
{
- return I2C_FUNC_SMBUS_QUICK | I2C_FUNC_SMBUS_BYTE |
- I2C_FUNC_SMBUS_BYTE_DATA | I2C_FUNC_SMBUS_WORD_DATA |
- I2C_FUNC_SMBUS_BLOCK_DATA | I2C_FUNC_SMBUS_I2C_BLOCK;
+ return I2C_FUNC_SMBUS_BYTE | I2C_FUNC_SMBUS_BYTE_DATA |
+ I2C_FUNC_SMBUS_WORD_DATA | I2C_FUNC_SMBUS_BLOCK_DATA |
+ I2C_FUNC_SMBUS_I2C_BLOCK;
}
static const struct i2c_algorithm rtl9300_i2c_algo = {
--
2.48.1
When building powerpc configurations in linux-5.4.y with binutils 2.43
or newer, there is an assembler error in arch/powerpc/boot/util.S:
arch/powerpc/boot/util.S: Assembler messages:
arch/powerpc/boot/util.S:44: Error: junk at end of line, first unrecognized character is `0'
arch/powerpc/boot/util.S:49: Error: syntax error; found `b', expected `,'
arch/powerpc/boot/util.S:49: Error: junk at end of line: `b'
binutils 2.43 contains stricter parsing of certain labels [1].
Remove the unnecessary leading zero to fix the build. This is only
needed in linux-5.4.y because commit 8b14e1dff067 ("powerpc: Remove
support for PowerPC 601") removed this code altogether in 5.10.
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=226749d5a6ff0d5c6… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
arch/powerpc/boot/util.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/boot/util.S b/arch/powerpc/boot/util.S
index f11f0589a669..5ab2bc864e66 100644
--- a/arch/powerpc/boot/util.S
+++ b/arch/powerpc/boot/util.S
@@ -41,12 +41,12 @@ udelay:
srwi r4,r4,16
cmpwi 0,r4,1 /* 601 ? */
bne .Ludelay_not_601
-00: li r0,86 /* Instructions / microsecond? */
+0: li r0,86 /* Instructions / microsecond? */
mtctr r0
10: addi r0,r0,0 /* NOP */
bdnz 10b
subic. r3,r3,1
- bne 00b
+ bne 0b
blr
.Ludelay_not_601:
base-commit: c25f780e491e4734eb27d65aa58e0909fd78ad9f
--
2.51.0
A new warning in clang [1] points out a few places in s5p_mfc_cmd_v6.c
where an uninitialized variable is passed as a const pointer:
drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c:45:7: error: variable 'h2r_args' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
45 | &h2r_args);
| ^~~~~~~~
drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c:133:7: error: variable 'h2r_args' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
133 | &h2r_args);
| ^~~~~~~~
drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c:148:7: error: variable 'h2r_args' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
148 | &h2r_args);
| ^~~~~~~~
The args parameter in s5p_mfc_cmd_host2risc_v6() is never actually used,
so just pass NULL to it in the places where h2r_args is currently
passed, clearing up the warning and not changing the functionality of
the code.
Cc: stable(a)vger.kernel.org
Fixes: f96f3cfa0bb8 ("[media] s5p-mfc: Update MFC v4l2 driver to support MFC6.x")
Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d44… [1]
Closes: https://github.com/ClangBuiltLinux/linux/issues/2103
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
From what I can tell, it seems like ->cmd_host2risc() is only ever
called from v6 code, which always passes NULL? It seems like it should
be possible to just drop .cmd_host2risc on the v5 side, then update
.cmd_host2risc to only take two parameters? If so, I can send a follow
up as a clean up, so that this can go back relatively conflict free.
---
.../platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c | 22 +++++-----------------
1 file changed, 5 insertions(+), 17 deletions(-)
diff --git a/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c b/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c
index 47bc3014b5d8..735471c50dbb 100644
--- a/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c
+++ b/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c
@@ -31,7 +31,6 @@ static int s5p_mfc_cmd_host2risc_v6(struct s5p_mfc_dev *dev, int cmd,
static int s5p_mfc_sys_init_cmd_v6(struct s5p_mfc_dev *dev)
{
- struct s5p_mfc_cmd_args h2r_args;
const struct s5p_mfc_buf_size_v6 *buf_size = dev->variant->buf_size->priv;
int ret;
@@ -41,33 +40,23 @@ static int s5p_mfc_sys_init_cmd_v6(struct s5p_mfc_dev *dev)
mfc_write(dev, dev->ctx_buf.dma, S5P_FIMV_CONTEXT_MEM_ADDR_V6);
mfc_write(dev, buf_size->dev_ctx, S5P_FIMV_CONTEXT_MEM_SIZE_V6);
- return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SYS_INIT_V6,
- &h2r_args);
+ return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SYS_INIT_V6, NULL);
}
static int s5p_mfc_sleep_cmd_v6(struct s5p_mfc_dev *dev)
{
- struct s5p_mfc_cmd_args h2r_args;
-
- memset(&h2r_args, 0, sizeof(struct s5p_mfc_cmd_args));
- return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SLEEP_V6,
- &h2r_args);
+ return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SLEEP_V6, NULL);
}
static int s5p_mfc_wakeup_cmd_v6(struct s5p_mfc_dev *dev)
{
- struct s5p_mfc_cmd_args h2r_args;
-
- memset(&h2r_args, 0, sizeof(struct s5p_mfc_cmd_args));
- return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_WAKEUP_V6,
- &h2r_args);
+ return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_WAKEUP_V6, NULL);
}
/* Open a new instance and get its number */
static int s5p_mfc_open_inst_cmd_v6(struct s5p_mfc_ctx *ctx)
{
struct s5p_mfc_dev *dev = ctx->dev;
- struct s5p_mfc_cmd_args h2r_args;
int codec_type;
mfc_debug(2, "Requested codec mode: %d\n", ctx->codec_mode);
@@ -130,14 +119,13 @@ static int s5p_mfc_open_inst_cmd_v6(struct s5p_mfc_ctx *ctx)
mfc_write(dev, 0, S5P_FIMV_D_CRC_CTRL_V6); /* no crc */
return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_OPEN_INSTANCE_V6,
- &h2r_args);
+ NULL);
}
/* Close instance */
static int s5p_mfc_close_inst_cmd_v6(struct s5p_mfc_ctx *ctx)
{
struct s5p_mfc_dev *dev = ctx->dev;
- struct s5p_mfc_cmd_args h2r_args;
int ret = 0;
dev->curr_ctx = ctx->num;
@@ -145,7 +133,7 @@ static int s5p_mfc_close_inst_cmd_v6(struct s5p_mfc_ctx *ctx)
mfc_write(dev, ctx->inst_no, S5P_FIMV_INSTANCE_ID_V6);
ret = s5p_mfc_cmd_host2risc_v6(dev,
S5P_FIMV_H2R_CMD_CLOSE_INSTANCE_V6,
- &h2r_args);
+ NULL);
} else {
ret = -EINVAL;
}
---
base-commit: 347e9f5043c89695b01e66b3ed111755afcf1911
change-id: 20250715-media-s5p-mfc-fix-uninit-const-pointer-cbf944ae4b4b
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
This is the start of the stable review cycle for the 5.4.298 release.
There are 23 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 04 Sep 2025 13:19:14 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.298-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.298-rc1
Imre Deak <imre.deak(a)intel.com>
Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"
Fabio Porcedda <fabio.porcedda(a)gmail.com>
net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions
Alex Deucher <alexander.deucher(a)amd.com>
Revert "drm/amdgpu: fix incorrect vm flags to map bo"
Minjong Kim <minbell.kim(a)samsung.com>
HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()
Ping Cheng <pinglinux(a)gmail.com>
HID: wacom: Add a new Art Pen 2
Qasim Ijaz <qasdev00(a)gmail.com>
HID: asus: fix UAF via HID_CLAIMED_INPUT validation
Thijs Raymakers <thijs(a)raymakers.nl>
KVM: x86: use array_index_nospec with indices that come from guest
Li Nan <linan122(a)huawei.com>
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
Eric Dumazet <edumazet(a)google.com>
sctp: initialize more fields in sctp_v6_from_sk()
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Set local Xoff after FW update
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon port speed set
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon MTU set
Yeounsu Moon <yyyynoom(a)gmail.com>
net: dlink: fix multicast stats being counted incorrectly
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control().
Christoph Hellwig <hch(a)lst.de>
net/atm: remove the atmdev_ops {get, set}sockopt methods
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced
Madhavan Srinivasan <maddy(a)linux.ibm.com>
powerpc/kvm: Fix ifdef to remove build warning
Oscar Maes <oscmaes92(a)gmail.com>
net: ipv4: fix regression in local-broadcast routes
Nikolay Kuratov <kniv(a)yandex-team.ru>
vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()
Damien Le Moal <dlemoal(a)kernel.org>
scsi: core: sysfs: Correct sysfs attributes access rights
Tengda Wu <wutengda(a)huaweicloud.com>
ftrace: Fix potential warning in trace_printk_seq during ftrace_dump
Randy Dunlap <rdunlap(a)infradead.org>
pinctrl: STMFX: add missing HAS_IOMEM dependency
-------------
Diffstat:
Makefile | 4 +--
arch/powerpc/kernel/kvm.c | 8 ++---
arch/x86/kvm/lapic.c | 3 ++
arch/x86/kvm/x86.c | 7 ++--
drivers/atm/atmtcp.c | 17 +++++++--
drivers/atm/eni.c | 17 ---------
drivers/atm/firestream.c | 2 --
drivers/atm/fore200e.c | 27 ---------------
drivers/atm/horizon.c | 40 ----------------------
drivers/atm/iphase.c | 16 ---------
drivers/atm/lanai.c | 2 --
drivers/atm/solos-pci.c | 2 --
drivers/atm/zatm.c | 16 ---------
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +--
drivers/gpu/drm/drm_dp_helper.c | 2 +-
drivers/hid/hid-asus.c | 8 ++++-
drivers/hid/hid-ntrig.c | 3 ++
drivers/hid/wacom_wac.c | 1 +
drivers/net/ethernet/dlink/dl2k.c | 2 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.c | 3 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.h | 12 +++++++
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 19 +++++++++-
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 4 ---
drivers/net/usb/qmi_wwan.c | 3 ++
drivers/pinctrl/Kconfig | 1 +
drivers/scsi/scsi_sysfs.c | 4 +--
drivers/vhost/net.c | 9 +++--
fs/efivarfs/super.c | 4 +++
include/linux/atmdev.h | 10 +-----
kernel/trace/trace.c | 4 +--
net/atm/common.c | 29 ++++++++--------
net/bluetooth/hci_event.c | 12 ++++++-
net/ipv4/route.c | 10 ++++--
net/sctp/ipv6.c | 2 ++
34 files changed, 129 insertions(+), 178 deletions(-)
The function calls of_parse_phandle() which returns
a device node with an incremented reference count. When the bonded device
is not available, the function
returns NULL without releasing the reference, causing a reference leak.
Add of_node_put(np) to release the device node reference.
The of_node_put function handles NULL pointers.
Found through static analysis by reviewing the doc of of_parse_phandle()
and cross-checking its usage patterns across the codebase.
Fixes: 7625ee981af1 ("[media] media: platform: rcar_drif: Add DRIF support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/media/platform/renesas/rcar_drif.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/platform/renesas/rcar_drif.c b/drivers/media/platform/renesas/rcar_drif.c
index fc8b6bbef793..c5d676eb1091 100644
--- a/drivers/media/platform/renesas/rcar_drif.c
+++ b/drivers/media/platform/renesas/rcar_drif.c
@@ -1246,6 +1246,7 @@ static struct device_node *rcar_drif_bond_enabled(struct platform_device *p)
if (np && of_device_is_available(np))
return np;
+ of_node_put(np);
return NULL;
}
--
2.35.1
Hi Sasha and Greg,
Salvatore Bonaccorso <carnil(a)debian.org> from Debian Kernel Team
included this regression-fix already.
Upstream commit 5189446ba995556eaa3755a6e875bc06675b88bd
"net: ipv4: fix regression in local-broadcast routes"
As far as I have seen this should be included in stable-6.16 and
LTS-6.12 (for other stable branches I simply have no interest - please
double-check).
I am sure Sasha's new kernel-patch-AI tool has catched this - just
kindly inform you.
Thanks.
Best regards,
-Sedat-
https://salsa.debian.org/kernel-team/linux/-/commit/194de383c5cd5e8c22cadfc…https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
Hi all,
Here's a collection of fixes that I *think* are bugs in fuse, along with
some scattered improvements.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
This has been running on the djcloud for months with no problems. Enjoy!
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fu…
---
Commits in this patchset:
* fuse: fix livelock in synchronous file put from fuseblk workers
* fuse: flush pending fuse events before aborting the connection
* fuse: capture the unique id of fuse commands being sent
* fuse: implement file attributes mask for statx
* fuse: update file mode when updating acls
* fuse: propagate default and file acls on creation
* fuse: enable FUSE_SYNCFS for all servers
---
fs/fuse/fuse_i.h | 14 +++++++
fs/fuse/acl.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/dev.c | 44 +++++++++++++++++++++++
fs/fuse/dev_uring.c | 8 ++++
fs/fuse/dir.c | 96 +++++++++++++++++++++++++++++++++++++++------------
fs/fuse/file.c | 10 +++++
fs/fuse/inode.c | 5 +++
7 files changed, 245 insertions(+), 27 deletions(-)
Add missing of_node_put() and put_device() calls to release references.
The function calls of_parse_phandle() and of_find_device_by_node()
but fails to release the references.
Both functions' documentation mentions that
the returned references must be dropped after use.
Found through static analysis by reviewing the documentation and
cross-checking their usage patterns.
Fixes: 2d1021487273 ("phy: tegra: xusb: Add wake/sleepwalk for Tegra210")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/phy/tegra/xusb-tegra210.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/phy/tegra/xusb-tegra210.c b/drivers/phy/tegra/xusb-tegra210.c
index ebc8a7e21a31..cbfca51a53bc 100644
--- a/drivers/phy/tegra/xusb-tegra210.c
+++ b/drivers/phy/tegra/xusb-tegra210.c
@@ -3164,18 +3164,23 @@ tegra210_xusb_padctl_probe(struct device *dev,
}
pdev = of_find_device_by_node(np);
+ of_node_put(np);
if (!pdev) {
dev_warn(dev, "PMC device is not available\n");
goto out;
}
- if (!platform_get_drvdata(pdev))
+ if (!platform_get_drvdata(pdev)) {
+ put_device(&pdev->dev);
return ERR_PTR(-EPROBE_DEFER);
+ }
padctl->regmap = dev_get_regmap(&pdev->dev, "usb_sleepwalk");
if (!padctl->regmap)
dev_info(dev, "failed to find PMC regmap\n");
+ put_device(&pdev->dev);
+
out:
return &padctl->base;
}
--
2.35.1
The cdnsp-pci driver uses pcim_enable_device() to enable a PCI device,
which means the device will be automatically disabled on driver detach
through the managed device framework. The manual pci_disable_device()
call in the error path is therefore redundant.
Found via static anlaysis and this is similar to commit 99ca0b57e49f
("thermal: intel: int340x: processor: Fix warning during module unload").
Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/usb/cdns3/cdnsp-pci.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/usb/cdns3/cdnsp-pci.c b/drivers/usb/cdns3/cdnsp-pci.c
index 8c361b8394e9..5e7b88ca8b96 100644
--- a/drivers/usb/cdns3/cdnsp-pci.c
+++ b/drivers/usb/cdns3/cdnsp-pci.c
@@ -85,7 +85,7 @@ static int cdnsp_pci_probe(struct pci_dev *pdev,
cdnsp = kzalloc(sizeof(*cdnsp), GFP_KERNEL);
if (!cdnsp) {
ret = -ENOMEM;
- goto disable_pci;
+ goto put_pci;
}
}
@@ -168,9 +168,6 @@ static int cdnsp_pci_probe(struct pci_dev *pdev,
if (!pci_is_enabled(func))
kfree(cdnsp);
-disable_pci:
- pci_disable_device(pdev);
-
put_pci:
pci_dev_put(func);
--
2.35.1
This is the start of the stable review cycle for the 6.1.150 release.
There are 50 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 04 Sep 2025 13:19:14 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.150-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.1.150-rc1
Eric Sandeen <sandeen(a)redhat.com>
xfs: do not propagate ENODATA disk errors into xattr code
Imre Deak <imre.deak(a)intel.com>
Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Handle reads greater than 60 bytes
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Don't set bus speed on every transfer
Eric Dumazet <edumazet(a)google.com>
net: rose: fix a typo in rose_clear_routes()
James Jones <jajones(a)nvidia.com>
drm/nouveau/disp: Always accept linear modifier
Steve French <stfrench(a)microsoft.com>
smb3 client: fix return code mapping of remap_file_range
Fabio Porcedda <fabio.porcedda(a)gmail.com>
net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions
Shuhao Fu <sfual(a)cse.ust.hk>
fs/smb: Fix inconsistent refcnt update
Shanker Donthineni <sdonthineni(a)nvidia.com>
dma/pool: Ensure DMA_DIRECT_REMAP allocations are decrypted
Alex Deucher <alexander.deucher(a)amd.com>
Revert "drm/amdgpu: fix incorrect vm flags to map bo"
Minjong Kim <minbell.kim(a)samsung.com>
HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()
Ping Cheng <pinglinux(a)gmail.com>
HID: wacom: Add a new Art Pen 2
Qasim Ijaz <qasdev00(a)gmail.com>
HID: multitouch: fix slab out-of-bounds access in mt_report_fixup()
Qasim Ijaz <qasdev00(a)gmail.com>
HID: asus: fix UAF via HID_CLAIMED_INPUT validation
Thijs Raymakers <thijs(a)raymakers.nl>
KVM: x86: use array_index_nospec with indices that come from guest
Li Nan <linan122(a)huawei.com>
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
Eric Dumazet <edumazet(a)google.com>
sctp: initialize more fields in sctp_v6_from_sk()
Takamitsu Iwai <takamitz(a)amazon.co.jp>
net: rose: include node references in rose_neigh refcount
Takamitsu Iwai <takamitz(a)amazon.co.jp>
net: rose: convert 'use' field to refcount_t
Takamitsu Iwai <takamitz(a)amazon.co.jp>
net: rose: split remove and free operations in rose_remove_neigh()
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Set local Xoff after FW update
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon port speed set
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon MTU set
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Reload auxiliary drivers on fw_activate
Horatiu Vultur <horatiu.vultur(a)microchip.com>
phy: mscc: Fix when PTP clock is register and unregister
Yeounsu Moon <yyyynoom(a)gmail.com>
net: dlink: fix multicast stats being counted incorrectly
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control().
Pavel Shpakovskiy <pashpakovskii(a)salutedevices.com>
Bluetooth: hci_sync: fix set_local_name race condition
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced
Ludovico de Nittis <ludovico.denittis(a)collabora.com>
Bluetooth: hci_event: Mark connection as closed during suspend disconnect
Ludovico de Nittis <ludovico.denittis(a)collabora.com>
Bluetooth: hci_event: Treat UNKNOWN_CONN_ID on disconnect as success
José Expósito <jose.exposito89(a)gmail.com>
HID: input: report battery status changes immediately
José Expósito <jose.exposito89(a)gmail.com>
HID: input: rename hidinput_set_battery_charge_status()
Madhavan Srinivasan <maddy(a)linux.ibm.com>
powerpc/kvm: Fix ifdef to remove build warning
Rob Clark <robin.clark(a)oss.qualcomm.com>
drm/msm: Defer fd_install in SUBMIT ioctl
Oscar Maes <oscmaes92(a)gmail.com>
net: ipv4: fix regression in local-broadcast routes
Nikolay Kuratov <kniv(a)yandex-team.ru>
vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFS: Fix a race when updating an existing write
Christoph Hellwig <hch(a)lst.de>
nfs: fold nfs_page_group_lock_subrequests into nfs_lock_and_join_requests
Werner Sembach <wse(a)tuxedocomputers.com>
ACPI: EC: Add device to acpi_ec_no_wakeup[] qurik list
Alexey Klimov <alexey.klimov(a)linaro.org>
ASoC: codecs: tx-macro: correct tx_macro_component_drv name
Paulo Alcantara <pc(a)manguebit.org>
smb: client: fix race with concurrent opens in rename(2)
Paulo Alcantara <pc(a)manguebit.org>
smb: client: fix race with concurrent opens in unlink(2)
Damien Le Moal <dlemoal(a)kernel.org>
scsi: core: sysfs: Correct sysfs attributes access rights
Tengda Wu <wutengda(a)huaweicloud.com>
ftrace: Fix potential warning in trace_printk_seq during ftrace_dump
Aleksander Jan Bajkowski <olek2(a)wp.pl>
mips: lantiq: xway: sysctrl: rename the etop node
Aleksander Jan Bajkowski <olek2(a)wp.pl>
mips: dts: lantiq: danube: add missing burst length property
Randy Dunlap <rdunlap(a)infradead.org>
pinctrl: STMFX: add missing HAS_IOMEM dependency
-------------
Diffstat:
Makefile | 4 +-
arch/mips/boot/dts/lantiq/danube_easy50712.dts | 5 +-
arch/mips/lantiq/xway/sysctrl.c | 10 +-
arch/powerpc/kernel/kvm.c | 8 +-
arch/x86/kvm/lapic.c | 2 +
arch/x86/kvm/x86.c | 7 +-
drivers/acpi/ec.c | 6 +
drivers/atm/atmtcp.c | 17 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +-
drivers/gpu/drm/display/drm_dp_helper.c | 2 +-
drivers/gpu/drm/msm/msm_gem_submit.c | 14 +-
drivers/gpu/drm/nouveau/dispnv50/wndw.c | 4 +
drivers/hid/hid-asus.c | 8 +-
drivers/hid/hid-input-test.c | 10 +-
drivers/hid/hid-input.c | 51 ++++----
drivers/hid/hid-mcp2221.c | 71 +++++++----
drivers/hid/hid-multitouch.c | 8 ++
drivers/hid/hid-ntrig.c | 3 +
drivers/hid/wacom_wac.c | 1 +
drivers/net/ethernet/dlink/dl2k.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 2 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.c | 3 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.h | 12 ++
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 19 ++-
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 4 -
drivers/net/phy/mscc/mscc.h | 4 +
drivers/net/phy/mscc/mscc_main.c | 4 +-
drivers/net/phy/mscc/mscc_ptp.c | 34 +++--
drivers/net/usb/qmi_wwan.c | 3 +
drivers/pinctrl/Kconfig | 1 +
drivers/scsi/scsi_sysfs.c | 4 +-
drivers/vhost/net.c | 9 +-
fs/efivarfs/super.c | 4 +
fs/nfs/pagelist.c | 86 +------------
fs/nfs/write.c | 142 +++++++++++++--------
fs/smb/client/cifsfs.c | 14 ++
fs/smb/client/inode.c | 34 ++++-
fs/smb/client/smb2inode.c | 7 +-
fs/xfs/libxfs/xfs_attr_remote.c | 7 +
fs/xfs/libxfs/xfs_da_btree.c | 6 +
include/linux/atmdev.h | 1 +
include/linux/nfs_page.h | 2 +-
include/net/bluetooth/hci_sync.h | 2 +-
include/net/rose.h | 18 ++-
kernel/dma/pool.c | 4 +-
kernel/trace/trace.c | 4 +-
net/atm/common.c | 15 ++-
net/bluetooth/hci_event.c | 20 ++-
net/bluetooth/hci_sync.c | 6 +-
net/bluetooth/mgmt.c | 5 +-
net/ipv4/route.c | 10 +-
net/rose/af_rose.c | 13 +-
net/rose/rose_in.c | 12 +-
net/rose/rose_route.c | 62 +++++----
net/rose/rose_timer.c | 2 +-
net/sctp/ipv6.c | 2 +
sound/soc/codecs/lpass-tx-macro.c | 2 +-
57 files changed, 514 insertions(+), 302 deletions(-)
This converts the vexpress-sysreg MFD driver to using the new generic
GPIO interface but first fixes an issue with an unchecked return value
of devm_gpiochio_add_data().
Lee: Please, create an immutable branch containing these commits after
you pick them up, as I'd like to merge it into the GPIO tree and remove
the legacy interface in this cycle.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
---
Bartosz Golaszewski (2):
mfd: vexpress-sysreg: check the return value of devm_gpiochip_add_data()
mfd: vexpress-sysreg: use new generic GPIO chip API
drivers/mfd/vexpress-sysreg.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20250728-gpio-mmio-mfd-conv-d27c2cfbccfe
Best regards,
--
Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
From: SujanaSubr <sujana.subramaniam(a)sap.com>
[ Upstream commit 1ce840c7a659aa53a31ef49f0271b4fd0dc10296 ]
Currently, when firmware failure occurs during matcher disconnect flow,
the error flow of the function reconnects the matcher back and returns
an error, which continues running the calling function and eventually
frees the matcher that is being disconnected.
This leads to a case where we have a freed matcher on the matchers list,
which in turn leads to use-after-free and eventual crash.
This patch fixes that by not trying to reconnect the matcher back when
some FW command fails during disconnect.
Note that we're dealing here with FW error. We can't overcome this
problem. This might lead to bad steering state (e.g. wrong connection
between matchers), and will also lead to resource leakage, as it is
the case with any other error handling during resource destruction.
However, the goal here is to allow the driver to continue and not crash
the machine with use-after-free error.
Signed-off-by: Yevgeny Kliteynik <kliteyn(a)nvidia.com>
Signed-off-by: Itamar Gozlan <igozlan(a)nvidia.com>
Reviewed-by: Mark Bloch <mbloch(a)nvidia.com>
Signed-off-by: Tariq Toukan <tariqt(a)nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
Signed-off-by: Akendo <akendo(a)akendo.eu>
Signed-off-by: SujanaSubr <sujana.subramaniam(a)sap.com>
---
.../mlx5/core/steering/hws/mlx5hws_matcher.c | 24 +++++++------------
1 file changed, 8 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c
index 61a1155d4b4f..ce541c60c5b4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c
@@ -165,14 +165,14 @@ static int hws_matcher_disconnect(struct mlx5hws_matcher *matcher)
next->match_ste.rtc_0_id,
next->match_ste.rtc_1_id);
if (ret) {
- mlx5hws_err(tbl->ctx, "Failed to disconnect matcher\n");
- goto matcher_reconnect;
+ mlx5hws_err(tbl->ctx, "Fatal error, failed to disconnect matcher\n");
+ return ret;
}
} else {
ret = mlx5hws_table_connect_to_miss_table(tbl, tbl->default_miss.miss_tbl);
if (ret) {
- mlx5hws_err(tbl->ctx, "Failed to disconnect last matcher\n");
- goto matcher_reconnect;
+ mlx5hws_err(tbl->ctx, "Fatal error, failed to disconnect last matcher\n");
+ return ret;
}
}
@@ -180,27 +180,19 @@ static int hws_matcher_disconnect(struct mlx5hws_matcher *matcher)
if (prev_ft_id == tbl->ft_id) {
ret = mlx5hws_table_update_connected_miss_tables(tbl);
if (ret) {
- mlx5hws_err(tbl->ctx, "Fatal error, failed to update connected miss table\n");
- goto matcher_reconnect;
+ mlx5hws_err(tbl->ctx,
+ "Fatal error, failed to update connected miss table\n");
+ return ret;
}
}
ret = mlx5hws_table_ft_set_default_next_ft(tbl, prev_ft_id);
if (ret) {
mlx5hws_err(tbl->ctx, "Fatal error, failed to restore matcher ft default miss\n");
- goto matcher_reconnect;
+ return ret;
}
return 0;
-
-matcher_reconnect:
- if (list_empty(&tbl->matchers_list) || !prev)
- list_add(&matcher->list_node, &tbl->matchers_list);
- else
- /* insert after prev matcher */
- list_add(&matcher->list_node, &prev->list_node);
-
- return ret;
}
static void hws_matcher_set_rtc_attr_sz(struct mlx5hws_matcher *matcher,
--
2.39.5 (Apple Git-154)
Problem: when pinctrl core binds pins to a consumer device and the
pinmux ops of the underlying driver are marked as strict, the pin in
question can no longer be requested as a GPIO using the GPIO descriptor
API. It will result in the following error:
[ 5.095688] sc8280xp-tlmm f100000.pinctrl: pin GPIO_25 already requested by regulator-edp-3p3; cannot claim for f100000.pinctrl:570
[ 5.107822] sc8280xp-tlmm f100000.pinctrl: error -EINVAL: pin-25 (f100000.pinctrl:570)
This typically makes sense except when the pins are muxed to a function
that actually says "GPIO". Of course, the function name is just a string
so it has no meaning to the pinctrl subsystem.
We have many Qualcomm SoCs (and I can imagine it's a common pattern in
other platforms as well) where we mux a pin to "gpio" function using the
`pinctrl-X` property in order to configure bias or drive-strength and
then access it using the gpiod API. This makes it impossible to mark the
pin controller module as "strict".
This series proposes to introduce a concept of a sub-category of
pinfunctions: GPIO functions where the above is not true and the pin
muxed as a GPIO can still be accessed via the GPIO consumer API even for
strict pinmuxers.
To that end: we first clean up the drivers that use struct function_desc
and make them use the smaller struct pinfunction instead - which is the
correct structure for drivers to describe their pin functions with. We
also rework pinmux core to not duplicate memory used to store the
pinfunctions unless they're allocated dynamically.
First: provide the kmemdup_const() helper which only duplicates memory
if it's not in the .rodata section. Then rework all pinctrl drivers that
instantiate objects of type struct function_desc as they should only be
created by pinmux core. Next constify the return value of the accessor
used to expose these structures to users and finally convert the
pinfunction object within struct function_desc to a pointer and use
kmemdup_const() to assign it. With this done proceed to add
infrastructure for the GPIO pin function category and use it in Qualcomm
drivers. At the very end: make the Qualcomm pinmuxer strict.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
---
Changes in v7:
- Add a patch checking the return value of the get_function_name()
callback in pinmux_func_name_to_selector(). This fixes a NULL-pointer
dereference on IMX platforms
- Don't assign the number of functions in pinctrl device in the IMX
driver as it's done automatically when adding the pinfunctions using
the provided API. This fixes a warning from pinctrl core on IMX
platforms triggered by the conversion from accessing the radix tree
directly
- Link to v6: https://lore.kernel.org/r/20250828-pinctrl-gpio-pinfuncs-v6-0-c9abb6bdb689@…
Changes in v6:
- Select GENERIC_PINMUX_FUNCTIONS when using generic pinmux helpers in
qcom pinctrl drivers to fix build on ARM 32-bit platforms
- Assume that a pin can be requested in pin_request() if it has no
mux_setting assigned
- Also check if a function is a GPIO for pins within GPIO ranges
- Fix an issue with the imx pinctrl driver where the conversion patch
confused the function and pin group radix trees
- Add a FIXME to the imx driver mentioning the need to switch to the
provided helpers for accessing the group radix tree
- Link to v5: https://lore.kernel.org/r/20250815-pinctrl-gpio-pinfuncs-v5-0-955de9fd91db@…
Changes in v5:
- Fix a potential NULL-pointer dereference in
pinmux_can_be_used_for_gpio()
- Use PINCTRL_PINFUNCTION() in pinctrl-airoha
- Link to v4: https://lore.kernel.org/r/20250812-pinctrl-gpio-pinfuncs-v4-0-bb3906c55e64@…
Changes in v4:
- Update the GPIO pin function definitions to include the new qcom
driver (milos)
- Provide devm_kmemdup_const() instead of a non-managed kmemdup_const()
as a way to avoid casting out the 'const' modifier when passing the
const pointer to devm_add_action_or_reset()
- Use devm_krealloc_array() where applicable instead of devm_krealloc()
- Fix typos
- Fix kerneldocs
- Improve commit messages
- Small tweaks as pointed out by Andy
- Rebased on top of v6.17-rc1
- Link to v3: https://lore.kernel.org/r/20250724-pinctrl-gpio-pinfuncs-v3-0-af4db9302de4@…
Changes in v3:
- Add more patches in front: convert pinctrl drivers to stop defining
their own struct function_desc objects and make pinmux core not
duplicate .rodata memory in which struct pinfunction objects are
stored.
- Add a patch constifying pinmux_generic_get_function().
- Drop patches that were applied upstream.
- Link to v2: https://lore.kernel.org/r/20250709-pinctrl-gpio-pinfuncs-v2-0-b6135149c0d9@…
Changes in v2:
- Extend the series with providing pinmux_generic_add_pinfunction(),
using it in several drivers and converting pinctrl-msm to using
generic pinmux helpers
- Add a generic function_is_gpio() callback for pinmux_ops
- Convert all qualcomm drivers to using the new GPIO pin category so
that we can actually enable the strict flag
- Link to v1: https://lore.kernel.org/r/20250702-pinctrl-gpio-pinfuncs-v1-0-ed2bd0f9468d@…
---
Bartosz Golaszewski (16):
pinctrl: check the return value of pinmux_ops::get_function_name()
devres: provide devm_kmemdup_const()
pinctrl: ingenic: use struct pinfunction instead of struct function_desc
pinctrl: airoha: replace struct function_desc with struct pinfunction
pinctrl: mediatek: mt7988: use PINCTRL_PIN_FUNCTION()
pinctrl: mediatek: moore: replace struct function_desc with struct pinfunction
pinctrl: imx: don't access the pin function radix tree directly
pinctrl: keembay: release allocated memory in detach path
pinctrl: keembay: use a dedicated structure for the pinfunction description
pinctrl: constify pinmux_generic_get_function()
pinctrl: make struct pinfunction a pointer in struct function_desc
pinctrl: qcom: use generic pin function helpers
pinctrl: allow to mark pin functions as requestable GPIOs
pinctrl: qcom: add infrastructure for marking pin functions as GPIOs
pinctrl: qcom: mark the `gpio` and `egpio` pins function as non-strict functions
pinctrl: qcom: make the pinmuxing strict
drivers/base/devres.c | 21 +++++++
drivers/pinctrl/freescale/pinctrl-imx.c | 45 +++++++--------
drivers/pinctrl/mediatek/pinctrl-airoha.c | 19 +++----
drivers/pinctrl/mediatek/pinctrl-moore.c | 10 ++--
drivers/pinctrl/mediatek/pinctrl-moore.h | 7 +--
drivers/pinctrl/mediatek/pinctrl-mt7622.c | 2 +-
drivers/pinctrl/mediatek/pinctrl-mt7623.c | 2 +-
drivers/pinctrl/mediatek/pinctrl-mt7629.c | 2 +-
drivers/pinctrl/mediatek/pinctrl-mt7981.c | 2 +-
drivers/pinctrl/mediatek/pinctrl-mt7986.c | 2 +-
drivers/pinctrl/mediatek/pinctrl-mt7988.c | 44 ++++++---------
drivers/pinctrl/mediatek/pinctrl-mtk-common-v2.h | 2 +-
drivers/pinctrl/pinctrl-equilibrium.c | 2 +-
drivers/pinctrl/pinctrl-ingenic.c | 49 ++++++++---------
drivers/pinctrl/pinctrl-keembay.c | 26 ++++++---
drivers/pinctrl/pinctrl-single.c | 4 +-
drivers/pinctrl/pinmux.c | 70 ++++++++++++++++++++----
drivers/pinctrl/pinmux.h | 9 ++-
drivers/pinctrl/qcom/Kconfig | 1 +
drivers/pinctrl/qcom/pinctrl-ipq5018.c | 2 +-
drivers/pinctrl/qcom/pinctrl-ipq5332.c | 2 +-
drivers/pinctrl/qcom/pinctrl-ipq5424.c | 2 +-
drivers/pinctrl/qcom/pinctrl-ipq6018.c | 2 +-
drivers/pinctrl/qcom/pinctrl-ipq8074.c | 2 +-
drivers/pinctrl/qcom/pinctrl-ipq9574.c | 2 +-
drivers/pinctrl/qcom/pinctrl-mdm9607.c | 2 +-
drivers/pinctrl/qcom/pinctrl-mdm9615.c | 2 +-
drivers/pinctrl/qcom/pinctrl-milos.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm.c | 45 +++++----------
drivers/pinctrl/qcom/pinctrl-msm.h | 5 ++
drivers/pinctrl/qcom/pinctrl-msm8226.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8660.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8909.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8916.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8917.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8953.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8960.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8976.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8994.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8996.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8998.c | 2 +-
drivers/pinctrl/qcom/pinctrl-msm8x74.c | 2 +-
drivers/pinctrl/qcom/pinctrl-qcm2290.c | 4 +-
drivers/pinctrl/qcom/pinctrl-qcs404.c | 2 +-
drivers/pinctrl/qcom/pinctrl-qcs615.c | 2 +-
drivers/pinctrl/qcom/pinctrl-qcs8300.c | 4 +-
drivers/pinctrl/qcom/pinctrl-qdu1000.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sa8775p.c | 4 +-
drivers/pinctrl/qcom/pinctrl-sar2130p.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sc7180.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sc7280.c | 4 +-
drivers/pinctrl/qcom/pinctrl-sc8180x.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sc8280xp.c | 4 +-
drivers/pinctrl/qcom/pinctrl-sdm660.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sdm670.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sdm845.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sdx55.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sdx65.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sdx75.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm4450.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm6115.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm6125.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm6350.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm6375.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm7150.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm8150.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm8250.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm8350.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm8450.c | 4 +-
drivers/pinctrl/qcom/pinctrl-sm8550.c | 2 +-
drivers/pinctrl/qcom/pinctrl-sm8650.c | 4 +-
drivers/pinctrl/qcom/pinctrl-sm8750.c | 4 +-
drivers/pinctrl/qcom/pinctrl-x1e80100.c | 2 +-
drivers/pinctrl/renesas/pinctrl-rza1.c | 2 +-
drivers/pinctrl/renesas/pinctrl-rza2.c | 2 +-
drivers/pinctrl/renesas/pinctrl-rzg2l.c | 2 +-
drivers/pinctrl/renesas/pinctrl-rzv2m.c | 2 +-
include/linux/device/devres.h | 2 +
include/linux/pinctrl/pinctrl.h | 14 +++++
include/linux/pinctrl/pinmux.h | 2 +
80 files changed, 288 insertions(+), 227 deletions(-)
---
base-commit: b320789d6883cc00ac78ce83bccbfe7ed58afcf0
change-id: 20250701-pinctrl-gpio-pinfuncs-de82bd9aac43
Best regards,
--
Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
This is the start of the stable review cycle for the 6.6.104 release.
There are 75 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 04 Sep 2025 13:19:14 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.104-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.6.104-rc1
Eric Sandeen <sandeen(a)redhat.com>
xfs: do not propagate ENODATA disk errors into xattr code
Imre Deak <imre.deak(a)intel.com>
Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Handle reads greater than 60 bytes
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Don't set bus speed on every transfer
Chris Mi <cmi(a)nvidia.com>
net/mlx5: SF, Fix add port error handling
Eric Dumazet <edumazet(a)google.com>
net: rose: fix a typo in rose_clear_routes()
James Jones <jajones(a)nvidia.com>
drm/nouveau/disp: Always accept linear modifier
Steve French <stfrench(a)microsoft.com>
smb3 client: fix return code mapping of remap_file_range
Fabio Porcedda <fabio.porcedda(a)gmail.com>
net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions
Shuhao Fu <sfual(a)cse.ust.hk>
fs/smb: Fix inconsistent refcnt update
Shanker Donthineni <sdonthineni(a)nvidia.com>
dma/pool: Ensure DMA_DIRECT_REMAP allocations are decrypted
Alex Deucher <alexander.deucher(a)amd.com>
Revert "drm/amdgpu: fix incorrect vm flags to map bo"
Minjong Kim <minbell.kim(a)samsung.com>
HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()
Ping Cheng <pinglinux(a)gmail.com>
HID: wacom: Add a new Art Pen 2
Matt Coffin <mcoffin13(a)gmail.com>
HID: logitech: Add ids for G PRO 2 LIGHTSPEED
Antheas Kapenekakis <lkml(a)antheas.dev>
HID: quirks: add support for Legion Go dual dinput modes
Qasim Ijaz <qasdev00(a)gmail.com>
HID: multitouch: fix slab out-of-bounds access in mt_report_fixup()
Qasim Ijaz <qasdev00(a)gmail.com>
HID: asus: fix UAF via HID_CLAIMED_INPUT validation
Borislav Petkov (AMD) <bp(a)alien8.de>
x86/microcode/AMD: Handle the case of no BIOS microcode
Thijs Raymakers <thijs(a)raymakers.nl>
KVM: x86: use array_index_nospec with indices that come from guest
Li Nan <linan122(a)huawei.com>
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
Eric Dumazet <edumazet(a)google.com>
sctp: initialize more fields in sctp_v6_from_sk()
Takamitsu Iwai <takamitz(a)amazon.co.jp>
net: rose: include node references in rose_neigh refcount
Takamitsu Iwai <takamitz(a)amazon.co.jp>
net: rose: convert 'use' field to refcount_t
Takamitsu Iwai <takamitz(a)amazon.co.jp>
net: rose: split remove and free operations in rose_remove_neigh()
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: Set CIC bit only for TX queues with COE
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: xgmac: Correct supported speed modes
Serge Semin <fancer.lancer(a)gmail.com>
net: stmmac: Rename phylink_get_caps() callback to update_caps()
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Set local Xoff after FW update
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon port speed set
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon MTU set
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Nack sync reset when SFs are present
Jiri Pirko <jiri(a)resnulli.us>
net/mlx5: Convert SF port_indices xarray to function_ids xarray
Jiri Pirko <jiri(a)resnulli.us>
net/mlx5: Use devlink port pointer to get the pointer of container SF struct
Jiri Pirko <jiri(a)resnulli.us>
net/mlx5: Call mlx5_sf_id_erase() once in mlx5_sf_dealloc()
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Fix lockdep assertion on sync reset unload event
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Add support for sync reset using hot reset
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Add device cap for supporting hot reset in sync reset flow
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Reload auxiliary drivers on fw_activate
Horatiu Vultur <horatiu.vultur(a)microchip.com>
phy: mscc: Fix when PTP clock is register and unregister
Yeounsu Moon <yyyynoom(a)gmail.com>
net: dlink: fix multicast stats being counted incorrectly
Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com>
dt-bindings: display/msm: qcom,mdp5: drop lut clock
Michal Kubiak <michal.kubiak(a)intel.com>
ice: fix incorrect counter for buffer allocation failures
Maciej Fijalkowski <maciej.fijalkowski(a)intel.com>
ice: stop storing XDP verdict within ice_rx_buf
Maciej Fijalkowski <maciej.fijalkowski(a)intel.com>
ice: gather page_count()'s of each frag right before XDP prog call
Larysa Zaremba <larysa.zaremba(a)intel.com>
ice: Introduce ice_xdp_buff
Timur Tabi <ttabi(a)nvidia.com>
drm/nouveau: remove unused memory target test
Timur Tabi <ttabi(a)nvidia.com>
drm/nouveau: remove unused increment in gm200_flcn_pio_imem_wr
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control().
Pavel Shpakovskiy <pashpakovskii(a)salutedevices.com>
Bluetooth: hci_sync: fix set_local_name race condition
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced
Ludovico de Nittis <ludovico.denittis(a)collabora.com>
Bluetooth: hci_event: Mark connection as closed during suspend disconnect
Ludovico de Nittis <ludovico.denittis(a)collabora.com>
Bluetooth: hci_event: Treat UNKNOWN_CONN_ID on disconnect as success
José Expósito <jose.exposito89(a)gmail.com>
HID: input: report battery status changes immediately
José Expósito <jose.exposito89(a)gmail.com>
HID: input: rename hidinput_set_battery_charge_status()
Madhavan Srinivasan <maddy(a)linux.ibm.com>
powerpc/kvm: Fix ifdef to remove build warning
Rob Clark <robin.clark(a)oss.qualcomm.com>
drm/msm: Defer fd_install in SUBMIT ioctl
Oscar Maes <oscmaes92(a)gmail.com>
net: ipv4: fix regression in local-broadcast routes
Nikolay Kuratov <kniv(a)yandex-team.ru>
vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFS: Fix a race when updating an existing write
Christoph Hellwig <hch(a)lst.de>
nfs: fold nfs_page_group_lock_subrequests into nfs_lock_and_join_requests
Werner Sembach <wse(a)tuxedocomputers.com>
ACPI: EC: Add device to acpi_ec_no_wakeup[] qurik list
Junli Liu <liujunli(a)lixiang.com>
erofs: fix atomic context detection when !CONFIG_DEBUG_LOCK_ALLOC
Alexey Klimov <alexey.klimov(a)linaro.org>
ASoC: codecs: tx-macro: correct tx_macro_component_drv name
Paulo Alcantara <pc(a)manguebit.org>
smb: client: fix race with concurrent opens in rename(2)
Paulo Alcantara <pc(a)manguebit.org>
smb: client: fix race with concurrent opens in unlink(2)
Damien Le Moal <dlemoal(a)kernel.org>
scsi: core: sysfs: Correct sysfs attributes access rights
Tengda Wu <wutengda(a)huaweicloud.com>
ftrace: Fix potential warning in trace_printk_seq during ftrace_dump
Dan Carpenter <dan.carpenter(a)linaro.org>
of: dynamic: Fix use after free in of_changeset_add_prop_helper()
Rob Herring <robh(a)kernel.org>
of: Add a helper to free property struct
Aleksander Jan Bajkowski <olek2(a)wp.pl>
mips: lantiq: xway: sysctrl: rename the etop node
Aleksander Jan Bajkowski <olek2(a)wp.pl>
mips: dts: lantiq: danube: add missing burst length property
Randy Dunlap <rdunlap(a)infradead.org>
pinctrl: STMFX: add missing HAS_IOMEM dependency
Lizhi Hou <lizhi.hou(a)amd.com>
of: dynamic: Fix memleak when of_pci_add_properties() failed
-------------
Diffstat:
.../devicetree/bindings/display/msm/qcom,mdp5.yaml | 1 -
Makefile | 4 +-
arch/mips/boot/dts/lantiq/danube_easy50712.dts | 5 +-
arch/mips/lantiq/xway/sysctrl.c | 10 +-
arch/powerpc/kernel/kvm.c | 8 +-
arch/x86/kernel/cpu/microcode/amd.c | 22 ++-
arch/x86/kvm/lapic.c | 2 +
arch/x86/kvm/x86.c | 7 +-
drivers/acpi/ec.c | 6 +
drivers/atm/atmtcp.c | 17 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +-
drivers/gpu/drm/display/drm_dp_helper.c | 2 +-
drivers/gpu/drm/msm/msm_gem_submit.c | 14 +-
drivers/gpu/drm/nouveau/dispnv50/wndw.c | 4 +
drivers/gpu/drm/nouveau/nvkm/falcon/gm200.c | 15 +-
drivers/hid/hid-asus.c | 8 +-
drivers/hid/hid-ids.h | 3 +
drivers/hid/hid-input-test.c | 10 +-
drivers/hid/hid-input.c | 51 +++---
drivers/hid/hid-logitech-dj.c | 4 +
drivers/hid/hid-logitech-hidpp.c | 2 +
drivers/hid/hid-mcp2221.c | 71 +++++---
drivers/hid/hid-multitouch.c | 8 +
drivers/hid/hid-ntrig.c | 3 +
drivers/hid/hid-quirks.c | 2 +
drivers/hid/wacom_wac.c | 1 +
drivers/net/ethernet/dlink/dl2k.c | 2 +-
drivers/net/ethernet/intel/ice/ice_txrx.c | 94 +++++++---
drivers/net/ethernet/intel/ice/ice_txrx.h | 19 +-
drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 53 ++----
drivers/net/ethernet/mellanox/mlx5/core/devlink.c | 2 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.c | 3 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.h | 12 ++
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 19 +-
drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 200 ++++++++++++++-------
drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h | 1 +
drivers/net/ethernet/mellanox/mlx5/core/main.c | 3 +
.../net/ethernet/mellanox/mlx5/core/sf/devlink.c | 87 ++++-----
drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h | 6 +
drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 8 +-
.../net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 13 +-
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 9 +-
drivers/net/ethernet/stmicro/stmmac/hwif.h | 8 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 12 +-
drivers/net/phy/mscc/mscc.h | 4 +
drivers/net/phy/mscc/mscc_main.c | 4 +-
drivers/net/phy/mscc/mscc_ptp.c | 34 ++--
drivers/net/usb/qmi_wwan.c | 3 +
drivers/of/dynamic.c | 29 +--
drivers/of/of_private.h | 1 +
drivers/of/overlay.c | 11 +-
drivers/of/unittest.c | 12 +-
drivers/pinctrl/Kconfig | 1 +
drivers/scsi/scsi_sysfs.c | 4 +-
drivers/vhost/net.c | 9 +-
fs/efivarfs/super.c | 4 +
fs/erofs/zdata.c | 13 +-
fs/nfs/pagelist.c | 86 +--------
fs/nfs/write.c | 148 +++++++++------
fs/smb/client/cifsfs.c | 14 ++
fs/smb/client/inode.c | 34 +++-
fs/smb/client/smb2inode.c | 7 +-
fs/xfs/libxfs/xfs_attr_remote.c | 7 +
fs/xfs/libxfs/xfs_da_btree.c | 6 +
include/linux/atmdev.h | 1 +
include/linux/mlx5/mlx5_ifc.h | 11 +-
include/linux/nfs_page.h | 2 +-
include/net/bluetooth/hci_sync.h | 2 +-
include/net/rose.h | 18 +-
kernel/dma/pool.c | 4 +-
kernel/trace/trace.c | 4 +-
net/atm/common.c | 15 +-
net/bluetooth/hci_event.c | 20 ++-
net/bluetooth/hci_sync.c | 6 +-
net/bluetooth/mgmt.c | 5 +-
net/ipv4/route.c | 10 +-
net/rose/af_rose.c | 13 +-
net/rose/rose_in.c | 12 +-
net/rose/rose_route.c | 62 ++++---
net/rose/rose_timer.c | 2 +-
net/sctp/ipv6.c | 2 +
sound/soc/codecs/lpass-tx-macro.c | 2 +-
82 files changed, 897 insertions(+), 560 deletions(-)
This is the start of the stable review cycle for the 5.15.191 release.
There are 33 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 04 Sep 2025 13:19:14 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.191-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.191-rc1
Eric Sandeen <sandeen(a)redhat.com>
xfs: do not propagate ENODATA disk errors into xattr code
Imre Deak <imre.deak(a)intel.com>
Revert "drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS"
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Handle reads greater than 60 bytes
Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
HID: mcp2221: Don't set bus speed on every transfer
James Jones <jajones(a)nvidia.com>
drm/nouveau/disp: Always accept linear modifier
Fabio Porcedda <fabio.porcedda(a)gmail.com>
net: usb: qmi_wwan: add Telit Cinterion LE910C4-WWX new compositions
Shanker Donthineni <sdonthineni(a)nvidia.com>
dma/pool: Ensure DMA_DIRECT_REMAP allocations are decrypted
Alex Deucher <alexander.deucher(a)amd.com>
Revert "drm/amdgpu: fix incorrect vm flags to map bo"
Minjong Kim <minbell.kim(a)samsung.com>
HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()
Ping Cheng <pinglinux(a)gmail.com>
HID: wacom: Add a new Art Pen 2
Qasim Ijaz <qasdev00(a)gmail.com>
HID: multitouch: fix slab out-of-bounds access in mt_report_fixup()
Qasim Ijaz <qasdev00(a)gmail.com>
HID: asus: fix UAF via HID_CLAIMED_INPUT validation
Thijs Raymakers <thijs(a)raymakers.nl>
KVM: x86: use array_index_nospec with indices that come from guest
Li Nan <linan122(a)huawei.com>
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
Eric Dumazet <edumazet(a)google.com>
sctp: initialize more fields in sctp_v6_from_sk()
Rohan G Thomas <rohan.g.thomas(a)altera.com>
net: stmmac: xgmac: Do not enable RX FIFO Overflow interrupts
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Set local Xoff after FW update
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon port speed set
Alexei Lazar <alazar(a)nvidia.com>
net/mlx5e: Update and set Xon/Xoff upon MTU set
Horatiu Vultur <horatiu.vultur(a)microchip.com>
phy: mscc: Fix when PTP clock is register and unregister
Yeounsu Moon <yyyynoom(a)gmail.com>
net: dlink: fix multicast stats being counted incorrectly
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: atmtcp: Prevent arbitrary write in atmtcp_recv_control().
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Detect if HCI_EV_NUM_COMP_PKTS is unbalanced
Madhavan Srinivasan <maddy(a)linux.ibm.com>
powerpc/kvm: Fix ifdef to remove build warning
Oscar Maes <oscmaes92(a)gmail.com>
net: ipv4: fix regression in local-broadcast routes
Jan Kara <jack(a)suse.cz>
udf: Fix directory iteration for longer tail extents
Nikolay Kuratov <kniv(a)yandex-team.ru>
vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFS: Fix a race when updating an existing write
Christoph Hellwig <hch(a)lst.de>
nfs: fold nfs_page_group_lock_subrequests into nfs_lock_and_join_requests
Alexey Klimov <alexey.klimov(a)linaro.org>
ASoC: codecs: tx-macro: correct tx_macro_component_drv name
Damien Le Moal <dlemoal(a)kernel.org>
scsi: core: sysfs: Correct sysfs attributes access rights
Tengda Wu <wutengda(a)huaweicloud.com>
ftrace: Fix potential warning in trace_printk_seq during ftrace_dump
Randy Dunlap <rdunlap(a)infradead.org>
pinctrl: STMFX: add missing HAS_IOMEM dependency
-------------
Diffstat:
Makefile | 4 +-
arch/powerpc/kernel/kvm.c | 8 +-
arch/x86/kvm/lapic.c | 2 +
arch/x86/kvm/x86.c | 7 +-
drivers/atm/atmtcp.c | 17 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +-
drivers/gpu/drm/drm_dp_helper.c | 2 +-
drivers/gpu/drm/nouveau/dispnv50/wndw.c | 4 +
drivers/hid/hid-asus.c | 8 +-
drivers/hid/hid-mcp2221.c | 71 +++++++----
drivers/hid/hid-multitouch.c | 8 ++
drivers/hid/hid-ntrig.c | 3 +
drivers/hid/wacom_wac.c | 1 +
drivers/net/ethernet/dlink/dl2k.c | 2 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.c | 3 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.h | 12 ++
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 19 ++-
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 4 -
drivers/net/phy/mscc/mscc.h | 4 +
drivers/net/phy/mscc/mscc_main.c | 4 +-
drivers/net/phy/mscc/mscc_ptp.c | 34 +++--
drivers/net/usb/qmi_wwan.c | 3 +
drivers/pinctrl/Kconfig | 1 +
drivers/scsi/scsi_sysfs.c | 4 +-
drivers/vhost/net.c | 9 +-
fs/efivarfs/super.c | 4 +
fs/nfs/pagelist.c | 86 +------------
fs/nfs/write.c | 142 +++++++++++++--------
fs/udf/directory.c | 2 +-
fs/xfs/libxfs/xfs_attr_remote.c | 7 +
fs/xfs/libxfs/xfs_da_btree.c | 6 +
include/linux/atmdev.h | 1 +
include/linux/nfs_page.h | 2 +-
kernel/dma/pool.c | 4 +-
kernel/trace/trace.c | 4 +-
net/atm/common.c | 15 ++-
net/bluetooth/hci_event.c | 12 +-
net/ipv4/route.c | 10 +-
net/sctp/ipv6.c | 2 +
sound/soc/codecs/lpass-tx-macro.c | 2 +-
40 files changed, 328 insertions(+), 209 deletions(-)
If KASAN is enabled, and one runs in a clean repository e.g.:
make LLVM=1 prepare
make LLVM=1 prepare
Then the Rust code gets rebuilt, which should not happen.
The reason is some of the LLVM KASAN `rustc` flags are added in the
second run:
-Cllvm-args=-asan-instrumentation-with-call-threshold=10000
-Cllvm-args=-asan-stack=0
-Cllvm-args=-asan-globals=1
-Cllvm-args=-asan-kernel-mem-intrinsic-prefix=1
Further runs do not rebuild Rust because the flags do not change anymore.
Rebuilding like that in the second run is bad, even if this just happens
with KASAN enabled, but missing flags in the first one is even worse.
The root issue is that we pass, for some architectures and for the moment,
a generated `target.json` file. That file is not ready by the time `rustc`
gets called for the flag test, and thus the flag test fails just because
the file is not available, e.g.:
$ ... --target=./scripts/target.json ... -Cllvm-args=...
error: target file "./scripts/target.json" does not exist
There are a few approaches we could take here to solve this. For instance,
we could ensure that every time that the config is rebuilt, we regenerate
the file and recompute the flags. Or we could use the LLVM version to
check for these flags, instead of testing the flag (which may have other
advantages, such as allowing us to detect renames on the LLVM side).
However, it may be easier than that: `rustc` is aware of the `-Cllvm-args`
regardless of the `--target` (e.g. I checked that the list printed
is the same, plus that I can check for these flags even if I pass
a completely unrelated target), and thus we can just eliminate the
dependency completely.
Thus filter out the target.
This does mean that `rustc-option` cannot be used to test a flag that
requires the right target, but we don't have other users yet, it is a
minimal change and we want to get rid of custom targets in the future.
We could only filter in the case `target.json` is used, to make it work
in more cases, but then it would be harder to notice that it may not
work in a couple architectures.
Cc: Matthew Maurer <mmaurer(a)google.com>
Cc: Sami Tolvanen <samitolvanen(a)google.com>
Cc: stable(a)vger.kernel.org
Fixes: e3117404b411 ("kbuild: rust: Enable KASAN support")
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
---
By the way, I noticed that we are not getting `asan-instrument-allocas` enabled
in neither C nor Rust -- upstream LLVM renamed it in commit 8176ee9b5dda ("[asan]
Rename asan-instrument-allocas -> asan-instrument-dynamic-allocas")). But it
happened a very long time ago (9 years ago), and the addition in the kernel
is fairly old too, in 342061ee4ef3 ("kasan: support alloca() poisoning").
I assume it should either be renamed or removed? Happy to send a patch if so.
scripts/Makefile.compiler | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/Makefile.compiler b/scripts/Makefile.compiler
index 8956587b8547..7ed7f92a7daa 100644
--- a/scripts/Makefile.compiler
+++ b/scripts/Makefile.compiler
@@ -80,7 +80,7 @@ ld-option = $(call try-run, $(LD) $(KBUILD_LDFLAGS) $(1) -v,$(1),$(2),$(3))
# TODO: remove RUSTC_BOOTSTRAP=1 when we raise the minimum GNU Make version to 4.4
__rustc-option = $(call try-run,\
echo '#![allow(missing_docs)]#![feature(no_core)]#![no_core]' | RUSTC_BOOTSTRAP=1\
- $(1) --sysroot=/dev/null $(filter-out --sysroot=/dev/null,$(2)) $(3)\
+ $(1) --sysroot=/dev/null $(filter-out --sysroot=/dev/null --target=%,$(2)) $(3)\
--crate-type=rlib --out-dir=$(TMPOUT) --emit=obj=- - >/dev/null,$(3),$(4))
# rustc-option
base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8
--
2.49.0
When a CPU chooses to call push_dl_task and picks a task to push to
another CPU's runqueue then it will call find_lock_later_rq method
which would take a double lock on both CPUs' runqueues. If one of the
locks aren't readily available, it may lead to dropping the current
runqueue lock and reacquiring both the locks at once. During this window
it is possible that the task is already migrated and is running on some
other CPU. These cases are already handled. However, if the task is
migrated and has already been executed and another CPU is now trying to
wake it up (ttwu) such that it is queued again on the runqeue
(on_rq is 1) and also if the task was run by the same CPU, then the
current checks will pass even though the task was migrated out and is no
longer in the pushable tasks list.
Please go through the original rt change for more details on the issue.
To fix this, after the lock is obtained inside the find_lock_later_rq,
it ensures that the task is still at the head of pushable tasks list.
Also removed some checks that are no longer needed with the addition of
this new check.
However, the new check of pushable tasks list only applies when
find_lock_later_rq is called by push_dl_task. For the other caller i.e.
dl_task_offline_migration, existing checks are used.
Signed-off-by: Harshit Agarwal <harshit(a)nutanix.com>
Cc: stable(a)vger.kernel.org
---
Changes in v3:
- Incorporated review comments from Juri around the commit message as
well as around the comment regarding checks in find_lock_later_rq.
- Link to v2:
https://lore.kernel.org/stable/20250317022325.52791-1-harshit@nutanix.com/
Changes in v2:
- As per Juri's suggestion, moved the check inside find_lock_later_rq
similar to rt change. Here we distinguish among the push_dl_task
caller vs dl_task_offline_migration by checking if the task is
throttled or not.
- Fixed the commit message to refer to the rt change by title.
- Link to v1:
https://lore.kernel.org/lkml/20250307204255.60640-1-harshit@nutanix.com/
---
kernel/sched/deadline.c | 73 +++++++++++++++++++++++++++--------------
1 file changed, 49 insertions(+), 24 deletions(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 38e4537790af..e0c95f33e1ed 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2621,6 +2621,25 @@ static int find_later_rq(struct task_struct *task)
return -1;
}
+static struct task_struct *pick_next_pushable_dl_task(struct rq *rq)
+{
+ struct task_struct *p;
+
+ if (!has_pushable_dl_tasks(rq))
+ return NULL;
+
+ p = __node_2_pdl(rb_first_cached(&rq->dl.pushable_dl_tasks_root));
+
+ WARN_ON_ONCE(rq->cpu != task_cpu(p));
+ WARN_ON_ONCE(task_current(rq, p));
+ WARN_ON_ONCE(p->nr_cpus_allowed <= 1);
+
+ WARN_ON_ONCE(!task_on_rq_queued(p));
+ WARN_ON_ONCE(!dl_task(p));
+
+ return p;
+}
+
/* Locks the rq it finds */
static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
{
@@ -2648,12 +2667,37 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
/* Retry if something changed. */
if (double_lock_balance(rq, later_rq)) {
- if (unlikely(task_rq(task) != rq ||
+ /*
+ * double_lock_balance had to release rq->lock, in the
+ * meantime, task may no longer be fit to be migrated.
+ * Check the following to ensure that the task is
+ * still suitable for migration:
+ * 1. It is possible the task was scheduled,
+ * migrate_disabled was set and then got preempted,
+ * so we must check the task migration disable
+ * flag.
+ * 2. The CPU picked is in the task's affinity.
+ * 3. For throttled task (dl_task_offline_migration),
+ * check the following:
+ * - the task is not on the rq anymore (it was
+ * migrated)
+ * - the task is not on CPU anymore
+ * - the task is still a dl task
+ * - the task is not queued on the rq anymore
+ * 4. For the non-throttled task (push_dl_task), the
+ * check to ensure that this task is still at the
+ * head of the pushable tasks list is enough.
+ */
+ if (unlikely(is_migration_disabled(task) ||
!cpumask_test_cpu(later_rq->cpu, &task->cpus_mask) ||
- task_on_cpu(rq, task) ||
- !dl_task(task) ||
- is_migration_disabled(task) ||
- !task_on_rq_queued(task))) {
+ (task->dl.dl_throttled &&
+ (task_rq(task) != rq ||
+ task_on_cpu(rq, task) ||
+ !dl_task(task) ||
+ !task_on_rq_queued(task))) ||
+ (!task->dl.dl_throttled &&
+ task != pick_next_pushable_dl_task(rq)))) {
+
double_unlock_balance(rq, later_rq);
later_rq = NULL;
break;
@@ -2676,25 +2720,6 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
return later_rq;
}
-static struct task_struct *pick_next_pushable_dl_task(struct rq *rq)
-{
- struct task_struct *p;
-
- if (!has_pushable_dl_tasks(rq))
- return NULL;
-
- p = __node_2_pdl(rb_first_cached(&rq->dl.pushable_dl_tasks_root));
-
- WARN_ON_ONCE(rq->cpu != task_cpu(p));
- WARN_ON_ONCE(task_current(rq, p));
- WARN_ON_ONCE(p->nr_cpus_allowed <= 1);
-
- WARN_ON_ONCE(!task_on_rq_queued(p));
- WARN_ON_ONCE(!dl_task(p));
-
- return p;
-}
-
/*
* See if the non running -deadline tasks on this rq
* can be sent to some other CPU where they can preempt
--
2.49.0.111.g5b97a56fa0
From: Youling Tang <tangyouling(a)kylinos.cn>
Automatically disable kaslr when the kernel loads from kexec_file.
kexec_file loads the secondary kernel image to a non-linked address,
inherently providing KASLR-like randomization.
However, on LoongArch where System RAM may be non-contiguous, enabling
KASLR for the second kernel could relocate it to an invalid memory
region and cause boot failure. Thus, we disable KASLR when
"kexec_file" is detected in the command line.
To ensure compatibility with older kernels loaded via kexec_file,
this patch need be backported to stable branches.
Cc: stable(a)vger.kernel.org
Signed-off-by: Youling Tang <tangyouling(a)kylinos.cn>
---
arch/loongarch/kernel/relocate.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/loongarch/kernel/relocate.c b/arch/loongarch/kernel/relocate.c
index 50c469067f3a..4c097532cb88 100644
--- a/arch/loongarch/kernel/relocate.c
+++ b/arch/loongarch/kernel/relocate.c
@@ -140,6 +140,10 @@ static inline __init bool kaslr_disabled(void)
if (str == boot_command_line || (str > boot_command_line && *(str - 1) == ' '))
return true;
+ str = strstr(boot_command_line, "kexec_file");
+ if (str == boot_command_line || (str > boot_command_line && *(str - 1) == ' '))
+ return true;
+
#ifdef CONFIG_HIBERNATION
str = strstr(builtin_cmdline, "nohibernate");
if (str == builtin_cmdline || (str > builtin_cmdline && *(str - 1) == ' '))
--
2.43.0
When SRIOV is enabled, RSS is also supported for the functions. There is
an incorrect flag judgement which causes RSS to fail to be enabled.
Fixes: c52d4b898901 ("net: libwx: Redesign flow when sriov is enabled")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu(a)trustnetic.com>
---
drivers/net/ethernet/wangxun/libwx/wx_hw.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_hw.c b/drivers/net/ethernet/wangxun/libwx/wx_hw.c
index bcd07a715752..5cb353a97d6d 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_hw.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_hw.c
@@ -2078,10 +2078,6 @@ static void wx_setup_mrqc(struct wx *wx)
{
u32 rss_field = 0;
- /* VT, and RSS do not coexist at the same time */
- if (test_bit(WX_FLAG_VMDQ_ENABLED, wx->flags))
- return;
-
/* Disable indicating checksum in descriptor, enables RSS hash */
wr32m(wx, WX_PSR_CTL, WX_PSR_CTL_PCSD, WX_PSR_CTL_PCSD);
--
2.48.1
Fix multiple fwnode reference leaks:
1. The function calls fwnode_get_named_child_node() to get the "leds" node,
but never calls fwnode_handle_put(leds) to release this reference.
2. Within the fwnode_for_each_child_node() loop, the early return
paths that don't properly release the "led" fwnode reference.
This fix follows the same pattern as commit d029edefed39
("net dsa: qca8k: fix usages of device_get_named_child_node()")
Fixes: 94a2a84f5e9e ("net: dsa: mv88e6xxx: Support LED control")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
changes in v2:
- use goto for cleanup in error paths
- v1: https://lore.kernel.org/all/20250830085508.2107507-1-linmq006@gmail.com/
---
drivers/net/dsa/mv88e6xxx/leds.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/leds.c b/drivers/net/dsa/mv88e6xxx/leds.c
index 1c88bfaea46b..ab3bc645da56 100644
--- a/drivers/net/dsa/mv88e6xxx/leds.c
+++ b/drivers/net/dsa/mv88e6xxx/leds.c
@@ -779,7 +779,8 @@ int mv88e6xxx_port_setup_leds(struct mv88e6xxx_chip *chip, int port)
continue;
if (led_num > 1) {
dev_err(dev, "invalid LED specified port %d\n", port);
- return -EINVAL;
+ ret = -EINVAL;
+ goto err_put_led;
}
if (led_num == 0)
@@ -823,17 +824,25 @@ int mv88e6xxx_port_setup_leds(struct mv88e6xxx_chip *chip, int port)
init_data.devname_mandatory = true;
init_data.devicename = kasprintf(GFP_KERNEL, "%s:0%d:0%d", chip->info->name,
port, led_num);
- if (!init_data.devicename)
- return -ENOMEM;
+ if (!init_data.devicename) {
+ ret = -ENOMEM;
+ goto err_put_led;
+ }
ret = devm_led_classdev_register_ext(dev, l, &init_data);
kfree(init_data.devicename);
if (ret) {
dev_err(dev, "Failed to init LED %d for port %d", led_num, port);
- return ret;
+ goto err_put_led;
}
}
+ fwnode_handle_put(leds);
return 0;
+
+err_put_led:
+ fwnode_handle_put(led);
+ fwnode_handle_put(leds);
+ return ret;
}
--
2.35.1
commit efa95b01da18 ("netpoll: fix use after free") incorrectly
ignored the refcount and prematurely set dev->npinfo to NULL during
netpoll cleanup, leading to improper behavior and memory leaks.
Scenario causing lack of proper cleanup:
1) A netpoll is associated with a NIC (e.g., eth0) and netdev->npinfo is
allocated, and refcnt = 1
- Keep in mind that npinfo is shared among all netpoll instances. In
this case, there is just one.
2) Another netpoll is also associated with the same NIC and
npinfo->refcnt += 1.
- Now dev->npinfo->refcnt = 2;
- There is just one npinfo associated to the netdev.
3) When the first netpolls goes to clean up:
- The first cleanup succeeds and clears np->dev->npinfo, ignoring
refcnt.
- It basically calls `RCU_INIT_POINTER(np->dev->npinfo, NULL);`
- Set dev->npinfo = NULL, without proper cleanup
- No ->ndo_netpoll_cleanup() is either called
4) Now the second target tries to clean up
- The second cleanup fails because np->dev->npinfo is already NULL.
* In this case, ops->ndo_netpoll_cleanup() was never called, and
the skb pool is not cleaned as well (for the second netpoll
instance)
- This leaks npinfo and skbpool skbs, which is clearly reported by
kmemleak.
Revert commit efa95b01da18 ("netpoll: fix use after free") and adds
clarifying comments emphasizing that npinfo cleanup should only happen
once the refcount reaches zero, ensuring stable and correct netpoll
behavior.
Cc: stable(a)vger.kernel.org
Cc: jv(a)jvosburgh.net
Fixes: efa95b01da18 ("netpoll: fix use after free")
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
I have a selftest that shows the memory leak when kmemleak is enabled
and I will be submitting to net-next.
Also, giving I am reverting commit efa95b01da18 ("netpoll: fix use
after free"), which was supposed to fix a problem on bonding, I am
copying Jay.
---
net/core/netpoll.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 5f65b62346d4e..19676cd379640 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -815,6 +815,10 @@ static void __netpoll_cleanup(struct netpoll *np)
if (!npinfo)
return;
+ /* At this point, there is a single npinfo instance per netdevice, and
+ * its refcnt tracks how many netpoll structures are linked to it. We
+ * only perform npinfo cleanup when the refcnt decrements to zero.
+ */
if (refcount_dec_and_test(&npinfo->refcnt)) {
const struct net_device_ops *ops;
@@ -824,8 +828,7 @@ static void __netpoll_cleanup(struct netpoll *np)
RCU_INIT_POINTER(np->dev->npinfo, NULL);
call_rcu(&npinfo->rcu, rcu_cleanup_netpoll_info);
- } else
- RCU_INIT_POINTER(np->dev->npinfo, NULL);
+ }
skb_pool_flush(np);
}
---
base-commit: 864ecc4a6dade82d3f70eab43dad0e277aa6fc78
change-id: 20250901-netpoll_memleak-90d0d4bc772c
Best regards,
--
Breno Leitao <leitao(a)debian.org>
The patch titled
Subject: compiler-clang.h: define __SANITIZE_*__ macros only when undefined
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
compiler-clangh-define-__sanitize___-macros-only-when-undefined.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Nathan Chancellor <nathan(a)kernel.org>
Subject: compiler-clang.h: define __SANITIZE_*__ macros only when undefined
Date: Tue, 02 Sep 2025 15:49:26 -0700
Clang 22 recently added support for defining __SANITIZE__ macros similar
to GCC [1], which causes warnings (or errors with CONFIG_WERROR=y or W=e)
with the existing defines that the kernel creates to emulate this behavior
with existing clang versions.
In file included from <built-in>:3:
In file included from include/linux/compiler_types.h:171:
include/linux/compiler-clang.h:37:9: error: '__SANITIZE_THREAD__' macro redefined [-Werror,-Wmacro-redefined]
37 | #define __SANITIZE_THREAD__
| ^
<built-in>:352:9: note: previous definition is here
352 | #define __SANITIZE_THREAD__ 1
| ^
Refactor compiler-clang.h to only define the sanitizer macros when they
are undefined and adjust the rest of the code to use these macros for
checking if the sanitizers are enabled, clearing up the warnings and
allowing the kernel to easily drop these defines when the minimum
supported version of LLVM for building the kernel becomes 22.0.0 or newer.
Link: https://lkml.kernel.org/r/20250902-clang-update-sanitize-defines-v1-1-cf370…
Link: https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444da… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
Reviewed-by: Justin Stitt <justinstitt(a)google.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Bill Wendling <morbo(a)google.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/compiler-clang.h | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
--- a/include/linux/compiler-clang.h~compiler-clangh-define-__sanitize___-macros-only-when-undefined
+++ a/include/linux/compiler-clang.h
@@ -18,23 +18,42 @@
#define KASAN_ABI_VERSION 5
/*
+ * Clang 22 added preprocessor macros to match GCC, in hopes of eventually
+ * dropping __has_feature support for sanitizers:
+ * https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444da…
+ * Create these macros for older versions of clang so that it is easy to clean
+ * up once the minimum supported version of LLVM for building the kernel always
+ * creates these macros.
+ *
* Note: Checking __has_feature(*_sanitizer) is only true if the feature is
* enabled. Therefore it is not required to additionally check defined(CONFIG_*)
* to avoid adding redundant attributes in other configurations.
*/
+#if __has_feature(address_sanitizer) && !defined(__SANITIZE_ADDRESS__)
+#define __SANITIZE_ADDRESS__
+#endif
+#if __has_feature(hwaddress_sanitizer) && !defined(__SANITIZE_HWADDRESS__)
+#define __SANITIZE_HWADDRESS__
+#endif
+#if __has_feature(thread_sanitizer) && !defined(__SANITIZE_THREAD__)
+#define __SANITIZE_THREAD__
+#endif
-#if __has_feature(address_sanitizer) || __has_feature(hwaddress_sanitizer)
-/* Emulate GCC's __SANITIZE_ADDRESS__ flag */
+/*
+ * Treat __SANITIZE_HWADDRESS__ the same as __SANITIZE_ADDRESS__ in the kernel.
+ */
+#ifdef __SANITIZE_HWADDRESS__
#define __SANITIZE_ADDRESS__
+#endif
+
+#ifdef __SANITIZE_ADDRESS__
#define __no_sanitize_address \
__attribute__((no_sanitize("address", "hwaddress")))
#else
#define __no_sanitize_address
#endif
-#if __has_feature(thread_sanitizer)
-/* emulate gcc's __SANITIZE_THREAD__ flag */
-#define __SANITIZE_THREAD__
+#ifdef __SANITIZE_THREAD__
#define __no_sanitize_thread \
__attribute__((no_sanitize("thread")))
#else
_
Patches currently in -mm which might be from nathan(a)kernel.org are
compiler-clangh-define-__sanitize___-macros-only-when-undefined.patch
mm-rmap-convert-enum-rmap_level-to-enum-pgtable_level-fix.patch
Clang 22 recently added support for defining __SANITIZE__ macros similar
to GCC [1], which causes warnings (or errors with CONFIG_WERROR=y or
W=e) with the existing defines that the kernel creates to emulate this
behavior with existing clang versions.
In file included from <built-in>:3:
In file included from include/linux/compiler_types.h:171:
include/linux/compiler-clang.h:37:9: error: '__SANITIZE_THREAD__' macro redefined [-Werror,-Wmacro-redefined]
37 | #define __SANITIZE_THREAD__
| ^
<built-in>:352:9: note: previous definition is here
352 | #define __SANITIZE_THREAD__ 1
| ^
Refactor compiler-clang.h to only define the sanitizer macros when they
are undefined and adjust the rest of the code to use these macros for
checking if the sanitizers are enabled, clearing up the warnings and
allowing the kernel to easily drop these defines when the minimum
supported version of LLVM for building the kernel becomes 22.0.0 or
newer.
Cc: stable(a)vger.kernel.org
Link: https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444da… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
Andrew, would it be possible to take this via mm-hotfixes?
---
include/linux/compiler-clang.h | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index fa4ffe037bc7..8720a0705900 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -18,23 +18,42 @@
#define KASAN_ABI_VERSION 5
/*
+ * Clang 22 added preprocessor macros to match GCC, in hopes of eventually
+ * dropping __has_feature support for sanitizers:
+ * https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444da…
+ * Create these macros for older versions of clang so that it is easy to clean
+ * up once the minimum supported version of LLVM for building the kernel always
+ * creates these macros.
+ *
* Note: Checking __has_feature(*_sanitizer) is only true if the feature is
* enabled. Therefore it is not required to additionally check defined(CONFIG_*)
* to avoid adding redundant attributes in other configurations.
*/
+#if __has_feature(address_sanitizer) && !defined(__SANITIZE_ADDRESS__)
+#define __SANITIZE_ADDRESS__
+#endif
+#if __has_feature(hwaddress_sanitizer) && !defined(__SANITIZE_HWADDRESS__)
+#define __SANITIZE_HWADDRESS__
+#endif
+#if __has_feature(thread_sanitizer) && !defined(__SANITIZE_THREAD__)
+#define __SANITIZE_THREAD__
+#endif
-#if __has_feature(address_sanitizer) || __has_feature(hwaddress_sanitizer)
-/* Emulate GCC's __SANITIZE_ADDRESS__ flag */
+/*
+ * Treat __SANITIZE_HWADDRESS__ the same as __SANITIZE_ADDRESS__ in the kernel.
+ */
+#ifdef __SANITIZE_HWADDRESS__
#define __SANITIZE_ADDRESS__
+#endif
+
+#ifdef __SANITIZE_ADDRESS__
#define __no_sanitize_address \
__attribute__((no_sanitize("address", "hwaddress")))
#else
#define __no_sanitize_address
#endif
-#if __has_feature(thread_sanitizer)
-/* emulate gcc's __SANITIZE_THREAD__ flag */
-#define __SANITIZE_THREAD__
+#ifdef __SANITIZE_THREAD__
#define __no_sanitize_thread \
__attribute__((no_sanitize("thread")))
#else
---
base-commit: b320789d6883cc00ac78ce83bccbfe7ed58afcf0
change-id: 20250902-clang-update-sanitize-defines-845000c29d2c
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
Hi all,
A few months ago, the multi-fsblock untorn writes patchset added a bunch
of log intent item helper functions to estimate the number of intent
items that could be added to a particular transaction. Those helpers
enabled us to compute a safe upper bound on the number of blocks that
could be written in an untorn fashion with filesystem-provided out of
place writes.
Currently, the online fsck code employs static limits on the number of
intent items that it's willing to accrue to a single transaction when
it's trying to reap what it thinks are the old blocks from a corrupt
structure. There have been no problems reported with this approach
after years of testing, but static limits are scary and gross because
overestimating the intent item limit could result in transaction
overflows and dead filesystems; and underestimating causes unnecessary
overhead.
This series uses the new log intent item size helpers to estimate the
limits dynamically based on worst-case per-block repair work vs. the
size of the scrub transaction. After several months of testing this,
there don't seem to be any problems here either.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
This has been running on the djcloud for months with no problems. Enjoy!
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fi…
---
Commits in this patchset:
* xfs: prepare reaping code for dynamic limits
* xfs: convert the ifork reap code to use xreap_state
* xfs: use deferred intent items for reaping crosslinked blocks
* xfs: compute per-AG extent reap limits dynamically
* xfs: compute data device CoW staging extent reap limits dynamically
* xfs: compute realtime device CoW staging extent reap limits dynamically
* xfs: compute file mapping reap limits dynamically
* xfs: remove static reap limits
* xfs: use deferred reaping for data device cow extents
---
fs/xfs/scrub/repair.h | 8 -
fs/xfs/scrub/trace.h | 45 ++++
fs/xfs/scrub/newbt.c | 7 +
fs/xfs/scrub/reap.c | 622 +++++++++++++++++++++++++++++++++++++++----------
fs/xfs/scrub/trace.c | 1
5 files changed, 552 insertions(+), 131 deletions(-)
This reverts commit 71598a5a7797f0052aaa7bcff0b8d4b8f20f1441.
This commit introduced a regression, however the fix for the
regression:
aa5fc4362fac ("drm/amdgpu: fix task hang from failed job submission during process kill")
depends on things not yet present in 6.12.y and older kernels. Since
this commit is more of an optimization, just revert it for
6.12.y and older stable kernels.
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org # 6.1.x - 6.12.x
---
Please apply this revert to 6.1.x to 6.12.x stable trees. The newer
stable trees and Linus' tree already have the regression fix.
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 0adb106e2c42..37d53578825b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2292,11 +2292,13 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
*/
long amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long timeout)
{
- timeout = drm_sched_entity_flush(&vm->immediate, timeout);
+ timeout = dma_resv_wait_timeout(vm->root.bo->tbo.base.resv,
+ DMA_RESV_USAGE_BOOKKEEP,
+ true, timeout);
if (timeout <= 0)
return timeout;
- return drm_sched_entity_flush(&vm->delayed, timeout);
+ return dma_fence_wait_timeout(vm->last_unlocked, true, timeout);
}
static void amdgpu_vm_destroy_task_info(struct kref *kref)
--
2.51.0
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 1f403699c40f0806a707a9a6eed3b8904224021a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025090146-playback-kinsman-373c@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1f403699c40f0806a707a9a6eed3b8904224021a Mon Sep 17 00:00:00 2001
From: Ma Ke <make24(a)iscas.ac.cn>
Date: Tue, 12 Aug 2025 15:19:32 +0800
Subject: [PATCH] drm/mediatek: Fix device/node reference count leaks in
mtk_drm_get_all_drm_priv
Using device_find_child() and of_find_device_by_node() to locate
devices could cause an imbalance in the device's reference count.
device_find_child() and of_find_device_by_node() both call
get_device() to increment the reference count of the found device
before returning the pointer. In mtk_drm_get_all_drm_priv(), these
references are never released through put_device(), resulting in
permanent reference count increments. Additionally, the
for_each_child_of_node() iterator fails to release node references in
all code paths. This leaks device node references when loop
termination occurs before reaching MAX_CRTC. These reference count
leaks may prevent device/node resources from being properly released
during driver unbind operations.
As comment of device_find_child() says, 'NOTE: you will need to drop
the reference with put_device() after use'.
Cc: stable(a)vger.kernel.org
Fixes: 1ef7ed48356c ("drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
Reviewed-by: CK Hu <ck.hu(a)mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20250812071932.471730-…
Signed-off-by: Chun-Kuang Hu <chunkuang.hu(a)kernel.org>
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index d5e6bab36414..f8a817689e16 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -387,19 +387,19 @@ static bool mtk_drm_get_all_drm_priv(struct device *dev)
of_id = of_match_node(mtk_drm_of_ids, node);
if (!of_id)
- continue;
+ goto next_put_node;
pdev = of_find_device_by_node(node);
if (!pdev)
- continue;
+ goto next_put_node;
drm_dev = device_find_child(&pdev->dev, NULL, mtk_drm_match);
if (!drm_dev)
- continue;
+ goto next_put_device_pdev_dev;
temp_drm_priv = dev_get_drvdata(drm_dev);
if (!temp_drm_priv)
- continue;
+ goto next_put_device_drm_dev;
if (temp_drm_priv->data->main_len)
all_drm_priv[CRTC_MAIN] = temp_drm_priv;
@@ -411,10 +411,17 @@ static bool mtk_drm_get_all_drm_priv(struct device *dev)
if (temp_drm_priv->mtk_drm_bound)
cnt++;
- if (cnt == MAX_CRTC) {
- of_node_put(node);
+next_put_device_drm_dev:
+ put_device(drm_dev);
+
+next_put_device_pdev_dev:
+ put_device(&pdev->dev);
+
+next_put_node:
+ of_node_put(node);
+
+ if (cnt == MAX_CRTC)
break;
- }
}
if (drm_priv->data->mmsys_dev_num == cnt) {
The current implementation of CXL memory hotplug notifier gets called
before the HMAT memory hotplug notifier. The CXL driver calculates the
access coordinates (bandwidth and latency values) for the CXL end to
end path (i.e. CPU to endpoint). When the CXL region is onlined, the CXL
memory hotplug notifier writes the access coordinates to the HMAT target
structs. Then the HMAT memory hotplug notifier is called and it creates
the access coordinates for the node sysfs attributes.
During testing on an Intel platform, it was found that although the
newly calculated coordinates were pushed to sysfs, the sysfs attributes for
the access coordinates showed up with the wrong initiator. The system has
4 nodes (0, 1, 2, 3) where node 0 and 1 are CPU nodes and node 2 and 3 are
CXL nodes. The expectation is that node 2 would show up as a target to node
0:
/sys/devices/system/node/node2/access0/initiators/node0
However it was observed that node 2 showed up as a target under node 1:
/sys/devices/system/node/node2/access0/initiators/node1
The original intent of the 'ext_updated' flag in HMAT handling code was to
stop HMAT memory hotplug callback from clobbering the access coordinates
after CXL has injected its calculated coordinates and replaced the generic
target access coordinates provided by the HMAT table in the HMAT target
structs. However the flag is hacky at best and blocks the updates from
other CXL regions that are onlined in the same node later on. Remove the
'ext_updated' flag usage and just update the access coordinates for the
nodes directly without touching HMAT target data.
The hotplug memory callback ordering is changed. Instead of changing CXL,
move HMAT back so there's room for the levels rather than have CXL share
the same level as SLAB_CALLBACK_PRI. The change will resulting in the CXL
callback to be executed after the HMAT callback.
With the change, the CXL hotplug memory notifier runs after the HMAT
callback. The HMAT callback will create the node sysfs attributes for
access coordinates. The CXL callback will write the access coordinates to
the now created node sysfs attributes directly and will not pollute the
HMAT target values.
A nodemask is introduced to keep track if a node has been updated and
prevents further updates.
Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region")
Cc: stable(a)vger.kernel.org
Tested-by: Marc Herbert <marc.herbert(a)linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Dave Jiang <dave.jiang(a)intel.com>
---
v3:
- Use nodemask instead of xarray to keep track of node updates (Jonathan)
---
drivers/acpi/numa/hmat.c | 6 ------
drivers/cxl/core/cdat.c | 5 -----
drivers/cxl/core/core.h | 1 -
drivers/cxl/core/region.c | 20 ++++++++++++--------
include/linux/memory.h | 2 +-
5 files changed, 13 insertions(+), 21 deletions(-)
diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 4958301f5417..5d32490dc4ab 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -74,7 +74,6 @@ struct memory_target {
struct node_cache_attrs cache_attrs;
u8 gen_port_device_handle[ACPI_SRAT_DEVICE_HANDLE_SIZE];
bool registered;
- bool ext_updated; /* externally updated */
};
struct memory_initiator {
@@ -391,7 +390,6 @@ int hmat_update_target_coordinates(int nid, struct access_coordinate *coord,
coord->read_bandwidth, access);
hmat_update_target_access(target, ACPI_HMAT_WRITE_BANDWIDTH,
coord->write_bandwidth, access);
- target->ext_updated = true;
return 0;
}
@@ -773,10 +771,6 @@ static void hmat_update_target_attrs(struct memory_target *target,
u32 best = 0;
int i;
- /* Don't update if an external agent has changed the data. */
- if (target->ext_updated)
- return;
-
/* Don't update for generic port if there's no device handle */
if ((access == NODE_ACCESS_CLASS_GENPORT_SINK_LOCAL ||
access == NODE_ACCESS_CLASS_GENPORT_SINK_CPU) &&
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index c0af645425f4..c891fd618cfd 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -1081,8 +1081,3 @@ int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,
{
return hmat_update_target_coordinates(nid, &cxlr->coord[access], access);
}
-
-bool cxl_need_node_perf_attrs_update(int nid)
-{
- return !acpi_node_backed_by_real_pxm(nid);
-}
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 2669f251d677..a253d308f3c9 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -139,7 +139,6 @@ long cxl_pci_get_latency(struct pci_dev *pdev);
int cxl_pci_get_bandwidth(struct pci_dev *pdev, struct access_coordinate *c);
int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,
enum access_coordinate_class access);
-bool cxl_need_node_perf_attrs_update(int nid);
int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
struct access_coordinate *c);
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 71cc42d05248..0ed95cbc5d5b 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -30,6 +30,12 @@
* 3. Decoder targets
*/
+/*
+ * nodemask that sets per node when the access_coordinates for the node has
+ * been updated by the CXL memory hotplug notifier.
+ */
+static nodemask_t nodemask_region_seen = NODE_MASK_NONE;
+
static struct cxl_region *to_cxl_region(struct device *dev);
#define __ACCESS_ATTR_RO(_level, _name) { \
@@ -2442,14 +2448,8 @@ static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
for (int i = 0; i < ACCESS_COORDINATE_MAX; i++) {
if (cxlr->coord[i].read_bandwidth) {
- rc = 0;
- if (cxl_need_node_perf_attrs_update(nid))
- node_set_perf_attrs(nid, &cxlr->coord[i], i);
- else
- rc = cxl_update_hmat_access_coordinates(nid, cxlr, i);
-
- if (rc == 0)
- cset++;
+ node_update_perf_attrs(nid, &cxlr->coord[i], i);
+ cset++;
}
}
@@ -2487,6 +2487,10 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
if (nid != region_nid)
return NOTIFY_DONE;
+ /* No action needed if node bit already set */
+ if (node_test_and_set(nid, nodemask_region_seen))
+ return NOTIFY_DONE;
+
if (!cxl_region_update_coordinates(cxlr, nid))
return NOTIFY_DONE;
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 1305102688d0..0b755d1ef1ec 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -120,8 +120,8 @@ struct mem_section;
*/
#define DEFAULT_CALLBACK_PRI 0
#define SLAB_CALLBACK_PRI 1
-#define HMAT_CALLBACK_PRI 2
#define CXL_CALLBACK_PRI 5
+#define HMAT_CALLBACK_PRI 6
#define MM_COMPUTE_BATCH_PRI 10
#define CPUSET_CALLBACK_PRI 10
#define MEMTIER_HOTPLUG_PRI 100
--
2.50.1
Hi Dr. Sam Lavi Cosmetic And Implant Dentistry ,
AI adoption in healthcare isn’t coming — it’s already here.
Some implant clinics are quietly using AI to handle patient calls, consultations, and scheduling.
Their growth isn’t a coincidence.
Being first gives them an edge — being late means playing catch-up.
Should I share a 2-min demo video so you can see how it works?
Reply 'Video' and I’ll send it.
—
Best regards,
Mohanish Ved
AI Growth Specialist
EditRage Solutions
VIRQs come in 3 flavors, per-VPU, per-domain, and global, and the VIRQs
are tracked in per-cpu virq_to_irq arrays.
Per-domain and global VIRQs must be bound on CPU 0, and
bind_virq_to_irq() sets the per_cpu virq_to_irq at registration time
Later, the interrupt can migrate, and info->cpu is updated. When
calling __unbind_from_irq(), the per-cpu virq_to_irq is cleared for a
different cpu. If bind_virq_to_irq() is called again with CPU 0, the
stale irq is returned. There won't be any irq_info for the irq, so
things break.
Make xen_rebind_evtchn_to_cpu() update the per_cpu virq_to_irq mappings
to keep them update to date with the current cpu. This ensures the
correct virq_to_irq is cleared in __unbind_from_irq().
Fixes: e46cdb66c8fc ("xen: event channels")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason Andryuk <jason.andryuk(a)amd.com>
---
v3:
Kernel style brace placement
Delay setting old_cpu and tighten scope of variable
v2:
Different approach changing virq_to_irq
---
drivers/xen/events/events_base.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index b060b5a95f45..9478fae014e5 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1797,9 +1797,20 @@ static int xen_rebind_evtchn_to_cpu(struct irq_info *info, unsigned int tcpu)
* virq or IPI channel, which don't actually need to be rebound. Ignore
* it, but don't do the xenlinux-level rebind in that case.
*/
- if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0)
+ if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) {
+ int old_cpu = info->cpu;
+
bind_evtchn_to_cpu(info, tcpu, false);
+ if (info->type == IRQT_VIRQ) {
+ int virq = info->u.virq;
+ int irq = per_cpu(virq_to_irq, old_cpu)[virq];
+
+ per_cpu(virq_to_irq, old_cpu)[virq] = -1;
+ per_cpu(virq_to_irq, tcpu)[virq] = irq;
+ }
+ }
+
do_unmask(info, EVT_MASK_REASON_TEMPORARY);
return 0;
--
2.34.1
Change find_virq() to return -EEXIST when a VIRQ is bound to a
different CPU than the one passed in. With that, remove the BUG_ON()
from bind_virq_to_irq() to propogate the error upwards.
Some VIRQs are per-cpu, but others are per-domain or global. Those must
be bound to CPU0 and can then migrate elsewhere. The lookup for
per-domain and global will probably fail when migrated off CPU 0,
especially when the current CPU is tracked. This now returns -EEXIST
instead of BUG_ON().
A second call to bind a per-domain or global VIRQ is not expected, but
make it non-fatal to avoid trying to look up the irq, since we don't
know which per_cpu(virq_to_irq) it will be in.
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason Andryuk <jason.andryuk(a)amd.com>
---
v3:
Cc: stable as a pre-req for the subsequent virg tracking change
Call __unbind_from_irq() on error ro avoid leaking info
v2:
New
---
drivers/xen/events/events_base.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 374231d84e4f..b060b5a95f45 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1314,10 +1314,12 @@ int bind_interdomain_evtchn_to_irq_lateeoi(struct xenbus_device *dev,
}
EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irq_lateeoi);
-static int find_virq(unsigned int virq, unsigned int cpu, evtchn_port_t *evtchn)
+static int find_virq(unsigned int virq, unsigned int cpu, evtchn_port_t *evtchn,
+ bool percpu)
{
struct evtchn_status status;
evtchn_port_t port;
+ bool exists = false;
memset(&status, 0, sizeof(status));
for (port = 0; port < xen_evtchn_max_channels(); port++) {
@@ -1330,12 +1332,16 @@ static int find_virq(unsigned int virq, unsigned int cpu, evtchn_port_t *evtchn)
continue;
if (status.status != EVTCHNSTAT_virq)
continue;
- if (status.u.virq == virq && status.vcpu == xen_vcpu_nr(cpu)) {
+ if (status.u.virq != virq)
+ continue;
+ if (status.vcpu == xen_vcpu_nr(cpu)) {
*evtchn = port;
return 0;
+ } else if (!percpu) {
+ exists = true;
}
}
- return -ENOENT;
+ return exists ? -EEXIST : -ENOENT;
}
/**
@@ -1382,8 +1388,11 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu, bool percpu)
evtchn = bind_virq.port;
else {
if (ret == -EEXIST)
- ret = find_virq(virq, cpu, &evtchn);
- BUG_ON(ret < 0);
+ ret = find_virq(virq, cpu, &evtchn, percpu);
+ if (ret) {
+ __unbind_from_irq(info, info->irq);
+ goto out;
+ }
}
ret = xen_irq_info_virq_setup(info, cpu, evtchn, virq);
--
2.34.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x ae668cd567a6a7622bc813ee0bb61c42bed61ba7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025090112-skillet-muskiness-5948@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ae668cd567a6a7622bc813ee0bb61c42bed61ba7 Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen(a)redhat.com>
Date: Fri, 22 Aug 2025 12:55:56 -0500
Subject: [PATCH] xfs: do not propagate ENODATA disk errors into xattr code
ENODATA (aka ENOATTR) has a very specific meaning in the xfs xattr code;
namely, that the requested attribute name could not be found.
However, a medium error from disk may also return ENODATA. At best,
this medium error may escape to userspace as "attribute not found"
when in fact it's an IO (disk) error.
At worst, we may oops in xfs_attr_leaf_get() when we do:
error = xfs_attr_leaf_hasname(args, &bp);
if (error == -ENOATTR) {
xfs_trans_brelse(args->trans, bp);
return error;
}
because an ENODATA/ENOATTR error from disk leaves us with a null bp,
and the xfs_trans_brelse will then null-deref it.
As discussed on the list, we really need to modify the lower level
IO functions to trap all disk errors and ensure that we don't let
unique errors like this leak up into higher xfs functions - many
like this should be remapped to EIO.
However, this patch directly addresses a reported bug in the xattr
code, and should be safe to backport to stable kernels. A larger-scope
patch to handle more unique errors at lower levels can follow later.
(Note, prior to 07120f1abdff we did not oops, but we did return the
wrong error code to userspace.)
Signed-off-by: Eric Sandeen <sandeen(a)redhat.com>
Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Cc: stable(a)vger.kernel.org # v5.9+
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Carlos Maiolino <cem(a)kernel.org>
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 4c44ce1c8a64..bff3dc226f81 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -435,6 +435,13 @@ xfs_attr_rmtval_get(
0, &bp, &xfs_attr3_rmt_buf_ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(args->dp, XFS_ATTR_FORK);
+ /*
+ * ENODATA from disk implies a disk medium failure;
+ * ENODATA for xattrs means attribute not found, so
+ * disambiguate that here.
+ */
+ if (error == -ENODATA)
+ error = -EIO;
if (error)
return error;
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index 17d9e6154f19..723a0643b838 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -2833,6 +2833,12 @@ xfs_da_read_buf(
&bp, ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(dp, whichfork);
+ /*
+ * ENODATA from disk implies a disk medium failure; ENODATA for
+ * xattrs means attribute not found, so disambiguate that here.
+ */
+ if (error == -ENODATA && whichfork == XFS_ATTR_FORK)
+ error = -EIO;
if (error)
goto out_free;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x ae668cd567a6a7622bc813ee0bb61c42bed61ba7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025090111-acorn-ammonia-8c45@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ae668cd567a6a7622bc813ee0bb61c42bed61ba7 Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen(a)redhat.com>
Date: Fri, 22 Aug 2025 12:55:56 -0500
Subject: [PATCH] xfs: do not propagate ENODATA disk errors into xattr code
ENODATA (aka ENOATTR) has a very specific meaning in the xfs xattr code;
namely, that the requested attribute name could not be found.
However, a medium error from disk may also return ENODATA. At best,
this medium error may escape to userspace as "attribute not found"
when in fact it's an IO (disk) error.
At worst, we may oops in xfs_attr_leaf_get() when we do:
error = xfs_attr_leaf_hasname(args, &bp);
if (error == -ENOATTR) {
xfs_trans_brelse(args->trans, bp);
return error;
}
because an ENODATA/ENOATTR error from disk leaves us with a null bp,
and the xfs_trans_brelse will then null-deref it.
As discussed on the list, we really need to modify the lower level
IO functions to trap all disk errors and ensure that we don't let
unique errors like this leak up into higher xfs functions - many
like this should be remapped to EIO.
However, this patch directly addresses a reported bug in the xattr
code, and should be safe to backport to stable kernels. A larger-scope
patch to handle more unique errors at lower levels can follow later.
(Note, prior to 07120f1abdff we did not oops, but we did return the
wrong error code to userspace.)
Signed-off-by: Eric Sandeen <sandeen(a)redhat.com>
Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Cc: stable(a)vger.kernel.org # v5.9+
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Carlos Maiolino <cem(a)kernel.org>
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 4c44ce1c8a64..bff3dc226f81 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -435,6 +435,13 @@ xfs_attr_rmtval_get(
0, &bp, &xfs_attr3_rmt_buf_ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(args->dp, XFS_ATTR_FORK);
+ /*
+ * ENODATA from disk implies a disk medium failure;
+ * ENODATA for xattrs means attribute not found, so
+ * disambiguate that here.
+ */
+ if (error == -ENODATA)
+ error = -EIO;
if (error)
return error;
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index 17d9e6154f19..723a0643b838 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -2833,6 +2833,12 @@ xfs_da_read_buf(
&bp, ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(dp, whichfork);
+ /*
+ * ENODATA from disk implies a disk medium failure; ENODATA for
+ * xattrs means attribute not found, so disambiguate that here.
+ */
+ if (error == -ENODATA && whichfork == XFS_ATTR_FORK)
+ error = -EIO;
if (error)
goto out_free;
From: "Mario Limonciello (AMD)" <superm1(a)kernel.org>
[ Upstream commit 23800ad1265f10c2bc6f42154ce4d20e59f2900e ]
The ASUS ProArt PX13 has a spurious wakeup event from the touchpad
a few moments after entering hardware sleep. This can be avoided
by preventing the touchpad from being a wake source.
Add to the wakeup ignore list.
Reported-by: Amit Chaudhari <amitchaudhari(a)mac.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4482
Tested-by: Amit Chaudhari <amitchaudhari(a)mac.com>
Signed-off-by: Mario Limonciello (AMD) <superm1(a)kernel.org>
Reviewed-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Link: https://lore.kernel.org/20250814183430.3887973-1-superm1@kernel.org
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my comprehensive analysis of this commit and the context, here
is my assessment:
**Backport Status: YES**
## Extensive Analysis:
### 1. **Nature of the Fix**
This commit fixes a real hardware bug - spurious wakeup events from the
touchpad on the ASUS ProArt PX13 laptop. The code change adds a DMI-
based quirk entry to the `gpiolib_acpi_quirks` table in
`/home/sasha/linux/drivers/gpio/gpiolib-acpi-quirks.c:350-359`, which
instructs the GPIO subsystem to ignore wake events from the specific
touchpad GPIO pin (`ASCP1A00:00@8`).
### 2. **符合稳定内核规则 (Meets Stable Kernel Rules)**
According to `/home/sasha/linux/Documentation/process/stable-kernel-
rules.rst`:
- **Fixes a real bug**: Yes - spurious wakeups are a real hardware issue
that impacts users' ability to use sleep/suspend effectively (lines
18-19 of rules)
- **Obviously correct and tested**: Yes - the fix is straightforward
(adding a quirk entry), has been tested by the reporter (Amit
Chaudhari), and reviewed by Mika Westerberg
- **Size constraint**: The patch is only ~20 lines with context, well
under the 100-line limit
- **Already in mainline**: Yes - commit
23800ad1265f10c2bc6f42154ce4d20e59f2900e
### 3. **Historical Precedent**
Multiple similar commits for spurious wakeup quirks have been backported
to stable:
- Commit `805c74eac8cb3` (GPD G1619-04 touchpad wakeup) - explicitly
marked with `Cc: stable(a)vger.kernel.org`
- Commit `782eea0c89f7d` (Clevo NL5xNU) - marked with `Cc:
stable(a)vger.kernel.org`
- Commit `a69982c37cd05` (Clevo NH5xAx) - marked with `Cc:
<stable(a)vger.kernel.org> # v6.1+`
### 4. **Code Structure Analysis**
The change follows the exact same pattern as other quirk entries in the
file:
```c
.driver_data = &(struct acpi_gpiolib_dmi_quirk) {
.ignore_wake = "ASCP1A00:00@8",
},
```
This is a data-only addition to an existing quirk table - no logic
changes, no new code paths, minimal regression risk.
### 5. **User Impact**
The bug causes spurious wakeups "a few moments after entering hardware
sleep", which:
- Prevents proper system suspend/sleep functionality
- Drains battery life on laptops
- Disrupts user workflow
- Is a clear hardware-specific issue that cannot be worked around by
users
### 6. **Risk Assessment**
- **Extremely low risk**: The change only affects systems that match the
specific DMI strings (ASUSTeK COMPUTER INC. + ProArt PX13)
- **No impact on other systems**: DMI matching ensures this quirk only
applies to the affected hardware
- **Well-established mechanism**: The ignore_wake infrastructure is
mature and widely used
### 7. **Backporting Considerations**
While this specific commit wasn't explicitly marked with `Cc: stable`,
it follows the exact same pattern as commits that were. The commit has
already been picked up by Sasha Levin's stable tree (as shown in the `[
Upstream commit ]` tag in the local repository), suggesting the stable
maintainers recognize its importance.
The fix is self-contained, requires no prerequisites, and can be cleanly
applied to any kernel version that has the `gpiolib-acpi-quirks.c` file
structure (introduced in commit `92dc572852ddc`).
### Conclusion
This is a textbook example of a stable-appropriate fix: it addresses a
specific hardware bug affecting real users, uses a well-established
quirk mechanism, has zero impact on unaffected systems, and follows the
exact pattern of previously backported fixes for identical issues on
different hardware.
drivers/gpio/gpiolib-acpi-quirks.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/gpio/gpiolib-acpi-quirks.c b/drivers/gpio/gpiolib-acpi-quirks.c
index c13545dce3492..bfb04e67c4bc8 100644
--- a/drivers/gpio/gpiolib-acpi-quirks.c
+++ b/drivers/gpio/gpiolib-acpi-quirks.c
@@ -344,6 +344,20 @@ static const struct dmi_system_id gpiolib_acpi_quirks[] __initconst = {
.ignore_interrupt = "AMDI0030:00@8",
},
},
+ {
+ /*
+ * Spurious wakeups from TP_ATTN# pin
+ * Found in BIOS 5.35
+ * https://gitlab.freedesktop.org/drm/amd/-/issues/4482
+ */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_MATCH(DMI_PRODUCT_FAMILY, "ProArt PX13"),
+ },
+ .driver_data = &(struct acpi_gpiolib_dmi_quirk) {
+ .ignore_wake = "ASCP1A00:00@8",
+ },
+ },
{} /* Terminating entry */
};
--
2.50.1
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x ae668cd567a6a7622bc813ee0bb61c42bed61ba7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025090111-poem-retold-4ac2@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ae668cd567a6a7622bc813ee0bb61c42bed61ba7 Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen(a)redhat.com>
Date: Fri, 22 Aug 2025 12:55:56 -0500
Subject: [PATCH] xfs: do not propagate ENODATA disk errors into xattr code
ENODATA (aka ENOATTR) has a very specific meaning in the xfs xattr code;
namely, that the requested attribute name could not be found.
However, a medium error from disk may also return ENODATA. At best,
this medium error may escape to userspace as "attribute not found"
when in fact it's an IO (disk) error.
At worst, we may oops in xfs_attr_leaf_get() when we do:
error = xfs_attr_leaf_hasname(args, &bp);
if (error == -ENOATTR) {
xfs_trans_brelse(args->trans, bp);
return error;
}
because an ENODATA/ENOATTR error from disk leaves us with a null bp,
and the xfs_trans_brelse will then null-deref it.
As discussed on the list, we really need to modify the lower level
IO functions to trap all disk errors and ensure that we don't let
unique errors like this leak up into higher xfs functions - many
like this should be remapped to EIO.
However, this patch directly addresses a reported bug in the xattr
code, and should be safe to backport to stable kernels. A larger-scope
patch to handle more unique errors at lower levels can follow later.
(Note, prior to 07120f1abdff we did not oops, but we did return the
wrong error code to userspace.)
Signed-off-by: Eric Sandeen <sandeen(a)redhat.com>
Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Cc: stable(a)vger.kernel.org # v5.9+
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Carlos Maiolino <cem(a)kernel.org>
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 4c44ce1c8a64..bff3dc226f81 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -435,6 +435,13 @@ xfs_attr_rmtval_get(
0, &bp, &xfs_attr3_rmt_buf_ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(args->dp, XFS_ATTR_FORK);
+ /*
+ * ENODATA from disk implies a disk medium failure;
+ * ENODATA for xattrs means attribute not found, so
+ * disambiguate that here.
+ */
+ if (error == -ENODATA)
+ error = -EIO;
if (error)
return error;
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index 17d9e6154f19..723a0643b838 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -2833,6 +2833,12 @@ xfs_da_read_buf(
&bp, ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(dp, whichfork);
+ /*
+ * ENODATA from disk implies a disk medium failure; ENODATA for
+ * xattrs means attribute not found, so disambiguate that here.
+ */
+ if (error == -ENODATA && whichfork == XFS_ATTR_FORK)
+ error = -EIO;
if (error)
goto out_free;
From: Han Guangjiang <hanguangjiang(a)lixiang.com>
On repeated cold boots we occasionally hit a NULL pointer crash in
blk_should_throtl() when throttling is consulted before the throttle
policy is fully enabled for the queue. Checking only q->td != NULL is
insufficient during early initialization, so blkg_to_pd() for the
throttle policy can still return NULL and blkg_to_tg() becomes NULL,
which later gets dereferenced.
Tighten blk_throtl_activated() to also require that the throttle policy
bit is set on the queue:
return q->td != NULL &&
test_bit(blkcg_policy_throtl.plid, q->blkcg_pols);
This prevents blk_should_throtl() from accessing throttle group state
until policy data has been attached to blkgs.
Fixes: a3166c51702b ("blk-throttle: delay initialization until configuration")
Cc: stable(a)vger.kernel.org
Co-developed-by: Liang Jie <liangjie(a)lixiang.com>
Signed-off-by: Liang Jie <liangjie(a)lixiang.com>
Signed-off-by: Han Guangjiang <hanguangjiang(a)lixiang.com>
---
block/blk-throttle.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index 3b27755bfbff..9ca43dc56eda 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -156,7 +156,7 @@ void blk_throtl_cancel_bios(struct gendisk *disk);
static inline bool blk_throtl_activated(struct request_queue *q)
{
- return q->td != NULL;
+ return q->td != NULL && test_bit(blkcg_policy_throtl.plid, q->blkcg_pols);
}
static inline bool blk_should_throtl(struct bio *bio)
--
2.25.1
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x ae668cd567a6a7622bc813ee0bb61c42bed61ba7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025090105-strainer-glider-fdcd@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ae668cd567a6a7622bc813ee0bb61c42bed61ba7 Mon Sep 17 00:00:00 2001
From: Eric Sandeen <sandeen(a)redhat.com>
Date: Fri, 22 Aug 2025 12:55:56 -0500
Subject: [PATCH] xfs: do not propagate ENODATA disk errors into xattr code
ENODATA (aka ENOATTR) has a very specific meaning in the xfs xattr code;
namely, that the requested attribute name could not be found.
However, a medium error from disk may also return ENODATA. At best,
this medium error may escape to userspace as "attribute not found"
when in fact it's an IO (disk) error.
At worst, we may oops in xfs_attr_leaf_get() when we do:
error = xfs_attr_leaf_hasname(args, &bp);
if (error == -ENOATTR) {
xfs_trans_brelse(args->trans, bp);
return error;
}
because an ENODATA/ENOATTR error from disk leaves us with a null bp,
and the xfs_trans_brelse will then null-deref it.
As discussed on the list, we really need to modify the lower level
IO functions to trap all disk errors and ensure that we don't let
unique errors like this leak up into higher xfs functions - many
like this should be remapped to EIO.
However, this patch directly addresses a reported bug in the xattr
code, and should be safe to backport to stable kernels. A larger-scope
patch to handle more unique errors at lower levels can follow later.
(Note, prior to 07120f1abdff we did not oops, but we did return the
wrong error code to userspace.)
Signed-off-by: Eric Sandeen <sandeen(a)redhat.com>
Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Cc: stable(a)vger.kernel.org # v5.9+
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Carlos Maiolino <cem(a)kernel.org>
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 4c44ce1c8a64..bff3dc226f81 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -435,6 +435,13 @@ xfs_attr_rmtval_get(
0, &bp, &xfs_attr3_rmt_buf_ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(args->dp, XFS_ATTR_FORK);
+ /*
+ * ENODATA from disk implies a disk medium failure;
+ * ENODATA for xattrs means attribute not found, so
+ * disambiguate that here.
+ */
+ if (error == -ENODATA)
+ error = -EIO;
if (error)
return error;
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index 17d9e6154f19..723a0643b838 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -2833,6 +2833,12 @@ xfs_da_read_buf(
&bp, ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(dp, whichfork);
+ /*
+ * ENODATA from disk implies a disk medium failure; ENODATA for
+ * xattrs means attribute not found, so disambiguate that here.
+ */
+ if (error == -ENODATA && whichfork == XFS_ATTR_FORK)
+ error = -EIO;
if (error)
goto out_free;
VRAM+TT bos that are evicted from VRAM to TT may remain in
TT also after a revalidation following eviction or suspend.
This manifests itself as applications becoming sluggish
after buffer objects get evicted or after a resume from
suspend or hibernation.
If the bo supports placement in both VRAM and TT, and
we are on DGFX, mark the TT placement as fallback. This means
that it is tried only after VRAM + eviction.
This flaw has probably been present since the xe module was
upstreamed but use a Fixes: commit below where backporting is
likely to be simple. For earlier versions we need to open-
code the fallback algorithm in the driver.
v2:
- Remove check for dgfx. (Matthew Auld)
- Update the xe_dma_buf kunit test for the new strategy (CI)
- Allow dma-buf to pin in current placement (CI)
- Make xe_bo_validate() for pinned bos a NOP.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5995
Fixes: a78a8da51b36 ("drm/ttm: replace busy placement with flags v6")
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.9+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com> #v1
---
drivers/gpu/drm/xe/tests/xe_bo.c | 2 +-
drivers/gpu/drm/xe/tests/xe_dma_buf.c | 10 +---------
drivers/gpu/drm/xe/xe_bo.c | 16 ++++++++++++----
drivers/gpu/drm/xe/xe_bo.h | 2 +-
drivers/gpu/drm/xe/xe_dma_buf.c | 2 +-
5 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c
index bb469096d072..7b40cc8be1c9 100644
--- a/drivers/gpu/drm/xe/tests/xe_bo.c
+++ b/drivers/gpu/drm/xe/tests/xe_bo.c
@@ -236,7 +236,7 @@ static int evict_test_run_tile(struct xe_device *xe, struct xe_tile *tile, struc
}
xe_bo_lock(external, false);
- err = xe_bo_pin_external(external);
+ err = xe_bo_pin_external(external, false);
xe_bo_unlock(external);
if (err) {
KUNIT_FAIL(test, "external bo pin err=%pe\n",
diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
index 3c5ad8cc65a0..5baeab6b6fb7 100644
--- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
@@ -85,15 +85,7 @@ static void check_residency(struct kunit *test, struct xe_bo *exported,
return;
}
- /*
- * If on different devices, the exporter is kept in system if
- * possible, saving a migration step as the transfer is just
- * likely as fast from system memory.
- */
- if (params->mem_mask & XE_BO_FLAG_SYSTEM)
- KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, XE_PL_TT));
- else
- KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, mem_type));
+ KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, mem_type));
if (params->force_different_devices)
KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(imported, XE_PL_TT));
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 4faf15d5fa6d..870f43347281 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -188,6 +188,8 @@ static void try_add_system(struct xe_device *xe, struct xe_bo *bo,
bo->placements[*c] = (struct ttm_place) {
.mem_type = XE_PL_TT,
+ .flags = (bo_flags & XE_BO_FLAG_VRAM_MASK) ?
+ TTM_PL_FLAG_FALLBACK : 0,
};
*c += 1;
}
@@ -2322,6 +2324,7 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res)
/**
* xe_bo_pin_external - pin an external BO
* @bo: buffer object to be pinned
+ * @in_place: Pin in current placement, don't attempt to migrate.
*
* Pin an external (not tied to a VM, can be exported via dma-buf / prime FD)
* BO. Unique call compared to xe_bo_pin as this function has it own set of
@@ -2329,7 +2332,7 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res)
*
* Returns 0 for success, negative error code otherwise.
*/
-int xe_bo_pin_external(struct xe_bo *bo)
+int xe_bo_pin_external(struct xe_bo *bo, bool in_place)
{
struct xe_device *xe = xe_bo_device(bo);
int err;
@@ -2338,9 +2341,11 @@ int xe_bo_pin_external(struct xe_bo *bo)
xe_assert(xe, xe_bo_is_user(bo));
if (!xe_bo_is_pinned(bo)) {
- err = xe_bo_validate(bo, NULL, false);
- if (err)
- return err;
+ if (!in_place) {
+ err = xe_bo_validate(bo, NULL, false);
+ if (err)
+ return err;
+ }
spin_lock(&xe->pinned.lock);
list_add_tail(&bo->pinned_link, &xe->pinned.late.external);
@@ -2493,6 +2498,9 @@ int xe_bo_validate(struct xe_bo *bo, struct xe_vm *vm, bool allow_res_evict)
};
int ret;
+ if (xe_bo_is_pinned(bo))
+ return 0;
+
if (vm) {
lockdep_assert_held(&vm->lock);
xe_vm_assert_held(vm);
diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
index 8cce413b5235..cfb1ec266a6d 100644
--- a/drivers/gpu/drm/xe/xe_bo.h
+++ b/drivers/gpu/drm/xe/xe_bo.h
@@ -200,7 +200,7 @@ static inline void xe_bo_unlock_vm_held(struct xe_bo *bo)
}
}
-int xe_bo_pin_external(struct xe_bo *bo);
+int xe_bo_pin_external(struct xe_bo *bo, bool in_place);
int xe_bo_pin(struct xe_bo *bo);
void xe_bo_unpin_external(struct xe_bo *bo);
void xe_bo_unpin(struct xe_bo *bo);
diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
index 346f857f3837..af64baf872ef 100644
--- a/drivers/gpu/drm/xe/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/xe_dma_buf.c
@@ -72,7 +72,7 @@ static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
return ret;
}
- ret = xe_bo_pin_external(bo);
+ ret = xe_bo_pin_external(bo, true);
xe_assert(xe, !ret);
return 0;
--
2.50.1
On Wed, Aug 27, 2025 at 7:38 AM Stanislav Fort <disclosure(a)aisle.com> wrote:
>
> NET/ROM nr_rx_frame() dereferences the 5-byte transport header
> unconditionally. nr_route_frame() currently accepts frames as short as
> NR_NETWORK_LEN (15 bytes), which can lead to small out-of-bounds reads
> on short frames.
>
> Fix by using pskb_may_pull() in nr_rx_frame() to ensure the full
> NET/ROM network + transport header is present before accessing it, and
> guard the extra fields used by NR_CONNREQ (window, user address, and the
> optional BPQ timeout extension) with additional pskb_may_pull() checks.
>
> This aligns with recent fixes using pskb_may_pull() to validate header
> availability.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Reported-by: Stanislav Fort <disclosure(a)aisle.com>
> Cc: stable(a)vger.kernel.org
> Signed-off-by: Stanislav Fort <disclosure(a)aisle.com>
> ---
> net/netrom/af_netrom.c | 12 +++++++++++-
> net/netrom/nr_route.c | 2 +-
> 2 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
> index 3331669d8e33..1fbaa161288a 100644
> --- a/net/netrom/af_netrom.c
> +++ b/net/netrom/af_netrom.c
> @@ -885,6 +885,10 @@ int nr_rx_frame(struct sk_buff *skb, struct net_device *dev)
> * skb->data points to the netrom frame start
> */
>
> + /* Ensure NET/ROM network + transport header are present */
> + if (!pskb_may_pull(skb, NR_NETWORK_LEN + NR_TRANSPORT_LEN))
> + return 0;
> +
> src = (ax25_address *)(skb->data + 0);
> dest = (ax25_address *)(skb->data + 7);
>
> @@ -961,6 +965,12 @@ int nr_rx_frame(struct sk_buff *skb, struct net_device *dev)
> return 0;
> }
>
> + /* Ensure NR_CONNREQ fields (window + user address) are present */
> + if (!pskb_may_pull(skb, 21 + AX25_ADDR_LEN)) {
If skb->head is reallocated by this pskb_may_pull(), dest variable
might point to a freed piece of memory
(old skb->head)
As far as netrom is concerned, I would force a full linearization of
the packet very early
It is also unclear if the bug even exists in the first place.
Can you show the stack trace leading to this function being called
from an arbitrary
provider (like a packet being fed by malicious user space)
For instance nr_rx_frame() can be called from net/netrom/nr_loopback.c
with non malicious packet.
For the remaining caller (nr_route_frame()), it is unclear to me.
This is a series of patches that address several memory bugs that occur
in the Exynos Virtual Display driver.
Jeongjun Park (3):
drm/exynos: vidi: use priv->vidi_dev for ctx lookup in vidi_connection_ioctl()
drm/exynos: vidi: fix to avoid directly dereferencing user pointer
drm/exynos: vidi: use ctx->lock to protect struct vidi_context member variables related to memory alloc/free
drivers/gpu/drm/exynos/exynos_drm_drv.h | 1 +
drivers/gpu/drm/exynos/exynos_drm_vidi.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++----------
2 files changed, 64 insertions(+), 11 deletions(-)
Fix reference leaks where PCI device references
obtained via pci_get_device() were not being released:
1. The while loop that iterates through 0xa00a devices was not
releasing the final device reference when the loop terminates.
2. Single device lookups for 0xa001 and 0xa009 devices were not
releasing their references after use.
Add missing pci_dev_put() calls to ensure all device references
are properly released.
Fixes: cd7834167ffb ("[POWERPC] pasemi: Print more information at machine check")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
arch/powerpc/platforms/pasemi/setup.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/powerpc/platforms/pasemi/setup.c b/arch/powerpc/platforms/pasemi/setup.c
index d03b41336901..dafbee3afd86 100644
--- a/arch/powerpc/platforms/pasemi/setup.c
+++ b/arch/powerpc/platforms/pasemi/setup.c
@@ -169,6 +169,8 @@ static int __init pas_setup_mce_regs(void)
dev = pci_get_device(PCI_VENDOR_ID_PASEMI, 0xa00a, dev);
reg++;
}
+ /* Release the last device reference from the while loop */
+ pci_dev_put(dev);
dev = pci_get_device(PCI_VENDOR_ID_PASEMI, 0xa001, NULL);
if (dev && reg+4 < MAX_MCE_REGS) {
@@ -185,6 +187,7 @@ static int __init pas_setup_mce_regs(void)
mce_regs[reg].addr = pasemi_pci_getcfgaddr(dev, 0xc1c);
reg++;
}
+ pci_dev_put(dev);
dev = pci_get_device(PCI_VENDOR_ID_PASEMI, 0xa009, NULL);
if (dev && reg+2 < MAX_MCE_REGS) {
@@ -195,6 +198,7 @@ static int __init pas_setup_mce_regs(void)
mce_regs[reg].addr = pasemi_pci_getcfgaddr(dev, 0x214);
reg++;
}
+ pci_dev_put(dev);
num_mce_regs = reg;
--
2.35.1
Suspend-resume cycle test revealed a memory leak in 6.17-rc3
Turns out the slot_id race fix changes accidentally ends up calling
xhci_free_virt_device() with an incorrect vdev parameter.
The vdev variable was reused for temporary purposes right before calling
xhci_free_virt_device().
Fix this by passing the correct vdev parameter.
The slot_id race fix that caused this regression was targeted for stable,
so this needs to be applied there as well.
Fixes: 2eb03376151b ("usb: xhci: Fix slot_id resource race conflict")
Reported-by: David Wang <00107082(a)163.com>
Closes: https://lore.kernel.org/linux-usb/20250829181354.4450-1-00107082@163.com
Suggested-by: Michal Pecio <michal.pecio(a)gmail.com>
Suggested-by: David Wang <00107082(a)163.com>
Cc: stable(a)vger.kernel.org
Tested-by: David Wang <00107082(a)163.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-mem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 81eaad87a3d9..c4a6544aa107 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -962,7 +962,7 @@ static void xhci_free_virt_devices_depth_first(struct xhci_hcd *xhci, int slot_i
out:
/* we are now at a leaf device */
xhci_debugfs_remove_slot(xhci, slot_id);
- xhci_free_virt_device(xhci, vdev, slot_id);
+ xhci_free_virt_device(xhci, xhci->devs[slot_id], slot_id);
}
int xhci_alloc_virt_device(struct xhci_hcd *xhci, int slot_id,
--
2.43.0
Pending requests will be flushed on disconnect, and the corresponding
TRBs will be turned into No-op TRBs, which are ignored by the xHC
controller once it starts processing the ring.
If the USB debug cable repeatedly disconnects before ring is started
then the ring will eventually be filled with No-op TRBs.
No new transfers can be queued when the ring is full, and driver will
print the following error message:
"xhci_hcd 0000:00:14.0: failed to queue trbs"
This is a normal case for 'in' transfers where TRBs are always enqueued
in advance, ready to take on incoming data. If no data arrives, and
device is disconnected, then ring dequeue will remain at beginning of
the ring while enqueue points to first free TRB after last cancelled
No-op TRB.
s
Solve this by reinitializing the rings when the debug cable disconnects
and DbC is leaving the configured state.
Clear the whole ring buffer and set enqueue and dequeue to the beginning
of ring, and set cycle bit to its initial state.
Cc: stable(a)vger.kernel.org
Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver")
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-dbgcap.c | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci-dbgcap.c b/drivers/usb/host/xhci-dbgcap.c
index d0faff233e3e..63edf2d8f245 100644
--- a/drivers/usb/host/xhci-dbgcap.c
+++ b/drivers/usb/host/xhci-dbgcap.c
@@ -462,6 +462,25 @@ static void xhci_dbc_ring_init(struct xhci_ring *ring)
xhci_initialize_ring_info(ring);
}
+static int xhci_dbc_reinit_ep_rings(struct xhci_dbc *dbc)
+{
+ struct xhci_ring *in_ring = dbc->eps[BULK_IN].ring;
+ struct xhci_ring *out_ring = dbc->eps[BULK_OUT].ring;
+
+ if (!in_ring || !out_ring || !dbc->ctx) {
+ dev_warn(dbc->dev, "Can't re-init unallocated endpoints\n");
+ return -ENODEV;
+ }
+
+ xhci_dbc_ring_init(in_ring);
+ xhci_dbc_ring_init(out_ring);
+
+ /* set ep context enqueue, dequeue, and cycle to initial values */
+ xhci_dbc_init_ep_contexts(dbc);
+
+ return 0;
+}
+
static struct xhci_ring *
xhci_dbc_ring_alloc(struct device *dev, enum xhci_ring_type type, gfp_t flags)
{
@@ -885,7 +904,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
dev_info(dbc->dev, "DbC cable unplugged\n");
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
@@ -895,7 +914,7 @@ static enum evtreturn xhci_dbc_do_handle_events(struct xhci_dbc *dbc)
writel(portsc, &dbc->regs->portsc);
dbc->state = DS_ENABLED;
xhci_dbc_flush_requests(dbc);
-
+ xhci_dbc_reinit_ep_rings(dbc);
return EVT_DISC;
}
--
2.43.0
Hi,
I’d like to share a verified database of 502,364 attendees and 750 exhibitors from IAA Mobility 2025.
The file includes key information such as: Name, Job Title, Company, Location, Phone, and Email.
Delivery is within 48 hours and the data is GDPR compliant.
Labour Day Special: 20% discount available for a short time.
If you’d like the pricing, just reply with “Send me the cost” and I’ll provide the details.
Best regards,
Garnet Conwell
Sr. Marketing Manager
P.S. To opt out of future updates, reply with “Unfollow”.
Testing has shown that reading multiple registers at once (for 10-bit
ADC values) does not work. Set the use_single_read regmap_config flag
to make regmap split these for us.
This should fix temperature opregion accesses done by
drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for
the upcoming drivers for the ADC and battery MFD cells.
Fixes: 6bac0606fdba ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC")
Cc: stable(a)vger.kernel.org
Reviewed-by: Andy Shevchenko <andy(a)kernel.org>
Signed-off-by: Hans de Goede <hansg(a)kernel.org>
---
Changes in v3:
- Fix a few typos in the commit message
Changes in v2:
- Update comment to: "The hardware does not support reading multiple
registers at once"
---
drivers/mfd/intel_soc_pmic_chtdc_ti.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/mfd/intel_soc_pmic_chtdc_ti.c b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
index 4c1a68c9f575..6daf33e07ea0 100644
--- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c
+++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c
@@ -82,6 +82,8 @@ static const struct regmap_config chtdc_ti_regmap_config = {
.reg_bits = 8,
.val_bits = 8,
.max_register = 0xff,
+ /* The hardware does not support reading multiple registers at once */
+ .use_single_read = true,
};
static const struct regmap_irq chtdc_ti_irqs[] = {
--
2.49.0
When the host actively triggers SSR and collects coredump data,
the Bluetooth stack sends a reset command to the controller. However, due
to the inability to clear the QCA_SSR_TRIGGERED and QCA_IBS_DISABLED bits,
the reset command times out.
To address this, this patch clears the QCA_SSR_TRIGGERED and
QCA_IBS_DISABLED flags and adds a 50ms delay after SSR, but only when
HCI_QUIRK_NON_PERSISTENT_SETUP is not set. This ensures the controller
completes the SSR process when BT_EN is always high due to hardware.
For the purpose of HCI_QUIRK_NON_PERSISTENT_SETUP, please refer to
the comment in `include/net/bluetooth/hci.h`.
The HCI_QUIRK_NON_PERSISTENT_SETUP quirk is associated with BT_EN,
and its presence can be used to determine whether BT_EN is defined in DTS.
After SSR, host will not download the firmware, causing
controller to remain in the IBS_WAKE state. Host needs
to synchronize with the controller to maintain proper operation.
Multiple triggers of SSR only first generate coredump file,
due to memcoredump_flag no clear.
add clear coredump flag when ssr completed.
When the SSR duration exceeds 2 seconds, it triggers
host tx_idle_timeout, which sets host TX state to sleep. due to the
hardware pulling up bt_en, the firmware is not downloaded after the SSR.
As a result, the controller does not enter sleep mode. Consequently,
when the host sends a command afterward, it sends 0xFD to the controller,
but the controller does not respond, leading to a command timeout.
So reset tx_idle_timer after SSR to prevent host enter TX IBS_Sleep mode.
Changes since v6-7:
- Merge the changes into a single patch.
- Update commit.
Changes since v1-5:
- Add an explanation for HCI_QUIRK_NON_PERSISTENT_SETUP.
- Add commments for msleep(50).
- Update format and commit.
Signed-off-by: Shuai Zhang <quic_shuaz(a)quicinc.com>
---
drivers/bluetooth/hci_qca.c | 33 +++++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 4e56782b0..9dc59b002 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -1653,6 +1653,39 @@ static void qca_hw_error(struct hci_dev *hdev, u8 code)
skb_queue_purge(&qca->rx_memdump_q);
}
+ /*
+ * If the BT chip's bt_en pin is connected to a 3.3V power supply via
+ * hardware and always stays high, driver cannot control the bt_en pin.
+ * As a result, during SSR (SubSystem Restart), QCA_SSR_TRIGGERED and
+ * QCA_IBS_DISABLED flags cannot be cleared, which leads to a reset
+ * command timeout.
+ * Add an msleep delay to ensure controller completes the SSR process.
+ *
+ * Host will not download the firmware after SSR, controller to remain
+ * in the IBS_WAKE state, and the host needs to synchronize with it
+ *
+ * Since the bluetooth chip has been reset, clear the memdump state.
+ */
+ if (!test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
+ /*
+ * When the SSR (SubSystem Restart) duration exceeds 2 seconds,
+ * it triggers host tx_idle_delay, which sets host TX state
+ * to sleep. Reset tx_idle_timer after SSR to prevent
+ * host enter TX IBS_Sleep mode.
+ */
+ mod_timer(&qca->tx_idle_timer, jiffies +
+ msecs_to_jiffies(qca->tx_idle_delay));
+
+ /* Controller reset completion time is 50ms */
+ msleep(50);
+
+ clear_bit(QCA_SSR_TRIGGERED, &qca->flags);
+ clear_bit(QCA_IBS_DISABLED, &qca->flags);
+
+ qca->tx_ibs_state = HCI_IBS_TX_AWAKE;
+ qca->memdump_state = QCA_MEMDUMP_IDLE;
+ }
+
clear_bit(QCA_HW_ERROR_EVENT, &qca->flags);
}
--
2.34.1
From: Conor Dooley <conor.dooley(a)microchip.com>
In commit 13529647743d9 ("spi: microchip-core-qspi: Support per spi-mem
operation frequency switches") the logic for checking the viability of
op->max_freq in mchp_coreqspi_setup_clock() was copied into
mchp_coreqspi_supports_op(). Unfortunately, op->max_freq is not valid
when this function is called during probe but is instead zero.
Accordingly, baud_rate_val is calculated to be INT_MAX due to division
by zero, causing probe of the attached memory device to fail.
Seemingly spi-microchip-core-qspi was the only driver that had such a
modification made to its supports_op callback when the per_op_freq
capability was added, so just remove it to restore prior functionality.
CC: stable(a)vger.kernel.org
Reported-by: Valentina Fernandez <valentina.fernandezalanis(a)microchip.com>
Fixes: 13529647743d9 ("spi: microchip-core-qspi: Support per spi-mem operation frequency switches")
Signed-off-by: Conor Dooley <conor.dooley(a)microchip.com>
---
CC: Conor Dooley <conor.dooley(a)microchip.com>
CC: Daire McNamara <daire.mcnamara(a)microchip.com>
CC: Mark Brown <broonie(a)kernel.org>
CC: Miquel Raynal <miquel.raynal(a)bootlin.com>
CC: linux-spi(a)vger.kernel.org
CC: linux-kernel(a)vger.kernel.org
---
drivers/spi/spi-microchip-core-qspi.c | 12 ------------
1 file changed, 12 deletions(-)
diff --git a/drivers/spi/spi-microchip-core-qspi.c b/drivers/spi/spi-microchip-core-qspi.c
index d13a9b755c7f8..8dc98b17f77b5 100644
--- a/drivers/spi/spi-microchip-core-qspi.c
+++ b/drivers/spi/spi-microchip-core-qspi.c
@@ -531,10 +531,6 @@ static int mchp_coreqspi_exec_op(struct spi_mem *mem, const struct spi_mem_op *o
static bool mchp_coreqspi_supports_op(struct spi_mem *mem, const struct spi_mem_op *op)
{
- struct mchp_coreqspi *qspi = spi_controller_get_devdata(mem->spi->controller);
- unsigned long clk_hz;
- u32 baud_rate_val;
-
if (!spi_mem_default_supports_op(mem, op))
return false;
@@ -557,14 +553,6 @@ static bool mchp_coreqspi_supports_op(struct spi_mem *mem, const struct spi_mem_
return false;
}
- clk_hz = clk_get_rate(qspi->clk);
- if (!clk_hz)
- return false;
-
- baud_rate_val = DIV_ROUND_UP(clk_hz, 2 * op->max_freq);
- if (baud_rate_val > MAX_DIVIDER || baud_rate_val < MIN_DIVIDER)
- return false;
-
return true;
}
--
2.47.2
commit e3f1164fc9ee ("PM: EM: Support late CPUs booting and capacity
adjustment") added a mechanism to handle CPUs that come up late by
retrying when any of the `cpufreq_cpu_get()` call fails.
However, if there are holes in the CPU topology (offline CPUs, e.g.
nosmt), the first missing CPU causes the loop to break, preventing
subsequent online CPUs from being updated.
Instead of aborting on the first missing CPU policy, loop through all
and retry if any were missing.
Fixes: e3f1164fc9ee ("PM: EM: Support late CPUs booting and capacity adjustment")
Suggested-by: Kenneth Crudup <kenneth.crudup(a)gmail.com>
Reported-by: Kenneth Crudup <kenneth.crudup(a)gmail.com>
Closes: https://lore.kernel.org/linux-pm/40212796-734c-4140-8a85-854f72b8144d@panix…
Cc: stable(a)vger.kernel.org
Signed-off-by: Christian Loehle <christian.loehle(a)arm.com>
---
kernel/power/energy_model.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
index ea7995a25780..b63c2afc1379 100644
--- a/kernel/power/energy_model.c
+++ b/kernel/power/energy_model.c
@@ -778,7 +778,7 @@ void em_adjust_cpu_capacity(unsigned int cpu)
static void em_check_capacity_update(void)
{
cpumask_var_t cpu_done_mask;
- int cpu;
+ int cpu, failed_cpus = 0;
if (!zalloc_cpumask_var(&cpu_done_mask, GFP_KERNEL)) {
pr_warn("no free memory\n");
@@ -796,10 +796,8 @@ static void em_check_capacity_update(void)
policy = cpufreq_cpu_get(cpu);
if (!policy) {
- pr_debug("Accessing cpu%d policy failed\n", cpu);
- schedule_delayed_work(&em_update_work,
- msecs_to_jiffies(1000));
- break;
+ failed_cpus++;
+ continue;
}
cpufreq_cpu_put(policy);
@@ -814,6 +812,11 @@ static void em_check_capacity_update(void)
em_adjust_new_capacity(cpu, dev, pd);
}
+ if (failed_cpus) {
+ pr_debug("Accessing %d policies failed, retrying\n", failed_cpus);
+ schedule_delayed_work(&em_update_work, msecs_to_jiffies(1000));
+ }
+
free_cpumask_var(cpu_done_mask);
}
--
2.34.1
From: Stanislav Fort <stanislav.fort(a)aisle.com>
batadv_nc_skb_decode_packet() trusts coded_len and checks only against
skb->len. XOR starts at sizeof(struct batadv_unicast_packet), reducing
payload headroom, and the source skb length is not verified, allowing an
out-of-bounds read and a small out-of-bounds write.
Validate that coded_len fits within the payload area of both destination
and source sk_buffs before XORing.
Fixes: 2df5278b0267 ("batman-adv: network coding - receive coded packets and decode them")
Cc: stable(a)vger.kernel.org
Reported-by: Stanislav Fort <disclosure(a)aisle.com>
Signed-off-by: Stanislav Fort <stanislav.fort(a)aisle.com>
Signed-off-by: Sven Eckelmann <sven(a)narfation.org>
Signed-off-by: Simon Wunderlich <sw(a)simonwunderlich.de>
---
net/batman-adv/network-coding.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/batman-adv/network-coding.c b/net/batman-adv/network-coding.c
index 9f56308779cc3..af97d077369f9 100644
--- a/net/batman-adv/network-coding.c
+++ b/net/batman-adv/network-coding.c
@@ -1687,7 +1687,12 @@ batadv_nc_skb_decode_packet(struct batadv_priv *bat_priv, struct sk_buff *skb,
coding_len = ntohs(coded_packet_tmp.coded_len);
- if (coding_len > skb->len)
+ /* ensure dst buffer is large enough (payload only) */
+ if (coding_len + h_size > skb->len)
+ return NULL;
+
+ /* ensure src buffer is large enough (payload only) */
+ if (coding_len + h_size > nc_packet->skb->len)
return NULL;
/* Here the magic is reversed:
--
2.47.2
From: gyutrange <wlsrbwjd643(a)naver.com>
VNCR/TLBI VA reconstruction currently uses bit 48 as the sign bit,
but for 48-bit virtual addresses the correct sign bit is bit 47.
Using 48 can mis-canonicalize addresses in the negative half and may
cause missed invalidations.
Although VNCR_EL2 encodes other architectural fields (RESS, BADDR;
see Arm ARM D24.2.206), sign_extend64() interprets its second argument
as the index of the sign bit. Passing 48 prevents propagation of the
canonical sign bit for 48-bit VAs.
Impact:
- Incorrect canonicalization of VAs with bit47=1
- Potential stale VNCR pseudo-TLB entries after TLBI or MMU notifier
- Possible incorrect translation/permissions or DoS when combined
with other issues
Fixes: 667304740537 ("KVM: arm64: Mask out non-VA bits from TLBI VA* on VNCR invalidation")
Cc: stable(a)vger.kernel.org
Reported-by: DongHa Lee <gap-dev(a)example.com>
Reported-by: Gyujeong Jin <wlsrbwjd7232(a)gmail.com>
Reported-by: Daehyeon Ko <4ncient(a)example.com>
Reported-by: Geonha Lee <leegn4a(a)example.com>
Reported-by: Hyungyu Oh <dqpc_lover(a)example.com>
Reported-by: Jaewon Yang <r4mbb1(a)example.com>
Signed-off-by: Gyujeong Jin <wlsrbwjd7232(a)gmail.com>
---
arch/arm64/kvm/nested.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 77db81bae86f..eaa6dd9da086 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1169,7 +1169,7 @@ int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu)
static u64 read_vncr_el2(struct kvm_vcpu *vcpu)
{
- return (u64)sign_extend64(__vcpu_sys_reg(vcpu, VNCR_EL2), 48);
+ return (u64)sign_extend64(__vcpu_sys_reg(vcpu, VNCR_EL2), 47);
}
static int kvm_translate_vncr(struct kvm_vcpu *vcpu)
--
2.43.0
Correct the address in usb controller node to fix the following warning:
Warning (simple_bus_reg): /soc@0/usb@a6f8800: simple-bus unit address
format error, expected "a600000"
Fixes: c5a87e3a6b3e ("arm64: dts: qcom: sm8450: Flatten usb controller node")
Cc: stable(a)vger.kernel.org
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202508121834.953Mvah2-lkp@intel.com/
Signed-off-by: Krishna Kurapati <krishna.kurapati(a)oss.qualcomm.com>
---
This change was tested with W=1 and the reported issue is not seen.
Also didn't add RB Tag received from Neil Armstrong since there is a
change in commit text. This change is based on top of latest linux next.
Changes in v2:
Fixed the fixes tag.
Link to v1:
https://lore.kernel.org/all/20250813063840.2158792-1-krishna.kurapati@oss.q…
arch/arm64/boot/dts/qcom/sm8450.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
index 2baef6869ed7..38c91c3ec787 100644
--- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
@@ -5417,7 +5417,7 @@ opp-202000000 {
};
};
- usb_1: usb@a6f8800 {
+ usb_1: usb@a600000 {
compatible = "qcom,sm8450-dwc3", "qcom,snps-dwc3";
reg = <0 0x0a600000 0 0xfc100>;
status = "disabled";
--
2.34.1
There is a long standing bug which causes I2C communication not to
work on the Armada 3700 based boards. The first patch in the series
fixes that regression. The second patch improves recovery to make it
more robust which helps to avoid communication problems with certain
SFP modules.
Signed-off-by: Gabor Juhos <j4g8y7(a)gmail.com>
---
Changes in v3:
- rebase on tip of i2c/for-current
- remove Imre's tag from the cover letter, and replace his SoB tag to
Reviewed-by in the individual patches
- rework the second patch so it does not need changes in the I2C core code,
and drop the first one as it is not needed now
- Link to v2: https://lore.kernel.org/r/20250811-i2c-pxa-fix-i2c-communication-v2-0-ca42e…
Changes in v2:
- collect offered tags
- rebase and retest on tip of i2c/for-current
- Link to v1: https://lore.kernel.org/r/20250511-i2c-pxa-fix-i2c-communication-v1-0-e9097…
---
Gabor Juhos (2):
i2c: pxa: defer reset on Armada 3700 when recovery is used
i2c: pxa: handle 'Early Bus Busy' condition on Armada 3700
drivers/i2c/busses/i2c-pxa.c | 35 ++++++++++++++++++++++++++++-------
1 file changed, 28 insertions(+), 7 deletions(-)
---
base-commit: 3dd22078026c7cad4d4a3f32c5dc5452c7180de8
change-id: 20250510-i2c-pxa-fix-i2c-communication-3e6de1e3d0c6
Best regards,
--
Gabor Juhos <j4g8y7(a)gmail.com>
Commit under Fixes added support to build the driver as a loadable
module. However, it did not add MODULE_DEVICE_TABLE() which is required
for autoloading the driver when it is built as a loadable module.
Fix it.
Fixes: a2790bf81f0f ("PCI: j721e: Add support to build as a loadable module")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Siddharth Vadapalli <s-vadapalli(a)ti.com>
---
Hello,
This patch is based on commit
b320789d6883 Linux 6.17-rc4
of Mainline Linux.
Patch has been tested on J7200-EVM with the driver configs set to 'm'.
The pci-j721e.c driver is autoloaded as Linux boots and it doesn't need
to be manually probed. Logs:
https://gist.github.com/Siddharth-Vadapalli-at-TI/f64eed4df6fd4f714747b3c7e…
Regards,
Siddharth.
drivers/pci/controller/cadence/pci-j721e.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/controller/cadence/pci-j721e.c b/drivers/pci/controller/cadence/pci-j721e.c
index 6c93f39d0288..cfca13a4c840 100644
--- a/drivers/pci/controller/cadence/pci-j721e.c
+++ b/drivers/pci/controller/cadence/pci-j721e.c
@@ -440,6 +440,7 @@ static const struct of_device_id of_j721e_pcie_match[] = {
},
{},
};
+MODULE_DEVICE_TABLE(of, of_j721e_pcie_match);
static int j721e_pcie_probe(struct platform_device *pdev)
{
--
2.43.0
Hi maintainers,
Please consider backporting the following patches to the stable trees.
These patches fix a significant reading issue with mcp2221 on i2c eeprom.
I have confirmed that the patches applie cleanly and build successfully
against v6.6, v6.1, v5.15 and v5.10 stable branches.
I will later come back to you with other patches that might need larger
backport changes.
Thanks,
Romain
Hamish Martin (2):
HID: mcp2221: Don't set bus speed on every transfer
HID: mcp2221: Handle reads greater than 60 bytes
drivers/hid/hid-mcp2221.c | 71 +++++++++++++++++++++++++++------------
1 file changed, 49 insertions(+), 22 deletions(-)
--
2.48.1
Hello,
I would like to request backporting the following patch to 5.10 series:
590cfb082837cc6c0c595adf1711330197c86a58
ASoC: Intel: sof_rt5682: shrink platform_id names below 20 characters
The patch seems to be already present in 5.15 and newer branches, and
its lack seems to be causing out-of-bounds read. I've hit it in the
wild while trying to install 5.10.241 on i686:
sh /var/tmp/portage/sys-kernel/gentoo-kernel-5.10.241/work/linux-5.10/scripts/depmod.sh depmod 5.10.241-gentoo-dist
depmod: FATAL: Module index: bad character '�'=0x80 - only 7-bit ASCII is supported:
platform:jsl_rt5682_max98360ax�
TIA.
--
Best regards,
Michał Górny
Support for parsing the topology on AMD/Hygon processors using CPUID
leaf 0xb was added in commit 3986a0a805e6 ("x86/CPU/AMD: Derive CPU
topology from CPUID function 0xB when available"). In an effort to keep
all the topology parsing bits in one place, this commit also introduced
a pseudo dependency on the TOPOEXT feature to parse the CPUID leaf 0xb.
TOPOEXT feature (CPUID 0x80000001 ECX[22]) advertises the support for
Cache Properties leaf 0x8000001d and the CPUID leaf 0x8000001e EAX for
"Extended APIC ID" however support for 0xb was introduced alongside the
x2APIC support not only on AMD [1], but also historically on x86 [2].
Similar to 0xb, the support for extended CPU topology leaf 0x80000026
too does not depend on the TOPOEXT feature.
The support for these leaves is expected to be confirmed by ensuring
"leaf <= {extended_}cpuid_level" and then parsing the level 0 of the
respective leaf to confirm EBX[15:0] (LogProcAtThisLevel) is non-zero as
stated in the definition of "CPUID_Fn0000000B_EAX_x00 [Extended Topology
Enumeration] (Core::X86::Cpuid::ExtTopEnumEax0)" in Processor
Programming Reference (PPR) for AMD Family 19h Model 01h Rev B1 Vol1 [3]
Sec. 2.1.15.1 "CPUID Instruction Functions".
This has not been a problem on baremetal platforms since support for
TOPOEXT (Fam 0x15 and later) predates the support for CPUID leaf 0xb
(Fam 0x17[Zen2] and later), however, for AMD guests on QEMU, "x2apic"
feature can be enabled independent of the "topoext" feature where QEMU
expects topology and the initial APICID to be parsed using the CPUID
leaf 0xb (especially when number of cores > 255) which is populated
independent of the "topoext" feature flag.
Unconditionally call cpu_parse_topology_ext() on AMD and Hygon
processors to first parse the topology using the XTOPOLOGY leaves
(0x80000026 / 0xb) before using the TOPOEXT leaf (0x8000001e).
Cc: stable(a)vger.kernel.org # Only v6.9 and above
Link: https://lore.kernel.org/lkml/1529686927-7665-1-git-send-email-suravee.suthi… [1]
Link: https://lore.kernel.org/lkml/20080818181435.523309000@linux-os.sc.intel.com/ [2]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 [3]
Suggested-by: Naveen N Rao (AMD) <naveen(a)kernel.org>
Fixes: 3986a0a805e6 ("x86/CPU/AMD: Derive CPU topology from CPUID function 0xB when available")
Signed-off-by: K Prateek Nayak <kprateek.nayak(a)amd.com>
---
Changelog v3..v4:
o Quoted relevant section of the PPR justifying the changes.
o Moved this patch up ahead.
o Cc'd stable and made a note that backports should target v6.9 and
above since this depends on the x86 topology rewrite.
---
arch/x86/kernel/cpu/topology_amd.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
index 827dd0dbb6e9..4e3134a5550c 100644
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -175,18 +175,14 @@ static void topoext_fixup(struct topo_scan *tscan)
static void parse_topology_amd(struct topo_scan *tscan)
{
- bool has_topoext = false;
-
/*
- * If the extended topology leaf 0x8000_001e is available
- * try to get SMT, CORE, TILE, and DIE shifts from extended
+ * Try to get SMT, CORE, TILE, and DIE shifts from extended
* CPUID leaf 0x8000_0026 on supported processors first. If
* extended CPUID leaf 0x8000_0026 is not supported, try to
* get SMT and CORE shift from leaf 0xb first, then try to
* get the CORE shift from leaf 0x8000_0008.
*/
- if (cpu_feature_enabled(X86_FEATURE_TOPOEXT))
- has_topoext = cpu_parse_topology_ext(tscan);
+ bool has_topoext = cpu_parse_topology_ext(tscan);
if (cpu_feature_enabled(X86_FEATURE_AMD_HTR_CORES))
tscan->c->topo.cpu_type = cpuid_ebx(0x80000026);
--
2.34.1
This series adds support for the clone3 system call to the nios2
architecture. This addresses the build-time warning "warning: clone3()
entry point is missing, please fix" introduced in 505d66d1abfb9
("clone3: drop __ARCH_WANT_SYS_CLONE3 macro"). The implementation passes
the relevant clone3 tests of kselftest when applied on top of
next-20250815:
./run_kselftest.sh
TAP version 13
1..4
# selftests: clone3: clone3
ok 1 selftests: clone3: clone3
# selftests: clone3: clone3_clear_sighand
ok 2 selftests: clone3: clone3_clear_sighand
# selftests: clone3: clone3_set_tid
ok 3 selftests: clone3: clone3_set_tid
# selftests: clone3: clone3_cap_checkpoint_restore
ok 4 selftests: clone3: clone3_cap_checkpoint_restore
The series also includes a small patch to kernel/fork.c that ensures
that clone_flags are passed correctly on architectures where unsigned
long is insufficient to store the u64 clone_flags. It is marked as a fix
for stable backporting.
As requested, in v2, this series now further tries to correct this type
error throughout the whole code base. Thus, it now touches a larger
number of subsystems and all architectures.
Therefore, another test was performed for ARCH=x86_64 (as a
representative for 64-bit architectures). Here, the series builds cleanly
without warnings on defconfig with CONFIG_SECURITY_APPARMOR=y and
CONFIG_SECURITY_TOMOYO=y (to compile-check the LSM-related changes).
The build further successfully passes testing/selftests/clone3 (with the
patch from 20241105062948.1037011-1-zhouyuhang1010(a)163.com to prepare
clone3_cap_checkpoint_restore for compatibility with the newer libcap
version on my system).
Is there any option to further preflight check this patch series via
lkp/KernelCI/etc. for a broader test across architectures, or is this
degree of testing sufficient to eventually get the series merged?
N.B.: The series is not checkpatch clean right now:
- include/linux/cred.h, include/linux/mnt_namespace.h:
function definition arguments without identifier name
- include/trace/events/task.h:
space prohibited after that open parenthesis
I did not fix these warnings to keep my changes minimal and reviewable,
as the issues persist throughout the files and they were not introduced
by me; I only followed the existing code style and just replaced the
types. If desired, I'd be happy to make the changes in a potential v3,
though.
Signed-off-by: Simon Schuster <schuster.simon(a)siemens-energy.com>
---
Changes in v2:
- Introduce "Fixes:" and "Cc: stable(a)vger.kernel.org" where necessary
- Factor out "Fixes:" when adapting the datatype of clone_flags for
easier backports
- Fix additional instances where `unsigned long` clone_flags is used
- Reword commit message to make it clearer that any 32-bit arch is
affected by this bug
- Link to v1: https://lore.kernel.org/r/20250821-nios2-implement-clone3-v1-0-1bb24017376a…
---
Simon Schuster (4):
copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64)
copy_process: pass clone_flags as u64 across calltree
arch: copy_thread: pass clone_flags as u64
nios2: implement architecture-specific portion of sys_clone3
arch/alpha/kernel/process.c | 2 +-
arch/arc/kernel/process.c | 2 +-
arch/arm/kernel/process.c | 2 +-
arch/arm64/kernel/process.c | 2 +-
arch/csky/kernel/process.c | 2 +-
arch/hexagon/kernel/process.c | 2 +-
arch/loongarch/kernel/process.c | 2 +-
arch/m68k/kernel/process.c | 2 +-
arch/microblaze/kernel/process.c | 2 +-
arch/mips/kernel/process.c | 2 +-
arch/nios2/include/asm/syscalls.h | 1 +
arch/nios2/include/asm/unistd.h | 2 --
arch/nios2/kernel/entry.S | 6 ++++++
arch/nios2/kernel/process.c | 2 +-
arch/nios2/kernel/syscall_table.c | 1 +
arch/openrisc/kernel/process.c | 2 +-
arch/parisc/kernel/process.c | 2 +-
arch/powerpc/kernel/process.c | 2 +-
arch/riscv/kernel/process.c | 2 +-
arch/s390/kernel/process.c | 2 +-
arch/sh/kernel/process_32.c | 2 +-
arch/sparc/kernel/process_32.c | 2 +-
arch/sparc/kernel/process_64.c | 2 +-
arch/um/kernel/process.c | 2 +-
arch/x86/include/asm/fpu/sched.h | 2 +-
arch/x86/include/asm/shstk.h | 4 ++--
arch/x86/kernel/fpu/core.c | 2 +-
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/shstk.c | 2 +-
arch/xtensa/kernel/process.c | 2 +-
block/blk-ioc.c | 2 +-
fs/namespace.c | 2 +-
include/linux/cgroup.h | 4 ++--
include/linux/cred.h | 2 +-
include/linux/iocontext.h | 6 +++---
include/linux/ipc_namespace.h | 4 ++--
include/linux/lsm_hook_defs.h | 2 +-
include/linux/mnt_namespace.h | 2 +-
include/linux/nsproxy.h | 2 +-
include/linux/pid_namespace.h | 4 ++--
include/linux/rseq.h | 4 ++--
include/linux/sched/task.h | 2 +-
include/linux/security.h | 4 ++--
include/linux/sem.h | 4 ++--
include/linux/time_namespace.h | 4 ++--
include/linux/uprobes.h | 4 ++--
include/linux/user_events.h | 4 ++--
include/linux/utsname.h | 4 ++--
include/net/net_namespace.h | 4 ++--
include/trace/events/task.h | 6 +++---
ipc/namespace.c | 2 +-
ipc/sem.c | 2 +-
kernel/cgroup/namespace.c | 2 +-
kernel/cred.c | 2 +-
kernel/events/uprobes.c | 2 +-
kernel/fork.c | 10 +++++-----
kernel/nsproxy.c | 4 ++--
kernel/pid_namespace.c | 2 +-
kernel/sched/core.c | 4 ++--
kernel/sched/fair.c | 2 +-
kernel/sched/sched.h | 4 ++--
kernel/time/namespace.c | 2 +-
kernel/utsname.c | 2 +-
net/core/net_namespace.c | 2 +-
security/apparmor/lsm.c | 2 +-
security/security.c | 2 +-
security/selinux/hooks.c | 2 +-
security/tomoyo/tomoyo.c | 2 +-
68 files changed, 95 insertions(+), 89 deletions(-)
---
base-commit: 1357b2649c026b51353c84ddd32bc963e8999603
change-id: 20250818-nios2-implement-clone3-7f252c20860b
Best regards,
--
Simon Schuster <schuster.simon(a)siemens-energy.com>
Hello,
Henrik notified me that UDF in 5.15 kernel is failing for him, likely
because commit 1ea1cd11c72d ("udf: Fix directory iteration for longer tail
extents") is missing 5.15 stable tree. Basically wherever d16076d9b684b got
backported, 1ea1cd11c72d needs to be backported as well (I'm sorry for
forgetting to add proper Fixes tag back then).
Honza
--
Jan Kara <jack(a)suse.com>
SUSE Labs, CR
When the xe pm_notifier evicts for suspend / hibernate, there might be
racing tasks trying to re-validate again. This can lead to suspend taking
excessive time or get stuck in a live-lock. This behaviour becomes
much worse with the fix that actually makes re-validation bring back
bos to VRAM rather than letting them remain in TT.
Prevent that by having exec and the rebind worker waiting for a completion
that is set to block by the pm_notifier before suspend and is signaled
by the pm_notifier after resume / wakeup.
It's probably still possible to craft malicious applications that block
suspending. More work is pending to fix that.
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4288
Fixes: c6a4d46ec1d7 ("drm/xe: evict user memory in PM notifier")
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.16+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
---
drivers/gpu/drm/xe/xe_device_types.h | 2 ++
drivers/gpu/drm/xe/xe_exec.c | 9 +++++++++
drivers/gpu/drm/xe/xe_pm.c | 4 ++++
drivers/gpu/drm/xe/xe_vm.c | 14 ++++++++++++++
4 files changed, 29 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 092004d14db2..6602bd678cbc 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -507,6 +507,8 @@ struct xe_device {
/** @pm_notifier: Our PM notifier to perform actions in response to various PM events. */
struct notifier_block pm_notifier;
+ /** @pm_block: Completion to block validating tasks on suspend / hibernate prepare */
+ struct completion pm_block;
/** @pmt: Support the PMT driver callback interface */
struct {
diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
index 44364c042ad7..374c831e691b 100644
--- a/drivers/gpu/drm/xe/xe_exec.c
+++ b/drivers/gpu/drm/xe/xe_exec.c
@@ -237,6 +237,15 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
goto err_unlock_list;
}
+ /*
+ * It's OK to block interruptible here with the vm lock held, since
+ * on task freezing during suspend / hibernate, the call will
+ * return -ERESTARTSYS and the IOCTL will be rerun.
+ */
+ err = wait_for_completion_interruptible(&xe->pm_block);
+ if (err)
+ goto err_unlock_list;
+
vm_exec.vm = &vm->gpuvm;
vm_exec.flags = DRM_EXEC_INTERRUPTIBLE_WAIT;
if (xe_vm_in_lr_mode(vm)) {
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index bee9aacd82e7..f4c7a84270ea 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -306,6 +306,7 @@ static int xe_pm_notifier_callback(struct notifier_block *nb,
switch (action) {
case PM_HIBERNATION_PREPARE:
case PM_SUSPEND_PREPARE:
+ reinit_completion(&xe->pm_block);
xe_pm_runtime_get(xe);
err = xe_bo_evict_all_user(xe);
if (err)
@@ -322,6 +323,7 @@ static int xe_pm_notifier_callback(struct notifier_block *nb,
break;
case PM_POST_HIBERNATION:
case PM_POST_SUSPEND:
+ complete_all(&xe->pm_block);
xe_bo_notifier_unprepare_all_pinned(xe);
xe_pm_runtime_put(xe);
break;
@@ -348,6 +350,8 @@ int xe_pm_init(struct xe_device *xe)
if (err)
return err;
+ init_completion(&xe->pm_block);
+ complete_all(&xe->pm_block);
/* For now suspend/resume is only allowed with GuC */
if (!xe_device_uc_enabled(xe))
return 0;
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 3ff3c67aa79d..edcdb0528f2a 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -394,6 +394,9 @@ static int xe_gpuvm_validate(struct drm_gpuvm_bo *vm_bo, struct drm_exec *exec)
list_move_tail(&gpuva_to_vma(gpuva)->combined_links.rebind,
&vm->rebind_list);
+ if (!try_wait_for_completion(&vm->xe->pm_block))
+ return -EAGAIN;
+
ret = xe_bo_validate(gem_to_xe_bo(vm_bo->obj), vm, false);
if (ret)
return ret;
@@ -494,6 +497,12 @@ static void preempt_rebind_work_func(struct work_struct *w)
xe_assert(vm->xe, xe_vm_in_preempt_fence_mode(vm));
trace_xe_vm_rebind_worker_enter(vm);
+ /*
+ * This blocks the wq during suspend / hibernate.
+ * Don't hold any locks.
+ */
+retry_pm:
+ wait_for_completion(&vm->xe->pm_block);
down_write(&vm->lock);
if (xe_vm_is_closed_or_banned(vm)) {
@@ -503,6 +512,11 @@ static void preempt_rebind_work_func(struct work_struct *w)
}
retry:
+ if (!try_wait_for_completion(&vm->xe->pm_block)) {
+ up_write(&vm->lock);
+ goto retry_pm;
+ }
+
if (xe_vm_userptr_check_repin(vm)) {
err = xe_vm_userptr_pin(vm);
if (err)
--
2.50.1
switch_to_super_page() assumes the memory range it's working on is aligned
to the target large page level. Unfortunately, __domain_mapping() doesn't
take this into account when using it, and will pass unaligned ranges
ultimately freeing a PTE range larger than expected.
Take for example a mapping with the following iov_pfn range [0x3fe400,
0x4c0600], which should be backed by the following mappings:
iov_pfn [0x3fe400, 0x3fffff] covered by 2MiB pages
iov_pfn [0x400000, 0x4bffff] covered by 1GiB pages
iov_pfn [0x4c0000, 0x4c05ff] covered by 2MiB pages
Under this circumstance, __domain_mapping() will pass [0x400000, 0x4c05ff]
to switch_to_super_page() at a 1 GiB granularity, which will in turn
free PTEs all the way to iov_pfn 0x4fffff.
Mitigate this by rounding down the iov_pfn range passed to
switch_to_super_page() in __domain_mapping()
to the target large page level.
Additionally add range alignment checks to switch_to_super_page.
Fixes: 9906b9352a35 ("iommu/vt-d: Avoid duplicate removing in __domain_mapping()")
Signed-off-by: Eugene Koira <eugkoira(a)amazon.com>
Cc: stable(a)vger.kernel.org
---
drivers/iommu/intel/iommu.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9c3ab9d9f69a..dff2d895b8ab 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1575,6 +1575,10 @@ static void switch_to_super_page(struct dmar_domain *domain,
unsigned long lvl_pages = lvl_to_nr_pages(level);
struct dma_pte *pte = NULL;
+ if (WARN_ON(!IS_ALIGNED(start_pfn, lvl_pages) ||
+ !IS_ALIGNED(end_pfn + 1, lvl_pages)))
+ return;
+
while (start_pfn <= end_pfn) {
if (!pte)
pte = pfn_to_dma_pte(domain, start_pfn, &level,
@@ -1650,7 +1654,8 @@ __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
unsigned long pages_to_remove;
pteval |= DMA_PTE_LARGE_PAGE;
- pages_to_remove = min_t(unsigned long, nr_pages,
+ pages_to_remove = min_t(unsigned long,
+ round_down(nr_pages, lvl_pages),
nr_pte_to_next_page(pte) * lvl_pages);
end_pfn = iov_pfn + pages_to_remove - 1;
switch_to_super_page(domain, iov_pfn, end_pfn, largepage_lvl);
--
2.47.3
Amazon Development Center (Netherlands) B.V., Johanna Westerdijkplein 1, NL-2521 EN The Hague, Registration No. Chamber of Commerce 56869649, VAT: NL 852339859B01
Fix incorrect use of IS_ERR() to check devm_kzalloc() return value.
devm_kzalloc() returns NULL on failure, not an error pointer.
This issue was introduced by commit 4277f035d227
("coresight: trbe: Add a representative coresight_platform_data for TRBE")
which replaced the original function but didn't update the error check.
Fixes: 4277f035d227 ("coresight: trbe: Add a representative coresight_platform_data for TRBE")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/hwtracing/coresight/coresight-trbe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 8267dd1a2130..caf873adfc3a 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -1279,7 +1279,7 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
* into the device for that purpose.
*/
desc.pdata = devm_kzalloc(dev, sizeof(*desc.pdata), GFP_KERNEL);
- if (IS_ERR(desc.pdata))
+ if (!desc.pdata)
goto cpu_clear;
desc.type = CORESIGHT_DEV_TYPE_SINK;
--
2.35.1
The patch titled
Subject: mm: fix folio_expected_ref_count() when PG_private_2
has been added to the -mm mm-new branch. Its filename is
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm: fix folio_expected_ref_count() when PG_private_2
Date: Sun, 31 Aug 2025 02:01:16 -0700 (PDT)
6.16's folio_expected_ref_count() is forgetting the PG_private_2 flag,
which (like PG_private, but not in addition to PG_private) counts for 1
more reference: it needs to be using folio_has_private() in place of
folio_test_private().
But this went wrong earlier: folio_expected_ref_count() was based on (and
replaced) mm/migrate.c's folio_expected_refs(), which has been using
folio_test_private() since 6.0 converted to folios(): before that,
expected_page_refs() was correctly using page_has_private().
Just a few filesystems are still using PG_private_2 a.k.a. PG_fscache.
Potentially, this fix re-enables page migration on their folios; but it
would not be surprising to learn that in practice those folios are not
migratable for other reasons.
Link: https://lkml.kernel.org/r/f91ee36e-a8cb-e3a4-c23b-524ff3848da7@google.com
Fixes: 86ebd50224c0 ("mm: add folio_expected_ref_count() for reference count calculation")
Fixes: 108ca8358139 ("mm/migrate: Convert expected_page_refs() to folio_expected_refs()")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mm.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/mm.h~mm-fix-folio_expected_ref_count-when-pg_private_2
+++ a/include/linux/mm.h
@@ -2234,8 +2234,8 @@ static inline int folio_expected_ref_cou
} else {
/* One reference per page from the pagecache. */
ref_count += !!folio->mapping << order;
- /* One reference from PG_private. */
- ref_count += folio_test_private(folio);
+ /* One reference from PG_private or PG_private_2. */
+ ref_count += folio_has_private(folio);
}
/* One reference per page table mapping. */
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
mm-folio_may_be_cached-unless-folio_test_large.patch
mm-lru_add_drain_all-do-local-lru_add_drain-first.patch
Please backport the following commit into Linux v5.4 and newer stable
releases:
80dc18a0cba8dea42614f021b20a04354b213d86
The backport will likely depend on macro rename commit:
817f989700fddefa56e5e443e7d138018ca6709d
This part of commit description clarifies why this is a fix:
"
As per PCIe r6.0, sec 6.6.1, a Downstream Port that supports Link speeds
greater than 5.0 GT/s, software must wait a minimum of 100 ms after Link
training completes before sending a Configuration Request.
"
In practice, this makes detection of PCIe Gen3 and Gen4 SSDs reliable on
Renesas R-Car V4H SoC. Without this commit, the SSDs sporadically do not
get detected, or sometimes they link up in Gen1 mode.
This fixes commit
886bc5ceb5cc ("PCI: designware: Add generic dw_pcie_wait_for_link()")
which is in v4.5-rc1-4-g886bc5ceb5cc3 , so I think this fix should be
backported to all currently maintained stable releases, i.e. v5.4+ .
Thank you
--
Best regards,
Marek Vasut
The patch titled
Subject: mm: folio_may_be_cached() unless folio_test_large()
has been added to the -mm mm-new branch. Its filename is
mm-folio_may_be_cached-unless-folio_test_large.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm: folio_may_be_cached() unless folio_test_large()
Date: Sun, 31 Aug 2025 02:16:25 -0700 (PDT)
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a
large folio is added: so collect_longterm_unpinnable_folios() just wastes
effort when calling lru_add_drain_all() on a large folio.
But although there is good reason not to batch up PMD-sized folios, we
might well benefit from batching a small number of low-order mTHPs (though
unclear how that "small number" limitation will be implemented).
So ask if folio_may_be_cached() rather than !folio_test_large(), to
insulate those particular checks from future change. Name preferred to
"folio_is_batchable" because large folios can well be put on a batch: it's
just the per-CPU LRU caches, drained much later, which need care.
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration".
Link: https://lkml.kernel.org/r/861c061c-51cd-b940-49df-9f55e1fee2c8@google.com
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/swap.h | 10 ++++++++++
mm/gup.c | 5 +++--
mm/mlock.c | 6 +++---
mm/swap.c | 2 +-
4 files changed, 17 insertions(+), 6 deletions(-)
--- a/include/linux/swap.h~mm-folio_may_be_cached-unless-folio_test_large
+++ a/include/linux/swap.h
@@ -381,6 +381,16 @@ void folio_add_lru_vma(struct folio *, s
void mark_page_accessed(struct page *);
void folio_mark_accessed(struct folio *);
+static inline bool folio_may_be_cached(struct folio *folio)
+{
+ /*
+ * Holding PMD-sized folios in per-CPU LRU cache unbalances accounting.
+ * Holding small numbers of low-order mTHP folios in per-CPU LRU cache
+ * will be sensible, but nobody has implemented and tested that yet.
+ */
+ return !folio_test_large(folio);
+}
+
extern atomic_t lru_disable_count;
static inline bool lru_cache_disabled(void)
--- a/mm/gup.c~mm-folio_may_be_cached-unless-folio_test_large
+++ a/mm/gup.c
@@ -2309,8 +2309,9 @@ static unsigned long collect_longterm_un
continue;
}
- if (drain_allow && folio_ref_count(folio) !=
- folio_expected_ref_count(folio) + 1) {
+ if (drain_allow && folio_may_be_cached(folio) &&
+ folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
--- a/mm/mlock.c~mm-folio_may_be_cached-unless-folio_test_large
+++ a/mm/mlock.c
@@ -255,7 +255,7 @@ void mlock_folio(struct folio *folio)
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_lru(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -278,7 +278,7 @@ void mlock_new_folio(struct folio *folio
folio_get(folio);
if (!folio_batch_add(fbatch, mlock_new(folio)) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
@@ -299,7 +299,7 @@ void munlock_folio(struct folio *folio)
*/
folio_get(folio);
if (!folio_batch_add(fbatch, folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_cached(folio) || lru_cache_disabled())
mlock_folio_batch(fbatch);
local_unlock(&mlock_fbatch.lock);
}
--- a/mm/swap.c~mm-folio_may_be_cached-unless-folio_test_large
+++ a/mm/swap.c
@@ -192,7 +192,7 @@ static void __folio_batch_add_and_move(s
local_lock(&cpu_fbatches.lock);
if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
- folio_test_large(folio) || lru_cache_disabled())
+ !folio_may_be_cached(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
mm-folio_may_be_cached-unless-folio_test_large.patch
mm-lru_add_drain_all-do-local-lru_add_drain-first.patch
The patch titled
Subject: mm: Revert "mm: vmscan.c: fix OOM on swap stress test"
has been added to the -mm mm-new branch. Its filename is
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm: Revert "mm: vmscan.c: fix OOM on swap stress test"
Date: Sun, 31 Aug 2025 02:13:55 -0700 (PDT)
This reverts commit 0885ef4705607936fc36a38fd74356e1c465b023: that
was a fix to the reverted 33dfe9204f29b415bbc0abb1a50642d1ba94f5e9.
Link: https://lkml.kernel.org/r/bd440614-3d2c-31d6-1b8f-9635c69cef7c@google.com
Fixes: 0885ef470560 ("mm: vmscan.c: fix OOM on swap stress test")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/vmscan.c~mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test
+++ a/mm/vmscan.c
@@ -4500,7 +4500,7 @@ static bool sort_folio(struct lruvec *lr
}
/* ineligible */
- if (!folio_test_lru(folio) || zone > sc->reclaim_idx) {
+ if (zone > sc->reclaim_idx) {
gen = folio_inc_gen(lruvec, folio, false);
list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]);
return true;
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
mm-folio_may_be_cached-unless-folio_test_large.patch
mm-lru_add_drain_all-do-local-lru_add_drain-first.patch
The patch titled
Subject: mm: Revert "mm/gup: clear the LRU flag of a page before adding to LRU batch"
has been added to the -mm mm-new branch. Its filename is
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm: Revert "mm/gup: clear the LRU flag of a page before adding to LRU batch"
Date: Sun, 31 Aug 2025 02:11:33 -0700 (PDT)
This reverts commit 33dfe9204f29b415bbc0abb1a50642d1ba94f5e9: now that
collect_longterm_unpinnable_folios() is checking ref_count instead of lru,
and mlock/munlock do not participate in the revised LRU flag clearing,
those changes are misleading, and enlarge the window during which
mlock/munlock may miss an mlock_count update.
It is possible (I'd hesitate to claim probable) that the greater
likelihood of missed mlock_count updates would explain the "Realtime
threads delayed due to kcompactd0" observed on 6.12 in the Link below. If
that is the case, this reversion will help; but a complete solution needs
also a further patch, beyond the scope of this series.
Included some 80-column cleanup around folio_batch_add_and_move().
The role of folio_test_clear_lru() (before taking per-memcg lru_lock) is
questionable since 6.13 removed mem_cgroup_move_account() etc; but perhaps
there are still some races which need it - not examined here.
Link: https://lore.kernel.org/linux-mm/DU0PR01MB10385345F7153F334100981888259A@DU…
Link: https://lkml.kernel.org/r/0215a42b-99cd-612a-95f7-56f8251d99ef@google.com
Fixes: 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding to LRU batch")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swap.c | 50 ++++++++++++++++++++++++++------------------------
1 file changed, 26 insertions(+), 24 deletions(-)
--- a/mm/swap.c~mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch
+++ a/mm/swap.c
@@ -164,6 +164,10 @@ static void folio_batch_move_lru(struct
for (i = 0; i < folio_batch_count(fbatch); i++) {
struct folio *folio = fbatch->folios[i];
+ /* block memcg migration while the folio moves between lru */
+ if (move_fn != lru_add && !folio_test_clear_lru(folio))
+ continue;
+
folio_lruvec_relock_irqsave(folio, &lruvec, &flags);
move_fn(lruvec, folio);
@@ -176,14 +180,10 @@ static void folio_batch_move_lru(struct
}
static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
- struct folio *folio, move_fn_t move_fn,
- bool on_lru, bool disable_irq)
+ struct folio *folio, move_fn_t move_fn, bool disable_irq)
{
unsigned long flags;
- if (on_lru && !folio_test_clear_lru(folio))
- return;
-
folio_get(folio);
if (disable_irq)
@@ -191,8 +191,8 @@ static void __folio_batch_add_and_move(s
else
local_lock(&cpu_fbatches.lock);
- if (!folio_batch_add(this_cpu_ptr(fbatch), folio) || folio_test_large(folio) ||
- lru_cache_disabled())
+ if (!folio_batch_add(this_cpu_ptr(fbatch), folio) ||
+ folio_test_large(folio) || lru_cache_disabled())
folio_batch_move_lru(this_cpu_ptr(fbatch), move_fn);
if (disable_irq)
@@ -201,13 +201,13 @@ static void __folio_batch_add_and_move(s
local_unlock(&cpu_fbatches.lock);
}
-#define folio_batch_add_and_move(folio, op, on_lru) \
- __folio_batch_add_and_move( \
- &cpu_fbatches.op, \
- folio, \
- op, \
- on_lru, \
- offsetof(struct cpu_fbatches, op) >= offsetof(struct cpu_fbatches, lock_irq) \
+#define folio_batch_add_and_move(folio, op) \
+ __folio_batch_add_and_move( \
+ &cpu_fbatches.op, \
+ folio, \
+ op, \
+ offsetof(struct cpu_fbatches, op) >= \
+ offsetof(struct cpu_fbatches, lock_irq) \
)
static void lru_move_tail(struct lruvec *lruvec, struct folio *folio)
@@ -231,10 +231,10 @@ static void lru_move_tail(struct lruvec
void folio_rotate_reclaimable(struct folio *folio)
{
if (folio_test_locked(folio) || folio_test_dirty(folio) ||
- folio_test_unevictable(folio))
+ folio_test_unevictable(folio) || !folio_test_lru(folio))
return;
- folio_batch_add_and_move(folio, lru_move_tail, true);
+ folio_batch_add_and_move(folio, lru_move_tail);
}
void lru_note_cost_unlock_irq(struct lruvec *lruvec, bool file,
@@ -328,10 +328,11 @@ static void folio_activate_drain(int cpu
void folio_activate(struct folio *folio)
{
- if (folio_test_active(folio) || folio_test_unevictable(folio))
+ if (folio_test_active(folio) || folio_test_unevictable(folio) ||
+ !folio_test_lru(folio))
return;
- folio_batch_add_and_move(folio, lru_activate, true);
+ folio_batch_add_and_move(folio, lru_activate);
}
#else
@@ -507,7 +508,7 @@ void folio_add_lru(struct folio *folio)
lru_gen_in_fault() && !(current->flags & PF_MEMALLOC))
folio_set_active(folio);
- folio_batch_add_and_move(folio, lru_add, false);
+ folio_batch_add_and_move(folio, lru_add);
}
EXPORT_SYMBOL(folio_add_lru);
@@ -685,13 +686,13 @@ void lru_add_drain_cpu(int cpu)
void deactivate_file_folio(struct folio *folio)
{
/* Deactivating an unevictable folio will not accelerate reclaim */
- if (folio_test_unevictable(folio))
+ if (folio_test_unevictable(folio) || !folio_test_lru(folio))
return;
if (lru_gen_enabled() && lru_gen_clear_refs(folio))
return;
- folio_batch_add_and_move(folio, lru_deactivate_file, true);
+ folio_batch_add_and_move(folio, lru_deactivate_file);
}
/*
@@ -704,13 +705,13 @@ void deactivate_file_folio(struct folio
*/
void folio_deactivate(struct folio *folio)
{
- if (folio_test_unevictable(folio))
+ if (folio_test_unevictable(folio) || !folio_test_lru(folio))
return;
if (lru_gen_enabled() ? lru_gen_clear_refs(folio) : !folio_test_active(folio))
return;
- folio_batch_add_and_move(folio, lru_deactivate, true);
+ folio_batch_add_and_move(folio, lru_deactivate);
}
/**
@@ -723,10 +724,11 @@ void folio_deactivate(struct folio *foli
void folio_mark_lazyfree(struct folio *folio)
{
if (!folio_test_anon(folio) || !folio_test_swapbacked(folio) ||
+ !folio_test_lru(folio) ||
folio_test_swapcache(folio) || folio_test_unevictable(folio))
return;
- folio_batch_add_and_move(folio, lru_lazyfree, true);
+ folio_batch_add_and_move(folio, lru_lazyfree);
}
void lru_add_drain(void)
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
mm-folio_may_be_cached-unless-folio_test_large.patch
mm-lru_add_drain_all-do-local-lru_add_drain-first.patch
The patch titled
Subject: mm/gup: local lru_add_drain() to avoid lru_add_drain_all()
has been added to the -mm mm-new branch. Its filename is
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm/gup: local lru_add_drain() to avoid lru_add_drain_all()
Date: Sun, 31 Aug 2025 02:08:26 -0700 (PDT)
In many cases, if collect_longterm_unpinnable_folios() does need to drain
the LRU cache to release a reference, the cache in question is on this
same CPU, and much more efficiently drained by a preliminary local
lru_add_drain(), than the later cross-CPU lru_add_drain_all().
Marked for stable, to counter the increase in lru_add_drain_all()s from
"mm/gup: check ref_count instead of lru before migration". Note for clean
backports: can take 6.16 commit a03db236aebf ("gup: optimize longterm
pin_user_pages() for large folio") first.
Link: https://lkml.kernel.org/r/165ccfdb-16b5-ac11-0d66-2984293590c8@google.com
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/gup.c~mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all
+++ a/mm/gup.c
@@ -2291,6 +2291,8 @@ static unsigned long collect_longterm_un
struct folio *folio;
long i = 0;
+ lru_add_drain();
+
for (folio = pofs_get_folio(pofs, i); folio;
folio = pofs_next_folio(folio, pofs, &i)) {
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
mm-folio_may_be_cached-unless-folio_test_large.patch
mm-lru_add_drain_all-do-local-lru_add_drain-first.patch
The patch titled
Subject: mm/gup: check ref_count instead of lru before migration
has been added to the -mm mm-new branch. Its filename is
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm/gup: check ref_count instead of lru before migration
Date: Sun, 31 Aug 2025 02:05:56 -0700 (PDT)
Will Deacon reports:-
When taking a longterm GUP pin via pin_user_pages(),
__gup_longterm_locked() tries to migrate target folios that should not be
longterm pinned, for example because they reside in a CMA region or
movable zone. This is done by first pinning all of the target folios
anyway, collecting all of the longterm-unpinnable target folios into a
list, dropping the pins that were just taken and finally handing the list
off to migrate_pages() for the actual migration.
It is critically important that no unexpected references are held on the
folios being migrated, otherwise the migration will fail and
pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is
relatively easy to observe migration failures when running pKVM (which
uses pin_user_pages() on crosvm's virtual address space to resolve stage-2
page faults from the guest) on a 6.15-based Pixel 6 device and this
results in the VM terminating prematurely.
In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its
mapping of guest memory prior to the pinning. Subsequently, when
pin_user_pages() walks the page-table, the relevant 'pte' is not present
and so the faulting logic allocates a new folio, mlocks it with
mlock_folio() and maps it in the page-table.
Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch
by pagevec"), mlock/munlock operations on a folio (formerly page), are
deferred. For example, mlock_folio() takes an additional reference on the
target folio before placing it into a per-cpu 'folio_batch' for later
processing by mlock_folio_batch(), which drops the refcount once the
operation is complete. Processing of the batches is coupled with the LRU
batch logic and can be forcefully drained with lru_add_drain_all() but as
long as a folio remains unprocessed on the batch, its refcount will be
elevated.
This deferred batching therefore interacts poorly with the pKVM pinning
scenario as we can find ourselves in a situation where the migration code
fails to migrate a folio due to the elevated refcount from the pending
mlock operation.
Hugh Dickins adds:-
!folio_test_lru() has never been a very reliable way to tell if an
lru_add_drain_all() is worth calling, to remove LRU cache references to
make the folio migratable: the LRU flag may be set even while the folio is
held with an extra reference in a per-CPU LRU cache.
5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11
commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding
to LRU batch") tried to make it reliable, by moving LRU flag clearing; but
missed the mlock/munlock batches, so still unreliable as reported.
And it turns out to be difficult to extend 33dfe9204f29's LRU flag
clearing to the mlock/munlock batches: if they do benefit from batching,
mlock/munlock cannot be so effective when easily suppressed while !LRU.
Instead, switch to an expected ref_count check, which was more reliable
all along: some more false positives (unhelpful drains) than before, and
never a guarantee that the folio will prove migratable, but better.
Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm: add
folio_expected_ref_count() for reference count calculation") and 6.17
commit ("mm: fix folio_expected_ref_count() when PG_private_2").
Link: https://lkml.kernel.org/r/47c51c9a-140f-1ea1-b692-c4bae5d1fa58@google.com
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: Will Deacon <will(a)kernel.org>
Link: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/
Cc: <stable(a)vger.kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Keir Fraser <keirf(a)google.com>
Cc: Konstantin Khlebnikov <koct9i(a)gmail.com>
Cc: Li Zhe <lizhe.67(a)bytedance.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Shivank Garg <shivankg(a)amd.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Xu <weixugc(a)google.com>
Cc: yangge <yangge1116(a)126.com>
Cc: Yuanchu Xie <yuanchu(a)google.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/gup.c~mm-gup-check-ref_count-instead-of-lru-before-migration
+++ a/mm/gup.c
@@ -2307,7 +2307,8 @@ static unsigned long collect_longterm_un
continue;
}
- if (!folio_test_lru(folio) && drain_allow) {
+ if (drain_allow && folio_ref_count(folio) !=
+ folio_expected_ref_count(folio) + 1) {
lru_add_drain_all();
drain_allow = false;
}
_
Patches currently in -mm which might be from hughd(a)google.com are
mm-fix-folio_expected_ref_count-when-pg_private_2.patch
mm-gup-check-ref_count-instead-of-lru-before-migration.patch
mm-gup-local-lru_add_drain-to-avoid-lru_add_drain_all.patch
mm-revert-mm-gup-clear-the-lru-flag-of-a-page-before-adding-to-lru-batch.patch
mm-revert-mm-vmscanc-fix-oom-on-swap-stress-test.patch
mm-folio_may_be_cached-unless-folio_test_large.patch
mm-lru_add_drain_all-do-local-lru_add_drain-first.patch
The patch titled
Subject: mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Uladzislau Rezki (Sony)" <urezki(a)gmail.com>
Subject: mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc()
Date: Sun, 31 Aug 2025 14:10:58 +0200
kasan_populate_vmalloc() and its helpers ignore the caller's gfp_mask and
always allocate memory using the hardcoded GFP_KERNEL flag. This makes
them inconsistent with vmalloc(), which was recently extended to support
GFP_NOFS and GFP_NOIO allocations.
Page table allocations performed during shadow population also ignore the
external gfp_mask. To preserve the intended semantics of GFP_NOFS and
GFP_NOIO, wrap the apply_to_page_range() calls into the appropriate
memalloc scope.
This patch:
- Extends kasan_populate_vmalloc() and helpers to take gfp_mask;
- Passes gfp_mask down to alloc_pages_bulk() and __get_free_page();
- Enforces GFP_NOFS/NOIO semantics with memalloc_*_save()/restore()
around apply_to_page_range();
- Updates vmalloc.c and percpu allocator call sites accordingly.
Link: https://lkml.kernel.org/r/20250831121058.92971-1-urezki@gmail.com
Fixes: 451769ebb7e7 ("mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc")
Signed-off-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/kasan.h | 6 +++---
mm/kasan/shadow.c | 31 ++++++++++++++++++++++++-------
mm/vmalloc.c | 8 ++++----
3 files changed, 31 insertions(+), 14 deletions(-)
--- a/include/linux/kasan.h~mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc
+++ a/include/linux/kasan.h
@@ -562,7 +562,7 @@ static inline void kasan_init_hw_tags(vo
#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
void kasan_populate_early_vm_area_shadow(void *start, unsigned long size);
-int kasan_populate_vmalloc(unsigned long addr, unsigned long size);
+int kasan_populate_vmalloc(unsigned long addr, unsigned long size, gfp_t gfp_mask);
void kasan_release_vmalloc(unsigned long start, unsigned long end,
unsigned long free_region_start,
unsigned long free_region_end,
@@ -574,7 +574,7 @@ static inline void kasan_populate_early_
unsigned long size)
{ }
static inline int kasan_populate_vmalloc(unsigned long start,
- unsigned long size)
+ unsigned long size, gfp_t gfp_mask)
{
return 0;
}
@@ -610,7 +610,7 @@ static __always_inline void kasan_poison
static inline void kasan_populate_early_vm_area_shadow(void *start,
unsigned long size) { }
static inline int kasan_populate_vmalloc(unsigned long start,
- unsigned long size)
+ unsigned long size, gfp_t gfp_mask)
{
return 0;
}
--- a/mm/kasan/shadow.c~mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc
+++ a/mm/kasan/shadow.c
@@ -336,13 +336,13 @@ static void ___free_pages_bulk(struct pa
}
}
-static int ___alloc_pages_bulk(struct page **pages, int nr_pages)
+static int ___alloc_pages_bulk(struct page **pages, int nr_pages, gfp_t gfp_mask)
{
unsigned long nr_populated, nr_total = nr_pages;
struct page **page_array = pages;
while (nr_pages) {
- nr_populated = alloc_pages_bulk(GFP_KERNEL, nr_pages, pages);
+ nr_populated = alloc_pages_bulk(gfp_mask, nr_pages, pages);
if (!nr_populated) {
___free_pages_bulk(page_array, nr_total - nr_pages);
return -ENOMEM;
@@ -354,25 +354,42 @@ static int ___alloc_pages_bulk(struct pa
return 0;
}
-static int __kasan_populate_vmalloc(unsigned long start, unsigned long end)
+static int __kasan_populate_vmalloc(unsigned long start, unsigned long end, gfp_t gfp_mask)
{
unsigned long nr_pages, nr_total = PFN_UP(end - start);
struct vmalloc_populate_data data;
+ unsigned int flags;
int ret = 0;
- data.pages = (struct page **)__get_free_page(GFP_KERNEL | __GFP_ZERO);
+ data.pages = (struct page **)__get_free_page(gfp_mask | __GFP_ZERO);
if (!data.pages)
return -ENOMEM;
while (nr_total) {
nr_pages = min(nr_total, PAGE_SIZE / sizeof(data.pages[0]));
- ret = ___alloc_pages_bulk(data.pages, nr_pages);
+ ret = ___alloc_pages_bulk(data.pages, nr_pages, gfp_mask);
if (ret)
break;
data.start = start;
+
+ /*
+ * page tables allocations ignore external gfp mask, enforce it
+ * by the scope API
+ */
+ if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
+ flags = memalloc_nofs_save();
+ else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
+ flags = memalloc_noio_save();
+
ret = apply_to_page_range(&init_mm, start, nr_pages * PAGE_SIZE,
kasan_populate_vmalloc_pte, &data);
+
+ if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
+ memalloc_nofs_restore(flags);
+ else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
+ memalloc_noio_restore(flags);
+
___free_pages_bulk(data.pages, nr_pages);
if (ret)
break;
@@ -386,7 +403,7 @@ static int __kasan_populate_vmalloc(unsi
return ret;
}
-int kasan_populate_vmalloc(unsigned long addr, unsigned long size)
+int kasan_populate_vmalloc(unsigned long addr, unsigned long size, gfp_t gfp_mask)
{
unsigned long shadow_start, shadow_end;
int ret;
@@ -415,7 +432,7 @@ int kasan_populate_vmalloc(unsigned long
shadow_start = PAGE_ALIGN_DOWN(shadow_start);
shadow_end = PAGE_ALIGN(shadow_end);
- ret = __kasan_populate_vmalloc(shadow_start, shadow_end);
+ ret = __kasan_populate_vmalloc(shadow_start, shadow_end, gfp_mask);
if (ret)
return ret;
--- a/mm/vmalloc.c~mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc
+++ a/mm/vmalloc.c
@@ -2026,6 +2026,8 @@ static struct vmap_area *alloc_vmap_area
if (unlikely(!vmap_initialized))
return ERR_PTR(-EBUSY);
+ /* Only reclaim behaviour flags are relevant. */
+ gfp_mask = gfp_mask & GFP_RECLAIM_MASK;
might_sleep();
/*
@@ -2038,8 +2040,6 @@ static struct vmap_area *alloc_vmap_area
*/
va = node_alloc(size, align, vstart, vend, &addr, &vn_id);
if (!va) {
- gfp_mask = gfp_mask & GFP_RECLAIM_MASK;
-
va = kmem_cache_alloc_node(vmap_area_cachep, gfp_mask, node);
if (unlikely(!va))
return ERR_PTR(-ENOMEM);
@@ -2089,7 +2089,7 @@ retry:
BUG_ON(va->va_start < vstart);
BUG_ON(va->va_end > vend);
- ret = kasan_populate_vmalloc(addr, size);
+ ret = kasan_populate_vmalloc(addr, size, gfp_mask);
if (ret) {
free_vmap_area(va);
return ERR_PTR(ret);
@@ -4826,7 +4826,7 @@ retry:
/* populate the kasan shadow space */
for (area = 0; area < nr_vms; area++) {
- if (kasan_populate_vmalloc(vas[area]->va_start, sizes[area]))
+ if (kasan_populate_vmalloc(vas[area]->va_start, sizes[area], GFP_KERNEL))
goto err_free_shadow;
}
_
Patches currently in -mm which might be from urezki(a)gmail.com are
mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc.patch
The patch titled
Subject: ocfs2: fix recursive semaphore deadlock in fiemap call
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
ocfs2-fix-recursive-semaphore-deadlock-in-fiemap-call.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Mark Tinguely <mark.tinguely(a)oracle.com>
Subject: ocfs2: fix recursive semaphore deadlock in fiemap call
Date: Fri, 29 Aug 2025 10:18:15 -0500
syzbot detected a OCFS2 hang due to a recursive semaphore on a
FS_IOC_FIEMAP of the extent list on a specially crafted mmap file.
context_switch kernel/sched/core.c:5357 [inline]
__schedule+0x1798/0x4cc0 kernel/sched/core.c:6961
__schedule_loop kernel/sched/core.c:7043 [inline]
schedule+0x165/0x360 kernel/sched/core.c:7058
schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:7115
rwsem_down_write_slowpath+0x872/0xfe0 kernel/locking/rwsem.c:1185
__down_write_common kernel/locking/rwsem.c:1317 [inline]
__down_write kernel/locking/rwsem.c:1326 [inline]
down_write+0x1ab/0x1f0 kernel/locking/rwsem.c:1591
ocfs2_page_mkwrite+0x2ff/0xc40 fs/ocfs2/mmap.c:142
do_page_mkwrite+0x14d/0x310 mm/memory.c:3361
wp_page_shared mm/memory.c:3762 [inline]
do_wp_page+0x268d/0x5800 mm/memory.c:3981
handle_pte_fault mm/memory.c:6068 [inline]
__handle_mm_fault+0x1033/0x5440 mm/memory.c:6195
handle_mm_fault+0x40a/0x8e0 mm/memory.c:6364
do_user_addr_fault+0x764/0x1390 arch/x86/mm/fault.c:1387
handle_page_fault arch/x86/mm/fault.c:1476 [inline]
exc_page_fault+0x76/0xf0 arch/x86/mm/fault.c:1532
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
RIP: 0010:copy_user_generic arch/x86/include/asm/uaccess_64.h:126 [inline]
RIP: 0010:raw_copy_to_user arch/x86/include/asm/uaccess_64.h:147 [inline]
RIP: 0010:_inline_copy_to_user include/linux/uaccess.h:197 [inline]
RIP: 0010:_copy_to_user+0x85/0xb0 lib/usercopy.c:26
Code: e8 00 bc f7 fc 4d 39 fc 72 3d 4d 39 ec 77 38 e8 91 b9 f7 fc 4c 89
f7 89 de e8 47 25 5b fd 0f 01 cb 4c 89 ff 48 89 d9 4c 89 f6 <f3> a4 0f
1f 00 48 89 cb 0f 01 ca 48 89 d8 5b 41 5c 41 5d 41 5e 41
RSP: 0018:ffffc9000403f950 EFLAGS: 00050256
RAX: ffffffff84c7f101 RBX: 0000000000000038 RCX: 0000000000000038
RDX: 0000000000000000 RSI: ffffc9000403f9e0 RDI: 0000200000000060
RBP: ffffc9000403fa90 R08: ffffc9000403fa17 R09: 1ffff92000807f42
R10: dffffc0000000000 R11: fffff52000807f43 R12: 0000200000000098
R13: 00007ffffffff000 R14: ffffc9000403f9e0 R15: 0000200000000060
copy_to_user include/linux/uaccess.h:225 [inline]
fiemap_fill_next_extent+0x1c0/0x390 fs/ioctl.c:145
ocfs2_fiemap+0x888/0xc90 fs/ocfs2/extent_map.c:806
ioctl_fiemap fs/ioctl.c:220 [inline]
do_vfs_ioctl+0x1173/0x1430 fs/ioctl.c:532
__do_sys_ioctl fs/ioctl.c:596 [inline]
__se_sys_ioctl+0x82/0x170 fs/ioctl.c:584
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5f13850fd9
RSP: 002b:00007ffe3b3518b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000200000000000 RCX: 00007f5f13850fd9
RDX: 0000200000000040 RSI: 00000000c020660b RDI: 0000000000000004
RBP: 6165627472616568 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe3b3518f0
R13: 00007ffe3b351b18 R14: 431bde82d7b634db R15: 00007f5f1389a03b
ocfs2_fiemap() takes a read lock of the ip_alloc_sem semaphore (since
v2.6.22-527-g7307de80510a) and calls fiemap_fill_next_extent() to read the
extent list of this running mmap executable. The user supplied buffer to
hold the fiemap information page faults calling ocfs2_page_mkwrite() which
will take a write lock (since v2.6.27-38-g00dc417fa3e7) of the same
semaphore. This recursive semaphore will hold filesystem locks and causes
a hang of the fileystem.
The ip_alloc_sem protects the inode extent list and size. Release the
read semphore before calling fiemap_fill_next_extent() in ocfs2_fiemap()
and ocfs2_fiemap_inline(). This does an unnecessary semaphore lock/unlock
on the last extent but simplifies the error path.
Link: https://lkml.kernel.org/r/61d1a62b-2631-4f12-81e2-cd689914360b@oracle.com
Fixes: 00dc417fa3e7 ("ocfs2: fiemap support")
Signed-off-by: Mark Tinguely <mark.tinguely(a)oracle.com>
Reported-by: syzbot+541dcc6ee768f77103e7(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=541dcc6ee768f77103e7
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/extent_map.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
--- a/fs/ocfs2/extent_map.c~ocfs2-fix-recursive-semaphore-deadlock-in-fiemap-call
+++ a/fs/ocfs2/extent_map.c
@@ -706,6 +706,8 @@ out:
* it not only handles the fiemap for inlined files, but also deals
* with the fast symlink, cause they have no difference for extent
* mapping per se.
+ *
+ * Must be called with ip_alloc_sem semaphore held.
*/
static int ocfs2_fiemap_inline(struct inode *inode, struct buffer_head *di_bh,
struct fiemap_extent_info *fieinfo,
@@ -717,6 +719,7 @@ static int ocfs2_fiemap_inline(struct in
u64 phys;
u32 flags = FIEMAP_EXTENT_DATA_INLINE|FIEMAP_EXTENT_LAST;
struct ocfs2_inode_info *oi = OCFS2_I(inode);
+ lockdep_assert_held_read(&oi->ip_alloc_sem);
di = (struct ocfs2_dinode *)di_bh->b_data;
if (ocfs2_inode_is_fast_symlink(inode))
@@ -732,8 +735,11 @@ static int ocfs2_fiemap_inline(struct in
phys += offsetof(struct ocfs2_dinode,
id2.i_data.id_data);
+ /* Release the ip_alloc_sem to prevent deadlock on page fault */
+ up_read(&OCFS2_I(inode)->ip_alloc_sem);
ret = fiemap_fill_next_extent(fieinfo, 0, phys, id_count,
flags);
+ down_read(&OCFS2_I(inode)->ip_alloc_sem);
if (ret < 0)
return ret;
}
@@ -802,9 +808,11 @@ int ocfs2_fiemap(struct inode *inode, st
len_bytes = (u64)le16_to_cpu(rec.e_leaf_clusters) << osb->s_clustersize_bits;
phys_bytes = le64_to_cpu(rec.e_blkno) << osb->sb->s_blocksize_bits;
virt_bytes = (u64)le32_to_cpu(rec.e_cpos) << osb->s_clustersize_bits;
-
+ /* Release the ip_alloc_sem to prevent deadlock on page fault */
+ up_read(&OCFS2_I(inode)->ip_alloc_sem);
ret = fiemap_fill_next_extent(fieinfo, virt_bytes, phys_bytes,
len_bytes, fe_flags);
+ down_read(&OCFS2_I(inode)->ip_alloc_sem);
if (ret)
break;
_
Patches currently in -mm which might be from mark.tinguely(a)oracle.com are
ocfs2-fix-recursive-semaphore-deadlock-in-fiemap-call.patch
Remove the logic to set interrupt mask by default in uio_hv_generic
driver as the interrupt mask value is supposed to be controlled
completely by the user space. If the mask bit gets changed
by the driver, concurrently with user mode operating on the ring,
the mask bit may be set when it is supposed to be clear, and the
user-mode driver will miss an interrupt which will cause a hang.
For eg- when the driver sets inbound ring buffer interrupt mask to 1,
the host does not interrupt the guest on the UIO VMBus channel.
However, setting the mask does not prevent the host from putting a
message in the inbound ring buffer. So let’s assume that happens,
the host puts a message into the ring buffer but does not interrupt.
Subsequently, the user space code in the guest sets the inbound ring
buffer interrupt mask to 0, saying “Hey, I’m ready for interrupts”.
User space code then calls pread() to wait for an interrupt.
Then one of two things happens:
* The host never sends another message. So the pread() waits forever.
* The host does send another message. But because there’s already a
message in the ring buffer, it doesn’t generate an interrupt.
This is the correct behavior, because the host should only send an
interrupt when the inbound ring buffer transitions from empty to
not-empty. Adding an additional message to a ring buffer that is not
empty is not supposed to generate an interrupt on the guest.
Since the guest is waiting in pread() and not removing messages from
the ring buffer, the pread() waits forever.
This could be easily reproduced in hv_fcopy_uio_daemon if we delay
setting interrupt mask to 0.
Similarly if hv_uio_channel_cb() sets the interrupt_mask to 1,
there’s a race condition. Once user space empties the inbound ring
buffer, but before user space sets interrupt_mask to 0, the host could
put another message in the ring buffer but it wouldn’t interrupt.
Then the next pread() would hang.
Fix these by removing all instances where interrupt_mask is changed,
while keeping the one in set_event() unchanged to enable userspace
control the interrupt mask by writing 0/1 to /dev/uioX.
Fixes: 95096f2fbd10 ("uio-hv-generic: new userspace i/o driver for VMBus")
Suggested-by: John Starks <jostarks(a)microsoft.com>
Signed-off-by: Naman Jain <namjain(a)linux.microsoft.com>
Cc: <stable(a)vger.kernel.org>
---
Changes since v1:
https://lore.kernel.org/all/20250818064846.271294-1-namjain@linux.microsoft…
* Added Fixes and Cc stable tags.
---
drivers/uio/uio_hv_generic.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index f19efad4d6f8..3f8e2e27697f 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -111,7 +111,6 @@ static void hv_uio_channel_cb(void *context)
struct hv_device *hv_dev;
struct hv_uio_private_data *pdata;
- chan->inbound.ring_buffer->interrupt_mask = 1;
virt_mb();
/*
@@ -183,8 +182,6 @@ hv_uio_new_channel(struct vmbus_channel *new_sc)
return;
}
- /* Disable interrupts on sub channel */
- new_sc->inbound.ring_buffer->interrupt_mask = 1;
set_channel_read_mode(new_sc, HV_CALL_ISR);
ret = hv_create_ring_sysfs(new_sc, hv_uio_ring_mmap);
if (ret) {
@@ -227,9 +224,7 @@ hv_uio_open(struct uio_info *info, struct inode *inode)
ret = vmbus_connect_ring(dev->channel,
hv_uio_channel_cb, dev->channel);
- if (ret == 0)
- dev->channel->inbound.ring_buffer->interrupt_mask = 1;
- else
+ if (ret)
atomic_dec(&pdata->refcnt);
return ret;
--
2.34.1
batadv_nc_skb_decode_packet() trusts coded_len and checks only against
skb->len. XOR starts at sizeof(struct batadv_unicast_packet), reducing
payload headroom, and the source skb length is not verified, allowing an
out-of-bounds read and a small out-of-bounds write.
Validate that coded_len fits within the payload area of both destination
and source sk_buffs before XORing.
Fixes: 2df5278b0267 ("batman-adv: network coding - receive coded packets and decode them")
Cc: stable(a)vger.kernel.org
Reported-by: Stanislav Fort <disclosure(a)aisle.com>
Signed-off-by: Stanislav Fort <disclosure(a)aisle.com>
---
net/batman-adv/network-coding.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/batman-adv/network-coding.c b/net/batman-adv/network-coding.c
index 9f56308779cc..af97d077369f 100644
--- a/net/batman-adv/network-coding.c
+++ b/net/batman-adv/network-coding.c
@@ -1687,7 +1687,12 @@ batadv_nc_skb_decode_packet(struct batadv_priv *bat_priv, struct sk_buff *skb,
coding_len = ntohs(coded_packet_tmp.coded_len);
- if (coding_len > skb->len)
+ /* ensure dst buffer is large enough (payload only) */
+ if (coding_len + h_size > skb->len)
+ return NULL;
+
+ /* ensure src buffer is large enough (payload only) */
+ if (coding_len + h_size > nc_packet->skb->len)
return NULL;
/* Here the magic is reversed:
--
2.39.3 (Apple Git-146)
Add an explicit check for the xfer length to 'rtl9300_i2c_config_xfer'
to ensure the data length isn't within the supported range. In
particular a data length of 0 is not supported by the hardware and
causes unintended or destructive behaviour.
This limitation becomes obvious when looking at the register
documentation [1]. 4 bits are reserved for DATA_WIDTH and the value
of these 4 bits is used as N + 1, allowing a data length range of
1 <= len <= 16.
Affected by this is the SMBus Quick Operation which works with a data
length of 0. Passing 0 as the length causes an underflow of the value
due to:
(len - 1) & 0xf
and effectively specifying a transfer length of 16 via the registers.
This causes a 16-byte write operation instead of a Quick Write. For
example, on SFP modules without write-protected EEPROM this soft-bricks
them by overwriting some initial bytes.
For completeness, also add a quirk for the zero length.
[1] https://svanheule.net/realtek/longan/register/i2c_mst1_ctrl2
Fixes: c366be720235 ("i2c: Add driver for the RTL9300 I2C controller")
Cc: <stable(a)vger.kernel.org> # v6.13+
Signed-off-by: Jonas Jelonek <jelonek.jonas(a)gmail.com>
Tested-by: Sven Eckelmann <sven(a)narfation.org>
Reviewed-by: Chris Packham <chris.packham(a)alliedtelesis.co.nz>
Tested-by: Chris Packham <chris.packham(a)alliedtelesis.co.nz> # On RTL9302C based board
Tested-by: Markus Stockhausen <markus.stockhausen(a)gmx.de>
---
drivers/i2c/busses/i2c-rtl9300.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-rtl9300.c b/drivers/i2c/busses/i2c-rtl9300.c
index 19c367703eaf..ebd4a85e1bde 100644
--- a/drivers/i2c/busses/i2c-rtl9300.c
+++ b/drivers/i2c/busses/i2c-rtl9300.c
@@ -99,6 +99,9 @@ static int rtl9300_i2c_config_xfer(struct rtl9300_i2c *i2c, struct rtl9300_i2c_c
{
u32 val, mask;
+ if (len < 1 || len > 16)
+ return -EINVAL;
+
val = chan->bus_freq << RTL9300_I2C_MST_CTRL2_SCL_FREQ_OFS;
mask = RTL9300_I2C_MST_CTRL2_SCL_FREQ_MASK;
@@ -352,7 +355,7 @@ static const struct i2c_algorithm rtl9300_i2c_algo = {
};
static struct i2c_adapter_quirks rtl9300_i2c_quirks = {
- .flags = I2C_AQ_NO_CLK_STRETCH,
+ .flags = I2C_AQ_NO_CLK_STRETCH | I2C_AQ_NO_ZERO_LEN,
.max_read_len = 16,
.max_write_len = 16,
};
--
2.48.1
Fix the current check for number of channels (child nodes in the device
tree). Before, this was:
if (device_get_child_node_count(dev) >= RTL9300_I2C_MUX_NCHAN)
RTL9300_I2C_MUX_NCHAN gives the maximum number of channels so checking
with '>=' isn't correct because it doesn't allow the last channel
number. Thus, fix it to:
if (device_get_child_node_count(dev) > RTL9300_I2C_MUX_NCHAN)
Issue occured on a TP-Link TL-ST1008F v2.0 device (8 SFP+ ports) and fix
is tested there.
Fixes: c366be720235 ("i2c: Add driver for the RTL9300 I2C controller")
Cc: <stable(a)vger.kernel.org> # v6.13+
Signed-off-by: Jonas Jelonek <jelonek.jonas(a)gmail.com>
Tested-by: Sven Eckelmann <sven(a)narfation.org>
Reviewed-by: Chris Packham <chris.packham(a)alliedtelesis.co.nz>
Tested-by: Chris Packham <chris.packham(a)alliedtelesis.co.nz> # On RTL9302C based board
Tested-by: Markus Stockhausen <markus.stockhausen(a)gmx.de>
---
drivers/i2c/busses/i2c-rtl9300.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-rtl9300.c b/drivers/i2c/busses/i2c-rtl9300.c
index 4b215f9a24e6..19c367703eaf 100644
--- a/drivers/i2c/busses/i2c-rtl9300.c
+++ b/drivers/i2c/busses/i2c-rtl9300.c
@@ -382,7 +382,7 @@ static int rtl9300_i2c_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, i2c);
- if (device_get_child_node_count(dev) >= RTL9300_I2C_MUX_NCHAN)
+ if (device_get_child_node_count(dev) > RTL9300_I2C_MUX_NCHAN)
return dev_err_probe(dev, -EINVAL, "Too many channels\n");
device_for_each_child_node(dev, child) {
--
2.48.1
Hi Stable,
Please provide a quote for your products:
Include:
1.Pricing (per unit)
2.Delivery cost & timeline
3.Quote expiry date
Deadline: September
Thanks!
Kamal Prasad
Albinayah Trading
Our user report a bug that his cpu keep the min cpu freq,
bisect to commit ada8d7fa0ad49a2a078f97f7f6e02d24d3c357a3
("sched/cpufreq: Rework schedutil governor performance estimation")
and test the fix e37617c8e53a1f7fcba6d5e1041f4fd8a2425c27
("sched/fair: Fix frequency selection for non-invariant case")
works.
And backport the "stable-deps-of" series:
"consolidate and cleanup CPU capacity"
Link: https://lore.kernel.org/all/20231211104855.558096-1-vincent.guittot@linaro.…
PS:
commit 50b813b147e9eb6546a1fc49d4e703e6d23691f2 in series
("cpufreq/cppc: Move and rename cppc_cpufreq_{perf_to_khz|khz_to_perf}()")
merged in v6.6.59 as 33e89c16cea0882bb05c585fa13236b730dd0efa
Vincent Guittot (8):
sched/topology: Add a new arch_scale_freq_ref() method
cpufreq: Use the fixed and coherent frequency for scaling capacity
cpufreq/schedutil: Use a fixed reference frequency
energy_model: Use a fixed reference frequency
cpufreq/cppc: Set the frequency used for computing the capacity
arm64/amu: Use capacity_ref_freq() to set AMU ratio
topology: Set capacity_freq_ref in all cases
sched/fair: Fix frequency selection for non-invariant case
arch/arm/include/asm/topology.h | 1 +
arch/arm64/include/asm/topology.h | 1 +
arch/arm64/kernel/topology.c | 26 ++++++------
arch/riscv/include/asm/topology.h | 1 +
drivers/base/arch_topology.c | 69 ++++++++++++++++++++-----------
drivers/cpufreq/cpufreq.c | 4 +-
include/linux/arch_topology.h | 8 ++++
include/linux/cpufreq.h | 1 +
include/linux/energy_model.h | 6 +--
include/linux/sched/topology.h | 8 ++++
kernel/sched/cpufreq_schedutil.c | 30 +++++++++++++-
11 files changed, 111 insertions(+), 44 deletions(-)
--
2.20.1
Fix a use-after-free window by correcting the buffer release sequence in
the deferred receive path. The code freed the RQ buffer first and only
then cleared the context pointer under the lock. Concurrent paths
(e.g., ABTS and the repost path) also inspect and release the same
pointer under the lock, so the old order could lead to double-free/UAF.
Note that the repost path already uses the correct pattern: detach the
pointer under the lock, then free it after dropping the lock. The deferred
path should do the same.
Fixes: 472e146d1cf3 ("scsi: lpfc: Correct upcalling nvmet_fc transport during io done downcall")
Cc: stable(a)vger.kernel.org
Signed-off-by: John Evans <evans1210144(a)gmail.com>
---
v2:
* Rework locking to read and clear the buffer pointer atomically, as
suggested by Justin Tee.
---
drivers/scsi/lpfc/lpfc_nvmet.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c
index fba2e62027b7..4cfc928bcf2d 100644
--- a/drivers/scsi/lpfc/lpfc_nvmet.c
+++ b/drivers/scsi/lpfc/lpfc_nvmet.c
@@ -1243,7 +1243,7 @@ lpfc_nvmet_defer_rcv(struct nvmet_fc_target_port *tgtport,
struct lpfc_nvmet_tgtport *tgtp;
struct lpfc_async_xchg_ctx *ctxp =
container_of(rsp, struct lpfc_async_xchg_ctx, hdlrctx.fcp_req);
- struct rqb_dmabuf *nvmebuf = ctxp->rqb_buffer;
+ struct rqb_dmabuf *nvmebuf;
struct lpfc_hba *phba = ctxp->phba;
unsigned long iflag;
@@ -1251,13 +1251,18 @@ lpfc_nvmet_defer_rcv(struct nvmet_fc_target_port *tgtport,
lpfc_nvmeio_data(phba, "NVMET DEFERRCV: xri x%x sz %d CPU %02x\n",
ctxp->oxid, ctxp->size, raw_smp_processor_id());
+ spin_lock_irqsave(&ctxp->ctxlock, iflag);
+ nvmebuf = ctxp->rqb_buffer;
if (!nvmebuf) {
+ spin_unlock_irqrestore(&ctxp->ctxlock, iflag);
lpfc_printf_log(phba, KERN_INFO, LOG_NVME_IOERR,
"6425 Defer rcv: no buffer oxid x%x: "
"flg %x ste %x\n",
ctxp->oxid, ctxp->flag, ctxp->state);
return;
}
+ ctxp->rqb_buffer = NULL;
+ spin_unlock_irqrestore(&ctxp->ctxlock, iflag);
tgtp = phba->targetport->private;
if (tgtp)
@@ -1265,9 +1270,6 @@ lpfc_nvmet_defer_rcv(struct nvmet_fc_target_port *tgtport,
/* Free the nvmebuf since a new buffer already replaced it */
nvmebuf->hrq->rqbp->rqb_free_buffer(phba, nvmebuf);
- spin_lock_irqsave(&ctxp->ctxlock, iflag);
- ctxp->rqb_buffer = NULL;
- spin_unlock_irqrestore(&ctxp->ctxlock, iflag);
}
/**
--
2.43.0
Hi,
Charles Bordet reported the following issue (full context in
https://bugs.debian.org/1108860)
> Dear Maintainer,
>
> What led up to the situation?
> We run a production environment using Debian 12 VMs, with a network
> topology involving VXLAN tunnels encapsulated inside Wireguard
> interfaces. This setup has worked reliably for over a year, with MTU set
> to 1500 on all interfaces except the Wireguard interface (set to 1420).
> Wireguard kernel fragmentation allowed this configuration to function
> without issues, even though the effective path MTU is lower than 1500.
>
> What exactly did you do (or not do) that was effective (or ineffective)?
> We performed a routine system upgrade, updating all packages include the
> kernel. After the upgrade, we observed severe network issues (timeouts,
> very slow HTTP/HTTPS, and apt update failures) on all VMs behind the
> router. SSH and small-packet traffic continued to work.
>
> To diagnose, we:
>
> * Restored a backup (with the previous kernel): the problem disappeared.
> * Repeated the upgrade, confirming the issue reappeared.
> * Systematically tested each kernel version from 6.1.124-1 up to
> 6.1.140-1. The problem first appears with kernel 6.1.135-1; all earlier
> versions work as expected.
> * Kernel version from the backports (6.12.32-1) did not resolve the
> problem.
>
> What was the outcome of this action?
>
> * With kernel 6.1.135-1 or later, network timeouts occur for
> large-packet protocols (HTTP, apt, etc.), while SSH and small-packet
> protocols work.
> * With kernel 6.1.133-1 or earlier, everything works as expected.
>
> What outcome did you expect instead?
> We expected the network to function as before, with Wireguard handling
> fragmentation transparently and no application-level timeouts,
> regardless of the kernel version.
While triaging the issue we found that the commit 8930424777e4
("tunnels: Accept PACKET_HOST in skb_tunnel_check_pmtu()." introduces
the issue and Charles confirmed that the issue was present as well in
6.12.35 and 6.15.4 (other version up could potentially still be
affected, but we wanted to check it is not a 6.1.y specific
regression).
Reverthing the commit fixes Charles' issue.
Does that ring a bell?
Regards,
Salvatore
Fix multiple fwnode reference leaks:
1. The function calls fwnode_get_named_child_node() to get the "leds" node,
but never calls fwnode_handle_put(leds) to release this reference.
2. Within the fwnode_for_each_child_node() loop, the early return
paths that don't properly release the "led" fwnode reference.
This fix follows the same pattern as commit d029edefed39
("net dsa: qca8k: fix usages of device_get_named_child_node()")
Fixes: 94a2a84f5e9e ("net: dsa: mv88e6xxx: Support LED control")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/net/dsa/mv88e6xxx/leds.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/net/dsa/mv88e6xxx/leds.c b/drivers/net/dsa/mv88e6xxx/leds.c
index 1c88bfaea46b..dcc765066f9c 100644
--- a/drivers/net/dsa/mv88e6xxx/leds.c
+++ b/drivers/net/dsa/mv88e6xxx/leds.c
@@ -779,6 +779,8 @@ int mv88e6xxx_port_setup_leds(struct mv88e6xxx_chip *chip, int port)
continue;
if (led_num > 1) {
dev_err(dev, "invalid LED specified port %d\n", port);
+ fwnode_handle_put(led);
+ fwnode_handle_put(leds);
return -EINVAL;
}
@@ -823,17 +825,23 @@ int mv88e6xxx_port_setup_leds(struct mv88e6xxx_chip *chip, int port)
init_data.devname_mandatory = true;
init_data.devicename = kasprintf(GFP_KERNEL, "%s:0%d:0%d", chip->info->name,
port, led_num);
- if (!init_data.devicename)
+ if (!init_data.devicename) {
+ fwnode_handle_put(led);
+ fwnode_handle_put(leds);
return -ENOMEM;
+ }
ret = devm_led_classdev_register_ext(dev, l, &init_data);
kfree(init_data.devicename);
if (ret) {
dev_err(dev, "Failed to init LED %d for port %d", led_num, port);
+ fwnode_handle_put(led);
+ fwnode_handle_put(leds);
return ret;
}
}
+ fwnode_handle_put(leds);
return 0;
}
--
2.35.1
Before conversion to unify the calibration data management, the
tas2563_apply_calib() function performed the big endian conversion and
wrote the calibration data to the device. The writing is now done by the
common tasdev_load_calibrated_data() function, but without conversion.
Put the values into the calibration data buffer with the expected
endianness.
Fixes: 4fe238513407 ("ALSA: hda/tas2781: Move and unified the calibrated-data getting function for SPI and I2C into the tas2781_hda lib")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Gergo Koteles <soyer(a)irl.hu>
---
sound/hda/codecs/side-codecs/tas2781_hda_i2c.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c b/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c
index e34b17f0c9b9..1eac751ab2a6 100644
--- a/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c
+++ b/sound/hda/codecs/side-codecs/tas2781_hda_i2c.c
@@ -310,6 +310,7 @@ static int tas2563_save_calibration(struct tas2781_hda *h)
struct cali_reg *r = &cd->cali_reg_array;
unsigned int offset = 0;
unsigned char *data;
+ __be32 bedata;
efi_status_t status;
unsigned int attr;
int ret, i, j, k;
@@ -351,6 +352,8 @@ static int tas2563_save_calibration(struct tas2781_hda *h)
i, j, status);
return -EINVAL;
}
+ bedata = cpu_to_be32(*(uint32_t *)&data[offset]);
+ memcpy(&data[offset], &bedata, sizeof(bedata));
offset += TAS2563_CAL_DATA_SIZE;
}
}
--
2.51.0
From: Donet Tom <donettom(a)linux.ibm.com>
On systems using the hash MMU, there is a software SLB preload cache that
mirrors the entries loaded into the hardware SLB buffer. This preload
cache is subject to periodic eviction — typically after every 256 context
switches — to remove old entry.
To optimize performance, the kernel skips switch_mmu_context() in
switch_mm_irqs_off() when the prev and next mm_struct are the same.
However, on hash MMU systems, this can lead to inconsistencies between
the hardware SLB and the software preload cache.
If an SLB entry for a process is evicted from the software cache on one
CPU, and the same process later runs on another CPU without executing
switch_mmu_context(), the hardware SLB may retain stale entries. If the
kernel then attempts to reload that entry, it can trigger an SLB
multi-hit error.
The following timeline shows how stale SLB entries are created and can
cause a multi-hit error when a process moves between CPUs without a
MMU context switch.
CPU 0 CPU 1
----- -----
Process P
exec swapper/1
load_elf_binary
begin_new_exc
activate_mm
switch_mm_irqs_off
switch_mmu_context
switch_slb
/*
* This invalidates all
* the entries in the HW
* and setup the new HW
* SLB entries as per the
* preload cache.
*/
context_switch
sched_migrate_task migrates process P to cpu-1
Process swapper/0 context switch (to process P)
(uses mm_struct of Process P) switch_mm_irqs_off()
switch_slb
load_slb++
/*
* load_slb becomes 0 here
* and we evict an entry from
* the preload cache with
* preload_age(). We still
* keep HW SLB and preload
* cache in sync, that is
* because all HW SLB entries
* anyways gets evicted in
* switch_slb during SLBIA.
* We then only add those
* entries back in HW SLB,
* which are currently
* present in preload_cache
* (after eviction).
*/
load_elf_binary continues...
setup_new_exec()
slb_setup_new_exec()
sched_switch event
sched_migrate_task migrates
process P to cpu-0
context_switch from swapper/0 to Process P
switch_mm_irqs_off()
/*
* Since both prev and next mm struct are same we don't call
* switch_mmu_context(). This will cause the HW SLB and SW preload
* cache to go out of sync in preload_new_slb_context. Because there
* was an SLB entry which was evicted from both HW and preload cache
* on cpu-1. Now later in preload_new_slb_context(), when we will try
* to add the same preload entry again, we will add this to the SW
* preload cache and then will add it to the HW SLB. Since on cpu-0
* this entry was never invalidated, hence adding this entry to the HW
* SLB will cause a SLB multi-hit error.
*/
load_elf_binary continues...
START_THREAD
start_thread
preload_new_slb_context
/*
* This tries to add a new EA to preload cache which was earlier
* evicted from both cpu-1 HW SLB and preload cache. This caused the
* HW SLB of cpu-0 to go out of sync with the SW preload cache. The
* reason for this was, that when we context switched back on CPU-0,
* we should have ideally called switch_mmu_context() which will
* bring the HW SLB entries on CPU-0 in sync with SW preload cache
* entries by setting up the mmu context properly. But we didn't do
* that since the prev mm_struct running on cpu-0 was same as the
* next mm_struct (which is true for swapper / kernel threads). So
* now when we try to add this new entry into the HW SLB of cpu-0,
* we hit a SLB multi-hit error.
*/
WARNING: CPU: 0 PID: 1810970 at arch/powerpc/mm/book3s64/slb.c:62
assert_slb_presence+0x2c/0x50(48 results) 02:47:29 [20157/42149]
Modules linked in:
CPU: 0 UID: 0 PID: 1810970 Comm: dd Not tainted 6.16.0-rc3-dirty #12
VOLUNTARY
Hardware name: IBM pSeries (emulated by qemu) POWER8 (architected)
0x4d0200 0xf000004 of:SLOF,HEAD hv:linux,kvm pSeries
NIP: c00000000015426c LR: c0000000001543b4 CTR: 0000000000000000
REGS: c0000000497c77e0 TRAP: 0700 Not tainted (6.16.0-rc3-dirty)
MSR: 8000000002823033 <SF,VEC,VSX,FP,ME,IR,DR,RI,LE> CR: 28888482 XER: 00000000
CFAR: c0000000001543b0 IRQMASK: 3
<...>
NIP [c00000000015426c] assert_slb_presence+0x2c/0x50
LR [c0000000001543b4] slb_insert_entry+0x124/0x390
Call Trace:
0x7fffceb5ffff (unreliable)
preload_new_slb_context+0x100/0x1a0
start_thread+0x26c/0x420
load_elf_binary+0x1b04/0x1c40
bprm_execve+0x358/0x680
do_execveat_common+0x1f8/0x240
sys_execve+0x58/0x70
system_call_exception+0x114/0x300
system_call_common+0x160/0x2c4
>From the above analysis, during early exec the hardware SLB is cleared,
and entries from the software preload cache are reloaded into hardware
by switch_slb. However, preload_new_slb_context and slb_setup_new_exec
also attempt to load some of the same entries, which can trigger a
multi-hit. In most cases, these additional preloads simply hit existing
entries and add nothing new. Removing these functions avoids redundant
preloads and eliminates the multi-hit issue. This patch removes these
two functions.
We tested process switching performance using the context_switch
benchmark on POWER9/hash, and observed no regression.
Without this patch: 129041 ops/sec
With this patch: 129341 ops/sec
We also measured SLB faults during boot, and the counts are essentially
the same with and without this patch.
SLB faults without this patch: 19727
SLB faults with this patch: 19786
Cc: Madhavan Srinivasan <maddy(a)linux.ibm.com>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Nicholas Piggin <npiggin(a)gmail.com>
Cc: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Cc: Paul Mackerras <paulus(a)ozlabs.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Donet Tom <donettom(a)linux.ibm.com>
Cc: linuxppc-dev(a)lists.ozlabs.org
Fixes: 5434ae74629a ("powerpc/64s/hash: Add a SLB preload cache")
cc: <stable(a)vger.kernel.org>
Suggested-by: Nicholas Piggin <npiggin(a)gmail.com>
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/mmu-hash.h | 1 -
arch/powerpc/kernel/process.c | 5 --
arch/powerpc/mm/book3s64/internal.h | 2 -
arch/powerpc/mm/book3s64/mmu_context.c | 2 -
arch/powerpc/mm/book3s64/slb.c | 88 -------------------
5 files changed, 98 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 1c4eebbc69c9..e1f77e2eead4 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -524,7 +524,6 @@ void slb_save_contents(struct slb_entry *slb_ptr);
void slb_dump_contents(struct slb_entry *slb_ptr);
extern void slb_vmalloc_update(void);
-void preload_new_slb_context(unsigned long start, unsigned long sp);
#ifdef CONFIG_PPC_64S_HASH_MMU
void slb_set_size(u16 size);
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 855e09886503..2b9799157eb4 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1897,8 +1897,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
return 0;
}
-void preload_new_slb_context(unsigned long start, unsigned long sp);
-
/*
* Set up a thread for executing a new program
*/
@@ -1906,9 +1904,6 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
{
#ifdef CONFIG_PPC64
unsigned long load_addr = regs->gpr[2]; /* saved by ELF_PLAT_INIT */
-
- if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && !radix_enabled())
- preload_new_slb_context(start, sp);
#endif
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
diff --git a/arch/powerpc/mm/book3s64/internal.h b/arch/powerpc/mm/book3s64/internal.h
index a57a25f06a21..c26a6f0c90fc 100644
--- a/arch/powerpc/mm/book3s64/internal.h
+++ b/arch/powerpc/mm/book3s64/internal.h
@@ -24,8 +24,6 @@ static inline bool stress_hpt(void)
void hpt_do_stress(unsigned long ea, unsigned long hpte_group);
-void slb_setup_new_exec(void);
-
void exit_lazy_flush_tlb(struct mm_struct *mm, bool always_flush);
#endif /* ARCH_POWERPC_MM_BOOK3S64_INTERNAL_H */
diff --git a/arch/powerpc/mm/book3s64/mmu_context.c b/arch/powerpc/mm/book3s64/mmu_context.c
index 4e1e45420bd4..fb9dcf9ca599 100644
--- a/arch/powerpc/mm/book3s64/mmu_context.c
+++ b/arch/powerpc/mm/book3s64/mmu_context.c
@@ -150,8 +150,6 @@ static int hash__init_new_context(struct mm_struct *mm)
void hash__setup_new_exec(void)
{
slice_setup_new_exec();
-
- slb_setup_new_exec();
}
#else
static inline int hash__init_new_context(struct mm_struct *mm)
diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
index 6b783552403c..7e053c561a09 100644
--- a/arch/powerpc/mm/book3s64/slb.c
+++ b/arch/powerpc/mm/book3s64/slb.c
@@ -328,94 +328,6 @@ static void preload_age(struct thread_info *ti)
ti->slb_preload_tail = (ti->slb_preload_tail + 1) % SLB_PRELOAD_NR;
}
-void slb_setup_new_exec(void)
-{
- struct thread_info *ti = current_thread_info();
- struct mm_struct *mm = current->mm;
- unsigned long exec = 0x10000000;
-
- WARN_ON(irqs_disabled());
-
- /*
- * preload cache can only be used to determine whether a SLB
- * entry exists if it does not start to overflow.
- */
- if (ti->slb_preload_nr + 2 > SLB_PRELOAD_NR)
- return;
-
- hard_irq_disable();
-
- /*
- * We have no good place to clear the slb preload cache on exec,
- * flush_thread is about the earliest arch hook but that happens
- * after we switch to the mm and have already preloaded the SLBEs.
- *
- * For the most part that's probably okay to use entries from the
- * previous exec, they will age out if unused. It may turn out to
- * be an advantage to clear the cache before switching to it,
- * however.
- */
-
- /*
- * preload some userspace segments into the SLB.
- * Almost all 32 and 64bit PowerPC executables are linked at
- * 0x10000000 so it makes sense to preload this segment.
- */
- if (!is_kernel_addr(exec)) {
- if (preload_add(ti, exec))
- slb_allocate_user(mm, exec);
- }
-
- /* Libraries and mmaps. */
- if (!is_kernel_addr(mm->mmap_base)) {
- if (preload_add(ti, mm->mmap_base))
- slb_allocate_user(mm, mm->mmap_base);
- }
-
- /* see switch_slb */
- asm volatile("isync" : : : "memory");
-
- local_irq_enable();
-}
-
-void preload_new_slb_context(unsigned long start, unsigned long sp)
-{
- struct thread_info *ti = current_thread_info();
- struct mm_struct *mm = current->mm;
- unsigned long heap = mm->start_brk;
-
- WARN_ON(irqs_disabled());
-
- /* see above */
- if (ti->slb_preload_nr + 3 > SLB_PRELOAD_NR)
- return;
-
- hard_irq_disable();
-
- /* Userspace entry address. */
- if (!is_kernel_addr(start)) {
- if (preload_add(ti, start))
- slb_allocate_user(mm, start);
- }
-
- /* Top of stack, grows down. */
- if (!is_kernel_addr(sp)) {
- if (preload_add(ti, sp))
- slb_allocate_user(mm, sp);
- }
-
- /* Bottom of heap, grows up. */
- if (heap && !is_kernel_addr(heap)) {
- if (preload_add(ti, heap))
- slb_allocate_user(mm, heap);
- }
-
- /* see switch_slb */
- asm volatile("isync" : : : "memory");
-
- local_irq_enable();
-}
-
static void slb_cache_slbie_kernel(unsigned int index)
{
unsigned long slbie_data = get_paca()->slb_cache[index];
--
2.50.1
On 8/29/25 13:29, yangshiguang wrote:
> At 2025-08-27 16:40:21, "Vlastimil Babka" <vbabka(a)suse.cz> wrote:
>
>>>
>>>>
>>>
>>> How about this?
>>>
>>> /* Preemption is disabled in ___slab_alloc() */
>>> - gfp_flags &= ~(__GFP_DIRECT_RECLAIM);
>>> + gfp_flags = (gfp_flags & ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL)) |
>>> + __GFP_NOWARN;
>>
>>I'd suggest using gfp_nested_flags() and & ~(__GFP_DIRECT_RECLAIM);
>>
>
> However, gfp has been processed by gfp_nested_mask() in
> stack_depot_save_flags().
Aha, didn't notice. Good to know!
> Still need to call here?
No then we can indeed just mask out __GFP_DIRECT_RECLAIM.
Maybe the comment could say something like:
/*
* Preemption is disabled in ___slab_alloc() so we need to disallow
* blocking. The flags are further adjusted by gfp_nested_mask() in
* stack_depot itself.
*/
> set_track_prepare()
> ->stack_depot_save_flags()
>
>>> >--
>>>>Cheers,
>>>>Harry / Hyeonggon
Its actions are opportunistic anyway and will be completed
on device suspend.
Also restrict the scope of the pm runtime reference to
the notifier callback itself to make it easier to
follow.
Marking as a fix to simplify backporting of the fix
that follows in the series.
Fixes: c6a4d46ec1d7 ("drm/xe: evict user memory in PM notifier")
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.16+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
---
drivers/gpu/drm/xe/xe_pm.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index a2e85030b7f4..b57b46ad9f7c 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -308,28 +308,22 @@ static int xe_pm_notifier_callback(struct notifier_block *nb,
case PM_SUSPEND_PREPARE:
xe_pm_runtime_get(xe);
err = xe_bo_evict_all_user(xe);
- if (err) {
+ if (err)
drm_dbg(&xe->drm, "Notifier evict user failed (%d)\n", err);
- xe_pm_runtime_put(xe);
- break;
- }
err = xe_bo_notifier_prepare_all_pinned(xe);
- if (err) {
+ if (err)
drm_dbg(&xe->drm, "Notifier prepare pin failed (%d)\n", err);
- xe_pm_runtime_put(xe);
- }
+ xe_pm_runtime_put(xe);
break;
case PM_POST_HIBERNATION:
case PM_POST_SUSPEND:
+ xe_pm_runtime_get(xe);
xe_bo_notifier_unprepare_all_pinned(xe);
xe_pm_runtime_put(xe);
break;
}
- if (err)
- return NOTIFY_BAD;
-
return NOTIFY_DONE;
}
--
2.50.1
The current implementation of CXL memory hotplug notifier gets called
before the HMAT memory hotplug notifier. The CXL driver calculates the
access coordinates (bandwidth and latency values) for the CXL end to
end path (i.e. CPU to endpoint). When the CXL region is onlined, the CXL
memory hotplug notifier writes the access coordinates to the HMAT target
structs. Then the HMAT memory hotplug notifier is called and it creates
the access coordinates for the node sysfs attributes.
During testing on an Intel platform, it was found that although the
newly calculated coordinates were pushed to sysfs, the sysfs attributes for
the access coordinates showed up with the wrong initiator. The system has
4 nodes (0, 1, 2, 3) where node 0 and 1 are CPU nodes and node 2 and 3 are
CXL nodes. The expectation is that node 2 would show up as a target to node
0:
/sys/devices/system/node/node2/access0/initiators/node0
However it was observed that node 2 showed up as a target under node 1:
/sys/devices/system/node/node2/access0/initiators/node1
The original intent of the 'ext_updated' flag in HMAT handling code was to
stop HMAT memory hotplug callback from clobbering the access coordinates
after CXL has injected its calculated coordinates and replaced the generic
target access coordinates provided by the HMAT table in the HMAT target
structs. However the flag is hacky at best and blocks the updates from
other CXL regions that are onlined in the same node later on. Remove the
'ext_updated' flag usage and just update the access coordinates for the
nodes directly without touching HMAT target data.
The hotplug memory callback ordering is changed. Instead of changing CXL,
move HMAT back so there's room for the levels rather than have CXL share
the same level as SLAB_CALLBACK_PRI. The change will resulting in the CXL
callback to be executed after the HMAT callback.
With the change, the CXL hotplug memory notifier runs after the HMAT
callback. The HMAT callback will create the node sysfs attributes for
access coordinates. The CXL callback will write the access coordinates to
the now created node sysfs attributes directly and will not pollute the
HMAT target values.
Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region")
Cc: stable(a)vger.kernel.org
Tested-by: Marc Herbert <marc.herbert(a)linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Dave Jiang <dave.jiang(a)intel.com>
---
v2:
- Add description to what was observed for the issue. (Dan)
- Use the correct Fixes tag. (Dan)
- Add Cc to stable. (Dan)
- Add support to only update on first region appearance. (Jonathan)
---
drivers/acpi/numa/hmat.c | 6 ------
drivers/cxl/core/cdat.c | 5 -----
drivers/cxl/core/core.h | 1 -
drivers/cxl/core/region.c | 21 +++++++++++++--------
include/linux/memory.h | 2 +-
5 files changed, 14 insertions(+), 21 deletions(-)
diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 4958301f5417..5d32490dc4ab 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -74,7 +74,6 @@ struct memory_target {
struct node_cache_attrs cache_attrs;
u8 gen_port_device_handle[ACPI_SRAT_DEVICE_HANDLE_SIZE];
bool registered;
- bool ext_updated; /* externally updated */
};
struct memory_initiator {
@@ -391,7 +390,6 @@ int hmat_update_target_coordinates(int nid, struct access_coordinate *coord,
coord->read_bandwidth, access);
hmat_update_target_access(target, ACPI_HMAT_WRITE_BANDWIDTH,
coord->write_bandwidth, access);
- target->ext_updated = true;
return 0;
}
@@ -773,10 +771,6 @@ static void hmat_update_target_attrs(struct memory_target *target,
u32 best = 0;
int i;
- /* Don't update if an external agent has changed the data. */
- if (target->ext_updated)
- return;
-
/* Don't update for generic port if there's no device handle */
if ((access == NODE_ACCESS_CLASS_GENPORT_SINK_LOCAL ||
access == NODE_ACCESS_CLASS_GENPORT_SINK_CPU) &&
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index c0af645425f4..c891fd618cfd 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -1081,8 +1081,3 @@ int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,
{
return hmat_update_target_coordinates(nid, &cxlr->coord[access], access);
}
-
-bool cxl_need_node_perf_attrs_update(int nid)
-{
- return !acpi_node_backed_by_real_pxm(nid);
-}
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 2669f251d677..a253d308f3c9 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -139,7 +139,6 @@ long cxl_pci_get_latency(struct pci_dev *pdev);
int cxl_pci_get_bandwidth(struct pci_dev *pdev, struct access_coordinate *c);
int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,
enum access_coordinate_class access);
-bool cxl_need_node_perf_attrs_update(int nid);
int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port,
struct access_coordinate *c);
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 71cc42d05248..371873fc43eb 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -9,6 +9,7 @@
#include <linux/uuid.h>
#include <linux/sort.h>
#include <linux/idr.h>
+#include <linux/xarray.h>
#include <linux/memory-tiers.h>
#include <cxlmem.h>
#include <cxl.h>
@@ -30,6 +31,9 @@
* 3. Decoder targets
*/
+/* xarray that stores the reference count per node for regions */
+static DEFINE_XARRAY(node_regions_xa);
+
static struct cxl_region *to_cxl_region(struct device *dev);
#define __ACCESS_ATTR_RO(_level, _name) { \
@@ -2442,14 +2446,8 @@ static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
for (int i = 0; i < ACCESS_COORDINATE_MAX; i++) {
if (cxlr->coord[i].read_bandwidth) {
- rc = 0;
- if (cxl_need_node_perf_attrs_update(nid))
- node_set_perf_attrs(nid, &cxlr->coord[i], i);
- else
- rc = cxl_update_hmat_access_coordinates(nid, cxlr, i);
-
- if (rc == 0)
- cset++;
+ node_update_perf_attrs(nid, &cxlr->coord[i], i);
+ cset++;
}
}
@@ -2475,6 +2473,7 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
struct node_notify *nn = arg;
int nid = nn->nid;
int region_nid;
+ int rc;
if (action != NODE_ADDED_FIRST_MEMORY)
return NOTIFY_DONE;
@@ -2487,6 +2486,11 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
if (nid != region_nid)
return NOTIFY_DONE;
+ /* No action needed if there's existing entry */
+ rc = xa_insert(&node_regions_xa, nid, NULL, GFP_KERNEL);
+ if (rc < 0)
+ return NOTIFY_DONE;
+
if (!cxl_region_update_coordinates(cxlr, nid))
return NOTIFY_DONE;
@@ -3638,6 +3642,7 @@ int cxl_region_init(void)
void cxl_region_exit(void)
{
+ xa_destroy(&node_regions_xa);
cxl_driver_unregister(&cxl_region_driver);
}
diff --git a/include/linux/memory.h b/include/linux/memory.h
index de5c0d8e8925..684fd641f2f0 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -120,8 +120,8 @@ struct mem_section;
*/
#define DEFAULT_CALLBACK_PRI 0
#define SLAB_CALLBACK_PRI 1
-#define HMAT_CALLBACK_PRI 2
#define CXL_CALLBACK_PRI 5
+#define HMAT_CALLBACK_PRI 6
#define MM_COMPUTE_BATCH_PRI 10
#define CPUSET_CALLBACK_PRI 10
#define MEMTIER_HOTPLUG_PRI 100
--
2.50.1
Hi Greg,
Could you pick up this commit for 6.12 and 6.6:
00a973e093e9 ("interconnect: qcom: icc-rpm: Set the count member before accessing the flex array")
It just silences a UBSan warning so it doesn't affect regular users, but
it helps in testing to silence those warnings. It is a clean cherry-pick.
regards,
dan carpenter
On Wed, Dec 04, 2024 at 12:33:34AM +0200, djakov(a)kernel.org wrote:
> From: Georgi Djakov <djakov(a)kernel.org>
>
> The following UBSAN error is reported during boot on the db410c board on
> a clang-19 build:
>
> Internal error: UBSAN: array index out of bounds: 00000000f2005512 [#1] PREEMPT SMP
> ...
> pc : qnoc_probe+0x5f8/0x5fc
> ...
>
> The cause of the error is that the counter member was not set before
> accessing the annotated flexible array member, but after that. Fix this
> by initializing it earlier.
>
> Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
> Closes: https://lore.kernel.org/r/CA+G9fYs+2mBz1y2dAzxkj9-oiBJ2Acm1Sf1h2YQ3VmBqj_VX…
> Fixes: dd4904f3b924 ("interconnect: qcom: Annotate struct icc_onecell_data with __counted_by")
> Signed-off-by: Georgi Djakov <djakov(a)kernel.org>
> ---
> drivers/interconnect/qcom/icc-rpm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/interconnect/qcom/icc-rpm.c b/drivers/interconnect/qcom/icc-rpm.c
> index a8ed435f696c..ea1042d38128 100644
> --- a/drivers/interconnect/qcom/icc-rpm.c
> +++ b/drivers/interconnect/qcom/icc-rpm.c
> @@ -503,6 +503,7 @@ int qnoc_probe(struct platform_device *pdev)
> GFP_KERNEL);
> if (!data)
> return -ENOMEM;
> + data->num_nodes = num_nodes;
>
> qp->num_intf_clks = cd_num;
> for (i = 0; i < cd_num; i++)
> @@ -597,7 +598,6 @@ int qnoc_probe(struct platform_device *pdev)
>
> data->nodes[i] = node;
> }
> - data->num_nodes = num_nodes;
>
> clk_bulk_disable_unprepare(qp->num_intf_clks, qp->intf_clks);
>
Commit ID: 9cd51eefae3c871440b93c03716c5398f41bdf78
Why it should be applied: This is a small addition to a quirk list for which I
forgot to set cc: stable when originally submitted it. Bringing it onto stable
will result in several downstream distributions automatically adopting the
patch, helping the affected device.
What kernel versions I wish it to be applied to: It should apply cleanly to
6.1.y and newer longterm releases.
I hope I correctly followed
https://docs.kernel.org/process/stable-kernel-rules.html#option-2 to bring this
into stable.
Hi Dr. Sam Lavi Cosmetic And Implant Dentistry ,
Quick question: what happens when a patient calls after hours about implants?
Studies show 35–40% of dental calls go unanswered.
Every unanswered call = one implant patient choosing another practice.
We built an AI voice agent that answers every call, educates patients, and books consultations
24/7.
Want me to send you a 30-sec demo recording so you can hear it in action?
Just reply 'Yes' and I’ll send it right away.
—
Best regards,
Mohanish Ved
AI Growth Specialist
EditRage Solutions
Note:
The issue looks like it's from tty directly but I don't see who is the maintainer, so I email the closest I can get
Description:
In command line, sticky keys reset only when typing ASCII and ISO-8859-1 characters.
Tested with the QWERTY Lafayette layout: https://codeberg.org/Alerymin/kbd-qwerty-lafayette
Observed Behaviour:
When the layout is loaded in ISO-8859-15, most characters typed don't reset the sticky key, unless it's basic ASCII characters or Unicode
When the layout is loaded in ISO-8859-1, the sticky key works fine.
Expected behaviour:
Sticky key working in ISO-8859-1 and ISO-8859-15
System used:
Arch Linux, kernel 6.16.3-arch1-1
When the xe pm_notifier evicts for suspend / hibernate, there might be
racing tasks trying to re-validate again. This can lead to suspend taking
excessive time or get stuck in a live-lock. This behaviour becomes
much worse with the fix that actually makes re-validation bring back
bos to VRAM rather than letting them remain in TT.
Prevent that by having exec and the rebind worker waiting for a completion
that is set to block by the pm_notifier before suspend and is signaled
by the pm_notifier after resume / wakeup.
It's probably still possible to craft malicious applications that block
suspending. More work is pending to fix that.
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4288
Fixes: c6a4d46ec1d7 ("drm/xe: evict user memory in PM notifier")
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.16+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
---
drivers/gpu/drm/xe/xe_device_types.h | 2 ++
drivers/gpu/drm/xe/xe_exec.c | 9 +++++++++
drivers/gpu/drm/xe/xe_pm.c | 4 ++++
drivers/gpu/drm/xe/xe_vm.c | 14 ++++++++++++++
4 files changed, 29 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 092004d14db2..6602bd678cbc 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -507,6 +507,8 @@ struct xe_device {
/** @pm_notifier: Our PM notifier to perform actions in response to various PM events. */
struct notifier_block pm_notifier;
+ /** @pm_block: Completion to block validating tasks on suspend / hibernate prepare */
+ struct completion pm_block;
/** @pmt: Support the PMT driver callback interface */
struct {
diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
index 44364c042ad7..374c831e691b 100644
--- a/drivers/gpu/drm/xe/xe_exec.c
+++ b/drivers/gpu/drm/xe/xe_exec.c
@@ -237,6 +237,15 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
goto err_unlock_list;
}
+ /*
+ * It's OK to block interruptible here with the vm lock held, since
+ * on task freezing during suspend / hibernate, the call will
+ * return -ERESTARTSYS and the IOCTL will be rerun.
+ */
+ err = wait_for_completion_interruptible(&xe->pm_block);
+ if (err)
+ goto err_unlock_list;
+
vm_exec.vm = &vm->gpuvm;
vm_exec.flags = DRM_EXEC_INTERRUPTIBLE_WAIT;
if (xe_vm_in_lr_mode(vm)) {
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index b57b46ad9f7c..2d7b05d8a78b 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -306,6 +306,7 @@ static int xe_pm_notifier_callback(struct notifier_block *nb,
switch (action) {
case PM_HIBERNATION_PREPARE:
case PM_SUSPEND_PREPARE:
+ reinit_completion(&xe->pm_block);
xe_pm_runtime_get(xe);
err = xe_bo_evict_all_user(xe);
if (err)
@@ -318,6 +319,7 @@ static int xe_pm_notifier_callback(struct notifier_block *nb,
break;
case PM_POST_HIBERNATION:
case PM_POST_SUSPEND:
+ complete_all(&xe->pm_block);
xe_pm_runtime_get(xe);
xe_bo_notifier_unprepare_all_pinned(xe);
xe_pm_runtime_put(xe);
@@ -345,6 +347,8 @@ int xe_pm_init(struct xe_device *xe)
if (err)
return err;
+ init_completion(&xe->pm_block);
+ complete_all(&xe->pm_block);
/* For now suspend/resume is only allowed with GuC */
if (!xe_device_uc_enabled(xe))
return 0;
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 3ff3c67aa79d..edcdb0528f2a 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -394,6 +394,9 @@ static int xe_gpuvm_validate(struct drm_gpuvm_bo *vm_bo, struct drm_exec *exec)
list_move_tail(&gpuva_to_vma(gpuva)->combined_links.rebind,
&vm->rebind_list);
+ if (!try_wait_for_completion(&vm->xe->pm_block))
+ return -EAGAIN;
+
ret = xe_bo_validate(gem_to_xe_bo(vm_bo->obj), vm, false);
if (ret)
return ret;
@@ -494,6 +497,12 @@ static void preempt_rebind_work_func(struct work_struct *w)
xe_assert(vm->xe, xe_vm_in_preempt_fence_mode(vm));
trace_xe_vm_rebind_worker_enter(vm);
+ /*
+ * This blocks the wq during suspend / hibernate.
+ * Don't hold any locks.
+ */
+retry_pm:
+ wait_for_completion(&vm->xe->pm_block);
down_write(&vm->lock);
if (xe_vm_is_closed_or_banned(vm)) {
@@ -503,6 +512,11 @@ static void preempt_rebind_work_func(struct work_struct *w)
}
retry:
+ if (!try_wait_for_completion(&vm->xe->pm_block)) {
+ up_write(&vm->lock);
+ goto retry_pm;
+ }
+
if (xe_vm_userptr_check_repin(vm)) {
err = xe_vm_userptr_pin(vm);
if (err)
--
2.50.1
VRAM+TT bos that are evicted from VRAM to TT may remain in
TT also after a revalidation following eviction or suspend.
This manifests itself as applications becoming sluggish
after buffer objects get evicted or after a resume from
suspend or hibernation.
If the bo supports placement in both VRAM and TT, and
we are on DGFX, mark the TT placement as fallback. This means
that it is tried only after VRAM + eviction.
This flaw has probably been present since the xe module was
upstreamed but use a Fixes: commit below where backporting is
likely to be simple. For earlier versions we need to open-
code the fallback algorithm in the driver.
v2:
- Remove check for dgfx. (Matthew Auld)
- Update the xe_dma_buf kunit test for the new strategy (CI)
- Allow dma-buf to pin in current placement (CI)
- Make xe_bo_validate() for pinned bos a NOP.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5995
Fixes: a78a8da51b36 ("drm/ttm: replace busy placement with flags v6")
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.9+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com>
---
drivers/gpu/drm/xe/tests/xe_bo.c | 2 +-
drivers/gpu/drm/xe/tests/xe_dma_buf.c | 10 +---------
drivers/gpu/drm/xe/xe_bo.c | 16 ++++++++++++----
drivers/gpu/drm/xe/xe_bo.h | 2 +-
drivers/gpu/drm/xe/xe_dma_buf.c | 2 +-
5 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c
index bb469096d072..7b40cc8be1c9 100644
--- a/drivers/gpu/drm/xe/tests/xe_bo.c
+++ b/drivers/gpu/drm/xe/tests/xe_bo.c
@@ -236,7 +236,7 @@ static int evict_test_run_tile(struct xe_device *xe, struct xe_tile *tile, struc
}
xe_bo_lock(external, false);
- err = xe_bo_pin_external(external);
+ err = xe_bo_pin_external(external, false);
xe_bo_unlock(external);
if (err) {
KUNIT_FAIL(test, "external bo pin err=%pe\n",
diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
index cde9530bef8c..3f30d2cef917 100644
--- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
@@ -85,15 +85,7 @@ static void check_residency(struct kunit *test, struct xe_bo *exported,
return;
}
- /*
- * If on different devices, the exporter is kept in system if
- * possible, saving a migration step as the transfer is just
- * likely as fast from system memory.
- */
- if (params->mem_mask & XE_BO_FLAG_SYSTEM)
- KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, XE_PL_TT));
- else
- KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, mem_type));
+ KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, mem_type));
if (params->force_different_devices)
KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(imported, XE_PL_TT));
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 4faf15d5fa6d..870f43347281 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -188,6 +188,8 @@ static void try_add_system(struct xe_device *xe, struct xe_bo *bo,
bo->placements[*c] = (struct ttm_place) {
.mem_type = XE_PL_TT,
+ .flags = (bo_flags & XE_BO_FLAG_VRAM_MASK) ?
+ TTM_PL_FLAG_FALLBACK : 0,
};
*c += 1;
}
@@ -2322,6 +2324,7 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res)
/**
* xe_bo_pin_external - pin an external BO
* @bo: buffer object to be pinned
+ * @in_place: Pin in current placement, don't attempt to migrate.
*
* Pin an external (not tied to a VM, can be exported via dma-buf / prime FD)
* BO. Unique call compared to xe_bo_pin as this function has it own set of
@@ -2329,7 +2332,7 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res)
*
* Returns 0 for success, negative error code otherwise.
*/
-int xe_bo_pin_external(struct xe_bo *bo)
+int xe_bo_pin_external(struct xe_bo *bo, bool in_place)
{
struct xe_device *xe = xe_bo_device(bo);
int err;
@@ -2338,9 +2341,11 @@ int xe_bo_pin_external(struct xe_bo *bo)
xe_assert(xe, xe_bo_is_user(bo));
if (!xe_bo_is_pinned(bo)) {
- err = xe_bo_validate(bo, NULL, false);
- if (err)
- return err;
+ if (!in_place) {
+ err = xe_bo_validate(bo, NULL, false);
+ if (err)
+ return err;
+ }
spin_lock(&xe->pinned.lock);
list_add_tail(&bo->pinned_link, &xe->pinned.late.external);
@@ -2493,6 +2498,9 @@ int xe_bo_validate(struct xe_bo *bo, struct xe_vm *vm, bool allow_res_evict)
};
int ret;
+ if (xe_bo_is_pinned(bo))
+ return 0;
+
if (vm) {
lockdep_assert_held(&vm->lock);
xe_vm_assert_held(vm);
diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
index 8cce413b5235..cfb1ec266a6d 100644
--- a/drivers/gpu/drm/xe/xe_bo.h
+++ b/drivers/gpu/drm/xe/xe_bo.h
@@ -200,7 +200,7 @@ static inline void xe_bo_unlock_vm_held(struct xe_bo *bo)
}
}
-int xe_bo_pin_external(struct xe_bo *bo);
+int xe_bo_pin_external(struct xe_bo *bo, bool in_place);
int xe_bo_pin(struct xe_bo *bo);
void xe_bo_unpin_external(struct xe_bo *bo);
void xe_bo_unpin(struct xe_bo *bo);
diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
index 346f857f3837..af64baf872ef 100644
--- a/drivers/gpu/drm/xe/xe_dma_buf.c
+++ b/drivers/gpu/drm/xe/xe_dma_buf.c
@@ -72,7 +72,7 @@ static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
return ret;
}
- ret = xe_bo_pin_external(bo);
+ ret = xe_bo_pin_external(bo, true);
xe_assert(xe, !ret);
return 0;
--
2.50.1
VRAM+TT bos that are evicted from VRAM to TT may remain in
TT also after a revalidation following eviction or suspend.
This manifests itself as applications becoming sluggish
after buffer objects get evicted or after a resume from
suspend or hibernation.
If the bo supports placement in both VRAM and TT, and
we are on DGFX, mark the TT placement as fallback. This means
that it is tried only after VRAM + eviction.
This flaw has probably been present since the xe module was
upstreamed but use a Fixes: commit below where backporting is
likely to be simple. For earlier versions we need to open-
code the fallback algorithm in the driver.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5995
Fixes: a78a8da51b36 ("drm/ttm: replace busy placement with flags v6")
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.9+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
---
drivers/gpu/drm/xe/xe_bo.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 4faf15d5fa6d..64dea4e478bd 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -188,6 +188,8 @@ static void try_add_system(struct xe_device *xe, struct xe_bo *bo,
bo->placements[*c] = (struct ttm_place) {
.mem_type = XE_PL_TT,
+ .flags = (IS_DGFX(xe) && (bo_flags & XE_BO_FLAG_VRAM_MASK)) ?
+ TTM_PL_FLAG_FALLBACK : 0,
};
*c += 1;
}
--
2.50.1
Make sure to drop the references to the sibling platform devices and
their child drm devices taken by of_find_device_by_node() and
device_find_child() when initialising the driver data during bind().
Fixes: 1ef7ed48356c ("drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support")
Cc: stable(a)vger.kernel.org # 6.4
Cc: Nancy.Lin <nancy.lin(a)mediatek.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/gpu/drm/mediatek/mtk_drm_drv.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index 7c0c12dde488..33b83576af7e 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -395,10 +395,12 @@ static bool mtk_drm_get_all_drm_priv(struct device *dev)
continue;
drm_dev = device_find_child(&pdev->dev, NULL, mtk_drm_match);
+ put_device(&pdev->dev);
if (!drm_dev)
continue;
temp_drm_priv = dev_get_drvdata(drm_dev);
+ put_device(drm_dev);
if (!temp_drm_priv)
continue;
--
2.49.1
There is a bug observed when rtw89_core_tx_kick_off_and_wait() tries to
access already freed skb_data:
BUG: KFENCE: use-after-free write in rtw89_core_tx_kick_off_and_wait drivers/net/wireless/realtek/rtw89/core.c:1110
CPU: 6 UID: 0 PID: 41377 Comm: kworker/u64:24 Not tainted 6.17.0-rc1+ #1 PREEMPT(lazy)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS edk2-20250523-14.fc42 05/23/2025
Workqueue: events_unbound cfg80211_wiphy_work [cfg80211]
Use-after-free write at 0x0000000020309d9d (in kfence-#251):
rtw89_core_tx_kick_off_and_wait drivers/net/wireless/realtek/rtw89/core.c:1110
rtw89_core_scan_complete drivers/net/wireless/realtek/rtw89/core.c:5338
rtw89_hw_scan_complete_cb drivers/net/wireless/realtek/rtw89/fw.c:7979
rtw89_chanctx_proceed_cb drivers/net/wireless/realtek/rtw89/chan.c:3165
rtw89_chanctx_proceed drivers/net/wireless/realtek/rtw89/chan.h:141
rtw89_hw_scan_complete drivers/net/wireless/realtek/rtw89/fw.c:8012
rtw89_mac_c2h_scanofld_rsp drivers/net/wireless/realtek/rtw89/mac.c:5059
rtw89_fw_c2h_work drivers/net/wireless/realtek/rtw89/fw.c:6758
process_one_work kernel/workqueue.c:3241
worker_thread kernel/workqueue.c:3400
kthread kernel/kthread.c:463
ret_from_fork arch/x86/kernel/process.c:154
ret_from_fork_asm arch/x86/entry/entry_64.S:258
kfence-#251: 0x0000000056e2393d-0x000000009943cb62, size=232, cache=skbuff_head_cache
allocated by task 41377 on cpu 6 at 77869.159548s (0.009551s ago):
__alloc_skb net/core/skbuff.c:659
__netdev_alloc_skb net/core/skbuff.c:734
ieee80211_nullfunc_get net/mac80211/tx.c:5844
rtw89_core_send_nullfunc drivers/net/wireless/realtek/rtw89/core.c:3431
rtw89_core_scan_complete drivers/net/wireless/realtek/rtw89/core.c:5338
rtw89_hw_scan_complete_cb drivers/net/wireless/realtek/rtw89/fw.c:7979
rtw89_chanctx_proceed_cb drivers/net/wireless/realtek/rtw89/chan.c:3165
rtw89_chanctx_proceed drivers/net/wireless/realtek/rtw89/chan.c:3194
rtw89_hw_scan_complete drivers/net/wireless/realtek/rtw89/fw.c:8012
rtw89_mac_c2h_scanofld_rsp drivers/net/wireless/realtek/rtw89/mac.c:5059
rtw89_fw_c2h_work drivers/net/wireless/realtek/rtw89/fw.c:6758
process_one_work kernel/workqueue.c:3241
worker_thread kernel/workqueue.c:3400
kthread kernel/kthread.c:463
ret_from_fork arch/x86/kernel/process.c:154
ret_from_fork_asm arch/x86/entry/entry_64.S:258
freed by task 1045 on cpu 9 at 77869.168393s (0.001557s ago):
ieee80211_tx_status_skb net/mac80211/status.c:1117
rtw89_pci_release_txwd_skb drivers/net/wireless/realtek/rtw89/pci.c:564
rtw89_pci_release_tx_skbs.isra.0 drivers/net/wireless/realtek/rtw89/pci.c:651
rtw89_pci_release_tx drivers/net/wireless/realtek/rtw89/pci.c:676
rtw89_pci_napi_poll drivers/net/wireless/realtek/rtw89/pci.c:4238
__napi_poll net/core/dev.c:7495
net_rx_action net/core/dev.c:7557 net/core/dev.c:7684
handle_softirqs kernel/softirq.c:580
do_softirq.part.0 kernel/softirq.c:480
__local_bh_enable_ip kernel/softirq.c:407
rtw89_pci_interrupt_threadfn drivers/net/wireless/realtek/rtw89/pci.c:927
irq_thread_fn kernel/irq/manage.c:1133
irq_thread kernel/irq/manage.c:1257
kthread kernel/kthread.c:463
ret_from_fork arch/x86/kernel/process.c:154
ret_from_fork_asm arch/x86/entry/entry_64.S:258
It is a consequence of a race between the waiting and the signaling side
of the completion:
Waiting thread Completing thread
rtw89_core_tx_kick_off_and_wait()
rcu_assign_pointer(skb_data->wait, wait)
/* start waiting */
wait_for_completion_timeout()
rtw89_pci_tx_status()
rtw89_core_tx_wait_complete()
rcu_read_lock()
/* signals completion and
* proceeds further
*/
complete(&wait->completion)
rcu_read_unlock()
...
/* frees skb_data */
ieee80211_tx_status_ni()
/* returns (exit status doesn't matter) */
wait_for_completion_timeout()
...
/* accesses the already freed skb_data */
rcu_assign_pointer(skb_data->wait, NULL)
The completing side might proceed and free the underlying skb even before
the waiting side is fully awoken and run to execution. Actually the race
happens regardless of wait_for_completion_timeout() exit status, e.g.
the waiting side may hit a timeout and the concurrent completing side is
still able to free the skb.
Skbs which are sent by rtw89_core_tx_kick_off_and_wait() are owned by the
driver. They don't come from core ieee80211 stack so no need to pass them
to ieee80211_tx_status_ni() on completing side.
Introduce a work function which will act as a garbage collector for
rtw89_tx_wait_info objects and the associated skbs. Thus no potentially
heavy locks are required on the completing side.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 1ae5ca615285 ("wifi: rtw89: add function to wait for completion of TX skbs")
Cc: stable(a)vger.kernel.org
Suggested-by: Zong-Zhe Yang <kevin_yang(a)realtek.com>
Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru>
---
v2: use a work function to manage release of tx_waits and associated skbs (Zong-Zhe)
The interesting part is how rtw89_tx_wait_work() should wait for
completion - based on timeout or without it, or just check status with
a call to completion_done().
Simply waiting with wait_for_completion() may become a bottleneck if for
any reason the completion is delayed significantly, and we are holding a
wiphy lock here. I _suspect_ rtw89_pci_tx_status() should be called
either by napi polling handler or in other cases e.g. by rtw89_hci_reset()
but it's hard to deduce for any possible scenario that it will be called
in some time.
Anyway, the current and the next patch try to make sure that when
rtw89_core_tx_wait_complete() is called, skbdata->wait is properly
initialized so that there should be no buggy situations when tx_wait skb
is not recognized and invalidly passed to ieee80211 stack, also without
signaling a completion.
If rtw89_core_tx_wait_complete() is not called at all, this should
indicate a bug elsewhere.
drivers/net/wireless/realtek/rtw89/core.c | 42 +++++++++++++++++++----
drivers/net/wireless/realtek/rtw89/core.h | 22 +++++++-----
drivers/net/wireless/realtek/rtw89/pci.c | 9 ++---
3 files changed, 54 insertions(+), 19 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw89/core.c b/drivers/net/wireless/realtek/rtw89/core.c
index 57590f5577a3..48aa02d6abd4 100644
--- a/drivers/net/wireless/realtek/rtw89/core.c
+++ b/drivers/net/wireless/realtek/rtw89/core.c
@@ -1073,6 +1073,26 @@ rtw89_core_tx_update_desc_info(struct rtw89_dev *rtwdev,
}
}
+static void rtw89_tx_wait_work(struct wiphy *wiphy, struct wiphy_work *work)
+{
+ struct rtw89_dev *rtwdev = container_of(work, struct rtw89_dev,
+ tx_wait_work);
+ struct rtw89_tx_wait_info *wait, *tmp;
+
+ lockdep_assert_wiphy(wiphy);
+
+ list_for_each_entry_safe(wait, tmp, &rtwdev->tx_waits, list) {
+ if (!wait->finished) {
+ unsigned long tmo = msecs_to_jiffies(wait->timeout);
+ if (!wait_for_completion_timeout(&wait->completion, tmo))
+ continue;
+ }
+ list_del(&wait->list);
+ dev_kfree_skb_any(wait->skb);
+ kfree(wait);
+ }
+}
+
void rtw89_core_tx_kick_off(struct rtw89_dev *rtwdev, u8 qsel)
{
u8 ch_dma;
@@ -1090,6 +1110,8 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *sk
unsigned long time_left;
int ret = 0;
+ lockdep_assert_wiphy(rtwdev->hw->wiphy);
+
wait = kzalloc(sizeof(*wait), GFP_KERNEL);
if (!wait) {
rtw89_core_tx_kick_off(rtwdev, qsel);
@@ -1097,18 +1119,23 @@ int rtw89_core_tx_kick_off_and_wait(struct rtw89_dev *rtwdev, struct sk_buff *sk
}
init_completion(&wait->completion);
- rcu_assign_pointer(skb_data->wait, wait);
+ skb_data->wait = wait;
rtw89_core_tx_kick_off(rtwdev, qsel);
time_left = wait_for_completion_timeout(&wait->completion,
msecs_to_jiffies(timeout));
- if (time_left == 0)
+ if (time_left == 0) {
ret = -ETIMEDOUT;
- else if (!wait->tx_done)
- ret = -EAGAIN;
+ } else {
+ wait->finished = true;
+ if (!wait->tx_done)
+ ret = -EAGAIN;
+ }
- rcu_assign_pointer(skb_data->wait, NULL);
- kfree_rcu(wait, rcu_head);
+ wait->skb = skb;
+ wait->timeout = timeout;
+ list_add_tail(&wait->list, &rtwdev->tx_waits);
+ wiphy_work_queue(rtwdev->hw->wiphy, &rtwdev->tx_wait_work);
return ret;
}
@@ -4972,6 +4999,7 @@ void rtw89_core_stop(struct rtw89_dev *rtwdev)
clear_bit(RTW89_FLAG_RUNNING, rtwdev->flags);
wiphy_work_cancel(wiphy, &rtwdev->c2h_work);
+ wiphy_work_cancel(wiphy, &rtwdev->tx_wait_work);
wiphy_work_cancel(wiphy, &rtwdev->cancel_6ghz_probe_work);
wiphy_work_cancel(wiphy, &btc->eapol_notify_work);
wiphy_work_cancel(wiphy, &btc->arp_notify_work);
@@ -5203,6 +5231,7 @@ int rtw89_core_init(struct rtw89_dev *rtwdev)
INIT_LIST_HEAD(&rtwdev->scan_info.pkt_list[band]);
}
INIT_LIST_HEAD(&rtwdev->scan_info.chan_list);
+ INIT_LIST_HEAD(&rtwdev->tx_waits);
INIT_WORK(&rtwdev->ba_work, rtw89_core_ba_work);
INIT_WORK(&rtwdev->txq_work, rtw89_core_txq_work);
INIT_DELAYED_WORK(&rtwdev->txq_reinvoke_work, rtw89_core_txq_reinvoke_work);
@@ -5233,6 +5262,7 @@ int rtw89_core_init(struct rtw89_dev *rtwdev)
wiphy_work_init(&rtwdev->c2h_work, rtw89_fw_c2h_work);
wiphy_work_init(&rtwdev->ips_work, rtw89_ips_work);
wiphy_work_init(&rtwdev->cancel_6ghz_probe_work, rtw89_cancel_6ghz_probe_work);
+ wiphy_work_init(&rtwdev->tx_wait_work, rtw89_tx_wait_work);
INIT_WORK(&rtwdev->load_firmware_work, rtw89_load_firmware_work);
skb_queue_head_init(&rtwdev->c2h_queue);
diff --git a/drivers/net/wireless/realtek/rtw89/core.h b/drivers/net/wireless/realtek/rtw89/core.h
index 43e10278e14d..06f7d82a8d18 100644
--- a/drivers/net/wireless/realtek/rtw89/core.h
+++ b/drivers/net/wireless/realtek/rtw89/core.h
@@ -3508,12 +3508,16 @@ struct rtw89_phy_rate_pattern {
struct rtw89_tx_wait_info {
struct rcu_head rcu_head;
+ struct list_head list;
struct completion completion;
+ struct sk_buff *skb;
+ unsigned int timeout;
bool tx_done;
+ bool finished;
};
struct rtw89_tx_skb_data {
- struct rtw89_tx_wait_info __rcu *wait;
+ struct rtw89_tx_wait_info *wait;
u8 hci_priv[];
};
@@ -5925,6 +5929,9 @@ struct rtw89_dev {
/* used to protect rpwm */
spinlock_t rpwm_lock;
+ struct list_head tx_waits;
+ struct wiphy_work tx_wait_work;
+
struct rtw89_cam_info cam_info;
struct sk_buff_head c2h_queue;
@@ -7258,23 +7265,20 @@ static inline struct sk_buff *rtw89_alloc_skb_for_rx(struct rtw89_dev *rtwdev,
return dev_alloc_skb(length);
}
-static inline void rtw89_core_tx_wait_complete(struct rtw89_dev *rtwdev,
+static inline bool rtw89_core_tx_wait_complete(struct rtw89_dev *rtwdev,
struct rtw89_tx_skb_data *skb_data,
bool tx_done)
{
struct rtw89_tx_wait_info *wait;
- rcu_read_lock();
-
- wait = rcu_dereference(skb_data->wait);
+ wait = skb_data->wait;
if (!wait)
- goto out;
+ return false;
wait->tx_done = tx_done;
- complete(&wait->completion);
+ complete_all(&wait->completion);
-out:
- rcu_read_unlock();
+ return true;
}
static inline bool rtw89_is_mlo_1_1(struct rtw89_dev *rtwdev)
diff --git a/drivers/net/wireless/realtek/rtw89/pci.c b/drivers/net/wireless/realtek/rtw89/pci.c
index a669f2f843aa..6356c2c224c5 100644
--- a/drivers/net/wireless/realtek/rtw89/pci.c
+++ b/drivers/net/wireless/realtek/rtw89/pci.c
@@ -464,10 +464,7 @@ static void rtw89_pci_tx_status(struct rtw89_dev *rtwdev,
struct rtw89_tx_skb_data *skb_data = RTW89_TX_SKB_CB(skb);
struct ieee80211_tx_info *info;
- rtw89_core_tx_wait_complete(rtwdev, skb_data, tx_status == RTW89_TX_DONE);
-
info = IEEE80211_SKB_CB(skb);
- ieee80211_tx_info_clear_status(info);
if (info->flags & IEEE80211_TX_CTL_NO_ACK)
info->flags |= IEEE80211_TX_STAT_NOACK_TRANSMITTED;
@@ -494,6 +491,10 @@ static void rtw89_pci_tx_status(struct rtw89_dev *rtwdev,
}
}
+ if (rtw89_core_tx_wait_complete(rtwdev, skb_data, tx_status == RTW89_TX_DONE))
+ return;
+
+ ieee80211_tx_info_clear_status(info);
ieee80211_tx_status_ni(rtwdev->hw, skb);
}
@@ -1387,7 +1388,7 @@ static int rtw89_pci_txwd_submit(struct rtw89_dev *rtwdev,
}
tx_data->dma = dma;
- rcu_assign_pointer(skb_data->wait, NULL);
+ skb_data->wait = NULL;
txwp_len = sizeof(*txwp_info);
txwd_len = chip->txwd_body_size;
--
2.50.1
Fix the regression introduced in 9e30ecf23b1b whereby IPv4 broadcast
packets were having their ethernet destination field mangled. This
broke WOL magic packets and likely other IPv4 broadcast.
The regression can be observed by sending an IPv4 WOL packet using
the wakeonlan program to any ethernet address:
wakeonlan 46:3b:ad:61:e0:5d
and capturing the packet with tcpdump:
tcpdump -i eth0 -w /tmp/bad.cap dst port 9
The ethernet destination MUST be ff:ff:ff:ff:ff:ff for broadcast, but is
mangled with 9e30ecf23b1b applied.
Revert the change made in 9e30ecf23b1b and ensure the MTU value for
broadcast routes is retained by calling ip_dst_init_metrics() directly,
avoiding the need to enter the main code block in rt_set_nexthop().
Simplify the code path taken for broadcast packets back to the original
before the regression, adding only the call to ip_dst_init_metrics().
The broadcast_pmtu.sh selftest provided with the original patch still
passes with this patch applied.
Fixes: 9e30ecf23b1b ("net: ipv4: fix incorrect MTU in broadcast routes")
Signed-off-by: Brett A C Sheffield <bacs(a)librecast.net>
---
net/ipv4/route.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f639a2ae881a..eaf78e128aca 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2588,6 +2588,7 @@ static struct rtable *__mkroute_output(const struct fib_result *res,
do_cache = true;
if (type == RTN_BROADCAST) {
flags |= RTCF_BROADCAST | RTCF_LOCAL;
+ fi = NULL;
} else if (type == RTN_MULTICAST) {
flags |= RTCF_MULTICAST | RTCF_LOCAL;
if (!ip_check_mc_rcu(in_dev, fl4->daddr, fl4->saddr,
@@ -2657,8 +2658,12 @@ static struct rtable *__mkroute_output(const struct fib_result *res,
rth->dst.output = ip_mc_output;
RT_CACHE_STAT_INC(out_slow_mc);
}
+ if (type == RTN_BROADCAST) {
+ /* ensure MTU value for broadcast routes is retained */
+ ip_dst_init_metrics(&rth->dst, res->fi->fib_metrics);
+ }
#ifdef CONFIG_IP_MROUTE
- if (type == RTN_MULTICAST) {
+ else if (type == RTN_MULTICAST) {
if (IN_DEV_MFORWARD(in_dev) &&
!ipv4_is_local_multicast(fl4->daddr)) {
rth->dst.input = ip_mr_input;
base-commit: 01b9128c5db1b470575d07b05b67ffa3cb02ebf1
--
2.49.1
The comparison function cmp_loc_by_count() used for sorting stack trace
locations in debugfs currently returns -1 if a->count > b->count and 1
otherwise. This breaks the antisymmetry property required by sort(),
because when two counts are equal, both cmp(a, b) and cmp(b, a) return
1.
This can lead to undefined or incorrect ordering results. Fix it by
explicitly returning 0 when the counts are equal, ensuring that the
comparison function follows the expected mathematical properties.
Fixes: 553c0369b3e1 ("mm/slub: sort debugfs output by frequency of stack traces")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuan-Wei Chiu <visitorckw(a)gmail.com>
---
mm/slub.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/slub.c b/mm/slub.c
index 30003763d224..c91b3744adbc 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -7718,8 +7718,9 @@ static int cmp_loc_by_count(const void *a, const void *b, const void *data)
if (loc1->count > loc2->count)
return -1;
- else
+ if (loc1->count < loc2->count)
return 1;
+ return 0;
}
static void *slab_debugfs_start(struct seq_file *seq, loff_t *ppos)
--
2.34.1
The patch titled
Subject: mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-memory-failure-fix-vm_bug_on_pagepagepoisonedpage-when-unpoison-memory.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Miaohe Lin <linmiaohe(a)huawei.com>
Subject: mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory
Date: Thu, 28 Aug 2025 10:46:18 +0800
When I did memory failure tests, below panic occurs:
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 #40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS: 00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
<TASK>
unpoison_memory+0x2f3/0x590
simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
debugfs_attr_write+0x42/0x60
full_proxy_write+0x5b/0x80
vfs_write+0xd5/0x540
ksys_write+0x64/0xe0
do_syscall_64+0xb9/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
</TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS: 00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---
The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered. This can be reproduced by below steps:
1.Offline memory block:
echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
page-types -b n -rlN
3.Write pfn to unpoison-pfn
echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn
This scenario can be identified by pfn_to_online_page() returning NULL.
And ZONE_DEVICE pages are never expected, so we can simply fail if
pfn_to_online_page() == NULL to fix the bug.
Link: https://lkml.kernel.org/r/20250828024618.1744895-1-linmiaohe@huawei.com
Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online")
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
--- a/mm/memory-failure.c~mm-memory-failure-fix-vm_bug_on_pagepagepoisonedpage-when-unpoison-memory
+++ a/mm/memory-failure.c
@@ -2568,10 +2568,9 @@ int unpoison_memory(unsigned long pfn)
static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL,
DEFAULT_RATELIMIT_BURST);
- if (!pfn_valid(pfn))
- return -ENXIO;
-
- p = pfn_to_page(pfn);
+ p = pfn_to_online_page(pfn);
+ if (!p)
+ return -EIO;
folio = page_folio(p);
mutex_lock(&mf_mutex);
_
Patches currently in -mm which might be from linmiaohe(a)huawei.com are
mm-memory-failure-fix-vm_bug_on_pagepagepoisonedpage-when-unpoison-memory.patch
revert-hugetlb-make-hugetlb-depends-on-sysfs-or-sysctl.patch
Hello,
Are you looking for a loan to either increase your activity or to
carry out a project. We offer Crypto Loans at 2-7% interest rate
with or without a credit check. We also pay 1% commission to
brokers, who introduce project owners for finance or other
opportunities.
Please get back to me if you are interested in more details.
Best Regards,
Lucy Seinfield
Assistant Secretary
The patch titled
Subject: mm/memory-failure: fix redundant updates for already poisoned pages
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-memory-failure-fix-redundant-updates-for-already-poisoned-pages.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Kyle Meyer <kyle.meyer(a)hpe.com>
Subject: mm/memory-failure: fix redundant updates for already poisoned pages
Date: Thu, 28 Aug 2025 13:38:20 -0500
Duplicate memory errors can be reported by multiple sources.
Passing an already poisoned page to action_result() causes issues:
* The amount of hardware corrupted memory is incorrectly updated.
* Per NUMA node MF stats are incorrectly updated.
* Redundant "already poisoned" messages are printed.
Avoid those issues by:
* Skipping hardware corrupted memory updates for already poisoned pages.
* Skipping per NUMA node MF stats updates for already poisoned pages.
* Dropping redundant "already poisoned" messages.
Make MF_MSG_ALREADY_POISONED consistent with other action_page_types and
make calls to action_result() consistent for already poisoned normal pages
and huge pages.
Link: https://lkml.kernel.org/r/aLCiHMy12Ck3ouwC@hpe.com
Fixes: b8b9488d50b7 ("mm/memory-failure: improve memory failure action_result messages")
Signed-off-by: Kyle Meyer <kyle.meyer(a)hpe.com>
Reviewed-by: Jiaqi Yan <jiaqiyan(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Borislav Betkov <bp(a)alien8.de>
Cc: Jane Chu <jane.chu(a)oracle.com>
Cc: Kyle Meyer <kyle.meyer(a)hpe.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: "Luck, Tony" <tony.luck(a)intel.com>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Russ Anderson <russ.anderson(a)hpe.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
--- a/mm/memory-failure.c~mm-memory-failure-fix-redundant-updates-for-already-poisoned-pages
+++ a/mm/memory-failure.c
@@ -956,7 +956,7 @@ static const char * const action_page_ty
[MF_MSG_BUDDY] = "free buddy page",
[MF_MSG_DAX] = "dax page",
[MF_MSG_UNSPLIT_THP] = "unsplit thp",
- [MF_MSG_ALREADY_POISONED] = "already poisoned",
+ [MF_MSG_ALREADY_POISONED] = "already poisoned page",
[MF_MSG_UNKNOWN] = "unknown page",
};
@@ -1349,9 +1349,10 @@ static int action_result(unsigned long p
{
trace_memory_failure_event(pfn, type, result);
- num_poisoned_pages_inc(pfn);
-
- update_per_node_mf_stats(pfn, result);
+ if (type != MF_MSG_ALREADY_POISONED) {
+ num_poisoned_pages_inc(pfn);
+ update_per_node_mf_stats(pfn, result);
+ }
pr_err("%#lx: recovery action for %s: %s\n",
pfn, action_page_types[type], action_name[result]);
@@ -2094,12 +2095,11 @@ retry:
*hugetlb = 0;
return 0;
} else if (res == -EHWPOISON) {
- pr_err("%#lx: already hardware poisoned\n", pfn);
if (flags & MF_ACTION_REQUIRED) {
folio = page_folio(p);
res = kill_accessing_process(current, folio_pfn(folio), flags);
- action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
}
+ action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
return res;
} else if (res == -EBUSY) {
if (!(flags & MF_NO_RETRY)) {
@@ -2285,7 +2285,6 @@ try_again:
goto unlock_mutex;
if (TestSetPageHWPoison(p)) {
- pr_err("%#lx: already hardware poisoned\n", pfn);
res = -EHWPOISON;
if (flags & MF_ACTION_REQUIRED)
res = kill_accessing_process(current, pfn, flags);
_
Patches currently in -mm which might be from kyle.meyer(a)hpe.com are
mm-memory-failure-fix-redundant-updates-for-already-poisoned-pages.patch
In the IOMMU Shared Virtual Addressing (SVA) context, the IOMMU hardware
shares and walks the CPU's page tables. The Linux x86 architecture maps
the kernel address space into the upper portion of every process’s page
table. Consequently, in an SVA context, the IOMMU hardware can walk and
cache kernel space mappings. However, the Linux kernel currently lacks
a notification mechanism for kernel space mapping changes. This means
the IOMMU driver is not aware of such changes, leading to a break in
IOMMU cache coherence.
Modern IOMMUs often cache page table entries of the intermediate-level
page table as long as the entry is valid, no matter the permissions, to
optimize walk performance. Currently the iommu driver is notified only
for changes of user VA mappings, so the IOMMU's internal caches may
retain stale entries for kernel VA. When kernel page table mappings are
changed (e.g., by vfree()), but the IOMMU's internal caches retain stale
entries, Use-After-Free (UAF) vulnerability condition arises.
If these freed page table pages are reallocated for a different purpose,
potentially by an attacker, the IOMMU could misinterpret the new data as
valid page table entries. This allows the IOMMU to walk into attacker-
controlled memory, leading to arbitrary physical memory DMA access or
privilege escalation.
To mitigate this, introduce a new iommu interface to flush IOMMU caches.
This interface should be invoked from architecture-specific code that
manages combined user and kernel page tables, whenever a kernel page table
update is done and the CPU TLB needs to be flushed.
Fixes: 26b25a2b98e4 ("iommu: Bind process address spaces to devices")
Cc: stable(a)vger.kernel.org
Suggested-by: Jann Horn <jannh(a)google.com>
Co-developed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde(a)amd.com>
Reviewed-by: Kevin Tian <kevin.tian(a)intel.com>
Tested-by: Yi Lai <yi1.lai(a)intel.com>
---
arch/x86/mm/tlb.c | 4 +++
drivers/iommu/iommu-sva.c | 60 ++++++++++++++++++++++++++++++++++++++-
include/linux/iommu.h | 4 +++
3 files changed, 67 insertions(+), 1 deletion(-)
Change log:
v3:
- iommu_sva_mms is an unbound list; iterating it in an atomic context
could introduce significant latency issues. Schedule it in a kernel
thread and replace the spinlock with a mutex.
- Replace the static key with a normal bool; it can be brought back if
data shows the benefit.
- Invalidate KVA range in the flush_tlb_all() paths.
- All previous reviewed-bys are preserved. Please let me know if there
are any objections.
v2:
- https://lore.kernel.org/linux-iommu/20250709062800.651521-1-baolu.lu@linux.…
- Remove EXPORT_SYMBOL_GPL(iommu_sva_invalidate_kva_range);
- Replace the mutex with a spinlock to make the interface usable in the
critical regions.
v1: https://lore.kernel.org/linux-iommu/20250704133056.4023816-1-baolu.lu@linux…
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 39f80111e6f1..3b85e7d3ba44 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -12,6 +12,7 @@
#include <linux/task_work.h>
#include <linux/mmu_notifier.h>
#include <linux/mmu_context.h>
+#include <linux/iommu.h>
#include <asm/tlbflush.h>
#include <asm/mmu_context.h>
@@ -1478,6 +1479,8 @@ void flush_tlb_all(void)
else
/* Fall back to the IPI-based invalidation. */
on_each_cpu(do_flush_tlb_all, NULL, 1);
+
+ iommu_sva_invalidate_kva_range(0, TLB_FLUSH_ALL);
}
/* Flush an arbitrarily large range of memory with INVLPGB. */
@@ -1540,6 +1543,7 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end)
kernel_tlb_flush_range(info);
put_flush_tlb_info();
+ iommu_sva_invalidate_kva_range(start, end);
}
/*
diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 1a51cfd82808..d0da2b3fd64b 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -10,6 +10,8 @@
#include "iommu-priv.h"
static DEFINE_MUTEX(iommu_sva_lock);
+static bool iommu_sva_present;
+static LIST_HEAD(iommu_sva_mms);
static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
struct mm_struct *mm);
@@ -42,6 +44,7 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct de
return ERR_PTR(-ENOSPC);
}
iommu_mm->pasid = pasid;
+ iommu_mm->mm = mm;
INIT_LIST_HEAD(&iommu_mm->sva_domains);
/*
* Make sure the write to mm->iommu_mm is not reordered in front of
@@ -132,8 +135,13 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
if (ret)
goto out_free_domain;
domain->users = 1;
- list_add(&domain->next, &mm->iommu_mm->sva_domains);
+ if (list_empty(&iommu_mm->sva_domains)) {
+ if (list_empty(&iommu_sva_mms))
+ WRITE_ONCE(iommu_sva_present, true);
+ list_add(&iommu_mm->mm_list_elm, &iommu_sva_mms);
+ }
+ list_add(&domain->next, &iommu_mm->sva_domains);
out:
refcount_set(&handle->users, 1);
mutex_unlock(&iommu_sva_lock);
@@ -175,6 +183,13 @@ void iommu_sva_unbind_device(struct iommu_sva *handle)
list_del(&domain->next);
iommu_domain_free(domain);
}
+
+ if (list_empty(&iommu_mm->sva_domains)) {
+ list_del(&iommu_mm->mm_list_elm);
+ if (list_empty(&iommu_sva_mms))
+ WRITE_ONCE(iommu_sva_present, false);
+ }
+
mutex_unlock(&iommu_sva_lock);
kfree(handle);
}
@@ -312,3 +327,46 @@ static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,
return domain;
}
+
+struct kva_invalidation_work_data {
+ struct work_struct work;
+ unsigned long start;
+ unsigned long end;
+};
+
+static void invalidate_kva_func(struct work_struct *work)
+{
+ struct kva_invalidation_work_data *data =
+ container_of(work, struct kva_invalidation_work_data, work);
+ struct iommu_mm_data *iommu_mm;
+
+ guard(mutex)(&iommu_sva_lock);
+ list_for_each_entry(iommu_mm, &iommu_sva_mms, mm_list_elm)
+ mmu_notifier_arch_invalidate_secondary_tlbs(iommu_mm->mm,
+ data->start, data->end);
+
+ kfree(data);
+}
+
+void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end)
+{
+ struct kva_invalidation_work_data *data;
+
+ if (likely(!READ_ONCE(iommu_sva_present)))
+ return;
+
+ /* will be freed in the task function */
+ data = kzalloc(sizeof(*data), GFP_ATOMIC);
+ if (!data)
+ return;
+
+ data->start = start;
+ data->end = end;
+ INIT_WORK(&data->work, invalidate_kva_func);
+
+ /*
+ * Since iommu_sva_mms is an unbound list, iterating it in an atomic
+ * context could introduce significant latency issues.
+ */
+ schedule_work(&data->work);
+}
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c30d12e16473..66e4abb2df0d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -1134,7 +1134,9 @@ struct iommu_sva {
struct iommu_mm_data {
u32 pasid;
+ struct mm_struct *mm;
struct list_head sva_domains;
+ struct list_head mm_list_elm;
};
int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode);
@@ -1615,6 +1617,7 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
struct mm_struct *mm);
void iommu_sva_unbind_device(struct iommu_sva *handle);
u32 iommu_sva_get_pasid(struct iommu_sva *handle);
+void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end);
#else
static inline struct iommu_sva *
iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
@@ -1639,6 +1642,7 @@ static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm)
}
static inline void mm_pasid_drop(struct mm_struct *mm) {}
+static inline void iommu_sva_invalidate_kva_range(unsigned long start, unsigned long end) {}
#endif /* CONFIG_IOMMU_SVA */
#ifdef CONFIG_IOMMU_IOPF
--
2.43.0
This reverts commit 2402adce8da4e7396b63b5ffa71e1fa16e5fe5c4.
The upstream commit a40c5d727b8111b5db424a1e43e14a1dcce1e77f ("drm/dp:
Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS") the
reverted commit backported causes a regression, on one eDP panel at
least resulting in display flickering, described in detail at the Link:
below. The issue fixed by the upstream commit will need a different
solution, revert the backport for now.
Cc: intel-gfx(a)lists.freedesktop.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: Sasha Levin <sashal(a)kernel.org>
Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14558
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
---
drivers/gpu/drm/drm_dp_helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
index 4eabef5b86d0..ffc68d305afe 100644
--- a/drivers/gpu/drm/drm_dp_helper.c
+++ b/drivers/gpu/drm/drm_dp_helper.c
@@ -280,7 +280,7 @@ ssize_t drm_dp_dpcd_read(struct drm_dp_aux *aux, unsigned int offset,
* We just have to do it before any DPCD access and hope that the
* monitor doesn't power down exactly after the throw away read.
*/
- ret = drm_dp_dpcd_access(aux, DP_AUX_NATIVE_READ, DP_LANE0_1_STATUS, buffer,
+ ret = drm_dp_dpcd_access(aux, DP_AUX_NATIVE_READ, DP_DPCD_REV, buffer,
1);
if (ret != 1)
goto out;
--
2.49.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 76d2e3890fb169168c73f2e4f8375c7cc24a765e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025082225-attribute-embark-823b@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76d2e3890fb169168c73f2e4f8375c7cc24a765e Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Sat, 16 Aug 2025 07:25:20 -0700
Subject: [PATCH] NFS: Fix a race when updating an existing write
After nfs_lock_and_join_requests() tests for whether the request is
still attached to the mapping, nothing prevents a call to
nfs_inode_remove_request() from succeeding until we actually lock the
page group.
The reason is that whoever called nfs_inode_remove_request() doesn't
necessarily have a lock on the page group head.
So in order to avoid races, let's take the page group lock earlier in
nfs_lock_and_join_requests(), and hold it across the removal of the
request in nfs_inode_remove_request().
Reported-by: Jeff Layton <jlayton(a)kernel.org>
Tested-by: Joe Quanaim <jdq(a)meta.com>
Tested-by: Andrew Steffen <aksteffen(a)meta.com>
Reviewed-by: Jeff Layton <jlayton(a)kernel.org>
Fixes: bd37d6fce184 ("NFSv4: Convert nfs_lock_and_join_requests() to use nfs_page_find_head_request()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index 11968dcb7243..6e69ce43a13f 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -253,13 +253,14 @@ nfs_page_group_unlock(struct nfs_page *req)
nfs_page_clear_headlock(req);
}
-/*
- * nfs_page_group_sync_on_bit_locked
+/**
+ * nfs_page_group_sync_on_bit_locked - Test if all requests have @bit set
+ * @req: request in page group
+ * @bit: PG_* bit that is used to sync page group
*
* must be called with page group lock held
*/
-static bool
-nfs_page_group_sync_on_bit_locked(struct nfs_page *req, unsigned int bit)
+bool nfs_page_group_sync_on_bit_locked(struct nfs_page *req, unsigned int bit)
{
struct nfs_page *head = req->wb_head;
struct nfs_page *tmp;
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index fa5c41d0989a..8b7c04737967 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -153,20 +153,10 @@ nfs_page_set_inode_ref(struct nfs_page *req, struct inode *inode)
}
}
-static int
-nfs_cancel_remove_inode(struct nfs_page *req, struct inode *inode)
+static void nfs_cancel_remove_inode(struct nfs_page *req, struct inode *inode)
{
- int ret;
-
- if (!test_bit(PG_REMOVE, &req->wb_flags))
- return 0;
- ret = nfs_page_group_lock(req);
- if (ret)
- return ret;
if (test_and_clear_bit(PG_REMOVE, &req->wb_flags))
nfs_page_set_inode_ref(req, inode);
- nfs_page_group_unlock(req);
- return 0;
}
/**
@@ -585,19 +575,18 @@ retry:
}
}
+ ret = nfs_page_group_lock(head);
+ if (ret < 0)
+ goto out_unlock;
+
/* Ensure that nobody removed the request before we locked it */
if (head != folio->private) {
+ nfs_page_group_unlock(head);
nfs_unlock_and_release_request(head);
goto retry;
}
- ret = nfs_cancel_remove_inode(head, inode);
- if (ret < 0)
- goto out_unlock;
-
- ret = nfs_page_group_lock(head);
- if (ret < 0)
- goto out_unlock;
+ nfs_cancel_remove_inode(head, inode);
/* lock each request in the page group */
for (subreq = head->wb_this_page;
@@ -786,7 +775,8 @@ static void nfs_inode_remove_request(struct nfs_page *req)
{
struct nfs_inode *nfsi = NFS_I(nfs_page_to_inode(req));
- if (nfs_page_group_sync_on_bit(req, PG_REMOVE)) {
+ nfs_page_group_lock(req);
+ if (nfs_page_group_sync_on_bit_locked(req, PG_REMOVE)) {
struct folio *folio = nfs_page_to_folio(req->wb_head);
struct address_space *mapping = folio->mapping;
@@ -798,6 +788,7 @@ static void nfs_inode_remove_request(struct nfs_page *req)
}
spin_unlock(&mapping->i_private_lock);
}
+ nfs_page_group_unlock(req);
if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) {
atomic_long_dec(&nfsi->nrequests);
diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h
index 169b4ae30ff4..9aed39abc94b 100644
--- a/include/linux/nfs_page.h
+++ b/include/linux/nfs_page.h
@@ -160,6 +160,7 @@ extern void nfs_join_page_group(struct nfs_page *head,
extern int nfs_page_group_lock(struct nfs_page *);
extern void nfs_page_group_unlock(struct nfs_page *);
extern bool nfs_page_group_sync_on_bit(struct nfs_page *, unsigned int);
+extern bool nfs_page_group_sync_on_bit_locked(struct nfs_page *, unsigned int);
extern int nfs_page_set_headlock(struct nfs_page *req);
extern void nfs_page_clear_headlock(struct nfs_page *req);
extern bool nfs_async_iocounter_wait(struct rpc_task *, struct nfs_lock_context *);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 76d2e3890fb169168c73f2e4f8375c7cc24a765e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025082225-cling-drainer-d884@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76d2e3890fb169168c73f2e4f8375c7cc24a765e Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust(a)hammerspace.com>
Date: Sat, 16 Aug 2025 07:25:20 -0700
Subject: [PATCH] NFS: Fix a race when updating an existing write
After nfs_lock_and_join_requests() tests for whether the request is
still attached to the mapping, nothing prevents a call to
nfs_inode_remove_request() from succeeding until we actually lock the
page group.
The reason is that whoever called nfs_inode_remove_request() doesn't
necessarily have a lock on the page group head.
So in order to avoid races, let's take the page group lock earlier in
nfs_lock_and_join_requests(), and hold it across the removal of the
request in nfs_inode_remove_request().
Reported-by: Jeff Layton <jlayton(a)kernel.org>
Tested-by: Joe Quanaim <jdq(a)meta.com>
Tested-by: Andrew Steffen <aksteffen(a)meta.com>
Reviewed-by: Jeff Layton <jlayton(a)kernel.org>
Fixes: bd37d6fce184 ("NFSv4: Convert nfs_lock_and_join_requests() to use nfs_page_find_head_request()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com>
diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index 11968dcb7243..6e69ce43a13f 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -253,13 +253,14 @@ nfs_page_group_unlock(struct nfs_page *req)
nfs_page_clear_headlock(req);
}
-/*
- * nfs_page_group_sync_on_bit_locked
+/**
+ * nfs_page_group_sync_on_bit_locked - Test if all requests have @bit set
+ * @req: request in page group
+ * @bit: PG_* bit that is used to sync page group
*
* must be called with page group lock held
*/
-static bool
-nfs_page_group_sync_on_bit_locked(struct nfs_page *req, unsigned int bit)
+bool nfs_page_group_sync_on_bit_locked(struct nfs_page *req, unsigned int bit)
{
struct nfs_page *head = req->wb_head;
struct nfs_page *tmp;
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index fa5c41d0989a..8b7c04737967 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -153,20 +153,10 @@ nfs_page_set_inode_ref(struct nfs_page *req, struct inode *inode)
}
}
-static int
-nfs_cancel_remove_inode(struct nfs_page *req, struct inode *inode)
+static void nfs_cancel_remove_inode(struct nfs_page *req, struct inode *inode)
{
- int ret;
-
- if (!test_bit(PG_REMOVE, &req->wb_flags))
- return 0;
- ret = nfs_page_group_lock(req);
- if (ret)
- return ret;
if (test_and_clear_bit(PG_REMOVE, &req->wb_flags))
nfs_page_set_inode_ref(req, inode);
- nfs_page_group_unlock(req);
- return 0;
}
/**
@@ -585,19 +575,18 @@ retry:
}
}
+ ret = nfs_page_group_lock(head);
+ if (ret < 0)
+ goto out_unlock;
+
/* Ensure that nobody removed the request before we locked it */
if (head != folio->private) {
+ nfs_page_group_unlock(head);
nfs_unlock_and_release_request(head);
goto retry;
}
- ret = nfs_cancel_remove_inode(head, inode);
- if (ret < 0)
- goto out_unlock;
-
- ret = nfs_page_group_lock(head);
- if (ret < 0)
- goto out_unlock;
+ nfs_cancel_remove_inode(head, inode);
/* lock each request in the page group */
for (subreq = head->wb_this_page;
@@ -786,7 +775,8 @@ static void nfs_inode_remove_request(struct nfs_page *req)
{
struct nfs_inode *nfsi = NFS_I(nfs_page_to_inode(req));
- if (nfs_page_group_sync_on_bit(req, PG_REMOVE)) {
+ nfs_page_group_lock(req);
+ if (nfs_page_group_sync_on_bit_locked(req, PG_REMOVE)) {
struct folio *folio = nfs_page_to_folio(req->wb_head);
struct address_space *mapping = folio->mapping;
@@ -798,6 +788,7 @@ static void nfs_inode_remove_request(struct nfs_page *req)
}
spin_unlock(&mapping->i_private_lock);
}
+ nfs_page_group_unlock(req);
if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) {
atomic_long_dec(&nfsi->nrequests);
diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h
index 169b4ae30ff4..9aed39abc94b 100644
--- a/include/linux/nfs_page.h
+++ b/include/linux/nfs_page.h
@@ -160,6 +160,7 @@ extern void nfs_join_page_group(struct nfs_page *head,
extern int nfs_page_group_lock(struct nfs_page *);
extern void nfs_page_group_unlock(struct nfs_page *);
extern bool nfs_page_group_sync_on_bit(struct nfs_page *, unsigned int);
+extern bool nfs_page_group_sync_on_bit_locked(struct nfs_page *, unsigned int);
extern int nfs_page_set_headlock(struct nfs_page *req);
extern void nfs_page_clear_headlock(struct nfs_page *req);
extern bool nfs_async_iocounter_wait(struct rpc_task *, struct nfs_lock_context *);
If an object is backed up to shmem it is incorrectly identified
as not having valid data by the move code. This means moving
to VRAM skips the -EMULTIHOP step and the bo is cleared. This
causes all sorts of weird behaviour on DGFX if an already evicted
object is targeted by the shrinker.
Fix this by using ttm_tt_is_swapped() to identify backed-up
objects.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5996
Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos")
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.15+
Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
---
drivers/gpu/drm/xe/xe_bo.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
index 7d1ff642b02a..4faf15d5fa6d 100644
--- a/drivers/gpu/drm/xe/xe_bo.c
+++ b/drivers/gpu/drm/xe/xe_bo.c
@@ -823,8 +823,7 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
return ret;
}
- tt_has_data = ttm && (ttm_tt_is_populated(ttm) ||
- (ttm->page_flags & TTM_TT_FLAG_SWAPPED));
+ tt_has_data = ttm && (ttm_tt_is_populated(ttm) || ttm_tt_is_swapped(ttm));
move_lacks_source = !old_mem || (handle_system_ccs ? (!bo->ccs_cleared) :
(!mem_type_is_vram(old_mem_type) && !tt_has_data));
--
2.50.1
Commit 3215eaceca87 ("mm/mremap: refactor initial parameter sanity
checks") moved the sanity check for vrm->new_addr from mremap_to() to
check_mremap_params().
However, this caused a regression as vrm->new_addr is now checked even
when MREMAP_FIXED and MREMAP_DONTUNMAP flags are not specified. In this
case, vrm->new_addr can be garbage and create unexpected failures.
Fix this by moving the new_addr check after the vrm_implies_new_addr()
guard. This ensures that the new_addr is only checked when the user has
specified one explicitly.
Cc: stable(a)vger.kernel.org
Fixes: 3215eaceca87 ("mm/mremap: refactor initial parameter sanity checks")
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
---
v2:
- split out vrm->new_len into individual checks
- cc stable, collect tags
v1:
https://lore.kernel.org/all/20250828032653.521314-1-cmllamas@google.com/
mm/mremap.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/mm/mremap.c b/mm/mremap.c
index e618a706aff5..35de0a7b910e 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -1774,15 +1774,18 @@ static unsigned long check_mremap_params(struct vma_remap_struct *vrm)
if (!vrm->new_len)
return -EINVAL;
- /* Is the new length or address silly? */
- if (vrm->new_len > TASK_SIZE ||
- vrm->new_addr > TASK_SIZE - vrm->new_len)
+ /* Is the new length silly? */
+ if (vrm->new_len > TASK_SIZE)
return -EINVAL;
/* Remainder of checks are for cases with specific new_addr. */
if (!vrm_implies_new_addr(vrm))
return 0;
+ /* Is the new address silly? */
+ if (vrm->new_addr > TASK_SIZE - vrm->new_len)
+ return -EINVAL;
+
/* The new address must be page-aligned. */
if (offset_in_page(vrm->new_addr))
return -EINVAL;
--
2.51.0.268.g9569e192d0-goog
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
Virtio drivers and PCI devices have never fully supported true
surprise (aka hot unplug) removal. Drivers historically continued
processing and waiting for pending I/O and even continued synchronous
device reset during surprise removal. Devices have also continued
completing I/Os, doing DMA and allowing device reset after surprise
removal to support such drivers.
Supporting it correctly would require a new device capability and
driver negotiation in the virtio specification to safely stop
I/O and free queue memory. Failure to do so either breaks all the
existing drivers with call trace listed in the commit or crashes the
host on continuing the DMA. Hence, until such specification and devices
are invented, restore the previous behavior of treating surprise
removal as graceful removal to avoid regressions and maintain system
stability same as before the
commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
As explained above, previous analysis of solving this only in driver
was incomplete and non-reliable at [1] and at [2]; Hence reverting commit
43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device")
is still the best stand to restore failures of virtio net and
block devices.
[1] https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB100BC6C638DCB…
[2] https://lore.kernel.org/virtualization/20250602024358.57114-1-parav@nvidia.…
Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device")
Cc: stable(a)vger.kernel.org
Reported-by: lirongqing(a)baidu.com
Closes: https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb73ca9b4741@bai…
Signed-off-by: Parav Pandit <parav(a)nvidia.com>
---
drivers/virtio/virtio_pci_common.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index d6d79af44569..dba5eb2eaff9 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -747,13 +747,6 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
struct device *dev = get_device(&vp_dev->vdev.dev);
- /*
- * Device is marked broken on surprise removal so that virtio upper
- * layers can abort any ongoing operation.
- */
- if (!pci_device_is_present(pci_dev))
- virtio_break_device(&vp_dev->vdev);
-
pci_disable_sriov(pci_dev);
unregister_virtio_device(&vp_dev->vdev);
--
2.26.2
Dear Entrepreneur,
"Global Support for Project & Business Development"
Need funds for your project or business plan? Our investors are ready to help and support you. Our network of professional investors offers soft loans to support project owners worldwide. Contact us to explore funding opportunities."
"Businesses and entrepreneurs seeking strategic partnerships and investment opportunities can benefit from our expertise. We consider collaborations in various sectors, including:
1. Business development
2. Project financing
3. Growth and expansion
4. Real estate investments
5. Strategic contracts
Note: IF YOU BRING A PROJECT OWNER TO US, YOU WILL RECEIVE A REFERER COMMISION. KINDLY DISCUSS PARTNERSHIP WITH US TODAY. info(a)firstcapitalsinvestors.com or firstcapitalsinvestors(a)gmail.com
We finance all kinds of viable projects ranging from personal projects to multi mega capital project. Organizational projects, . Government projects and community development are part of our services to humanity. We have support project such as real estate projects, Oil and Gas Projects, Health and medications, Engineering, Construction, Agriculture, Aviation, Retail sales business. All viable projects and businesses are welcome. Our investors are ready to support you.
Best Regards
Mr. William Man Fu Hung
First Capital Loans and Finance
www.firstcapitalsinvestors.com
info(a)firstcapitalsinvestors.com