From: Andrey Vatoropin <a.vatoropin(a)crpt.ru>
If fbcon_open() fails when called from con2fb_acquire_newinfo() then
info->fbcon_par pointer remains NULL which is later dereferenced.
Add check for return value of the function con2fb_acquire_newinfo() to
avoid it.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: d1baa4ffa677 ("fbcon: set_con2fb_map fixes")
Cc: stable(a)vger.kernel.org
Signed-off-by: Andrey Vatoropin <a.vatoropin(a)crpt.ru>
---
drivers/video/fbdev/core/fbcon.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c
index 3dd03e02bf97..d9b2b54f00db 100644
--- a/drivers/video/fbdev/core/fbcon.c
+++ b/drivers/video/fbdev/core/fbcon.c
@@ -1057,7 +1057,8 @@ static void fbcon_init(struct vc_data *vc, int init)
return;
if (!info->fbcon_par)
- con2fb_acquire_newinfo(vc, info, vc->vc_num, -1);
+ if (con2fb_acquire_newinfo(vc, info, vc->vc_num, -1))
+ return;
/* If we are not the first console on this
fb, copy the font from that console */
--
2.43.0
Memory allocated for struct vscsiblk_info in scsiback_probe() is not
freed in scsiback_remove() leading to potential memory leaks on remove,
as well as in the scsiback_probe() error paths. Fix that by freeing it
in scsiback_remove().
Cc: stable(a)vger.kernel.org
Fixes: d9d660f6e562 ("xen-scsiback: Add Xen PV SCSI backend driver")
Signed-off-by: Abdun Nihaal <nihaal(a)cse.iitm.ac.in>
---
Compile tested only. Issue found using static analysis.
drivers/xen/xen-scsiback.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c
index 0c51edfd13dc..7d5117e5efe0 100644
--- a/drivers/xen/xen-scsiback.c
+++ b/drivers/xen/xen-scsiback.c
@@ -1262,6 +1262,7 @@ static void scsiback_remove(struct xenbus_device *dev)
gnttab_page_cache_shrink(&info->free_pages, 0);
dev_set_drvdata(&dev->dev, NULL);
+ kfree(info);
}
static int scsiback_probe(struct xenbus_device *dev,
--
2.43.0
In one of the error paths in tw9906_probe(), the memory allocated in
v4l2_ctrl_handler_init() and v4l2_ctrl_new_std() is not freed. Fix that
by calling v4l2_ctrl_handler_free() on the handler in that error path.
Cc: stable(a)vger.kernel.org
Fixes: a000e9a02b58 ("[media] tw9906: add Techwell tw9906 video decoder")
Signed-off-by: Abdun Nihaal <nihaal(a)cse.iitm.ac.in>
---
Compile tested only. Issue found using static analysis.
drivers/media/i2c/tw9906.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/i2c/tw9906.c b/drivers/media/i2c/tw9906.c
index 6220f4fddbab..0ab43fe42d7f 100644
--- a/drivers/media/i2c/tw9906.c
+++ b/drivers/media/i2c/tw9906.c
@@ -196,6 +196,7 @@ static int tw9906_probe(struct i2c_client *client)
if (write_regs(sd, initial_registers) < 0) {
v4l2_err(client, "error initializing TW9906\n");
+ v4l2_ctrl_handler_free(hdl);
return -EINVAL;
}
--
2.43.0
In one of the error paths in tw9903_probe(), the memory allocated in
v4l2_ctrl_handler_init() and v4l2_ctrl_new_std() is not freed. Fix that
by calling v4l2_ctrl_handler_free() on the handler in that error path.
Cc: stable(a)vger.kernel.org
Fixes: 0890ec19c65d ("[media] tw9903: add new tw9903 video decoder")
Signed-off-by: Abdun Nihaal <nihaal(a)cse.iitm.ac.in>
---
Compile tested only. Issue found using static analysis.
drivers/media/i2c/tw9903.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/i2c/tw9903.c b/drivers/media/i2c/tw9903.c
index b996a05e56f2..c3eafd5d5dc8 100644
--- a/drivers/media/i2c/tw9903.c
+++ b/drivers/media/i2c/tw9903.c
@@ -228,6 +228,7 @@ static int tw9903_probe(struct i2c_client *client)
if (write_regs(sd, initial_registers) < 0) {
v4l2_err(client, "error initializing TW9903\n");
+ v4l2_ctrl_handler_free(hdl);
return -EINVAL;
}
--
2.43.0
The patch titled
Subject: mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-memory-failure-teach-kill_accessing_process-to-accept-hugetlb-tail-page-pfn.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Jane Chu <jane.chu(a)oracle.com>
Subject: mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
Date: Mon, 22 Dec 2025 18:21:12 -0700
When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
passed head pfn to kill_accessing_process(), that is not right. The
precise pfn of the poisoned page should be used in order to determine the
precise vaddr as the SIGBUS payload.
This issue has already been taken care of in the normal path, that is,
hwpoison_user_mappings(), see [1][2]. Furthermore, for [3] to work
correctly in the hugetlb repoisoning case, it's essential to inform VM the
precise poisoned page, not the head page.
Link: https://lkml.kernel.org/r/20251223012113.370674-2-jane.chu@oracle.com
Link: https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org [1]
Link: https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com [2]
Link: https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/ [3]
Signed-off-by: Jane Chu <jane.chu(a)oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: "David Hildenbrand (Red Hat)" <david(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Jiaqi Yan <jiaqiyan(a)google.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: William Roche <william.roche(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
--- a/mm/memory-failure.c~mm-memory-failure-teach-kill_accessing_process-to-accept-hugetlb-tail-page-pfn
+++ a/mm/memory-failure.c
@@ -692,6 +692,8 @@ static int check_hwpoisoned_entry(pte_t
unsigned long poisoned_pfn, struct to_kill *tk)
{
unsigned long pfn = 0;
+ unsigned long hwpoison_vaddr;
+ unsigned long mask;
if (pte_present(pte)) {
pfn = pte_pfn(pte);
@@ -702,10 +704,12 @@ static int check_hwpoisoned_entry(pte_t
pfn = softleaf_to_pfn(entry);
}
- if (!pfn || pfn != poisoned_pfn)
+ mask = ~((1UL << (shift - PAGE_SHIFT)) - 1);
+ if (!pfn || ((pfn & mask) != (poisoned_pfn & mask)))
return 0;
- set_to_kill(tk, addr, shift);
+ hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);
+ set_to_kill(tk, hwpoison_vaddr, shift);
return 1;
}
@@ -2038,10 +2042,8 @@ retry:
return 0;
case MF_HUGETLB_ALREADY_POISONED:
case MF_HUGETLB_ACC_EXISTING_POISON:
- if (flags & MF_ACTION_REQUIRED) {
- folio = page_folio(p);
- res = kill_accessing_process(current, folio_pfn(folio), flags);
- }
+ if (flags & MF_ACTION_REQUIRED)
+ res = kill_accessing_process(current, pfn, flags);
if (res == MF_HUGETLB_ALREADY_POISONED)
action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
else
_
Patches currently in -mm which might be from jane.chu(a)oracle.com are
mm-memory-failure-fix-missing-mf_stats-count-in-hugetlb-poison.patch
mm-memory-failure-teach-kill_accessing_process-to-accept-hugetlb-tail-page-pfn.patch
The patch titled
Subject: mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-memory-failure-fix-missing-mf_stats-count-in-hugetlb-poison.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Jane Chu <jane.chu(a)oracle.com>
Subject: mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
Date: Mon, 22 Dec 2025 18:21:11 -0700
When a newly poisoned subpage ends up in an already poisoned hugetlb
folio, 'num_poisoned_pages' is incremented, but the per node ->mf_stats is
not. Fix the inconsistency by designating action_result() to update them
both.
While at it, define __get_huge_page_for_hwpoison() return values in terms
of symbol names for better readibility. Also rename
folio_set_hugetlb_hwpoison() to hugetlb_update_hwpoison() since the
function does more than the conventional bit setting and the fact three
possible return values are expected.
Link: https://lkml.kernel.org/r/20251223012113.370674-1-jane.chu@oracle.com
Fixes: 18f41fa616ee ("mm: memory-failure: bump memory failure stats to pglist_data")
Signed-off-by: Jane Chu <jane.chu(a)oracle.com>
Cc: "David Hildenbrand (Red Hat)" <david(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Jiaqi Yan <jiaqiyan(a)google.com>
Cc: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: William Roche <william.roche(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 56 ++++++++++++++++++++++++------------------
1 file changed, 33 insertions(+), 23 deletions(-)
--- a/mm/memory-failure.c~mm-memory-failure-fix-missing-mf_stats-count-in-hugetlb-poison
+++ a/mm/memory-failure.c
@@ -1883,12 +1883,18 @@ static unsigned long __folio_free_raw_hw
return count;
}
-static int folio_set_hugetlb_hwpoison(struct folio *folio, struct page *page)
+#define MF_HUGETLB_ALREADY_POISONED 3 /* already poisoned */
+#define MF_HUGETLB_ACC_EXISTING_POISON 4 /* accessed existing poisoned page */
+/*
+ * Set hugetlb folio as hwpoisoned, update folio private raw hwpoison list
+ * to keep track of the poisoned pages.
+ */
+static int hugetlb_update_hwpoison(struct folio *folio, struct page *page)
{
struct llist_head *head;
struct raw_hwp_page *raw_hwp;
struct raw_hwp_page *p;
- int ret = folio_test_set_hwpoison(folio) ? -EHWPOISON : 0;
+ int ret = folio_test_set_hwpoison(folio) ? MF_HUGETLB_ALREADY_POISONED : 0;
/*
* Once the hwpoison hugepage has lost reliable raw error info,
@@ -1896,20 +1902,18 @@ static int folio_set_hugetlb_hwpoison(st
* so skip to add additional raw error info.
*/
if (folio_test_hugetlb_raw_hwp_unreliable(folio))
- return -EHWPOISON;
+ return MF_HUGETLB_ALREADY_POISONED;
+
head = raw_hwp_list_head(folio);
llist_for_each_entry(p, head->first, node) {
if (p->page == page)
- return -EHWPOISON;
+ return MF_HUGETLB_ACC_EXISTING_POISON;
}
raw_hwp = kmalloc(sizeof(struct raw_hwp_page), GFP_ATOMIC);
if (raw_hwp) {
raw_hwp->page = page;
llist_add(&raw_hwp->node, head);
- /* the first error event will be counted in action_result(). */
- if (ret)
- num_poisoned_pages_inc(page_to_pfn(page));
} else {
/*
* Failed to save raw error info. We no longer trace all
@@ -1955,32 +1959,30 @@ void folio_clear_hugetlb_hwpoison(struct
folio_free_raw_hwp(folio, true);
}
+#define MF_HUGETLB_FREED 0 /* freed hugepage */
+#define MF_HUGETLB_IN_USED 1 /* in-use hugepage */
+#define MF_NOT_HUGETLB 2 /* not a hugepage */
+
/*
* Called from hugetlb code with hugetlb_lock held.
- *
- * Return values:
- * 0 - free hugepage
- * 1 - in-use hugepage
- * 2 - not a hugepage
- * -EBUSY - the hugepage is busy (try to retry)
- * -EHWPOISON - the hugepage is already hwpoisoned
*/
int __get_huge_page_for_hwpoison(unsigned long pfn, int flags,
bool *migratable_cleared)
{
struct page *page = pfn_to_page(pfn);
struct folio *folio = page_folio(page);
- int ret = 2; /* fallback to normal page handling */
+ int ret = MF_NOT_HUGETLB;
bool count_increased = false;
+ int rc;
if (!folio_test_hugetlb(folio))
goto out;
if (flags & MF_COUNT_INCREASED) {
- ret = 1;
+ ret = MF_HUGETLB_IN_USED;
count_increased = true;
} else if (folio_test_hugetlb_freed(folio)) {
- ret = 0;
+ ret = MF_HUGETLB_FREED;
} else if (folio_test_hugetlb_migratable(folio)) {
ret = folio_try_get(folio);
if (ret)
@@ -1991,8 +1993,9 @@ int __get_huge_page_for_hwpoison(unsigne
goto out;
}
- if (folio_set_hugetlb_hwpoison(folio, page)) {
- ret = -EHWPOISON;
+ rc = hugetlb_update_hwpoison(folio, page);
+ if (rc >= MF_HUGETLB_ALREADY_POISONED) {
+ ret = rc;
goto out;
}
@@ -2029,22 +2032,29 @@ static int try_memory_failure_hugetlb(un
*hugetlb = 1;
retry:
res = get_huge_page_for_hwpoison(pfn, flags, &migratable_cleared);
- if (res == 2) { /* fallback to normal page handling */
+ switch (res) {
+ case MF_NOT_HUGETLB: /* fallback to normal page handling */
*hugetlb = 0;
return 0;
- } else if (res == -EHWPOISON) {
+ case MF_HUGETLB_ALREADY_POISONED:
+ case MF_HUGETLB_ACC_EXISTING_POISON:
if (flags & MF_ACTION_REQUIRED) {
folio = page_folio(p);
res = kill_accessing_process(current, folio_pfn(folio), flags);
}
- action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
+ if (res == MF_HUGETLB_ALREADY_POISONED)
+ action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
+ else
+ action_result(pfn, MF_MSG_HUGE, MF_FAILED);
return res;
- } else if (res == -EBUSY) {
+ case -EBUSY:
if (!(flags & MF_NO_RETRY)) {
flags |= MF_NO_RETRY;
goto retry;
}
return action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
+ default:
+ break;
}
folio = page_folio(p);
_
Patches currently in -mm which might be from jane.chu(a)oracle.com are
mm-memory-failure-fix-missing-mf_stats-count-in-hugetlb-poison.patch
mm-memory-failure-teach-kill_accessing_process-to-accept-hugetlb-tail-page-pfn.patch
From: Ionut Nechita <ionut.nechita(a)windriver.com>
This series addresses two critical issues in the block layer multiqueue
(blk-mq) subsystem when running on PREEMPT_RT kernels.
The first patch fixes a severe performance regression where queue_lock
contention in the I/O hot path causes IRQ threads to sleep on RT kernels.
Testing on MegaRAID 12GSAS controller showed a 76% performance drop
(640 MB/s -> 153 MB/s). The fix replaces spinlock with memory barriers
to maintain ordering without sleeping.
The second patch fixes a WARN_ON that triggers during SCSI device scanning
when blk_freeze_queue_start() calls blk_mq_run_hw_queues() synchronously
from interrupt context. The warning "WARN_ON_ONCE(!async && in_interrupt())"
is resolved by switching to asynchronous execution.
Changes in v2:
- Removed the blk_mq_cpuhp_lock patch (needs more investigation)
- Added fix for WARN_ON in interrupt context during queue freezing
- Updated commit messages for clarity
Ionut Nechita (2):
block/blk-mq: fix RT kernel regression with queue_lock in hot path
block: Fix WARN_ON in blk_mq_run_hw_queue when called from interrupt
context
block/blk-mq.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
--
2.52.0
The patch titled
Subject: mm: take into account mm_cid size for mm_struct static definitions
has been added to the -mm mm-new branch. Its filename is
mm-take-into-account-mm_cid-size-for-mm_struct-static-definitions.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
The mm-new branch of mm.git is not included in linux-next
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Subject: mm: take into account mm_cid size for mm_struct static definitions
Date: Sun, 21 Dec 2025 18:29:24 -0500
Both init_mm and efi_mm static definitions need to make room for the
2 mm_cid cpumasks.
This fixes possible out-of-bounds accesses to init_mm and efi_mm.
Link: https://lkml.kernel.org/r/20251221232926.450602-4-mathieu.desnoyers@efficio…
Fixes: af7f588d8f73 ("sched: Introduce per-memory-map concurrency ID")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mm_types.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/include/linux/mm_types.h~mm-take-into-account-mm_cid-size-for-mm_struct-static-definitions
+++ a/include/linux/mm_types.h
@@ -1368,7 +1368,7 @@ extern struct mm_struct init_mm;
#define MM_STRUCT_FLEXIBLE_ARRAY_INIT \
{ \
- [0 ... sizeof(cpumask_t)-1] = 0 \
+ [0 ... sizeof(cpumask_t) + MM_CID_STATIC_SIZE - 1] = 0 \
}
/* Pointer magic because the dynamic array size confuses some compilers. */
@@ -1500,7 +1500,7 @@ static inline int mm_alloc_cid_noprof(st
mm_init_cid(mm, p);
return 0;
}
-#define mm_alloc_cid(...) alloc_hooks(mm_alloc_cid_noprof(__VA_ARGS__))
+# define mm_alloc_cid(...) alloc_hooks(mm_alloc_cid_noprof(__VA_ARGS__))
static inline void mm_destroy_cid(struct mm_struct *mm)
{
@@ -1514,6 +1514,8 @@ static inline unsigned int mm_cid_size(v
return cpumask_size() + bitmap_size(num_possible_cpus());
}
+/* Use NR_CPUS as worse case for static allocation. */
+# define MM_CID_STATIC_SIZE (2 * sizeof(cpumask_t))
#else /* CONFIG_SCHED_MM_CID */
static inline void mm_init_cid(struct mm_struct *mm, struct task_struct *p) { }
static inline int mm_alloc_cid(struct mm_struct *mm, struct task_struct *p) { return 0; }
@@ -1522,6 +1524,7 @@ static inline unsigned int mm_cid_size(v
{
return 0;
}
+# define MM_CID_STATIC_SIZE 0
#endif /* CONFIG_SCHED_MM_CID */
struct mmu_gather;
_
Patches currently in -mm which might be from mathieu.desnoyers(a)efficios.com are
lib-introduce-hierarchical-per-cpu-counters.patch
mm-fix-oom-killer-inaccuracy-on-large-many-core-systems.patch
mm-implement-precise-oom-killer-task-selection.patch
mm-add-missing-static-initializer-for-init_mm-mm_cidlock.patch
mm-rename-cpu_bitmap-field-to-flexible_array.patch
mm-take-into-account-mm_cid-size-for-mm_struct-static-definitions.patch
mm-take-into-account-hierarchical-percpu-tree-items-for-static-mm_struct-definitions.patch
tsacct-skip-all-kernel-threads.patch
The patch titled
Subject: mm: add missing static initializer for init_mm::mm_cid.lock
has been added to the -mm mm-new branch. Its filename is
mm-add-missing-static-initializer-for-init_mm-mm_cidlock.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
The mm-new branch of mm.git is not included in linux-next
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Subject: mm: add missing static initializer for init_mm::mm_cid.lock
Date: Sun, 21 Dec 2025 18:29:22 -0500
Patch series "MM_CID and HPCC mm_struct static init fixes".
Mark Brown reported a regression [1] on linux next due to the hierarchical
percpu counters (HPCC). You mentioned they were only in mm-new (and
therefore not pulled into -next) [2], but it looks like they got more
exposure that we expected. :)
This bug hunting got me to fix static initialization issues in both MM_CID
(for upstream) and HPCC (mm-new). Mark tested my series and confirmed
that it fixes his issues.
This patch (of 5):
Initialize the mm_cid.lock struct member of init_mm.
Link: https://lkml.kernel.org/r/20251221232926.450602-2-mathieu.desnoyers@efficio…
Fixes: 8cea569ca785 ("sched/mmcid: Use proper data structures")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/init-mm.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/init-mm.c~mm-add-missing-static-initializer-for-init_mm-mm_cidlock
+++ a/mm/init-mm.c
@@ -44,6 +44,9 @@ struct mm_struct init_mm = {
.mm_lock_seq = SEQCNT_ZERO(init_mm.mm_lock_seq),
#endif
.user_ns = &init_user_ns,
+#ifdef CONFIG_SCHED_MM_CID
+ .mm_cid.lock = __RAW_SPIN_LOCK_UNLOCKED(init_mm.mm_cid.lock),
+#endif
.cpu_bitmap = CPU_BITS_NONE,
INIT_MM_CONTEXT(init_mm)
};
_
Patches currently in -mm which might be from mathieu.desnoyers(a)efficios.com are
lib-introduce-hierarchical-per-cpu-counters.patch
mm-fix-oom-killer-inaccuracy-on-large-many-core-systems.patch
mm-implement-precise-oom-killer-task-selection.patch
mm-add-missing-static-initializer-for-init_mm-mm_cidlock.patch
mm-rename-cpu_bitmap-field-to-flexible_array.patch
mm-take-into-account-mm_cid-size-for-mm_struct-static-definitions.patch
mm-take-into-account-hierarchical-percpu-tree-items-for-static-mm_struct-definitions.patch
tsacct-skip-all-kernel-threads.patch
The patch titled
Subject: lib/buildid: use __kernel_read() for sleepable context
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
lib-buildid-use-__kernel_read-for-sleepable-context.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Shakeel Butt <shakeel.butt(a)linux.dev>
Subject: lib/buildid: use __kernel_read() for sleepable context
Date: Mon, 22 Dec 2025 12:58:59 -0800
Prevent a "BUG: unable to handle kernel NULL pointer dereference in
filemap_read_folio".
For the sleepable context, convert freader to use __kernel_read() instead
of direct page cache access via read_cache_folio(). This simplifies the
faultable code path by using the standard kernel file reading interface
which handles all the complexity of reading file data.
At the moment we are not changing the code for non-sleepable context which
uses filemap_get_folio() and only succeeds if the target folios are
already in memory and up-to-date. The reason is to keep the patch simple
and easier to backport to stable kernels.
Syzbot repro does not crash the kernel anymore and the selftests run
successfully.
In the follow up we will make __kernel_read() with IOCB_NOWAIT work for
non-sleepable contexts. In addition, I would like to replace the
secretmem check with a more generic approach and will add fstest for the
buildid code.
Link: https://lkml.kernel.org/r/20251222205859.3968077-1-shakeel.butt@linux.dev
Fixes: ad41251c290d ("lib/buildid: implement sleepable build_id_parse() API")
Reported-by: syzbot+09b7d050e4806540153d(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=09b7d050e4806540153d
Signed-off-by: Shakeel Butt <shakeel.butt(a)linux.dev>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Andrii Nakryiko <andrii(a)kernel.org>
Cc: Daniel Borkman <daniel(a)iogearbox.net>
Cc: "Darrick J. Wong" <djwong(a)kernel.org>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/buildid.c | 32 ++++++++++++++++++++------------
1 file changed, 20 insertions(+), 12 deletions(-)
--- a/lib/buildid.c~lib-buildid-use-__kernel_read-for-sleepable-context
+++ a/lib/buildid.c
@@ -5,6 +5,7 @@
#include <linux/elf.h>
#include <linux/kernel.h>
#include <linux/pagemap.h>
+#include <linux/fs.h>
#include <linux/secretmem.h>
#define BUILD_ID 3
@@ -46,20 +47,9 @@ static int freader_get_folio(struct frea
freader_put_folio(r);
- /* reject secretmem folios created with memfd_secret() */
- if (secretmem_mapping(r->file->f_mapping))
- return -EFAULT;
-
+ /* only use page cache lookup - fail if not already cached */
r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT);
- /* if sleeping is allowed, wait for the page, if necessary */
- if (r->may_fault && (IS_ERR(r->folio) || !folio_test_uptodate(r->folio))) {
- filemap_invalidate_lock_shared(r->file->f_mapping);
- r->folio = read_cache_folio(r->file->f_mapping, file_off >> PAGE_SHIFT,
- NULL, r->file);
- filemap_invalidate_unlock_shared(r->file->f_mapping);
- }
-
if (IS_ERR(r->folio) || !folio_test_uptodate(r->folio)) {
if (!IS_ERR(r->folio))
folio_put(r->folio);
@@ -97,6 +87,24 @@ const void *freader_fetch(struct freader
return r->data + file_off;
}
+ /* reject secretmem folios created with memfd_secret() */
+ if (secretmem_mapping(r->file->f_mapping)) {
+ r->err = -EFAULT;
+ return NULL;
+ }
+
+ /* use __kernel_read() for sleepable context */
+ if (r->may_fault) {
+ ssize_t ret;
+
+ ret = __kernel_read(r->file, r->buf, sz, &file_off);
+ if (ret != sz) {
+ r->err = (ret < 0) ? ret : -EIO;
+ return NULL;
+ }
+ return r->buf;
+ }
+
/* fetch or reuse folio for given file offset */
r->err = freader_get_folio(r, file_off);
if (r->err)
_
Patches currently in -mm which might be from shakeel.butt(a)linux.dev are
mm-memcg-fix-unit-conversion-for-k-macro-in-oom-log.patch
lib-buildid-use-__kernel_read-for-sleepable-context.patch
Calling napi_disable() on an already disabled napi can cause the
deadlock. In commit 4bc12818b363 ("virtio-net: disable delayed refill
when pausing rx"), to avoid the deadlock, when pausing the RX in
virtnet_rx_pause[_all](), we disable and cancel the delayed refill work.
However, in the virtnet_rx_resume_all(), we enable the delayed refill
work too early before enabling all the receive queue napis.
The deadlock can be reproduced by running
selftests/drivers/net/hw/xsk_reconfig.py with multiqueue virtio-net
device and inserting a cond_resched() inside the for loop in
virtnet_rx_resume_all() to increase the success rate. Because the worker
processing the delayed refilled work runs on the same CPU as
virtnet_rx_resume_all(), a reschedule is needed to cause the deadlock.
In real scenario, the contention on netdev_lock can cause the
reschedule.
This fixes the deadlock by ensuring all receive queue's napis are
enabled before we enable the delayed refill work in
virtnet_rx_resume_all() and virtnet_open().
Fixes: 4bc12818b363 ("virtio-net: disable delayed refill when pausing rx")
Reported-by: Paolo Abeni <pabeni(a)redhat.com>
Closes: https://netdev-ctrl.bots.linux.dev/logs/vmksft/drv-hw-dbg/results/400961/3-…
Cc: stable(a)vger.kernel.org
Signed-off-by: Bui Quang Minh <minhquangbui99(a)gmail.com>
---
Changes in v2:
- Move try_fill_recv() before rx napi_enable()
- Link to v1: https://lore.kernel.org/netdev/20251208153419.18196-1-minhquangbui99@gmail.…
---
drivers/net/virtio_net.c | 71 +++++++++++++++++++++++++---------------
1 file changed, 45 insertions(+), 26 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 8e04adb57f52..4e08880a9467 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3214,21 +3214,31 @@ static void virtnet_update_settings(struct virtnet_info *vi)
static int virtnet_open(struct net_device *dev)
{
struct virtnet_info *vi = netdev_priv(dev);
+ bool schedule_refill = false;
int i, err;
- enable_delayed_refill(vi);
-
+ /* - We must call try_fill_recv before enabling napi of the same receive
+ * queue so that it doesn't race with the call in virtnet_receive.
+ * - We must enable and schedule delayed refill work only when we have
+ * enabled all the receive queue's napi. Otherwise, in refill_work, we
+ * have a deadlock when calling napi_disable on an already disabled
+ * napi.
+ */
for (i = 0; i < vi->max_queue_pairs; i++) {
if (i < vi->curr_queue_pairs)
/* Make sure we have some buffers: if oom use wq. */
if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
- schedule_delayed_work(&vi->refill, 0);
+ schedule_refill = true;
err = virtnet_enable_queue_pair(vi, i);
if (err < 0)
goto err_enable_qp;
}
+ enable_delayed_refill(vi);
+ if (schedule_refill)
+ schedule_delayed_work(&vi->refill, 0);
+
if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) {
if (vi->status & VIRTIO_NET_S_LINK_UP)
netif_carrier_on(vi->dev);
@@ -3463,39 +3473,48 @@ static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq)
__virtnet_rx_pause(vi, rq);
}
-static void __virtnet_rx_resume(struct virtnet_info *vi,
- struct receive_queue *rq,
- bool refill)
+static void virtnet_rx_resume_all(struct virtnet_info *vi)
{
- bool running = netif_running(vi->dev);
bool schedule_refill = false;
+ int i;
- if (refill && !try_fill_recv(vi, rq, GFP_KERNEL))
- schedule_refill = true;
- if (running)
- virtnet_napi_enable(rq);
-
- if (schedule_refill)
- schedule_delayed_work(&vi->refill, 0);
-}
+ if (netif_running(vi->dev)) {
+ /* See the comment in virtnet_open for the ordering rule
+ * of try_fill_recv, receive queue napi_enable and delayed
+ * refill enable/schedule.
+ */
+ for (i = 0; i < vi->max_queue_pairs; i++) {
+ if (i < vi->curr_queue_pairs)
+ if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
+ schedule_refill = true;
-static void virtnet_rx_resume_all(struct virtnet_info *vi)
-{
- int i;
+ virtnet_napi_enable(&vi->rq[i]);
+ }
- enable_delayed_refill(vi);
- for (i = 0; i < vi->max_queue_pairs; i++) {
- if (i < vi->curr_queue_pairs)
- __virtnet_rx_resume(vi, &vi->rq[i], true);
- else
- __virtnet_rx_resume(vi, &vi->rq[i], false);
+ enable_delayed_refill(vi);
+ if (schedule_refill)
+ schedule_delayed_work(&vi->refill, 0);
}
}
static void virtnet_rx_resume(struct virtnet_info *vi, struct receive_queue *rq)
{
- enable_delayed_refill(vi);
- __virtnet_rx_resume(vi, rq, true);
+ bool schedule_refill = false;
+
+ if (netif_running(vi->dev)) {
+ /* See the comment in virtnet_open for the ordering rule
+ * of try_fill_recv, receive queue napi_enable and delayed
+ * refill enable/schedule.
+ */
+ if (!try_fill_recv(vi, rq, GFP_KERNEL))
+ schedule_refill = true;
+
+ virtnet_napi_enable(rq);
+
+ enable_delayed_refill(vi);
+ if (schedule_refill)
+ schedule_delayed_work(&vi->refill, 0);
+ }
}
static int virtnet_rx_resize(struct virtnet_info *vi,
--
2.43.0
This patch series addresses two issues in demote_folio_list()
and can_demote() in reclaim/demotion.
Commit 7d709f49babc ("vmscan,cgroup: apply mems_effective to reclaim")
introduces the cpuset.mems_effective check and applies it to
can_demote(). However:
1. It does not apply this check in demote_folio_list(), which leads
to situations where pages are demoted to nodes that are
explicitly excluded from the task's cpuset.mems.
2. It checks only the nodes in the immediate next demotion hierarchy
and does not check all allowed demotion targets in can_demote().
This can cause pages to never be demoted if the nodes in the next
demotion hierarchy are not set in mems_effective.
To address these bugs, implement a new function
mem_cgroup_filter_mems_allowed() to filter out nodes that are not
set in mems_effective, and update demote_folio_list() and can_demote()
accordingly.
Reproduct Bug 1:
Assume a system with 4 nodes, where nodes 0-1 are top-tier and
nodes 2-3 are far-tier memory. All nodes have equal capacity.
Test script to reproduct:
echo 1 > /sys/kernel/mm/numa/demotion_enabled
mkdir /sys/fs/cgroup/test
echo +cpuset > /sys/fs/cgroup/cgroup.subtree_control
echo "0-2" > /sys/fs/cgroup/test/cpuset.mems
echo $$ > /sys/fs/cgroup/test/cgroup.procs
swapoff -a
# Expectation: Should respect node 0-2 limit.
# Observation: Node 3 shows significant allocation (MemFree drops)
stress-ng --oomable --vm 1 --vm-bytes 150% --mbind 0,1
Reproduct Bug 2:
Assume a system with 6 nodes, where nodes 0-2 are top-tier,
node 3 is far-tier node, and nodes 4-5 are the farthest-tier nodes.
All nodes have equal capacity.
Test script:
echo 1 > /sys/kernel/mm/numa/demotion_enabled
mkdir /sys/fs/cgroup/test
echo +cpuset > /sys/fs/cgroup/cgroup.subtree_control
echo "0-2,4-5" > /sys/fs/cgroup/test/cpuset.mems
echo $$ > /sys/fs/cgroup/test/cgroup.procs
swapoff -a
# Expectation: Pages are demoted to Nodes 4-5
# Observation: No pages are demoted before oom.
stress-ng --oomable --vm 1 --vm-bytes 150% --mbind 0,1,2
Bing Jiao (2):
mm/vmscan: respect mems_effective in demote_folio_list()
mm/vmscan: check all allowed targets in can_demote()
include/linux/cpuset.h | 6 +++---
include/linux/memcontrol.h | 6 +++---
kernel/cgroup/cpuset.c | 12 +++++-------
mm/memcontrol.c | 5 +++--
mm/vmscan.c | 27 ++++++++++++++++++---------
5 files changed, 32 insertions(+), 24 deletions(-)
--
2.52.0.351.gbe84eed79e-goog
A process issuing blocking writes to a virtio console may get stuck
indefinitely if another thread polls the device. Here is how to trigger
the bug:
- Thread A writes to the port until the virtqueue is full.
- Thread A calls wait_port_writable() and goes to sleep, waiting on
port->waitqueue.
- The host processes some of the write, marks buffers as used and raises
an interrupt.
- Before the interrupt is serviced, thread B executes port_fops_poll().
This calls reclaim_consumed_buffers() via will_write_block() and
consumes all used buffers.
- The interrupt is serviced. vring_interrupt() finds no used buffers
via more_used() and returns without waking port->waitqueue.
- Thread A is still in wait_event(port->waitqueue), waiting for a
wakeup that never arrives.
The crux is that invoking reclaim_consumed_buffers() may cause
vring_interrupt() to omit wakeups.
Fix this by calling reclaim_consumed_buffers() in out_int() before
waking. This is similar to the call to discard_port_data() in
in_intr() which also frees buffer from a non-sleepable context.
This in turn guarantees that port->outvq_full is up to date when
handling polling. Since in_intr() already populates port->inbuf we
use that to avoid changing reader state.
Cc: stable(a)vger.kernel.org
Signed-off-by: Lorenz Bauer <lmb(a)isovalent.com>
---
As far as I can tell all currently maintained stable series kernels need
this commit. Applies and builds cleanly on 5.10.247, verified to fix
the issue.
---
Changes in v2:
- Call reclaim_consumed_buffers() in out_intr instead of
issuing another wake.
- Link to v1: https://lore.kernel.org/r/20251215-virtio-console-lost-wakeup-v1-1-79a5c578…
---
drivers/char/virtio_console.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 088182e54debd6029ea2c2a5542d7a28500e67b8..351e445da35e25910671615a8ecc79165dfc66b7 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -971,10 +971,17 @@ static __poll_t port_fops_poll(struct file *filp, poll_table *wait)
return EPOLLHUP;
}
ret = 0;
- if (!will_read_block(port))
+
+ spin_lock(&port->inbuf_lock);
+ if (port->inbuf)
ret |= EPOLLIN | EPOLLRDNORM;
- if (!will_write_block(port))
+ spin_unlock(&port->inbuf_lock);
+
+ spin_lock(&port->outvq_lock);
+ if (!port->outvq_full)
ret |= EPOLLOUT;
+ spin_unlock(&port->outvq_lock);
+
if (!port->host_connected)
ret |= EPOLLHUP;
@@ -1698,6 +1705,7 @@ static void flush_bufs(struct virtqueue *vq, bool can_sleep)
static void out_intr(struct virtqueue *vq)
{
struct port *port;
+ unsigned long flags;
port = find_port_by_vq(vq->vdev->priv, vq);
if (!port) {
@@ -1705,6 +1713,10 @@ static void out_intr(struct virtqueue *vq)
return;
}
+ spin_lock_irqsave(&port->outvq_lock, flags);
+ reclaim_consumed_buffers(port);
+ spin_unlock_irqrestore(&port->outvq_lock, flags);
+
wake_up_interruptible(&port->waitqueue);
}
---
base-commit: d358e5254674b70f34c847715ca509e46eb81e6f
change-id: 20251215-virtio-console-lost-wakeup-0f566c5cd35f
Best regards,
--
Lorenz Bauer <lmb(a)isovalent.com>
When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
passed head pfn to kill_accessing_process(), that is not right.
The precise pfn of the poisoned page should be used in order to
determine the precise vaddr as the SIGBUS payload.
This issue has already been taken care of in the normal path, that is,
hwpoison_user_mappings(), see [1][2]. Further more, for [3] to work
correctly in the hugetlb repoisoning case, it's essential to inform
VM the precise poisoned page, not the head page.
[1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
[2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
[3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Jane Chu <jane.chu(a)oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
---
v1 -> v2:
pickup R-B, add stable to cc list.
---
mm/memory-failure.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 3edebb0cda30..c9d87811b1ea 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
}
static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
- unsigned long poisoned_pfn, struct to_kill *tk)
+ unsigned long poisoned_pfn, struct to_kill *tk,
+ int pte_nr)
{
unsigned long pfn = 0;
+ unsigned long hwpoison_vaddr;
if (pte_present(pte)) {
pfn = pte_pfn(pte);
@@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
pfn = swp_offset_pfn(swp);
}
- if (!pfn || pfn != poisoned_pfn)
+ if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
return 0;
- set_to_kill(tk, addr, shift);
+ hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);
+ set_to_kill(tk, hwpoison_vaddr, shift);
return 1;
}
@@ -749,7 +752,7 @@ static int hwpoison_pte_range(pmd_t *pmdp, unsigned long addr,
for (; addr != end; ptep++, addr += PAGE_SIZE) {
ret = check_hwpoisoned_entry(ptep_get(ptep), addr, PAGE_SHIFT,
- hwp->pfn, &hwp->tk);
+ hwp->pfn, &hwp->tk, 1);
if (ret == 1)
break;
}
@@ -772,8 +775,8 @@ static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask,
ptl = huge_pte_lock(h, walk->mm, ptep);
pte = huge_ptep_get(walk->mm, addr, ptep);
- ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h),
- hwp->pfn, &hwp->tk);
+ ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h), hwp->pfn,
+ &hwp->tk, pages_per_huge_page(h));
spin_unlock(ptl);
return ret;
}
@@ -2023,10 +2026,8 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
*hugetlb = 0;
return 0;
} else if (res == -EHWPOISON) {
- if (flags & MF_ACTION_REQUIRED) {
- folio = page_folio(p);
- res = kill_accessing_process(current, folio_pfn(folio), flags);
- }
+ if (flags & MF_ACTION_REQUIRED)
+ res = kill_accessing_process(current, pfn, flags);
action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
return res;
} else if (res == -EBUSY) {
@@ -2037,6 +2038,7 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
return action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
}
+
folio = page_folio(p);
folio_lock(folio);
--
2.43.5
The hw clock gating register sequence consists of register value pairs
that are written to the GPU during initialisation.
The a690 hwcg sequence has two GMU registers in it that used to amount
to random writes in the GPU mapping, but since commit 188db3d7fe66
("drm/msm/a6xx: Rebase GMU register offsets") they trigger a fault as
the updated offsets now lie outside the mapping. This in turn breaks
boot of machines like the Lenovo ThinkPad X13s.
Note that the updates of these GMU registers is already taken care of
properly since commit 40c297eb245b ("drm/msm/a6xx: Set GMU CGC
properties on a6xx too"), but for some reason these two entries were
left in the table.
Fixes: 5e7665b5e484 ("drm/msm/adreno: Add Adreno A690 support")
Cc: stable(a)vger.kernel.org # 6.5
Cc: Bjorn Andersson <andersson(a)kernel.org>
Cc: Konrad Dybcio <konradybcio(a)kernel.org>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index 29107b362346..4c2f739ee9b7 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -501,8 +501,6 @@ static const struct adreno_reglist a690_hwcg[] = {
{REG_A6XX_RBBM_CLOCK_CNTL_GMU_GX, 0x00000222},
{REG_A6XX_RBBM_CLOCK_DELAY_GMU_GX, 0x00000111},
{REG_A6XX_RBBM_CLOCK_HYST_GMU_GX, 0x00000555},
- {REG_A6XX_GPU_GMU_AO_GMU_CGC_DELAY_CNTL, 0x10111},
- {REG_A6XX_GPU_GMU_AO_GMU_CGC_HYST_CNTL, 0x5555},
{}
};
--
2.51.2
The driver trusts the 'num' and 'entry_size' fields read from BAR2 and
uses them directly to compute the length for memcpy_fromio() without
any bounds checking. If these fields get corrupted or otherwise contain
invalid values, num * entry_size can exceed the size of
proc_mon_info.entries and lead to a potential out-of-bounds write.
Add validation for 'entry_size' by ensuring it is non-zero and that
num * entry_size does not exceed the size of proc_mon_info.entries.
Fixes: ff428d052b3b ("misc: bcm-vk: add get_card_info, peerlog_info, and proc_mon_info")
Cc: stable(a)vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com>
---
drivers/misc/bcm-vk/bcm_vk_dev.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/misc/bcm-vk/bcm_vk_dev.c b/drivers/misc/bcm-vk/bcm_vk_dev.c
index a16b99bdaa13..a4a74c10f02b 100644
--- a/drivers/misc/bcm-vk/bcm_vk_dev.c
+++ b/drivers/misc/bcm-vk/bcm_vk_dev.c
@@ -439,6 +439,7 @@ static void bcm_vk_get_proc_mon_info(struct bcm_vk *vk)
struct device *dev = &vk->pdev->dev;
struct bcm_vk_proc_mon_info *mon = &vk->proc_mon_info;
u32 num, entry_size, offset, buf_size;
+ size_t max_bytes;
u8 *dst;
/* calculate offset which is based on peerlog offset */
@@ -458,6 +459,9 @@ static void bcm_vk_get_proc_mon_info(struct bcm_vk *vk)
num, BCM_VK_PROC_MON_MAX);
return;
}
+ if (!entry_size || (size_t)num > max_bytes / entry_size) {
+ return;
+ }
mon->num = num;
mon->entry_size = entry_size;
--
2.43.0
A process issuing blocking writes to a virtio console may get stuck
indefinitely if another thread polls the device. Here is how to trigger
the bug:
- Thread A writes to the port until the virtqueue is full.
- Thread A calls wait_port_writable() and goes to sleep, waiting on
port->waitqueue.
- The host processes some of the write, marks buffers as used and raises
an interrupt.
- Before the interrupt is serviced, thread B executes port_fops_poll().
This calls reclaim_consumed_buffers() via will_write_block() and
consumes all used buffers.
- The interrupt is serviced. vring_interrupt() finds no used buffers
via more_used() and returns without waking port->waitqueue.
- Thread A is still in wait_event(port->waitqueue), waiting for a
wakeup that never arrives.
The crux is that invoking reclaim_consumed_buffers() may cause
vring_interrupt() to omit wakeups.
Fix this by making reclaim_consumed_buffers() issue an additional wake
up if it consumed any buffers.
Signed-off-by: Lorenz Bauer <lmb(a)isovalent.com>
---
As far as I can tell all currently maintained stable series kernels need
this commit. I've tested that it applies cleanly to 5.10.247, however
wasn't able to build the kernel due to an unrelated link error. Instead
I applied it to 5.15.197, which compiled and verified to be fixed.
---
drivers/char/virtio_console.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 088182e54debd6029ea2c2a5542d7a28500e67b8..7cd3ad9da9b53a7a570410f12501acc7fd7e3b9b 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -581,6 +581,7 @@ static ssize_t send_control_msg(struct port *port, unsigned int event,
static void reclaim_consumed_buffers(struct port *port)
{
struct port_buffer *buf;
+ bool freed = false;
unsigned int len;
if (!port->portdev) {
@@ -589,7 +590,15 @@ static void reclaim_consumed_buffers(struct port *port)
}
while ((buf = virtqueue_get_buf(port->out_vq, &len))) {
free_buf(buf, false);
+ freed = true;
+ }
+ if (freed) {
+ /* We freed all used buffers. Issue a wake up so that other pending
+ * tasks do not get stuck. This is necessary because vring_interrupt()
+ * will drop wakeups from the host if there are no used buffers.
+ */
port->outvq_full = false;
+ wake_up_interruptible(&port->waitqueue);
}
}
---
base-commit: d358e5254674b70f34c847715ca509e46eb81e6f
change-id: 20251215-virtio-console-lost-wakeup-0f566c5cd35f
Best regards,
--
Lorenz Bauer <lmb(a)isovalent.com>
Add missing null check for cci parameter before dereferencing it in
ucsi_sync_control_common(). The function can be called with cci=NULL
from ucsi_acknowledge(), which leads to a null pointer dereference
when accessing *cci in the condition check.
The crash occurs because the code checks if cci is not null before
calling ucsi->ops->read_cci(ucsi, cci), but then immediately
dereferences cci without a null check in the following condition:
(*cci & UCSI_CCI_COMMAND_COMPLETE).
KASAN trace:
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
RIP: 0010:ucsi_sync_control_common+0x2ae/0x4e0 [typec_ucsi]
Cc: stable(a)vger.kernel.org
Fixes: 667ecac55861 ("usb: typec: ucsi: return CCI and message from sync_control callback")
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Mario Limonciello (AMD) <superm1(a)kernel.org>
---
v2:
* Add stable tag
* Add Heikki's tag
---
drivers/usb/typec/ucsi/ucsi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index 9b3df776137a1..7129973f19e7e 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -97,7 +97,7 @@ int ucsi_sync_control_common(struct ucsi *ucsi, u64 command, u32 *cci)
if (!ret && cci)
ret = ucsi->ops->read_cci(ucsi, cci);
- if (!ret && ucsi->message_in_size > 0 &&
+ if (!ret && cci && ucsi->message_in_size > 0 &&
(*cci & UCSI_CCI_COMMAND_COMPLETE))
ret = ucsi->ops->read_message_in(ucsi, ucsi->message_in,
ucsi->message_in_size);
--
2.43.0
From: Quentin Schulz <quentin.schulz(a)cherry.de>
The Device Tree specification specifies[1] that
"""
Each node in the devicetree is named according to the following
convention:
node-name@unit-address
[...]
The unit-address must match the first address specified in the reg
property of the node.
"""
The first address in the reg property is fdaXa000 and not fdaX9000. This
is likely a copy-paste error as the IOMMU for core0 has two entries in
the reg property, the first one being fdab9000 and the second fdaba000.
Let's fix this oversight to match what the spec is expecting.
[1] https://github.com/devicetree-org/devicetree-specification/releases/downloa… 2.2.1 Node Names
Fixes: a31dfc060a74 ("arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588-base")
Cc: stable(a)vger.kernel.org
Signed-off-by: Quentin Schulz <quentin.schulz(a)cherry.de>
---
arch/arm64/boot/dts/rockchip/rk3588-base.dtsi | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
index 2a79217930206..7ab12d1054a73 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
@@ -1200,7 +1200,7 @@ rknn_core_1: npu@fdac0000 {
status = "disabled";
};
- rknn_mmu_1: iommu@fdac9000 {
+ rknn_mmu_1: iommu@fdaca000 {
compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu";
reg = <0x0 0xfdaca000 0x0 0x100>;
interrupts = <GIC_SPI 111 IRQ_TYPE_LEVEL_HIGH 0>;
@@ -1230,7 +1230,7 @@ rknn_core_2: npu@fdad0000 {
status = "disabled";
};
- rknn_mmu_2: iommu@fdad9000 {
+ rknn_mmu_2: iommu@fdada000 {
compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu";
reg = <0x0 0xfdada000 0x0 0x100>;
interrupts = <GIC_SPI 112 IRQ_TYPE_LEVEL_HIGH 0>;
---
base-commit: a619746d25c8adafe294777cc98c47a09759b3ed
change-id: 20251215-npu-dt-node-address-7bfeab42dae9
Best regards,
--
Quentin Schulz <quentin.schulz(a)cherry.de>
This reverts commit ba4a2e2317b9faeca9193ed6d3193ddc3cf2aba3.
While this fake hotplugging was a nice idea, it has shown that this feature
does not handle PCIe switches correctly:
pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43
pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them
pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44
pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them
pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45
pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them
pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46
pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them
pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46
pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41])
pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them
pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46
During the initial scan, PCI core doesn't see the switch and since the Root
Port is not hot plug capable, the secondary bus number gets assigned as the
subordinate bus number. This means, the PCI core assumes that only one bus
will appear behind the Root Port since the Root Port is not hot plug
capable.
This works perfectly fine for PCIe endpoints connected to the Root Port,
since they don't extend the bus. However, if a PCIe switch is connected,
then there is a problem when the downstream busses starts showing up and
the PCI core doesn't extend the subordinate bus number after initial scan
during boot.
The long term plan is to migrate this driver to the pwrctrl framework,
once it adds proper support for powering up and enumerating PCIe switches.
Cc: stable(a)vger.kernel.org
Suggested-by: Manivannan Sadhasivam <mani(a)kernel.org>
Acked-by: Shawn Lin <shawn.lin(a)rock-chips.com>
Tested-by: Shawn Lin <shawn.lin(a)rock-chips.com>
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
drivers/pci/controller/dwc/pcie-qcom.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index e87ec6779d44..c5fcb87972e9 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -136,7 +136,6 @@
/* PARF_INT_ALL_{STATUS/CLEAR/MASK} register fields */
#define PARF_INT_ALL_LINK_UP BIT(13)
-#define PARF_INT_MSI_DEV_0_7 GENMASK(30, 23)
/* PARF_NO_SNOOP_OVERRIDE register fields */
#define WR_NO_SNOOP_OVERRIDE_EN BIT(1)
@@ -1982,8 +1981,7 @@ static int qcom_pcie_probe(struct platform_device *pdev)
goto err_host_deinit;
}
- writel_relaxed(PARF_INT_ALL_LINK_UP | PARF_INT_MSI_DEV_0_7,
- pcie->parf + PARF_INT_ALL_MASK);
+ writel_relaxed(PARF_INT_ALL_LINK_UP, pcie->parf + PARF_INT_ALL_MASK);
}
qcom_pcie_icc_opp_update(pcie);
--
2.52.0