If the CM ID is destroyed while the CM event for multicast creating is
still queued the cancel_work_sync() will prevent the work from running
which also prevents destroying the ah_attr. This leaks a refcount and
triggers a WARN:
GID entry ref leak for dev syz1 index 2 ref=573
WARNING: CPU: 1 PID: 655 at drivers/infiniband/core/cache.c:809 release_gid_table drivers/infiniband/core/cache.c:806 [inline]
WARNING: CPU: 1 PID: 655 at drivers/infiniband/core/cache.c:809 gid_table_release_one+0x284/0x3cc drivers/infiniband/core/cache.c:886
Destroy the ah_attr after canceling the work, it is safe to call this
twice.
Cc: stable(a)vger.kernel.org
Fixes: fe454dc31e84 ("RDMA/ucma: Fix use-after-free bug in ucma_create_uevent")
Reported-by: syzbot+b0da83a6c0e2e2bddbd4(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68232e7b.050a0220.f2294.09f6.GAE@google.com
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
---
drivers/infiniband/core/cma.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 95e89f5c147c2c..4f5fd47086ab90 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2031,6 +2031,8 @@ static void destroy_mc(struct rdma_id_private *id_priv,
dev_put(ndev);
cancel_work_sync(&mc->iboe_join.work);
+ if (event->event == RDMA_CM_EVENT_MULTICAST_JOIN)
+ rdma_destroy_ah_attr(&event->param.ud.ah_attr);
}
kfree(mc);
}
base-commit: 3fbaef0942719187f3396bfd0c780d55d35e0980
--
2.43.0
Initialize the eb.vma array with values of 0 when the eb structure is
first set up. In particular, this sets the eb->vma[i].vma pointers to
NULL, simplifying cleanup and getting rid of the bug described below.
During the execution of eb_lookup_vmas(), the eb->vma array is
successively filled up with struct eb_vma objects. This process includes
calling eb_add_vma(), which might fail; however, even in the event of
failure, eb->vma[i].vma is set for the currently processed buffer.
If eb_add_vma() fails, eb_lookup_vmas() returns with an error, which
prompts a call to eb_release_vmas() to clean up the mess. Since
eb_lookup_vmas() might fail during processing any (possibly not first)
buffer, eb_release_vmas() checks whether a buffer's vma is NULL to know
at what point did the lookup function fail.
In eb_lookup_vmas(), eb->vma[i].vma is set to NULL if either the helper
function eb_lookup_vma() or eb_validate_vma() fails. eb->vma[i+1].vma is
set to NULL in case i915_gem_object_userptr_submit_init() fails; the
current one needs to be cleaned up by eb_release_vmas() at this point,
so the next one is set. If eb_add_vma() fails, neither the current nor
the next vma is set to NULL, which is a source of a NULL deref bug
described in the issue linked in the Closes tag.
When entering eb_lookup_vmas(), the vma pointers are set to the slab
poison value, instead of NULL. This doesn't matter for the actual
lookup, since it gets overwritten anyway, however the eb_release_vmas()
function only recognizes NULL as the stopping value, hence the pointers
are being set to NULL as they go in case of intermediate failure. This
patch changes the approach to filling them all with NULL at the start
instead, rather than handling that manually during failure.
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15062
Fixes: 544460c33821 ("drm/i915: Multi-BB execbuf")
Reported-by: Gangmin Kim <km.kim1503(a)gmail.com>
Cc: <stable(a)vger.kernel.org> # 5.16.x
Signed-off-by: Krzysztof Niemiec <krzysztof.niemiec(a)intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik(a)linux.intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas(a)intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)linux.intel.com>
---
I messed up the continuity in previous revisions; the original patch
was sent as [1], and the first revision (which I didn't mark as v2 due
to the title change) was sent as [2].
This is the full current changelog:
v5:
- improve style and fix nits in commit log (Andi)
- fix typos and style in the code and comments (Andi)
- set args->buffer_count + 1 values to 0 instead of just
args->buffer_count (Andi)
v4:
- delete an empty line (Janusz), reword the comment a bit (Krzysztof,
Janusz)
v3:
- use memset() to fill the entire eb.vma array with zeros instead of
looping through the elements (Janusz)
- add a comment clarifying the mechanism of the initial allocation (Janusz)
- change the commit log again, including title
- rearrange the tags to keep checkpatch happy
v2:
- set the eb->vma[i].vma pointers to NULL during setup instead of
ad-hoc at failure (Janusz)
- romanize the reporter's name (Andi, offline)
- change the commit log, including title
[1] https://patchwork.freedesktop.org/series/156832/
[2] https://patchwork.freedesktop.org/series/158036/
.../gpu/drm/i915/gem/i915_gem_execbuffer.c | 37 +++++++++----------
1 file changed, 17 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index b057c2fa03a4..d49e96f9be51 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -951,13 +951,13 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
vma = eb_lookup_vma(eb, eb->exec[i].handle);
if (IS_ERR(vma)) {
err = PTR_ERR(vma);
- goto err;
+ return err;
}
err = eb_validate_vma(eb, &eb->exec[i], vma);
if (unlikely(err)) {
i915_vma_put(vma);
- goto err;
+ return err;
}
err = eb_add_vma(eb, ¤t_batch, i, vma);
@@ -966,19 +966,8 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
if (i915_gem_object_is_userptr(vma->obj)) {
err = i915_gem_object_userptr_submit_init(vma->obj);
- if (err) {
- if (i + 1 < eb->buffer_count) {
- /*
- * Execbuffer code expects last vma entry to be NULL,
- * since we already initialized this entry,
- * set the next value to NULL or we mess up
- * cleanup handling.
- */
- eb->vma[i + 1].vma = NULL;
- }
-
+ if (err)
return err;
- }
eb->vma[i].flags |= __EXEC_OBJECT_USERPTR_INIT;
eb->args->flags |= __EXEC_USERPTR_USED;
@@ -986,10 +975,6 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
}
return 0;
-
-err:
- eb->vma[i].vma = NULL;
- return err;
}
static int eb_lock_vmas(struct i915_execbuffer *eb)
@@ -3375,7 +3360,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
eb.exec = exec;
eb.vma = (struct eb_vma *)(exec + args->buffer_count + 1);
- eb.vma[0].vma = NULL;
+ memset(eb.vma, 0, (args->buffer_count + 1) * sizeof(struct eb_vma));
+
eb.batch_pool = NULL;
eb.invalid_flags = __EXEC_OBJECT_UNKNOWN_FLAGS;
@@ -3584,7 +3570,18 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
if (err)
return err;
- /* Allocate extra slots for use by the command parser */
+ /*
+ * Allocate extra slots for use by the command parser.
+ *
+ * Note that this allocation handles two different arrays (the
+ * exec2_list array, and the eventual eb.vma array introduced in
+ * i915_gem_do_execbuffer()), that reside in virtually contiguous
+ * memory. Also note that the allocation intentionally doesn't fill the
+ * area with zeros, because the exec2_list part doesn't need to be, as
+ * it's immediately overwritten by user data a few lines below.
+ * However, the eb.vma part is explicitly zeroed later in
+ * i915_gem_do_execbuffer().
+ */
exec2_list = kvmalloc_array(count + 2, eb_element_size(),
__GFP_NOWARN | GFP_KERNEL);
if (exec2_list == NULL) {
--
2.45.2
The patch titled
Subject: mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-memory-failure-fix-missing-mf_stats-count-in-hugetlb-poison.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days
------------------------------------------------------
From: Jane Chu <jane.chu(a)oracle.com>
Subject: mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
Date: Tue, 16 Dec 2025 14:56:21 -0700
When a newly poisoned subpage ends up in an already poisoned hugetlb
folio, 'num_poisoned_pages' is incremented, but the per node ->mf_stats is
not. Fix the inconsistency by designating action_result() to update them
both.
Link: https://lkml.kernel.org/r/20251216215621.920093-1-jane.chu@oracle.com
Fixes: 18f41fa616ee4 ("mm: memory-failure: bump memory failure stats to pglist_data")
Signed-off-by: Jane Chu <jane.chu(a)oracle.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Jiaqi Yan <jiaqiyan(a)google.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: William Roche <william.roche(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/hugetlb.h | 4 ++--
include/linux/mm.h | 4 ++--
mm/hugetlb.c | 4 ++--
mm/memory-failure.c | 22 +++++++++++++---------
4 files changed, 19 insertions(+), 15 deletions(-)
--- a/include/linux/hugetlb.h~mm-memory-failure-fix-missing-mf_stats-count-in-hugetlb-poison
+++ a/include/linux/hugetlb.h
@@ -156,7 +156,7 @@ long hugetlb_unreserve_pages(struct inod
bool folio_isolate_hugetlb(struct folio *folio, struct list_head *list);
int get_hwpoison_hugetlb_folio(struct folio *folio, bool *hugetlb, bool unpoison);
int get_huge_page_for_hwpoison(unsigned long pfn, int flags,
- bool *migratable_cleared);
+ bool *migratable_cleared, bool *samepg);
void folio_putback_hugetlb(struct folio *folio);
void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int reason);
void hugetlb_fix_reserve_counts(struct inode *inode);
@@ -418,7 +418,7 @@ static inline int get_hwpoison_hugetlb_f
}
static inline int get_huge_page_for_hwpoison(unsigned long pfn, int flags,
- bool *migratable_cleared)
+ bool *migratable_cleared, bool *samepg)
{
return 0;
}
--- a/include/linux/mm.h~mm-memory-failure-fix-missing-mf_stats-count-in-hugetlb-poison
+++ a/include/linux/mm.h
@@ -4351,7 +4351,7 @@ extern int soft_offline_page(unsigned lo
extern const struct attribute_group memory_failure_attr_group;
extern void memory_failure_queue(unsigned long pfn, int flags);
extern int __get_huge_page_for_hwpoison(unsigned long pfn, int flags,
- bool *migratable_cleared);
+ bool *migratable_cleared, bool *samepg);
void num_poisoned_pages_inc(unsigned long pfn);
void num_poisoned_pages_sub(unsigned long pfn, long i);
#else
@@ -4360,7 +4360,7 @@ static inline void memory_failure_queue(
}
static inline int __get_huge_page_for_hwpoison(unsigned long pfn, int flags,
- bool *migratable_cleared)
+ bool *migratable_cleared, bool *samepg)
{
return 0;
}
--- a/mm/hugetlb.c~mm-memory-failure-fix-missing-mf_stats-count-in-hugetlb-poison
+++ a/mm/hugetlb.c
@@ -7132,12 +7132,12 @@ int get_hwpoison_hugetlb_folio(struct fo
}
int get_huge_page_for_hwpoison(unsigned long pfn, int flags,
- bool *migratable_cleared)
+ bool *migratable_cleared, bool *samepg)
{
int ret;
spin_lock_irq(&hugetlb_lock);
- ret = __get_huge_page_for_hwpoison(pfn, flags, migratable_cleared);
+ ret = __get_huge_page_for_hwpoison(pfn, flags, migratable_cleared, samepg);
spin_unlock_irq(&hugetlb_lock);
return ret;
}
--- a/mm/memory-failure.c~mm-memory-failure-fix-missing-mf_stats-count-in-hugetlb-poison
+++ a/mm/memory-failure.c
@@ -1883,7 +1883,8 @@ static unsigned long __folio_free_raw_hw
return count;
}
-static int folio_set_hugetlb_hwpoison(struct folio *folio, struct page *page)
+static int folio_set_hugetlb_hwpoison(struct folio *folio, struct page *page,
+ bool *samepg)
{
struct llist_head *head;
struct raw_hwp_page *raw_hwp;
@@ -1899,17 +1900,16 @@ static int folio_set_hugetlb_hwpoison(st
return -EHWPOISON;
head = raw_hwp_list_head(folio);
llist_for_each_entry(p, head->first, node) {
- if (p->page == page)
+ if (p->page == page) {
+ *samepg = true;
return -EHWPOISON;
+ }
}
raw_hwp = kmalloc(sizeof(struct raw_hwp_page), GFP_ATOMIC);
if (raw_hwp) {
raw_hwp->page = page;
llist_add(&raw_hwp->node, head);
- /* the first error event will be counted in action_result(). */
- if (ret)
- num_poisoned_pages_inc(page_to_pfn(page));
} else {
/*
* Failed to save raw error info. We no longer trace all
@@ -1966,7 +1966,7 @@ void folio_clear_hugetlb_hwpoison(struct
* -EHWPOISON - the hugepage is already hwpoisoned
*/
int __get_huge_page_for_hwpoison(unsigned long pfn, int flags,
- bool *migratable_cleared)
+ bool *migratable_cleared, bool *samepg)
{
struct page *page = pfn_to_page(pfn);
struct folio *folio = page_folio(page);
@@ -1991,7 +1991,7 @@ int __get_huge_page_for_hwpoison(unsigne
goto out;
}
- if (folio_set_hugetlb_hwpoison(folio, page)) {
+ if (folio_set_hugetlb_hwpoison(folio, page, samepg)) {
ret = -EHWPOISON;
goto out;
}
@@ -2024,11 +2024,12 @@ static int try_memory_failure_hugetlb(un
struct page *p = pfn_to_page(pfn);
struct folio *folio;
unsigned long page_flags;
+ bool samepg = false;
bool migratable_cleared = false;
*hugetlb = 1;
retry:
- res = get_huge_page_for_hwpoison(pfn, flags, &migratable_cleared);
+ res = get_huge_page_for_hwpoison(pfn, flags, &migratable_cleared, &samepg);
if (res == 2) { /* fallback to normal page handling */
*hugetlb = 0;
return 0;
@@ -2037,7 +2038,10 @@ retry:
folio = page_folio(p);
res = kill_accessing_process(current, folio_pfn(folio), flags);
}
- action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
+ if (samepg)
+ action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
+ else
+ action_result(pfn, MF_MSG_HUGE, MF_FAILED);
return res;
} else if (res == -EBUSY) {
if (!(flags & MF_NO_RETRY)) {
_
Patches currently in -mm which might be from jane.chu(a)oracle.com are
mm-memory-failure-fix-missing-mf_stats-count-in-hugetlb-poison.patch
Hi Exhibitor,
Hope you had a successful experience at LDI Show 2025 (Dec 3–9, Las Vegas). We have access to a verified list of 16,594 attendees and 312 exhibitors across the live events, lighting, audio, staging, and production-technology sectors.
This includes lighting designers, audio engineers, production managers, AV integrators, stage/rigging technicians, venue operations heads, broadcast specialists, and other key live-event decision-makers.
Don’t miss the opportunity to connect with high-quality prospects after the event.
If interested, kindly reply “Send Pricing” to receive the details.
Best regards,
Caroline Turner
Senior Market Analyst
To opt-out, reply “Not Interested”.
This patch reverts fuse back to its original behavior of sync being a no-op.
This fixes the userspace regression reported by Athul and J. upstream in
[1][2] where if there is a bug in a fuse server that causes the server to
never complete writeback, it will make wait_sb_inodes() wait forever.
Thanks,
Joanne
[1] https://lore.kernel.org/regressions/CAJnrk1ZjQ8W8NzojsvJPRXiv9TuYPNdj8Ye7=C…
[2] https://lore.kernel.org/linux-fsdevel/aT7JRqhUvZvfUQlV@eldamar.lan/
Changelog:
v1: https://lore.kernel.org/linux-mm/20251120184211.2379439-1-joannelkoong@gmai…
* Change AS_WRITEBACK_MAY_HANG to AS_NO_DATA_INTEGRITY and keep
AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM as is.
Joanne Koong (1):
fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()
fs/fs-writeback.c | 3 ++-
fs/fuse/file.c | 4 +++-
include/linux/pagemap.h | 11 +++++++++++
3 files changed, 16 insertions(+), 2 deletions(-)
--
2.47.3