The patch titled
Subject: idr: fix idr_alloc() returning an ID out of range
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
idr-fix-idr_alloc-returning-an-id-out-of-range.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Subject: idr: fix idr_alloc() returning an ID out of range
Date: Fri, 28 Nov 2025 16:18:32 +0000
If you use an IDR with a non-zero base, and specify a range that lies
entirely below the base, 'max - base' becomes very large and
idr_get_free() can return an ID that lies outside of the requested range.
Link: https://lkml.kernel.org/r/20251128161853.3200058-1-willy@infradead.org
Fixes: 6ce711f27500 (idr: Make 1-based IDRs more efficient)
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Reported-by: Jan Sokolowski <jan.sokolowski(a)intel.com>
Reported-by: Koen Koning koen.koning(a)intel.com
Reported-by: Peter Senna Tschudin peter.senna(a)linux.intel.com
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6449
Reviewed-by: Christian K��nig <christian.koenig(a)amd.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/idr.c | 2 ++
tools/testing/radix-tree/idr-test.c | 21 +++++++++++++++++++++
2 files changed, 23 insertions(+)
--- a/lib/idr.c~idr-fix-idr_alloc-returning-an-id-out-of-range
+++ a/lib/idr.c
@@ -40,6 +40,8 @@ int idr_alloc_u32(struct idr *idr, void
if (WARN_ON_ONCE(!(idr->idr_rt.xa_flags & ROOT_IS_IDR)))
idr->idr_rt.xa_flags |= IDR_RT_MARKER;
+ if (max < base)
+ return -ENOSPC;
id = (id < base) ? 0 : id - base;
radix_tree_iter_init(&iter, id);
--- a/tools/testing/radix-tree/idr-test.c~idr-fix-idr_alloc-returning-an-id-out-of-range
+++ a/tools/testing/radix-tree/idr-test.c
@@ -57,6 +57,26 @@ void idr_alloc_test(void)
idr_destroy(&idr);
}
+void idr_alloc2_test(void)
+{
+ int id;
+ struct idr idr = IDR_INIT_BASE(idr, 1);
+
+ id = idr_alloc(&idr, idr_alloc2_test, 0, 1, GFP_KERNEL);
+ assert(id == -ENOSPC);
+
+ id = idr_alloc(&idr, idr_alloc2_test, 1, 2, GFP_KERNEL);
+ assert(id == 1);
+
+ id = idr_alloc(&idr, idr_alloc2_test, 0, 1, GFP_KERNEL);
+ assert(id == -ENOSPC);
+
+ id = idr_alloc(&idr, idr_alloc2_test, 0, 2, GFP_KERNEL);
+ assert(id == -ENOSPC);
+
+ idr_destroy(&idr);
+}
+
void idr_replace_test(void)
{
DEFINE_IDR(idr);
@@ -409,6 +429,7 @@ void idr_checks(void)
idr_replace_test();
idr_alloc_test();
+ idr_alloc2_test();
idr_null_test();
idr_nowait_test();
idr_get_next_test(0);
_
Patches currently in -mm which might be from willy(a)infradead.org are
idr-fix-idr_alloc-returning-an-id-out-of-range.patch
mm-fix-vma_start_write_killable-signal-handling.patch
The patch titled
Subject: mm/kasan: fix incorrect unpoisoning in vrealloc for KASAN
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-kasan-fix-incorrect-unpoisoning-in-vrealloc-for-kasan.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Jiayuan Chen <jiayuan.chen(a)linux.dev>
Subject: mm/kasan: fix incorrect unpoisoning in vrealloc for KASAN
Date: Fri, 28 Nov 2025 19:15:14 +0800
Syzkaller reported a memory out-of-bounds bug [1]. This patch fixes two
issues:
1. In vrealloc, we were missing the KASAN_VMALLOC_VM_ALLOC flag when
unpoisoning the extended region. This flag is required to correctly
associate the allocation with KASAN's vmalloc tracking.
Note: In contrast, vzalloc (via __vmalloc_node_range_noprof) explicitly
sets KASAN_VMALLOC_VM_ALLOC and calls kasan_unpoison_vmalloc() with it.
vrealloc must behave consistently ��� especially when reusing existing
vmalloc regions ��� to ensure KASAN can track allocations correctly.
2. When vrealloc reuses an existing vmalloc region (without allocating new
pages), KASAN previously generated a new tag, which broke tag-based
memory access tracking. We now add a 'reuse_tag' parameter to
__kasan_unpoison_vmalloc() to preserve the original tag in such cases.
A new helper kasan_unpoison_vralloc() is introduced to handle this reuse
scenario, ensuring consistent tag behavior during reallocation.
Link: https://lkml.kernel.org/r/20251128111516.244497-1-jiayuan.chen@linux.dev
Link: https://syzkaller.appspot.com/bug?extid=997752115a851cb0cf36 [1]
Fixes: a0309faf1cb0 ("mm: vmalloc: support more granular vrealloc() sizing")
Signed-off-by: Jiayuan Chen <jiayuan.chen(a)linux.dev>
Reported-by: syzbot+997752115a851cb0cf36(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68e243a2.050a0220.1696c6.007d.GAE@google.com/T/
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Danilo Krummrich <dakr(a)kernel.org>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Kees Cook <kees(a)kernel.org>
Cc: "Uladzislau Rezki (Sony)" <urezki(a)gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/kasan.h | 21 +++++++++++++++++++--
mm/kasan/hw_tags.c | 4 ++--
mm/kasan/shadow.c | 6 ++++--
mm/vmalloc.c | 4 ++--
4 files changed, 27 insertions(+), 8 deletions(-)
--- a/include/linux/kasan.h~mm-kasan-fix-incorrect-unpoisoning-in-vrealloc-for-kasan
+++ a/include/linux/kasan.h
@@ -596,13 +596,23 @@ static inline void kasan_release_vmalloc
#endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
- kasan_vmalloc_flags_t flags);
+ kasan_vmalloc_flags_t flags, bool reuse_tag);
+
+static __always_inline void *kasan_unpoison_vrealloc(const void *start,
+ unsigned long size,
+ kasan_vmalloc_flags_t flags)
+{
+ if (kasan_enabled())
+ return __kasan_unpoison_vmalloc(start, size, flags, true);
+ return (void *)start;
+}
+
static __always_inline void *kasan_unpoison_vmalloc(const void *start,
unsigned long size,
kasan_vmalloc_flags_t flags)
{
if (kasan_enabled())
- return __kasan_unpoison_vmalloc(start, size, flags);
+ return __kasan_unpoison_vmalloc(start, size, flags, false);
return (void *)start;
}
@@ -629,6 +639,13 @@ static inline void kasan_release_vmalloc
unsigned long free_region_end,
unsigned long flags) { }
+static inline void *kasan_unpoison_vrealloc(const void *start,
+ unsigned long size,
+ kasan_vmalloc_flags_t flags)
+{
+ return (void *)start;
+}
+
static inline void *kasan_unpoison_vmalloc(const void *start,
unsigned long size,
kasan_vmalloc_flags_t flags)
--- a/mm/kasan/hw_tags.c~mm-kasan-fix-incorrect-unpoisoning-in-vrealloc-for-kasan
+++ a/mm/kasan/hw_tags.c
@@ -317,7 +317,7 @@ static void init_vmalloc_pages(const voi
}
void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
- kasan_vmalloc_flags_t flags)
+ kasan_vmalloc_flags_t flags, bool reuse_tag)
{
u8 tag;
unsigned long redzone_start, redzone_size;
@@ -361,7 +361,7 @@ void *__kasan_unpoison_vmalloc(const voi
return (void *)start;
}
- tag = kasan_random_tag();
+ tag = reuse_tag ? get_tag(start) : kasan_random_tag();
start = set_tag(start, tag);
/* Unpoison and initialize memory up to size. */
--- a/mm/kasan/shadow.c~mm-kasan-fix-incorrect-unpoisoning-in-vrealloc-for-kasan
+++ a/mm/kasan/shadow.c
@@ -625,7 +625,7 @@ void kasan_release_vmalloc(unsigned long
}
void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
- kasan_vmalloc_flags_t flags)
+ kasan_vmalloc_flags_t flags, bool reuse_tag)
{
/*
* Software KASAN modes unpoison both VM_ALLOC and non-VM_ALLOC
@@ -648,7 +648,9 @@ void *__kasan_unpoison_vmalloc(const voi
!(flags & KASAN_VMALLOC_PROT_NORMAL))
return (void *)start;
- start = set_tag(start, kasan_random_tag());
+ if (!reuse_tag)
+ start = set_tag(start, kasan_random_tag());
+
kasan_unpoison(start, size, false);
return (void *)start;
}
--- a/mm/vmalloc.c~mm-kasan-fix-incorrect-unpoisoning-in-vrealloc-for-kasan
+++ a/mm/vmalloc.c
@@ -4175,8 +4175,8 @@ void *vrealloc_node_align_noprof(const v
* We already have the bytes available in the allocation; use them.
*/
if (size <= alloced_size) {
- kasan_unpoison_vmalloc(p + old_size, size - old_size,
- KASAN_VMALLOC_PROT_NORMAL);
+ kasan_unpoison_vrealloc(p, size,
+ KASAN_VMALLOC_PROT_NORMAL | KASAN_VMALLOC_VM_ALLOC);
/*
* No need to zero memory here, as unused memory will have
* already been zeroed at initial allocation time or during
_
Patches currently in -mm which might be from jiayuan.chen(a)linux.dev are
mm-kasan-fix-incorrect-unpoisoning-in-vrealloc-for-kasan.patch
mm-vmscan-skip-increasing-kswapd_failures-when-reclaim-was-boosted.patch
In max16065_current_show, data->curr_sense is read twice: once for the
error check and again for the calculation. Since
i2c_smbus_read_byte_data returns negative error codes on failure, if the
data changes to an error code between the check and the use, ADC_TO_CURR
results in an incorrect calculation.
Read data->curr_sense into a local variable to ensure consistency. Note
that data->curr_gain is constant and safe to access directly.
This aligns max16065_current_show with max16065_input_show, which
already uses a local variable for the same reason.
Link: https://lore.kernel.org/all/CALbr=LYJ_ehtp53HXEVkSpYoub+XYSTU8Rg=o1xxMJ8=5z…
Fixes: f5bae2642e3d ("hwmon: Driver for MAX16065 System Manager and compatibles")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gui-Dong Han <hanguidong02(a)gmail.com>
---
Based on the discussion in the link, I will submit a series of patches to
address TOCTOU issues in the hwmon subsystem by converting macros to
functions or adjusting locking where appropriate.
---
drivers/hwmon/max16065.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/hwmon/max16065.c b/drivers/hwmon/max16065.c
index 0ccb5eb596fc..4c9e7892a73c 100644
--- a/drivers/hwmon/max16065.c
+++ b/drivers/hwmon/max16065.c
@@ -216,12 +216,13 @@ static ssize_t max16065_current_show(struct device *dev,
struct device_attribute *da, char *buf)
{
struct max16065_data *data = max16065_update_device(dev);
+ int curr_sense = data->curr_sense;
- if (unlikely(data->curr_sense < 0))
- return data->curr_sense;
+ if (unlikely(curr_sense < 0))
+ return curr_sense;
return sysfs_emit(buf, "%d\n",
- ADC_TO_CURR(data->curr_sense, data->curr_gain));
+ ADC_TO_CURR(curr_sense, data->curr_gain));
}
static ssize_t max16065_limit_store(struct device *dev,
--
2.43.0
The "timers: Provide timer_shutdown[_sync]()" patch series implemented a
useful feature that addresses various bugs caused by attempts to rearm
shutdown timers.
https://lore.kernel.org/all/20221123201306.823305113@linutronix.de/
However, this patch series was not fully backported to versions prior to
6.2, requiring separate patches for older kernels if these bugs were
encountered.
The biggest problem with this is that even if these bugs were discovered
and patched in the upstream kernel, if the maintainer or author didn't
create a separate backport patch for versions prior to 6.2, the bugs would
remain untouched in older kernels.
Therefore, to reduce the hassle of having to write a separate patch, we
should backport the remaining unbackported commits from the
"timers: Provide timer_shutdown[_sync]()" patch series to versions prior
to 6.2.
---
Documentation/RCU/Design/Requirements/Requirements.rst | 2 +-
Documentation/core-api/local_ops.rst | 2 +-
Documentation/kernel-hacking/locking.rst | 17 +++++++-----
Documentation/timers/hrtimers.rst | 2 +-
Documentation/translations/it_IT/kernel-hacking/locking.rst | 14 +++++-----
Documentation/translations/zh_CN/core-api/local_ops.rst | 2 +-
arch/arm/mach-spear/time.c | 8 +++---
drivers/bluetooth/hci_qca.c | 10 +++++--
drivers/char/tpm/tpm-dev-common.c | 4 +--
drivers/clocksource/arm_arch_timer.c | 12 ++++-----
drivers/clocksource/timer-sp804.c | 6 ++---
drivers/staging/wlan-ng/hfa384x_usb.c | 4 +--
drivers/staging/wlan-ng/prism2usb.c | 6 ++---
include/linux/timer.h | 17 ++++++++++--
kernel/time/timer.c | 315 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------
net/sunrpc/xprt.c | 2 +-
16 files changed, 322 insertions(+), 101 deletions(-)
From: Florian Westphal <fw(a)strlen.de>
[ Upstream commit 791a615b7ad2258c560f91852be54b0480837c93 ]
The initial buffer has to be inited to all-ones, but it must restrict
it to the size of the first field, not the total field size.
After each round in the map search step, the result and the fill map
are swapped, so if we have a set where f->bsize of the first element
is smaller than m->bsize_max, those one-bits are leaked into future
rounds result map.
This makes pipapo find an incorrect matching results for sets where
first field size is not the largest.
Followup patch adds a test case to nft_concat_range.sh selftest script.
Thanks to Stefano Brivio for pointing out that we need to zero out
the remainder explicitly, only correcting memset() argument isn't enough.
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Reported-by: Yi Chen <yiche(a)redhat.com>
Cc: Stefano Brivio <sbrivio(a)redhat.com>
Signed-off-by: Florian Westphal <fw(a)strlen.de>
Reviewed-by: Stefano Brivio <sbrivio(a)redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Signed-off-by: Nazar Kalashnikov <sivartiwe(a)gmail.com>
---
Backport fix for CVE-2024-57947
net/netfilter/nft_set_pipapo.c | 4 ++--
net/netfilter/nft_set_pipapo.h | 21 +++++++++++++++++++++
net/netfilter/nft_set_pipapo_avx2.c | 10 ++++++----
3 files changed, 29 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
index ce617f6a215f..6813ff660b72 100644
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -432,7 +432,7 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
res_map = scratch->map + (map_index ? m->bsize_max : 0);
fill_map = scratch->map + (map_index ? 0 : m->bsize_max);
- memset(res_map, 0xff, m->bsize_max * sizeof(*res_map));
+ pipapo_resmap_init(m, res_map);
nft_pipapo_for_each_field(f, i, m) {
bool last = i == m->field_count - 1;
@@ -536,7 +536,7 @@ static struct nft_pipapo_elem *pipapo_get(const struct net *net,
goto out;
}
- memset(res_map, 0xff, m->bsize_max * sizeof(*res_map));
+ pipapo_resmap_init(m, res_map);
nft_pipapo_for_each_field(f, i, m) {
bool last = i == m->field_count - 1;
diff --git a/net/netfilter/nft_set_pipapo.h b/net/netfilter/nft_set_pipapo.h
index 2e709ae01924..8f8f58af4e34 100644
--- a/net/netfilter/nft_set_pipapo.h
+++ b/net/netfilter/nft_set_pipapo.h
@@ -287,4 +287,25 @@ static u64 pipapo_estimate_size(const struct nft_set_desc *desc)
return size;
}
+/**
+ * pipapo_resmap_init() - Initialise result map before first use
+ * @m: Matching data, including mapping table
+ * @res_map: Result map
+ *
+ * Initialize all bits covered by the first field to one, so that after
+ * the first step, only the matching bits of the first bit group remain.
+ *
+ * If other fields have a large bitmap, set remainder of res_map to 0.
+ */
+static inline void pipapo_resmap_init(const struct nft_pipapo_match *m, unsigned long *res_map)
+{
+ const struct nft_pipapo_field *f = m->f;
+ int i;
+
+ for (i = 0; i < f->bsize; i++)
+ res_map[i] = ULONG_MAX;
+
+ for (i = f->bsize; i < m->bsize_max; i++)
+ res_map[i] = 0ul;
+}
#endif /* _NFT_SET_PIPAPO_H */
diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c
index 0a23d297084d..81e6d12ab4cd 100644
--- a/net/netfilter/nft_set_pipapo_avx2.c
+++ b/net/netfilter/nft_set_pipapo_avx2.c
@@ -1028,6 +1028,7 @@ static int nft_pipapo_avx2_lookup_8b_16(unsigned long *map, unsigned long *fill,
/**
* nft_pipapo_avx2_lookup_slow() - Fallback function for uncommon field sizes
+ * @mdata: Matching data, including mapping table
* @map: Previous match result, used as initial bitmap
* @fill: Destination bitmap to be filled with current match result
* @f: Field, containing lookup and mapping tables
@@ -1043,7 +1044,8 @@ static int nft_pipapo_avx2_lookup_8b_16(unsigned long *map, unsigned long *fill,
* Return: -1 on no match, rule index of match if @last, otherwise first long
* word index to be checked next (i.e. first filled word).
*/
-static int nft_pipapo_avx2_lookup_slow(unsigned long *map, unsigned long *fill,
+static int nft_pipapo_avx2_lookup_slow(const struct nft_pipapo_match *mdata,
+ unsigned long *map, unsigned long *fill,
struct nft_pipapo_field *f, int offset,
const u8 *pkt, bool first, bool last)
{
@@ -1053,7 +1055,7 @@ static int nft_pipapo_avx2_lookup_slow(unsigned long *map, unsigned long *fill,
lt += offset * NFT_PIPAPO_LONGS_PER_M256;
if (first)
- memset(map, 0xff, bsize * sizeof(*map));
+ pipapo_resmap_init(mdata, map);
for (i = offset; i < bsize; i++) {
if (f->bb == 8)
@@ -1181,7 +1183,7 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
} else if (f->groups == 16) {
NFT_SET_PIPAPO_AVX2_LOOKUP(8, 16);
} else {
- ret = nft_pipapo_avx2_lookup_slow(res, fill, f,
+ ret = nft_pipapo_avx2_lookup_slow(m, res, fill, f,
ret, rp,
first, last);
}
@@ -1197,7 +1199,7 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
} else if (f->groups == 32) {
NFT_SET_PIPAPO_AVX2_LOOKUP(4, 32);
} else {
- ret = nft_pipapo_avx2_lookup_slow(res, fill, f,
+ ret = nft_pipapo_avx2_lookup_slow(m, res, fill, f,
ret, rp,
first, last);
}
--
2.43.0
From: Nikolay Aleksandrov <razor(a)blackwall.org>
[ Upstream commit 9ec7eb60dcbcb6c41076defbc5df7bbd95ceaba5 ]
Add bond_ether_setup helper which is used to fix ether_setup() calls in the
bonding driver. It takes care of both IFF_MASTER and IFF_SLAVE flags, the
former is always restored and the latter only if it was set.
If the bond enslaves non-ARPHRD_ETHER device (changes its type), then
releases it and enslaves ARPHRD_ETHER device (changes back) then we
use ether_setup() to restore the bond device type but it also resets its
flags and removes IFF_MASTER and IFF_SLAVE[1]. Use the bond_ether_setup
helper to restore both after such transition.
[1] reproduce (nlmon is non-ARPHRD_ETHER):
$ ip l add nlmon0 type nlmon
$ ip l add bond2 type bond mode active-backup
$ ip l set nlmon0 master bond2
$ ip l set nlmon0 nomaster
$ ip l add bond1 type bond
(we use bond1 as ARPHRD_ETHER device to restore bond2's mode)
$ ip l set bond1 master bond2
$ ip l sh dev bond2
37: bond2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether be:d7:c5:40:5b:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 1500
(notice bond2's IFF_MASTER is missing)
Fixes: e36b9d16c6a6 ("bonding: clean muticast addresses when device changes type")
Signed-off-by: Nikolay Aleksandrov <razor(a)blackwall.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Alexey Panov <apanov(a)astralinux.ru>
---
Backport fix for CVE-2023-53103
Tested with the syzkaller reproducer for
https://syzkaller.appspot.com/bug?extid=9dfc3f3348729cc82277:
the issue triggers on vanilla v5.10.y and no longer reproduces with these
patches applied.
drivers/net/bonding/bond_main.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 08bc930afc4c..127242101c8e 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1691,6 +1691,19 @@ void bond_lower_state_changed(struct slave *slave)
netdev_lower_state_changed(slave->dev, &info);
}
+/* The bonding driver uses ether_setup() to convert a master bond device
+ * to ARPHRD_ETHER, that resets the target netdevice's flags so we always
+ * have to restore the IFF_MASTER flag, and only restore IFF_SLAVE if it was set
+ */
+static void bond_ether_setup(struct net_device *bond_dev)
+{
+ unsigned int slave_flag = bond_dev->flags & IFF_SLAVE;
+
+ ether_setup(bond_dev);
+ bond_dev->flags |= IFF_MASTER | slave_flag;
+ bond_dev->priv_flags &= ~IFF_TX_SKB_SHARING;
+}
+
/* enslave device <slave> to bond device <master> */
int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
struct netlink_ext_ack *extack)
@@ -1777,10 +1790,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
if (slave_dev->type != ARPHRD_ETHER)
bond_setup_by_slave(bond_dev, slave_dev);
- else {
- ether_setup(bond_dev);
- bond_dev->priv_flags &= ~IFF_TX_SKB_SHARING;
- }
+ else
+ bond_ether_setup(bond_dev);
call_netdevice_notifiers(NETDEV_POST_TYPE_CHANGE,
bond_dev);
--
2.30.2
Changes in v3:
- Fixed patch numbering (previously sent as 2/3 instead of 3/3)
Changes in v2:
- Added a new patch fixing bonding regression, based on the Fixes tag in
c484fcc058ba ("bonding: Fix memory leak when changing bond type to Ethernet")
- Added a cover letter
- No changes in patches 1 and 3
- Retested the reproducer [1]
Tested with the syzkaller reproducer [1].
The issue triggers on vanilla v5.10.y and no longer reproduces with these
patches applied.
Additionally, c484fcc058ba ("bonding: Fix memory leak when changing bond type
to Ethernet") has a Fixes tag pointing to
9ec7eb60dcbc ("bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether
type change"), so it should be ported as well.
[1]: https://syzkaller.appspot.com/bug?extid=9dfc3f3348729cc82277
Ido Schimmel (1):
bonding: Fix memory leak when changing bond type to Ethernet
Nikolay Aleksandrov (2):
bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type
change
bonding: restore bond's IFF_SLAVE flag if a non-eth dev enslave fails
drivers/net/bonding/bond_main.c | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)
--
2.39.5
Changes in v2:
- Added a new patch fixing bonding regression, based on the Fixes tag in
c484fcc058ba ("bonding: Fix memory leak when changing bond type to Ethernet")
- Added a cover letter
- No changes in patches 1 and 3
- Retested the reproducer [1]
Tested with the syzkaller reproducer [1].
The issue triggers on vanilla v5.10.y and no longer reproduces with these
patches applied.
Additionally, c484fcc058ba ("bonding: Fix memory leak when changing bond type
to Ethernet") has a Fixes tag pointing to
9ec7eb60dcbc ("bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether
type change"), so it should be ported as well.
[1]: https://syzkaller.appspot.com/bug?extid=9dfc3f3348729cc82277
Ido Schimmel (1):
bonding: Fix memory leak when changing bond type to Ethernet
Nikolay Aleksandrov (2):
bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type
change
bonding: restore bond's IFF_SLAVE flag if a non-eth dev enslave fails
drivers/net/bonding/bond_main.c | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)
--
2.39.5
The calculation of bridge window head alignment is done by
calculate_mem_align() [*]. With the default bridge window alignment, it
is used for both head and tail alignment.
The selected head alignment does not always result in tight-fitting
resources (gap at d4f00000-d4ffffff):
d4800000-dbffffff : PCI Bus 0000:06
d4800000-d48fffff : PCI Bus 0000:07
d4800000-d4803fff : 0000:07:00.0
d4800000-d4803fff : nvme
d4900000-d49fffff : PCI Bus 0000:0a
d4900000-d490ffff : 0000:0a:00.0
d4900000-d490ffff : r8169
d4910000-d4913fff : 0000:0a:00.0
d4a00000-d4cfffff : PCI Bus 0000:0b
d4a00000-d4bfffff : 0000:0b:00.0
d4a00000-d4bfffff : 0000:0b:00.0
d4c00000-d4c07fff : 0000:0b:00.0
d4d00000-d4dfffff : PCI Bus 0000:15
d4d00000-d4d07fff : 0000:15:00.0
d4d00000-d4d07fff : xhci-hcd
d4e00000-d4efffff : PCI Bus 0000:16
d4e00000-d4e7ffff : 0000:16:00.0
d4e80000-d4e803ff : 0000:16:00.0
d4e80000-d4e803ff : ahci
d5000000-dbffffff : PCI Bus 0000:0c
This has not been caused problems (for years) with the default bridge
window tail alignment that grossly over-estimates the required tail
alignment leaving more tail room than necessary. With the introduction
of relaxed tail alignment that leaves no extra tail room whatsoever,
any gaps will immediately turn into assignment failures.
Introduce head alignment calculation that ensures no gaps are left and
apply the new approach when using relaxed alignment.
([*] I don't understand the algorithm in calculate_mem_align().)
Fixes: 5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220775
Reported-by: Malte Schröder <malte+lkml(a)tnxip.de>
Tested-by: Malte Schröder <malte+lkml(a)tnxip.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
---
Little annoyingly, there's difference in what aligns array contains
between the legacy alignment approach (which I dare not to touch as I
really don't understand what the algorithm tries to do) and this new
head aligment algorithm, both consuming stack space. After making the
new approach the only available approach in the follow-up patch, only
one array remains (however, that follow-up change is also somewhat
riskier when it comes to regressions).
That being said, the new head alignment could work with the same aligns
array as the legacy approach, it just won't necessarily produce an
optimal (the smallest possible) head alignment when if (r_size <=
align) condition is used. Just let me know if that approach is
preferred (to save some stack space).
---
drivers/pci/setup-bus.c | 53 ++++++++++++++++++++++++++++++++++-------
1 file changed, 44 insertions(+), 9 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 70d021ffb486..93f6b0750174 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1224,6 +1224,45 @@ static inline resource_size_t calculate_mem_align(resource_size_t *aligns,
return min_align;
}
+/*
+ * Calculate bridge window head alignment that leaves no gaps in between
+ * resources.
+ */
+static resource_size_t calculate_head_align(resource_size_t *aligns,
+ int max_order)
+{
+ resource_size_t head_align = 1;
+ resource_size_t remainder = 0;
+ int order;
+
+ /* Take the largest alignment as the starting point. */
+ head_align <<= max_order + __ffs(SZ_1M);
+
+ for (order = max_order - 1; order >= 0; order--) {
+ resource_size_t align1 = 1;
+
+ align1 <<= order + __ffs(SZ_1M);
+
+ /*
+ * Account smaller resources with alignment < max_order that
+ * could be used to fill head room if alignment less than
+ * max_order is used.
+ */
+ remainder += aligns[order];
+
+ /*
+ * Test if head fill is enough to satisfy the alignment of
+ * the larger resources after reducing the alignment.
+ */
+ while ((head_align > align1) && (remainder >= head_align / 2)) {
+ head_align /= 2;
+ remainder -= head_align;
+ }
+ }
+
+ return head_align;
+}
+
/**
* pbus_upstream_space_available - Check no upstream resource limits allocation
* @bus: The bus
@@ -1311,13 +1350,13 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
{
struct pci_dev *dev;
resource_size_t min_align, win_align, align, size, size0, size1 = 0;
- resource_size_t aligns[28]; /* Alignments from 1MB to 128TB */
+ resource_size_t aligns[28] = {}; /* Alignments from 1MB to 128TB */
+ resource_size_t aligns2[28] = {};/* Alignments from 1MB to 128TB */
int order, max_order;
struct resource *b_res = pbus_select_window_for_type(bus, type);
resource_size_t children_add_size = 0;
resource_size_t children_add_align = 0;
resource_size_t add_align = 0;
- resource_size_t relaxed_align;
resource_size_t old_size;
if (!b_res)
@@ -1327,7 +1366,6 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
if (b_res->parent)
return;
- memset(aligns, 0, sizeof(aligns));
max_order = 0;
size = 0;
@@ -1378,6 +1416,7 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
*/
if (r_size <= align)
aligns[order] += align;
+ aligns2[order] += align;
if (order > max_order)
max_order = order;
@@ -1402,9 +1441,7 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
if (bus->self && size0 &&
!pbus_upstream_space_available(bus, b_res, size0, min_align)) {
- relaxed_align = 1ULL << (max_order + __ffs(SZ_1M));
- relaxed_align = max(relaxed_align, win_align);
- min_align = min(min_align, relaxed_align);
+ min_align = calculate_head_align(aligns2, max_order);
size0 = calculate_memsize(size, min_size, 0, 0, old_size, win_align);
resource_set_range(b_res, min_align, size0);
pci_info(bus->self, "bridge window %pR to %pR requires relaxed alignment rules\n",
@@ -1418,9 +1455,7 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
if (bus->self && size1 &&
!pbus_upstream_space_available(bus, b_res, size1, add_align)) {
- relaxed_align = 1ULL << (max_order + __ffs(SZ_1M));
- relaxed_align = max(relaxed_align, win_align);
- min_align = min(min_align, relaxed_align);
+ min_align = calculate_head_align(aligns2, max_order);
size1 = calculate_memsize(size, min_size, add_size, children_add_size,
old_size, win_align);
pci_info(bus->self,
--
2.39.5
pbus_size_mem() has two alignments, one for required resources in
min_align and another in add_align that takes account optional
resources.
The add_align is applied to the bridge window through the realloc_head
list. It can happen, however, that add_align is larger than min_align
but calculated size1 and size0 are equal due to extra tailroom (e.g.,
hotplug reservation, tail alignment), and therefore no entry is created
to the realloc_head list. Without the bridge appearing in the realloc
head, add_align is lost when pbus_size_mem() returns.
The problem is visible in this log for 0000:05:00.0 which lacks
add_size ... add_align ... line that would indicate it was added into
the realloc_head list:
pci 0000:05:00.0: PCI bridge to [bus 06-16]
...
pci 0000:06:00.0: bridge window [mem 0x00100000-0x001fffff] to [bus 07] requires relaxed alignment rules
pci 0000:06:06.0: bridge window [mem 0x00100000-0x001fffff] to [bus 0a] requires relaxed alignment rules
pci 0000:06:07.0: bridge window [mem 0x00100000-0x003fffff] to [bus 0b] requires relaxed alignment rules
pci 0000:06:08.0: bridge window [mem 0x00800000-0x00ffffff 64bit pref] to [bus 0c-14] requires relaxed alignment rules
pci 0000:06:08.0: bridge window [mem 0x01000000-0x057fffff] to [bus 0c-14] requires relaxed alignment rules
pci 0000:06:08.0: bridge window [mem 0x01000000-0x057fffff] to [bus 0c-14] requires relaxed alignment rules
pci 0000:06:08.0: bridge window [mem 0x01000000-0x057fffff] to [bus 0c-14] add_size 100000 add_align 1000000
pci 0000:06:0c.0: bridge window [mem 0x00100000-0x001fffff] to [bus 15] requires relaxed alignment rules
pci 0000:06:0d.0: bridge window [mem 0x00100000-0x001fffff] to [bus 16] requires relaxed alignment rules
pci 0000:06:0d.0: bridge window [mem 0x00100000-0x001fffff] to [bus 16] requires relaxed alignment rules
pci 0000:05:00.0: bridge window [mem 0xd4800000-0xd97fffff]: assigned
pci 0000:05:00.0: bridge window [mem 0x1060000000-0x10607fffff 64bit pref]: assigned
pci 0000:06:08.0: bridge window [mem size 0x04900000]: can't assign; no space
pci 0000:06:08.0: bridge window [mem size 0x04900000]: failed to assign
While this bug itself seems old, it has likely become more visible
after the relaxed tail alignment that does not grossly overestimate the
size needed for the bridge window.
Make sure add_align > min_align too results in adding an entry into the
realloc head list. In addition, add handling to the cases where
add_size is zero while only alignment differs.
Fixes: d74b9027a4da ("PCI: Consider additional PF's IOV BAR alignment in sizing and assigning")
Reported-by: Malte Schröder <malte+lkml(a)tnxip.de>
Tested-by: Malte Schröder <malte+lkml(a)tnxip.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
---
drivers/pci/setup-bus.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 4a8735b275e4..70d021ffb486 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -14,6 +14,7 @@
* tighter packing. Prefetchable range support.
*/
+#include <linux/align.h>
#include <linux/bitops.h>
#include <linux/init.h>
#include <linux/kernel.h>
@@ -452,7 +453,7 @@ static void reassign_resources_sorted(struct list_head *realloc_head,
"%s %pR: ignoring failure in optional allocation\n",
res_name, res);
}
- } else if (add_size > 0) {
+ } else if (add_size > 0 || !IS_ALIGNED(res->start, align)) {
res->flags |= add_res->flags &
(IORESOURCE_STARTALIGN|IORESOURCE_SIZEALIGN);
if (pci_reassign_resource(dev, idx, add_size, align))
@@ -1438,12 +1439,13 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
resource_set_range(b_res, min_align, size0);
b_res->flags |= IORESOURCE_STARTALIGN;
- if (bus->self && size1 > size0 && realloc_head) {
+ if (bus->self && realloc_head && (size1 > size0 || add_align > min_align)) {
b_res->flags &= ~IORESOURCE_DISABLED;
- add_to_list(realloc_head, bus->self, b_res, size1-size0, add_align);
+ add_size = size1 > size0 ? size1 - size0 : 0;
+ add_to_list(realloc_head, bus->self, b_res, add_size, add_align);
pci_info(bus->self, "bridge window %pR to %pR add_size %llx add_align %llx\n",
b_res, &bus->busn_res,
- (unsigned long long) (size1 - size0),
+ (unsigned long long) add_size,
(unsigned long long) add_align);
}
}
--
2.39.5
Let's properly adjust BALLOON_MIGRATE like the other drivers.
Note that the INFLATE/DEFLATE events are triggered from the core when
enqueueing/dequeueing pages.
Not completely sure whether really is stable material, but the fix is
trivial so let's just CC stable.
This was found by code inspection.
Fixes: fe030c9b85e6 ("powerpc/pseries/cmm: Implement balloon compaction")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
---
arch/powerpc/platforms/pseries/cmm.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index 688f5fa1c7245..310dab4bc8679 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -532,6 +532,7 @@ static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
spin_lock_irqsave(&b_dev_info->pages_lock, flags);
balloon_page_insert(b_dev_info, newpage);
+ __count_vm_event(BALLOON_MIGRATE);
b_dev_info->isolated_pages--;
spin_unlock_irqrestore(&b_dev_info->pages_lock, flags);
--
2.51.0
Backport commit: 094ee6017ea0 ("bonding: check xdp prog when set bond
mode") to 6.6.y to fix a bond issue.
It depends on commit: 22ccb684c1ca ("bonding: return detailed
error when loading native XDP fails)
In order to make a clean backport on stable kernel, backport 2 commits.
Hangbin Liu (1):
bonding: return detailed error when loading native XDP fails
Wang Liang (1):
bonding: check xdp prog when set bond mode
drivers/net/bonding/bond_main.c | 11 +++++++----
drivers/net/bonding/bond_options.c | 3 +++
include/net/bonding.h | 1 +
3 files changed, 11 insertions(+), 4 deletions(-)
--
2.17.1
We always have to initialize the balloon_dev_info, even when compaction
is not configured in: otherwise the containing list and the lock are
left uninitialized.
Likely not many such configs exist in practice, but let's CC stable to
be sure.
This was found by code inspection.
Fixes: fe030c9b85e6 ("powerpc/pseries/cmm: Implement balloon compaction")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
---
arch/powerpc/platforms/pseries/cmm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/pseries/cmm.c b/arch/powerpc/platforms/pseries/cmm.c
index 0823fa2da1516..688f5fa1c7245 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -550,7 +550,6 @@ static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
static void cmm_balloon_compaction_init(void)
{
- balloon_devinfo_init(&b_dev_info);
b_dev_info.migratepage = cmm_migratepage;
}
#else /* CONFIG_BALLOON_COMPACTION */
@@ -572,6 +571,7 @@ static int cmm_init(void)
if (!firmware_has_feature(FW_FEATURE_CMO) && !simulate)
return -EOPNOTSUPP;
+ balloon_devinfo_init(&b_dev_info);
cmm_balloon_compaction_init();
rc = register_oom_notifier(&cmm_oom_nb);
--
2.51.0
From: luoguangfei <15388634752(a)163.com>
[ Upstream commit 01b9128c5db1b470575d07b05b67ffa3cb02ebf1 ]
When removing a macb device, the driver calls phy_exit() before
unregister_netdev(). This leads to a WARN from kernfs:
------------[ cut here ]------------
kernfs: can not remove 'attached_dev', no directory
WARNING: CPU: 1 PID: 27146 at fs/kernfs/dir.c:1683
Call trace:
kernfs_remove_by_name_ns+0xd8/0xf0
sysfs_remove_link+0x24/0x58
phy_detach+0x5c/0x168
phy_disconnect+0x4c/0x70
phylink_disconnect_phy+0x6c/0xc0 [phylink]
macb_close+0x6c/0x170 [macb]
...
macb_remove+0x60/0x168 [macb]
platform_remove+0x5c/0x80
...
The warning happens because the PHY is being exited while the netdev
is still registered. The correct order is to unregister the netdev
before shutting down the PHY and cleaning up the MDIO bus.
Fix this by moving unregister_netdev() ahead of phy_exit() in
macb_remove().
Fixes: 8b73fa3ae02b ("net: macb: Added ZynqMP-specific initialization")
Signed-off-by: luoguangfei <15388634752(a)163.com>
Link: https://patch.msgid.link/20250818232527.1316-1-15388634752@163.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
[ Minor context change fixed. ]
Signed-off-by: Alva Lan <alvalan9(a)foxmail.com>
---
drivers/net/ethernet/cadence/macb_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 7593255e6e53..bf7b5d1d5616 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -5182,11 +5182,11 @@ static int macb_remove(struct platform_device *pdev)
if (dev) {
bp = netdev_priv(dev);
+ unregister_netdev(dev);
phy_exit(bp->sgmii_phy);
mdiobus_unregister(bp->mii_bus);
mdiobus_free(bp->mii_bus);
- unregister_netdev(dev);
tasklet_kill(&bp->hresp_err_tasklet);
pm_runtime_disable(&pdev->dev);
pm_runtime_dont_use_autosuspend(&pdev->dev);
--
2.43.0
Good day,
I am reaching out to invite your company to provide a quotation for the products detailed in the attached request. We recognise that some of these items may not align with your usual supplies, but we expect your expertise in sourcing and supplying these products.
Please note that this is a one-time tender, and we require the product and its components delivered on or before the date specified in the attached document. We anticipate your prompt response to enable us to proceed to the next step.
Thank you and looking forward to reviewing your proposal soonest.
Thomas Pierre
Procurement Manager
Phone: +1-713-564-2377
fax: +1-713-969-7350
Email: totalenergiesbids(a)contractor.net
Note: This message, including any attachments, is intended solely for the use of the individual or entity to whom it is addressed and may contain confidential, proprietary, or legally privileged information. Any unauthorized review, use, disclosure, distribution, reproduction, or any form of dissemination of this communication is strictly prohibited.If you are not the intended recipient, please notify the sender immediately, delete this message from your system, and do not retain, copy, or distribute it.Please note that any views or opinions expressed in this communication are those of the sender and do not necessarily reflect the official views or policies of the company.
"Please consider the environment before printing this email."