The patch titled
Subject: mm/hugetlb: fix hugetlb_pmd_shared()
has been added to the -mm mm-unstable branch. Its filename is
mm-hugetlb-fix-hugetlb_pmd_shared.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "David Hildenbrand (Red Hat)" <david(a)kernel.org>
Subject: mm/hugetlb: fix hugetlb_pmd_shared()
Date: Fri, 5 Dec 2025 22:35:55 +0100
Patch series "mm/hugetlb: fixes for PMD table sharing (incl. using
mmu_gather)".
One functional fix, one performance regression fix, and two related
comment fixes.
I cleaned up my prototype I recently shared [1] for the performance fix,
deferring most of the cleanups I had in the prototype to a later point.
While doing that I identified the other things.
The goal of this patch set is to be backported to stable trees "fairly"
easily. At least patch #1 and #4.
Patch #1 fixes hugetlb_pmd_shared() not detecting any sharing
Patch #2 + #3 are simple comment fixes that patch #4 interacts with.
Patch #4 is a fix for the reported performance regression due to excessive
IPI broadcasts during fork()+exit().
The last patch is all about TLB flushes, IPIs and mmu_gather. Read:
complicated
I added as much comments + description that I possibly could, and I am
hoping for review from Jann.
There are plenty of cleanups in the future to be had + one reasonable
optimization on x86. But that's all out of scope for this series.
This patch (of 4):
We switched from (wrongly) using the page count to an independent shared
count. Now, shared page tables have a refcount of 1 (excluding
speculative references) and instead use ptdesc->pt_share_count to identify
sharing.
We didn't convert hugetlb_pmd_shared(), so right now, we would never
detect a shared PMD table as such, because sharing/unsharing no longer
touches the refcount of a PMD table.
Page migration, like mbind() or migrate_pages() would allow for migrating
folios mapped into such shared PMD tables, even though the folios are not
exclusive. In smaps we would account them as "private" although they are
"shared", and we would be wrongly setting the PM_MMAP_EXCLUSIVE in the
pagemap interface.
Fix it by properly using ptdesc_pmd_is_shared() in hugetlb_pmd_shared().
Link: https://lkml.kernel.org/r/20251205213558.2980480-1-david@kernel.org
Link: https://lkml.kernel.org/r/20251205213558.2980480-2-david@kernel.org
Link: https://lore.kernel.org/all/8cab934d-4a56-44aa-b641-bfd7e23bd673@kernel.org/ [1]
Fixes: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count")
Signed-off-by: David Hildenbrand (Red Hat) <david(a)kernel.org>
Tested-by: Laurence Oberman <loberman(a)redhat.com>
Reviewed-by: Rik van Riel <riel(a)surriel.com>
Reviewed-by: Lance Yang <lance.yang(a)linux.dev>
Cc: Liu Shixin <liushixin2(a)huawei.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)kernel.org>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Jann Horn <jannh(a)google.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Nicholas Piggin <npiggin(a)gmail.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Prakash Sangappa <prakash.sangappa(a)oracle.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Will Deacon <will(a)kernel.org>
Cc: Uschakow, Stanislav" <suschako(a)amazon.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/hugetlb.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/hugetlb.h~mm-hugetlb-fix-hugetlb_pmd_shared
+++ a/include/linux/hugetlb.h
@@ -1326,7 +1326,7 @@ static inline __init void hugetlb_cma_re
#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING
static inline bool hugetlb_pmd_shared(pte_t *pte)
{
- return page_count(virt_to_page(pte)) > 1;
+ return ptdesc_pmd_is_shared(virt_to_ptdesc(pte));
}
#else
static inline bool hugetlb_pmd_shared(pte_t *pte)
_
Patches currently in -mm which might be from david(a)kernel.org are
mm-hugetlb-fix-hugetlb_pmd_shared.patch
mm-hugetlb-fix-two-comments-related-to-huge_pmd_unshare.patch
mm-rmap-fix-two-comments-related-to-huge_pmd_unshare.patch
mm-hugetlb-fix-excessive-ipi-broadcasts-when-unsharing-pmd-tables-using-mmu_gather.patch
Hi Ilpo,
I managed to get my hands on acpidumps for these models so this is
verified against those.
Thanks for all your latest reviews!
Signed-off-by: Kurt Borja <kuurtb(a)gmail.com>
---
Kurt Borja (3):
platform/x86: alienware-wmi-wmax: Add support for new Area-51 laptops
platform/x86: alienware-wmi-wmax: Add AWCC support for Alienware x16
platform/x86: alienware-wmi-wmax: Add support for Alienware 16X Aurora
drivers/platform/x86/dell/alienware-wmi-wmax.c | 32 ++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
---
base-commit: 9b9c0adbc3f8a524d291baccc9d0c04097fb4869
change-id: 20251111-area-51-7e6c2501e4eb
--
~ Kurt
When the filesystem is being mounted, the kernel panics while the data
regarding slot map allocation to the local node, is being written to the
disk. This occurs because the value of slot map buffer head block
number, which should have been greater than or equal to
`OCFS2_SUPER_BLOCK_BLKNO` (evaluating to 2) is less than it, indicative
of disk metadata corruption. This triggers
BUG_ON(bh->b_blocknr < OCFS2_SUPER_BLOCK_BLKNO) in ocfs2_write_block(),
causing the kernel to panic.
This is fixed by introducing an if condition block in
ocfs2_update_disk_slot(), right before calling ocfs2_write_block(), which
checks if `bh->b_blocknr` is lesser than `OCFS2_SUPER_BLOCK_BLKNO`; if
yes, then ocfs2_error is called, which prints the error log, for
debugging purposes, and the return value of ocfs2_error() is returned
back to caller of ocfs2_update_disk_slot() i.e. ocfs2_find_slot(). If
the return value is zero. then error code EIO is returned.
Reported-by: syzbot+c818e5c4559444f88aa0(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=c818e5c4559444f88aa0
Tested-by: syzbot+c818e5c4559444f88aa0(a)syzkaller.appspotmail.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Prithvi Tambewagh <activprithvi(a)gmail.com>
---
fs/ocfs2/slot_map.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/fs/ocfs2/slot_map.c b/fs/ocfs2/slot_map.c
index e544c704b583..788924fc3663 100644
--- a/fs/ocfs2/slot_map.c
+++ b/fs/ocfs2/slot_map.c
@@ -193,6 +193,16 @@ static int ocfs2_update_disk_slot(struct ocfs2_super *osb,
else
ocfs2_update_disk_slot_old(si, slot_num, &bh);
spin_unlock(&osb->osb_lock);
+ if (bh->b_blocknr < OCFS2_SUPER_BLOCK_BLKNO) {
+ status = ocfs2_error(osb->sb,
+ "Invalid Slot Map Buffer Head "
+ "Block Number : %llu, Should be >= %d",
+ le16_to_cpu(bh->b_blocknr),
+ le16_to_cpu((int)OCFS2_SUPER_BLOCK_BLKNO));
+ if (!status)
+ return -EIO;
+ return status;
+ }
status = ocfs2_write_block(osb, bh, INODE_CACHE(si->si_inode));
if (status < 0)
base-commit: 24172e0d79900908cf5ebf366600616d29c9b417
--
2.43.0
Hi,
We now have the verified International IAA Mobility 2025 Database with 501,479 attendees and 750 exhibitors.
Each contact includes: names, job titles, company details, locations, phone numbers, and verified email addresses — all carefully sourced and checked for accuracy.
delivered within 48 hours. Season-End Offer: 30% Discount on All Databases.
Reply “Send me the cost” to get pricing and more details.
Best regards,
Grace Taylor
Sr. Marketing Manager
P.S. Reply “Unfollow” to opt out.
From: Steven Rostedt <rostedt(a)goodmis.org>
The commit 4d38328eb442d ("tracing: Fix synth event printk format for str
fields") replaced "%.*s" with "%s" but missed removing the number size of
the dynamic and static strings. The commit e1a453a57bc7 ("tracing: Do not
add length to print format in synthetic events") fixed the dynamic part
but did not fix the static part. That is, with the commands:
# echo 's:wake_lat char[] wakee; u64 delta;' >> /sys/kernel/tracing/dynamic_events
# echo 'hist:keys=pid:ts=common_timestamp.usecs if !(common_flags & 0x18)' > /sys/kernel/tracing/events/sched/sched_waking/trigger
# echo 'hist:keys=next_pid:delta=common_timestamp.usecs-$ts:onmatch(sched.sched_waking).trace(wake_lat,next_comm,$delta)' > /sys/kernel/tracing/events/sched/sched_switch/trigger
That caused the output of:
<idle>-0 [001] d..5. 193.428167: wake_lat: wakee=(efault)sshd-sessiondelta=155
sshd-session-879 [001] d..5. 193.811080: wake_lat: wakee=(efault)kworker/u34:5delta=58
<idle>-0 [002] d..5. 193.811198: wake_lat: wakee=(efault)bashdelta=91
The commit e1a453a57bc7 fixed the part where the synthetic event had
"char[] wakee". But if one were to replace that with a static size string:
# echo 's:wake_lat char[16] wakee; u64 delta;' >> /sys/kernel/tracing/dynamic_events
Where "wakee" is defined as "char[16]" and not "char[]" making it a static
size, the code triggered the "(efaul)" again.
Remove the added STR_VAR_LEN_MAX size as the string is still going to be
nul terminated.
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Douglas Raillard <douglas.raillard(a)arm.com>
Link: https://patch.msgid.link/20251204151935.5fa30355@gandalf.local.home
Fixes: e1a453a57bc7 ("tracing: Do not add length to print format in synthetic events")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/trace_events_synth.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
index 2f19bbe73d27..4554c458b78c 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -375,7 +375,6 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter,
n_u64++;
} else {
trace_seq_printf(s, print_fmt, se->fields[i]->name,
- STR_VAR_LEN_MAX,
(char *)&entry->fields[n_u64].as_u64,
i == se->n_fields - 1 ? "" : " ");
n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
--
2.51.0
A new warning in Clang 22 [1] complains that @clidr passed to
get_clidr_el1() is an uninitialized const pointer. get_clidr_el1()
doesn't really care since it casts away the const-ness anyways.
Silence the warning by initializing the struct.
This patch won't apply to anything past v6.1 as this code section was
reworked in Commit 7af0c2534f4c ("KVM: arm64: Normalize cache
configuration"). There is no upstream equivalent so this patch only
needs to be applied (stable only) to 6.1.
Cc: stable(a)vger.kernel.org
Fixes: 7c8c5e6a9101e ("arm64: KVM: system register handling")
Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d44… [1]
Signed-off-by: Justin Stitt <justinstitt(a)google.com>
---
Resending this with Nathan's RB tag, an updated commit log and better
recipients from checkpatch.pl.
I've also sent a similar patch resend for 5.15
---
arch/arm64/kvm/sys_regs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index f4a7c5abcbca..d7ebd7387221 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2948,7 +2948,7 @@ int kvm_sys_reg_table_init(void)
{
bool valid = true;
unsigned int i;
- struct sys_reg_desc clidr;
+ struct sys_reg_desc clidr = {0};
/* Make sure tables are unique and in order. */
valid &= check_sysreg_table(sys_reg_descs, ARRAY_SIZE(sys_reg_descs), false);
---
base-commit: 830b3c68c1fb1e9176028d02ef86f3cf76aa2476
change-id: 20250724-b4-clidr-unint-const-ptr-7edb960bc3bd
Best regards,
--
Justin Stitt <justinstitt(a)google.com>
A new warning in Clang 22 [1] complains that @clidr passed to
get_clidr_el1() is an uninitialized const pointer. get_clidr_el1()
doesn't really care since it casts away the const-ness anyways -- it is
a false positive.
This patch isn't needed for anything past 6.1 as this code section was
reworked in Commit 7af0c2534f4c ("KVM: arm64: Normalize cache
configuration") which incidentally removed the aforementioned warning.
Since there is no upstream equivalent, this patch just needs to be
applied to 6.1.
Disable this warning for sys_regs.o instead of backporting the patches
from 6.2+ that modified this code area.
Cc: stable(a)vger.kernel.org
Fixes: 7c8c5e6a9101e ("arm64: KVM: system register handling")
Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d44… [1]
Reviewed-by: Nathan Chancellor <nathan(a)kernel.org>
Signed-off-by: Justin Stitt <justinstitt(a)google.com>
---
Changes in v2:
- disable warning for TU instead of initialising the struct
- update commit message
- Link to v1: https://lore.kernel.org/all/20250724-b4-clidr-unint-const-ptr-v1-1-67c4d620…
- Link to v1 resend (sent wrong diff, thanks Nathan): https://lore.kernel.org/all/20251204-b4-clidr-unint-const-ptr-v1-1-95161315…
---
arch/arm64/kvm/Makefile | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 5e33c2d4645a..5fdb5331bfad 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -24,6 +24,9 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
kvm-$(CONFIG_HW_PERF_EVENTS) += pmu-emul.o pmu.o
+# Work around a false positive Clang 22 -Wuninitialized-const-pointer warning
+CFLAGS_sys_regs.o := $(call cc-disable-warning, uninitialized-const-pointer)
+
always-y := hyp_constants.h hyp-constants.s
define rule_gen_hyp_constants
---
base-commit: 830b3c68c1fb1e9176028d02ef86f3cf76aa2476
change-id: 20250728-stable-disable-unit-ptr-warn-281fee82539c
Best regards,
--
Justin Stitt <justinstitt(a)google.com>
When compiling the device tree for the Integrator/AP with IM-PD1, the
following warning is observed regarding the display controller node:
arch/arm/boot/dts/arm/integratorap-im-pd1.dts:251.3-14: Warning
(dma_ranges_format):
/bus@c0000000/bus@c0000000/display@1000000:dma-ranges: empty
"dma-ranges" property but its #address-cells (2) differs from
/bus@c0000000/bus@c0000000 (1)
The display node specifies an empty "dma-ranges" property, intended to
describe a 1:1 identity mapping. However, the node lacks explicit
"#address-cells" and "#size-cells" properties. In this case, the device
tree compiler defaults the address cells to 2 (64-bit), which conflicts
with the parent bus configuration (32-bit, 1 cell).
Fix this by explicitly defining "#address-cells" and "#size-cells" as
1. This matches the 32-bit architecture of the Integrator platform and
ensures the address translation range is correctly parsed by the
compiler.
Fixes: 7bea67a99430 ("ARM: dts: integrator: Fix DMA ranges")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kuan-Wei Chiu <visitorckw(a)gmail.com>
---
arch/arm/boot/dts/arm/integratorap-im-pd1.dts | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm/boot/dts/arm/integratorap-im-pd1.dts b/arch/arm/boot/dts/arm/integratorap-im-pd1.dts
index db13e09f2fab..6d90d9a42dc9 100644
--- a/arch/arm/boot/dts/arm/integratorap-im-pd1.dts
+++ b/arch/arm/boot/dts/arm/integratorap-im-pd1.dts
@@ -248,6 +248,8 @@ display@1000000 {
/* 640x480 16bpp @ 25.175MHz is 36827428 bytes/s */
max-memory-bandwidth = <40000000>;
memory-region = <&impd1_ram>;
+ #address-cells = <1>;
+ #size-cells = <1>;
dma-ranges;
port@0 {
--
2.52.0.177.g9f829587af-goog
Since we recently started warning about uses of this function after the
atomic check phase completes, we've started getting warnings about this in
nouveau. It appears a misplaced drm_atomic_get_crtc_state() call has been
hiding in our .prepare_fb callback for a while.
So, fix this by adding a new nv50_head_atom_get_new() function and use that
in our .prepare_fb callback instead.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Fixes: 1590700d94ac ("drm/nouveau/kms/nv50-: split each resource type into their own source files")
Cc: <stable(a)vger.kernel.org> # v4.18+
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
---
drivers/gpu/drm/nouveau/dispnv50/atom.h | 13 +++++++++++++
drivers/gpu/drm/nouveau/dispnv50/wndw.c | 2 +-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/dispnv50/atom.h b/drivers/gpu/drm/nouveau/dispnv50/atom.h
index 93f8f4f645784..85b7cf70d13c4 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/atom.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/atom.h
@@ -152,8 +152,21 @@ static inline struct nv50_head_atom *
nv50_head_atom_get(struct drm_atomic_state *state, struct drm_crtc *crtc)
{
struct drm_crtc_state *statec = drm_atomic_get_crtc_state(state, crtc);
+
if (IS_ERR(statec))
return (void *)statec;
+
+ return nv50_head_atom(statec);
+}
+
+static inline struct nv50_head_atom *
+nv50_head_atom_get_new(struct drm_atomic_state *state, struct drm_crtc *crtc)
+{
+ struct drm_crtc_state *statec = drm_atomic_get_new_crtc_state(state, crtc);
+
+ if (IS_ERR(statec))
+ return (void*)statec;
+
return nv50_head_atom(statec);
}
diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
index ef9e410babbfb..9a2c20fce0f3e 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
@@ -583,7 +583,7 @@ nv50_wndw_prepare_fb(struct drm_plane *plane, struct drm_plane_state *state)
asyw->image.offset[0] = nvbo->offset;
if (wndw->func->prepare) {
- asyh = nv50_head_atom_get(asyw->state.state, asyw->state.crtc);
+ asyh = nv50_head_atom_get_new(asyw->state.state, asyw->state.crtc);
if (IS_ERR(asyh))
return PTR_ERR(asyh);
--
2.52.0
Pinctrl is responsible for bias settings and possibly other pin config,
so call gpiochip_generic_config to apply such config values. This might
also include settings that pinctrl does not support, but then it can
return ENOTSUPP as appropriate.
This makes sure any bias and other pin config set by userspace (via
gpiod) actually takes effect.
Cc: stable(a)vger.kernel.org
Signed-off-by: Matthijs Kooijman <matthijs(a)stdin.nl>
---
drivers/gpio/gpio-rockchip.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-rockchip.c b/drivers/gpio/gpio-rockchip.c
index 47174eb3ba76f..106f7f734b4ff 100644
--- a/drivers/gpio/gpio-rockchip.c
+++ b/drivers/gpio/gpio-rockchip.c
@@ -303,7 +303,7 @@ static int rockchip_gpio_set_config(struct gpio_chip *gc, unsigned int offset,
*/
return -ENOTSUPP;
default:
- return -ENOTSUPP;
+ return gpiochip_generic_config(gc, offset, config);
}
}
--
2.48.1
The VM_BIND ioctl did not validate args->num_syncs, allowing userspace
to pass excessively large values that could lead to oversized allocations
or other unexpected behavior.
Add a check to ensure num_syncs does not exceed XE_MAX_SYNCS,
returning -EINVAL when the limit is violated.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: <stable(a)vger.kernel.org> # v6.12+
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: Michal Mrozek <michal.mrozek(a)intel.com>
Cc: Carl Zhang <carl.zhang(a)intel.com>
Cc: José Roberto de Souza <jose.souza(a)intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin(a)intel.com>
Cc: Ivan Briano <ivan.briano(a)intel.com>
Cc: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit(a)intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin(a)intel.com>
---
drivers/gpu/drm/xe/xe_vm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index c2012d20faa6..2ada14ed1f3c 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -3341,6 +3341,9 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe, struct xe_vm *vm,
if (XE_IOCTL_DBG(xe, args->extensions))
return -EINVAL;
+ if (XE_IOCTL_DBG(xe, args->num_syncs > XE_MAX_SYNCS))
+ return -EINVAL;
+
if (args->num_binds > 1) {
u64 __user *bind_user =
u64_to_user_ptr(args->vector_of_binds);
--
2.50.1
On a Xen dom0 boot, this feature does not behave, and we end up
calculating:
num_roots = 1
num_nodes = 2
roots_per_node = 0
This causes a divide-by-zero in the modulus inside the loop.
This change adds a couple of guards for invalid states where we might
get a divide-by-zero.
Signed-off-by: Steven Noonan <steven(a)uplinklabs.net>
Signed-off-by: Ariadne Conill <ariadne(a)ariadne.space>
CC: Yazen Ghannam <yazen.ghannam(a)amd.com>
CC: x86(a)vger.kernel.org
CC: stable(a)vger.kernel.org
---
arch/x86/kernel/amd_node.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/x86/kernel/amd_node.c b/arch/x86/kernel/amd_node.c
index 3d0a4768d603c..cdc6ba224d4ad 100644
--- a/arch/x86/kernel/amd_node.c
+++ b/arch/x86/kernel/amd_node.c
@@ -282,6 +282,17 @@ static int __init amd_smn_init(void)
return -ENODEV;
num_nodes = amd_num_nodes();
+
+ if (!num_nodes)
+ return -ENODEV;
+
+ /* Possibly a virtualized environment (e.g. Xen) where we wi
ll get
+ * roots_per_node=0 if the number of roots is fewer than number of
+ * nodes
+ */
+ if (num_roots < num_nodes)
+ return -ENODEV;
+
amd_roots = kcalloc(num_nodes, sizeof(*amd_roots), GFP_KERNEL);
if (!amd_roots)
return -ENOMEM;
--
2.51.2
From: Maciej Wieczor-Retman <maciej.wieczor-retman(a)intel.com>
A KASAN tag mismatch, possibly causing a kernel panic, can be observed
on systems with a tag-based KASAN enabled and with multiple NUMA nodes.
It was reported on arm64 and reproduced on x86. It can be explained in
the following points:
1. There can be more than one virtual memory chunk.
2. Chunk's base address has a tag.
3. The base address points at the first chunk and thus inherits
the tag of the first chunk.
4. The subsequent chunks will be accessed with the tag from the
first chunk.
5. Thus, the subsequent chunks need to have their tag set to
match that of the first chunk.
Use the new vmalloc flag that disables random tag assignment in
__kasan_unpoison_vmalloc() - pass the same random tag to all the
vm_structs by tagging the pointers before they go inside
__kasan_unpoison_vmalloc(). Assigning a common tag resolves the pcpu
chunk address mismatch.
Fixes: 1d96320f8d53 ("kasan, vmalloc: add vmalloc tagging for SW_TAGS")
Cc: stable(a)vger.kernel.org # 6.1+
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman(a)intel.com>
Reviewed-by: Andrey Konovalov <andreyknvl(a)gmail.com>
---
Changelog v4:
- Add WARN_ON_ONCE() if the new flag is already set in the helper.
(Andrey)
- Remove pr_warn() since the comment should be enough. (Andrey)
Changelog v3:
- Redo the patch by using a flag instead of a new argument in
__kasan_unpoison_vmalloc() (Andrey Konovalov)
Changelog v2:
- Revise the whole patch to match the fixed refactorization from the
first patch.
Changelog v1:
- Rewrite the patch message to point at the user impact of the issue.
- Move helper to common.c so it can be compiled in all KASAN modes.
mm/kasan/common.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 1ed6289d471a..589be3d86735 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -591,11 +591,26 @@ void __kasan_unpoison_vmap_areas(struct vm_struct **vms, int nr_vms,
unsigned long size;
void *addr;
int area;
+ u8 tag;
+
+ /*
+ * If KASAN_VMALLOC_KEEP_TAG was set at this point, all vms[] pointers
+ * would be unpoisoned with the KASAN_TAG_KERNEL which would disable
+ * KASAN checks down the line.
+ */
+ if (WARN_ON_ONCE(flags & KASAN_VMALLOC_KEEP_TAG))
+ return;
+
+ size = vms[0]->size;
+ addr = vms[0]->addr;
+ vms[0]->addr = __kasan_unpoison_vmalloc(addr, size, flags);
+ tag = get_tag(vms[0]->addr);
- for (area = 0 ; area < nr_vms ; area++) {
+ for (area = 1 ; area < nr_vms ; area++) {
size = vms[area]->size;
- addr = vms[area]->addr;
- vms[area]->addr = __kasan_unpoison_vmalloc(addr, size, flags);
+ addr = set_tag(vms[area]->addr, tag);
+ vms[area]->addr =
+ __kasan_unpoison_vmalloc(addr, size, flags | KASAN_VMALLOC_KEEP_TAG);
}
}
#endif
--
2.52.0