From: Juergen Gross <jgross(a)suse.com>
[ upstream commit 41925b105e345ebc84cedb64f59d20cb14a62613 ]
xen_remap() is used to establish mappings for frames not under direct
control of the kernel: for Xenstore and console ring pages, and for
grant pages of non-PV guests.
Today xen_remap() is defined to use ioremap() on x86 (doing uncached
mappings), and ioremap_cache() on Arm (doing cached mappings).
Uncached mappings for those use cases are bad for performance, so they
should be avoided if possible. As all use cases of xen_remap() don't
require uncached mappings (the mapped area is always physical RAM),
a mapping using the standard WB cache mode is fine.
As sparse is flagging some of the xen_remap() use cases to be not
appropriate for iomem(), as the result is not annotated with the
__iomem modifier, eliminate xen_remap() completely and replace all
use cases with memremap() specifying the MEMREMAP_WB caching mode.
xen_unmap() can be replaced with memunmap().
Reported-by: kernel test robot <lkp(a)intel.com>
Signed-off-by: Juergen Gross <jgross(a)suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Acked-by: Stefano Stabellini <sstabellini(a)kernel.org>
Link: https://lore.kernel.org/r/20220530082634.6339-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Teddy Astie <teddy.astie(a)vates.tech> [backport to 5.15.y]
---
arch/x86/include/asm/xen/page.h | 3 ---
drivers/xen/grant-table.c | 6 +++---
drivers/xen/xenbus/xenbus_probe.c | 3 +--
3 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
index 1a162e559753..c183b7f9efef 100644
--- a/arch/x86/include/asm/xen/page.h
+++ b/arch/x86/include/asm/xen/page.h
@@ -355,9 +355,6 @@ unsigned long arbitrary_virt_to_mfn(void *vaddr);
void make_lowmem_page_readonly(void *vaddr);
void make_lowmem_page_readwrite(void *vaddr);
-#define xen_remap(cookie, size) ioremap((cookie), (size))
-#define xen_unmap(cookie) iounmap((cookie))
-
static inline bool xen_arch_need_swiotlb(struct device *dev,
phys_addr_t phys,
dma_addr_t dev_addr)
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 0a2d24d6ac6f..a10e0741bec5 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -743,7 +743,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr)
if (xen_auto_xlat_grant_frames.count)
return -EINVAL;
- vaddr = xen_remap(addr, XEN_PAGE_SIZE * max_nr_gframes);
+ vaddr = memremap(addr, XEN_PAGE_SIZE * max_nr_gframes, MEMREMAP_WB);
if (vaddr == NULL) {
pr_warn("Failed to ioremap gnttab share frames (addr=%pa)!\n",
&addr);
@@ -751,7 +751,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr)
}
pfn = kcalloc(max_nr_gframes, sizeof(pfn[0]), GFP_KERNEL);
if (!pfn) {
- xen_unmap(vaddr);
+ memunmap(vaddr);
return -ENOMEM;
}
for (i = 0; i < max_nr_gframes; i++)
@@ -770,7 +770,7 @@ void gnttab_free_auto_xlat_frames(void)
if (!xen_auto_xlat_grant_frames.count)
return;
kfree(xen_auto_xlat_grant_frames.pfn);
- xen_unmap(xen_auto_xlat_grant_frames.vaddr);
+ memunmap(xen_auto_xlat_grant_frames.vaddr);
xen_auto_xlat_grant_frames.pfn = NULL;
xen_auto_xlat_grant_frames.count = 0;
diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 2068f83556b7..77ca24611293 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -982,8 +982,7 @@ static int __init xenbus_init(void)
#endif
xen_store_gfn = (unsigned long)v;
xen_store_interface =
- xen_remap(xen_store_gfn << XEN_PAGE_SHIFT,
- XEN_PAGE_SIZE);
+ memremap(xen_store_gfn << XEN_PAGE_SHIFT, XEN_PAGE_SIZE, MEMREMAP_WB);
break;
default:
pr_warn("Xenstore state unknown\n");
--
2.50.0
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
The commit message of commit 6ec1f0239485 ("md/md-bitmap: fix stats
collection for external bitmaps") states:
Remove the external bitmap check as the statistics should be
available regardless of bitmap storage location.
Return -EINVAL only for invalid bitmap with no storage (neither in
superblock nor in external file).
But, the code does not adhere to the above, as it does only check for
a valid super-block for "internal" bitmaps. Hence, we observe:
Oops: GPF, probably for non-canonical address 0x1cd66f1f40000028
RIP: 0010:bitmap_get_stats+0x45/0xd0
Call Trace:
seq_read_iter+0x2b9/0x46a
seq_read+0x12f/0x180
proc_reg_read+0x57/0xb0
vfs_read+0xf6/0x380
ksys_read+0x6d/0xf0
do_syscall_64+0x8c/0x1b0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
We fix this by checking the existence of a super-block for both the
internal and external case.
Fixes: 6ec1f0239485 ("md/md-bitmap: fix stats collection for external bitmaps")
Cc: stable(a)vger.kernel.org
Reported-by: Gerald Gibson <gerald.gibson(a)oracle.com>
Signed-off-by: Håkon Bugge <haakon.bugge(a)oracle.com>
---
drivers/md/md-bitmap.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index bd694910b01b0..7f524a26cebca 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -2366,8 +2366,7 @@ static int bitmap_get_stats(void *data, struct md_bitmap_stats *stats)
if (!bitmap)
return -ENOENT;
- if (!bitmap->mddev->bitmap_info.external &&
- !bitmap->storage.sb_page)
+ if (!bitmap->storage.sb_page)
return -EINVAL;
sb = kmap_local_page(bitmap->storage.sb_page);
stats->sync_size = le64_to_cpu(sb->sync_size);
--
2.43.5
Hi all,
First patch is a prerequisite in order to avoid NULL pointer
dereferences in error paths. Then two fixes follow.
Signed-off-by: Kurt Borja <kuurtb(a)gmail.com>
---
Changes in v3:
- Add Cc stable tags to all patches
- Rebase on top of the 'fixes' branch
- Link to v2: https://lore.kernel.org/r/20250628-lmi-fix-v2-0-c530e1c959d7@gmail.com
Changes in v2:
[PATCH 02]
- Remove kobject_del() and commit message remark. It turns out it's
optional to call this (my bad)
- Leave only one fixes tag. The other two are not necessary.
- Link to v1: https://lore.kernel.org/r/20250628-lmi-fix-v1-0-c6eec9aa3ca7@gmail.com
---
Kurt Borja (3):
platform/x86: think-lmi: Create ksets consecutively
platform/x86: think-lmi: Fix kobject cleanup
platform/x86: think-lmi: Fix sysfs group cleanup
drivers/platform/x86/think-lmi.c | 90 ++++++++++++++--------------------------
1 file changed, 30 insertions(+), 60 deletions(-)
---
base-commit: 173bbec6693f3f3f00dac144f3aa0cd62fb60d33
change-id: 20250628-lmi-fix-98143b10d9fd
--
~ Kurt
#regzbot introduced: 129dab6e1286
Hello everyone,
We've identified a performance regression that starts with linux
kernel 6.10 and persists through 6.16(tested at commit e540341508ce).
Bisection pointed to commit:
129dab6e1286 ("iommu/vt-d: Use cache_tag_flush_range_np() in iotlb_sync_map").
The issue occurs when running fio against two NVMe devices located
under the same PCIe bridge (dual-port NVMe configuration). Performance
drops compared to configurations where the devices are on different
bridges.
Observed Performance:
- Before the commit: ~6150 MiB/s, regardless of NVMe device placement.
- After the commit:
-- Same PCIe bridge: ~4985 MiB/s
-- Different PCIe bridges: ~6150 MiB/s
Currently we can only reproduce the issue on a Z3 metal instance on
gcp. I suspect the issue can be reproducible if you have a dual port
nvme on any machine.
At [1] there's a more detailed description of the issue and details
on the reproducer.
Could you please advise on the appropriate path forward to mitigate or
address this regression?
Thanks,
Jo
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2115738
Hi,
Details in the patch attached, but an unrelated vfs change broke io_uring
for anon inode reading/writing. Please queue this up asap for 6.15.5 so we
don't have have any further 6.15-stable kernels with this regression.
You can also just cherry pick it, picks cleanly. Sha is:
6f11adcc6f36ffd8f33dbdf5f5ce073368975bc3
Thanks,
--
Jens Axboe
Only select ARCH_WANT_HUGE_PMD_SHARE on 64-bit x86.
Page table sharing requires at least three levels because it involves
shared references to PMD tables; 32-bit x86 has either two-level paging
(without PAE) or three-level paging (with PAE), but even with
three-level paging, having a dedicated PGD entry for hugetlb is only
barely possible (because the PGD only has four entries), and it seems
unlikely anyone's actually using PMD sharing on 32-bit.
Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which
has 2-level paging) became particularly problematic after commit
59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"),
since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and
the `pt_share_count` (for PMDs) share the same union storage - and with
2-level paging, PMDs are PGDs.
(For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the
configuration of page tables such that it is never enabled with 2-level
paging.)
Reported-by: Vitaly Chikunov <vt(a)altlinux.org>
Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Tested-by: Vitaly Chikunov <vt(a)altlinux.org>
Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jann Horn <jannh(a)google.com>
---
I'm carrying over Vitaly Chikunov's "Tested-by" from v1.
Changes in v2:
- disable it for 32-bit entirely (Dave Hansen)
- Link to v1: https://lore.kernel.org/r/20250630-x86-2level-hugetlb-v1-1-077cd53d8255@goo…
---
arch/x86/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 71019b3b54ea..4e0fe688cc83 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,7 +147,7 @@ config X86
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
- select ARCH_WANT_HUGE_PMD_SHARE
+ select ARCH_WANT_HUGE_PMD_SHARE if X86_64
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64
select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
---
base-commit: d0b3b7b22dfa1f4b515fd3a295b3fd958f9e81af
change-id: 20250630-x86-2level-hugetlb-b1d8feb255ce
--
Jann Horn <jannh(a)google.com>
Hi,
The commit has been in v6.1.y for 3+ years (but not in the 5.x stable branches):
23e118a48acf ("PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time")
The commit can be cherry-picked cleanly to the latest 5.x stable branches, i.e. 5.15.185, 5.10.238 and 5.4.294.
Some Linux distro kernels are still based on the 5.x stable branches, and they have been carrying the commit explicitly. It would be great to have the commit in the upstream 5.x branches, so they don't have to carry the commit explicitly.
There is a Debian 10.3 user who wants the commit, but the commit is absent from Debian 10.3's kernel, whose full version string is:
Linux version 5.10.0-0.deb10.30-amd64 (debian-kernel(a)lists.debian.org) (gcc-8 (Debian 8.3.0-6) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP Debian 5.10.218-1~deb10u1 (2024-06-12)
If the commit is cherry-picked into the upstream 5.10.y, the next Debian 10.3 kernel update would have the commit automatically.
Thanks!
-- Dexuan
Hi stable maintainers,
Please apply commit d1e420772cd1 ("x86/pkeys: Simplify PKRU update in
signal frame") to the stable branches for 6.12 and later.
This fixes a regression introduced in 6.13 by commit ae6012d72fa6
("x86/pkeys: Ensure updated PKRU value is XRSTOR'd"), which was also
backported in 6.12.5.
Ben.
--
Ben Hutchings
73.46% of all statistics are made up.