The quilt patch titled
Subject: mm/cma.c: delete kmemleak objects when freeing CMA areas to buddy at boot
has been removed from the -mm tree. Its filename was
mm-cmac-delete-kmemleak-objects-when-freeing-cma-areas-to-buddy-at-boot.patch
This patch was dropped because it is obsolete
------------------------------------------------------
From: "Isaac J. Manjarres" <isaacmanjarres(a)google.com>
Subject: mm/cma.c: delete kmemleak objects when freeing CMA areas to buddy at boot
Date: Mon, 9 Jan 2023 14:16:23 -0800
Since every CMA region is now tracked by kmemleak at the time
cma_activate_area() is invoked, and cma_activate_area() is called for each
CMA region, invoke kmemleak_free_part_phys() during cma_activate_area() to
inform kmemleak that the CMA region will be freed. Doing so also removes
the need to invoke kmemleak_ignore_phys() when the global CMA region is
being created, as the kmemleak object for it will be deleted.
This helps resolve a crash when kmemleak and CONFIG_DEBUG_PAGEALLOC are
both enabled, since CONFIG_DEBUG_PAGEALLOC causes the CMA region to be
unmapped from the kernel's address space when the pages are freed to
buddy. Without this patch, kmemleak will attempt to scan the CMA regions,
even though they are unmapped, which leads to a page-fault.
Link: https://lkml.kernel.org/r/20230109221624.592315-3-isaacmanjarres@google.com
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
Cc: Isaac J. Manjarres <isaacmanjarres(a)google.com>
Cc: Saravana Kannan <saravanak(a)google.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/cma.c~mm-cmac-delete-kmemleak-objects-when-freeing-cma-areas-to-buddy-at-boot
+++ a/mm/cma.c
@@ -103,6 +103,13 @@ static void __init cma_activate_area(str
goto out_error;
/*
+ * The CMA region was marked as allocated by kmemleak when it was either
+ * dynamically allocated or statically reserved. In any case,
+ * inform kmemleak that the region is about to be freed to the page allocator.
+ */
+ kmemleak_free_part_phys(cma_get_base(cma), cma_get_size(cma));
+
+ /*
* alloc_contig_range() requires the pfn range specified to be in the
* same zone. Simplify by forcing the entire CMA resv range to be in the
* same zone.
@@ -361,11 +368,6 @@ int __init cma_declare_contiguous_nid(ph
}
}
- /*
- * kmemleak scans/reads tracked objects for pointers to other
- * objects but this address isn't mapped and accessible
- */
- kmemleak_ignore_phys(addr);
base = addr;
}
_
Patches currently in -mm which might be from isaacmanjarres(a)google.com are
revert-mm-kmemleak-alloc-gray-object-for-reserved-region-with-direct-map.patch
The quilt patch titled
Subject: mm/cma.c: make kmemleak aware of all CMA regions
has been removed from the -mm tree. Its filename was
mm-cmac-make-kmemleak-aware-of-all-cma-regions.patch
This patch was dropped because it is obsolete
------------------------------------------------------
From: "Isaac J. Manjarres" <isaacmanjarres(a)google.com>
Subject: mm/cma.c: make kmemleak aware of all CMA regions
Date: Mon, 9 Jan 2023 14:16:22 -0800
Patch series "Fixes for kmemleak tracking with CMA regions".
When trying to boot a device with an ARM64 kernel with the following
config options enabled:
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y
CONFIG_DEBUG_KMEMLEAK=y
a page-fault is encountered when kmemleak starts to scan the list of gray
or allocated objects that it maintains. Upon closer inspection, it was
observed that these page-faults always occurred when kmemleak attempted to
scan a CMA region.
At the moment, kmemleak is made aware of CMA regions that are specified
through the devicetree to be created at specific memory addresses or
dynamically allocated within a range of addresses. However, if the CMA
region is constrained to a certain range of addresses through the command
line, the region is reserved through the memblock_reserve() function, but
kmemleak_alloc_phys() is not invoked. Furthermore, kmemleak is never
informed about CMA regions being freed to buddy at boot, which is
problematic when CONFIG_DEBUG_PAGEALLOC is enabled, as all CMA regions are
unmapped from the kernel's address space, and subsequently causes a
page-fault when kmemleak attempts to scan any of them.
This series makes it so that kmemleak is aware of every CMA region before
they are freed to the buddy allocator, so that at that time, kmemleak can
be informed that each region is about to be freed, and thus it should not
attempt to scan those regions.
This patch (of 2):
Currently, kmemleak tracks CMA regions that are specified through the
devicetree. However, if the global CMA region is specified through the
commandline, kmemleak will be unaware of the CMA region because
kmemleak_alloc_phys() is not invoked after memblock_reserve(). Add the
missing call to kmemleak_alloc_phys() so that all CMA regions are tracked
by kmemleak before they are freed to the page allocator in
cma_activate_area().
Link: https://lkml.kernel.org/r/20230109221624.592315-1-isaacmanjarres@google.com
Link: https://lkml.kernel.org/r/20230109221624.592315-2-isaacmanjarres@google.com
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
Cc: <stable(a)vger.kernel.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Saravana Kannan <saravanak(a)google.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/cma.c~mm-cmac-make-kmemleak-aware-of-all-cma-regions
+++ a/mm/cma.c
@@ -318,6 +318,8 @@ int __init cma_declare_contiguous_nid(ph
ret = -EBUSY;
goto err;
}
+
+ kmemleak_alloc_phys(base, size, 0);
} else {
phys_addr_t addr = 0;
_
Patches currently in -mm which might be from isaacmanjarres(a)google.com are
revert-mm-kmemleak-alloc-gray-object-for-reserved-region-with-direct-map.patch
mm-cmac-delete-kmemleak-objects-when-freeing-cma-areas-to-buddy-at-boot.patch
From: Kan Liang <kan.liang(a)linux.intel.com>
The below error can be observed in a Linux guest, when the hypervisor
is on a hybrid machine.
[ 0.118214] unchecked MSR access error: WRMSR to 0x38f (tried to
write 0x00011000f0000003f) at rIP: 0xffffffff83082124
(native_write_msr+0x4/0x30)
[ 0.118949] Call Trace:
[ 0.119092] <TASK>
[ 0.119215] ? __intel_pmu_enable_all.constprop.0+0x88/0xe0
[ 0.119533] intel_pmu_enable_all+0x15/0x20
[ 0.119778] x86_pmu_enable+0x17c/0x320
The current perf wrongly assumes that the perf metrics feature is always
enabled on p-core. It unconditionally enables the feature to workaround
the unreliable enumeration of the PERF_CAPABILITIES MSR. The assumption
is safe to bare metal. However, KVM doesn't support the perf metrics
feature yet. Setting the corresponding bit triggers MSR access error
in a guest.
Only unconditionally enable the core specific PMU feature for bare metal
on ADL and RPL, which includes the perf metrics on p-core and
PEBS-via-PT on e-core.
For the future platforms, perf doesn't need to hardcode the PMU feature.
The per-core PMU features can be enumerated by the enhanced
PERF_CAPABILITIES MSR and CPUID leaf 0x23. There is no such issue.
Fixes: f83d2f91d259 ("perf/x86/intel: Add Alder Lake Hybrid support")
Link: https://lore.kernel.org/lkml/e161b7c0-f0be-23c8-9a25-002260c2a085@linux.int…
Reported-by: Pengfei Xu <pengfei.xu(a)intel.com>
Signed-off-by: Kan Liang <kan.liang(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
---
Based on my limit knowledge regarding KVM and guest, I use the
HYPERVISOR bit to tell whether it's a guest. But I'm not sure whether
it's reliable. Please let me know if there is a better way. Thanks.
arch/x86/events/intel/core.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index bbb7846d3c1e..8d08929a7250 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -6459,8 +6459,17 @@ __init int intel_pmu_init(void)
__EVENT_CONSTRAINT(0, (1ULL << pmu->num_counters) - 1,
0, pmu->num_counters, 0, 0);
pmu->intel_cap.capabilities = x86_pmu.intel_cap.capabilities;
- pmu->intel_cap.perf_metrics = 1;
- pmu->intel_cap.pebs_output_pt_available = 0;
+ /*
+ * The capability bits are not reliable on ADL and RPL.
+ * For bare metal, it's safe to assume that some features
+ * are always enabled, e.g., the perf metrics on p-core,
+ * but we cannot do the same assumption for a hypervisor.
+ * Only update the core specific PMU feature for bare metal.
+ */
+ if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
+ pmu->intel_cap.perf_metrics = 1;
+ pmu->intel_cap.pebs_output_pt_available = 0;
+ }
memcpy(pmu->hw_cache_event_ids, spr_hw_cache_event_ids, sizeof(pmu->hw_cache_event_ids));
memcpy(pmu->hw_cache_extra_regs, spr_hw_cache_extra_regs, sizeof(pmu->hw_cache_extra_regs));
@@ -6480,8 +6489,10 @@ __init int intel_pmu_init(void)
__EVENT_CONSTRAINT(0, (1ULL << pmu->num_counters) - 1,
0, pmu->num_counters, 0, 0);
pmu->intel_cap.capabilities = x86_pmu.intel_cap.capabilities;
- pmu->intel_cap.perf_metrics = 0;
- pmu->intel_cap.pebs_output_pt_available = 1;
+ if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
+ pmu->intel_cap.perf_metrics = 0;
+ pmu->intel_cap.pebs_output_pt_available = 1;
+ }
memcpy(pmu->hw_cache_event_ids, glp_hw_cache_event_ids, sizeof(pmu->hw_cache_event_ids));
memcpy(pmu->hw_cache_extra_regs, tnt_hw_cache_extra_regs, sizeof(pmu->hw_cache_extra_regs));
--
2.35.1