Hello John,
On Thu, 6 Apr 2023 at 06:05, John Hubbard jhubbard@nvidia.com wrote:
Although CONFIG_DEVICE_PRIVATE and hmm_range_fault() and related functionality was first developed on x86, it also works on arm64. However, when trying this out on an arm64 system, it turns out that there is a massive slowdown during the setup and teardown phases.
This slowdown is due to lots of calls to WARN_ON()'s that are checking for pages that are out of the physical range for the CPU. However, that's a design feature of device private pages: they are specfically chosen in order to be outside of the range of the CPU's true physical pages.
Currently, the vmemmap region is dimensioned to only cover the PFN range that backs the linear map. So the WARN() seems appropriate here: you are mapping struct page[] ranges outside of the allocated window, and afaict, you might actually wrap around and corrupt the linear map at the start of the kernel VA space like this.
x86 doesn't have this warning. It only checks that pages are properly aligned. I've shown a comparison below between x86 (which works well) and arm64 (which has these warnings).
memunmap_pages() pageunmap_range() if (pgmap->type == MEMORY_DEVICE_PRIVATE) __remove_pages() __remove_section() sparse_remove_section() section_deactivate() depopulate_section_memmap() /* arch/arm64/mm/mmu.c */ vmemmap_free() { WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END)); ... }
/* arch/x86/mm/init_64.c */ vmemmap_free() { VM_BUG_ON(!PAGE_ALIGNED(start)); VM_BUG_ON(!PAGE_ALIGNED(end)); ... }
So, the warning is a false positive for this case. Therefore, skip the warning if CONFIG_DEVICE_PRIVATE is set.
I don't think this is a false positive. We'll need to adjust VMEMMAP_SIZE to account for this.
Signed-off-by: John Hubbard jhubbard@nvidia.com cc: stable@vger.kernel.org
arch/arm64/mm/mmu.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6f9d8898a025..d5c9b611a8d1 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1157,8 +1157,10 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int node, int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap) { +/* Device private pages are outside of the CPU's physical page range. */ +#ifndef CONFIG_DEVICE_PRIVATE WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END));
+#endif if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES)) return vmemmap_populate_basepages(start, end, node, altmap); else @@ -1169,8 +1171,10 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, void vmemmap_free(unsigned long start, unsigned long end, struct vmem_altmap *altmap) { +/* Device private pages are outside of the CPU's physical page range. */ +#ifndef CONFIG_DEVICE_PRIVATE WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END));
+#endif unmap_hotplug_range(start, end, true, altmap); free_empty_tables(start, end, VMEMMAP_START, VMEMMAP_END); } -- 2.40.0