On Wed, Feb 19, 2025 at 05:18:24PM +0000, Catalin Marinas wrote:
On Wed, Feb 19, 2025 at 02:00:27PM +0000, Catalin Marinas wrote:
On Sat, 8 Feb 2025 at 16:54, Naresh Kamboju naresh.kamboju@linaro.org wrote:
Regression on qemu-arm64 and FVP noticed this kernel warning running selftests: arm64: check_hugetlb_options test case on 6.6.76-rc1 and 6.6.76-rc2.
Test regression: WARNING-arch-arm64-mm-copypage-copy_highpage
------------[ cut here ]------------ [ 96.920028] WARNING: CPU: 1 PID: 3611 at arch/arm64/mm/copypage.c:29 copy_highpage (arch/arm64/include/asm/mte.h:87) [ 96.922100] Modules linked in: crct10dif_ce sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 fuse drm backlight ip_tables x_tables [ 96.925603] CPU: 1 PID: 3611 Comm: check_hugetlb_o Not tainted 6.6.76-rc2 #1 [ 96.926956] Hardware name: linux,dummy-virt (DT) [ 96.927695] pstate: 43402009 (nZcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) [ 96.928687] pc : copy_highpage (arch/arm64/include/asm/mte.h:87) [ 96.929037] lr : copy_highpage (arch/arm64/include/asm/alternative-macros.h:232 arch/arm64/include/asm/cpufeature.h:443 arch/arm64/include/asm/cpufeature.h:504 arch/arm64/include/asm/cpufeature.h:814 arch/arm64/mm/copypage.c:27) [ 96.929399] sp : ffff800088aa3ab0 [ 96.930232] x29: ffff800088aa3ab0 x28: 00000000000001ff x27: 0000000000000000 [ 96.930784] x26: 0000000000000000 x25: 0000ffff9b800000 x24: 0000ffff9b9ff000 [ 96.931402] x23: fffffc0003257fc0 x22: ffff0000c95ff000 x21: ffff0000c93ff000 [ 96.932054] x20: fffffc0003257fc0 x19: fffffc000324ffc0 x18: 0000ffff9b800000 [ 96.933357] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 96.934091] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 96.935095] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 [ 96.935982] x8 : 0bfffc0001800000 x7 : 0000000000000000 x6 : 0000000000000000 [ 96.936536] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 [ 96.937089] x2 : 0000000000000000 x1 : ffff0000c9600000 x0 : ffff0000c9400080 [ 96.939431] Call trace: [ 96.939920] copy_highpage (arch/arm64/include/asm/mte.h:87) [ 96.940443] copy_user_highpage (arch/arm64/mm/copypage.c:40) [ 96.940963] copy_user_large_folio (mm/memory.c:5977 mm/memory.c:6109) [ 96.941535] hugetlb_wp (mm/hugetlb.c:5701) [ 96.941948] hugetlb_fault (mm/hugetlb.c:6237) [ 96.942344] handle_mm_fault (mm/memory.c:5330) [ 96.942794] do_page_fault (arch/arm64/mm/fault.c:513 arch/arm64/mm/fault.c:626) [ 96.943341] do_mem_abort (arch/arm64/mm/fault.c:846) [ 96.943797] el0_da (arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:144 arch/arm64/kernel/entry-common.c:547) [ 96.944229] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:0) [ 96.944765] el0t_64_sync (arch/arm64/kernel/entry.S:599) [ 96.945383] ---[ end trace 0000000000000000 ]---
Prior to commit 25c17c4b55de ("hugetlb: arm64: add mte support"), there was no hugetlb support with MTE, so the above code path should not happen - it seems to get a PROT_MTE hugetlb page which should have been prevented by arch_validate_flags(). Or something else corrupts the page flags and we end up with some random PG_mte_tagged set.
The problem is in the arm64 arch_calc_vm_flag_bits() as it returns VM_MTE_ALLOWED for any MAP_ANONYMOUS ignoring MAP_HUGETLB (it's been doing this since day 1 of MTE). The implementation does handle the hugetlb file mmap() correctly but not the MAP_ANONYMOUS case.
The fix would be something like below:
-----------------8<-------------------------- diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h index 5966ee4a6154..8ff5d88c9f12 100644 --- a/arch/arm64/include/asm/mman.h +++ b/arch/arm64/include/asm/mman.h @@ -28,7 +28,8 @@ static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags) * backed by tags-capable memory. The vm_flags may be overridden by a * filesystem supporting MTE (RAM-based). */
- if (system_supports_mte() && (flags & MAP_ANONYMOUS))
if (system_supports_mte() &&
((flags & MAP_ANONYMOUS) && !(flags & MAP_HUGETLB)))
return VM_MTE_ALLOWED;
return 0;
-------------------8<-----------------------
This fix won't make sense for mainline since it supports MAP_HUGETLB already.
Greg, are you ok with a stable-only fix as above or you'd rather see the full 25c17c4b55de ("hugetlb: arm64: add mte support") backported?
A stable-only fix for this is fine, thanks! Can you send it with a changelog and I'll queue it up. Does it also need to go to older kernels as well?
thanks,
greg k-h