BUG selftests/mm] - Linux-kselftest-mirror

9 Mar 2024


      Hi,
Routine run of the test in net-next gave also this mm unit error.
root@defiant:tools/testing/selftests/mm# ./uffd-unit-tests
Testing UFFDIO_API (with syscall)... done
Testing UFFDIO_API (with /dev/userfaultfd)... done
Testing register-ioctls on anon... done
Testing register-ioctls on shmem... done
Testing register-ioctls on shmem-private... done
Testing register-ioctls on hugetlb... skipped [reason: memory allocation failed]
Testing register-ioctls on hugetlb-private... skipped [reason: memory allocation failed]
Testing zeropage on anon... done
Testing zeropage on shmem... done
Testing zeropage on shmem-private... done
Testing zeropage on hugetlb... skipped [reason: memory allocation failed]
Testing zeropage on hugetlb-private... skipped [reason: memory allocation failed]
Testing move on anon... done
Testing move-pmd on anon... done
Testing move-pmd-split on anon... done
Testing wp-fork on anon... done
Testing wp-fork on shmem... done
Testing wp-fork on shmem-private... done
Testing wp-fork on hugetlb... skipped [reason: memory allocation failed]
Testing wp-fork on hugetlb-private... skipped [reason: memory allocation failed]
Testing wp-fork-with-event on anon... done
Testing wp-fork-with-event on shmem... done
Testing wp-fork-with-event on shmem-private... done
Testing wp-fork-with-event on hugetlb... skipped [reason: memory allocation failed]
Testing wp-fork-with-event on hugetlb-private... skipped [reason: memory allocation failed]
Testing wp-fork-pin on anon... done
Testing wp-fork-pin on shmem... done
Testing wp-fork-pin on shmem-private... done
Testing wp-fork-pin on hugetlb... skipped [reason: memory allocation failed]
Testing wp-fork-pin on hugetlb-private... skipped [reason: memory allocation failed]
Testing wp-fork-pin-with-event on anon... done
Testing wp-fork-pin-with-event on shmem... done
Testing wp-fork-pin-with-event on shmem-private... done
Testing wp-fork-pin-with-event on hugetlb... skipped [reason: memory allocation failed]
Testing wp-fork-pin-with-event on hugetlb-private... skipped [reason: memory allocation failed]
Testing wp-unpopulated on anon... done
Testing minor on shmem... done
Testing minor on hugetlb... skipped [reason: memory allocation failed]
Testing minor-wp on shmem... done
Testing minor-wp on hugetlb... skipped [reason: memory allocation failed]
Testing minor-collapse on shmem... done
Testing sigbus on anon... done
Testing sigbus on shmem... done
Testing sigbus on shmem-private... done
Testing sigbus on hugetlb... skipped [reason: memory allocation failed]
Testing sigbus on hugetlb-private... skipped [reason: memory allocation failed]
Testing sigbus-wp on anon... done
Testing sigbus-wp on shmem... done
Testing sigbus-wp on shmem-private... done
Testing sigbus-wp on hugetlb... skipped [reason: memory allocation failed]
Testing sigbus-wp on hugetlb-private... skipped [reason: memory allocation failed]
Testing events on anon... done
Testing events on shmem... done
Testing events on shmem-private... done
Testing events on hugetlb... skipped [reason: memory allocation failed]
Testing events on hugetlb-private... skipped [reason: memory allocation failed]
Testing events-wp on anon... done
Testing events-wp on shmem... done
Testing events-wp on shmem-private... done
Testing events-wp on hugetlb... skipped [reason: memory allocation failed]
Testing events-wp on hugetlb-private... skipped [reason: memory allocation failed]
Testing poison on anon... done
Testing poison on shmem... done
Testing poison on shmem-private... done
Testing poison on hugetlb... skipped [reason: memory allocation failed]
Testing poison on hugetlb-private... skipped [reason: memory allocation failed]
Userfaults unit tests: pass=42, skip=24, fail=0 (total=66)
root@defiant:tools/testing/selftests/mm# grep -i huge /proc/meminfo
It resulted in alarming errors in the syslog:
Mar  9 19:48:24 defiant kernel: [77187.055103] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 4631e000
Mar  9 19:48:24 defiant kernel: [77187.055132] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 46320000
Mar  9 19:48:24 defiant kernel: [77187.055160] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 46322000
Mar  9 19:48:24 defiant kernel: [77187.055189] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 46324000
Mar  9 19:48:24 defiant kernel: [77187.055218] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 46326000
Mar  9 19:48:24 defiant kernel: [77187.055250] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 46328000
Mar  9 19:48:24 defiant kernel: [77187.055278] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 4632a000
Mar  9 19:48:24 defiant kernel: [77187.055307] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 4632c000
Mar  9 19:48:24 defiant kernel: [77187.055336] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 4632e000
Mar  9 19:48:24 defiant kernel: [77187.055366] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 46330000
Mar  9 19:48:24 defiant kernel: [77187.055395] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 46332000
Mar  9 19:48:24 defiant kernel: [77187.055423] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 46334000
Mar  9 19:48:24 defiant kernel: [77187.055452] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 46336000
Mar  9 19:48:24 defiant kernel: [77187.055480] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 46338000
Mar  9 19:48:24 defiant kernel: [77187.055509] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 4633a000
Mar  9 19:48:24 defiant kernel: [77187.055538] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 4633c000
Mar  9 19:48:24 defiant kernel: [77187.055567] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 4633e000
Mar  9 19:48:24 defiant kernel: [77187.055597] MCE: Killing uffd-unit-tests:1321817 due to hardware memory corruption fault at 46340000
At this point, it can be problem with my box's memory chips, or something with HUGETLB.
However, since the "classic" allocations were successful, the problem might be in huge pages, or
if I understood well, in deliberate poisoning of pages?
Please also find strace of the run.
Best regards,
Mirsad Todorovac