On 2024/6/26 13:08, Jiaqi Yan wrote:
Add regression and new tests when hugepage has correctable memory errors, and how userspace wants to deal with it:
- if enable_soft_offline=1, mapped hugepage is soft offlined
- if enable_soft_offline=0, mapped hugepage is intact
Free hugepages case is not explicitly covered by the tests.
Hugepage having corrected memory errors is emulated with MADV_SOFT_OFFLINE.
Thanks for update.
Signed-off-by: Jiaqi Yan jiaqiyan@google.com
tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + .../selftests/mm/hugetlb-soft-offline.c | 228 ++++++++++++++++++ tools/testing/selftests/mm/run_vmtests.sh | 4 + 4 files changed, 234 insertions(+) create mode 100644 tools/testing/selftests/mm/hugetlb-soft-offline.c
...
+static void test_soft_offline_common(int enable_soft_offline) +{
- int fd;
- int expect_errno = enable_soft_offline ? 0 : EOPNOTSUPP;
- struct statfs file_stat;
- unsigned long hugepagesize_kb = 0;
- unsigned long nr_hugepages_before = 0;
- unsigned long nr_hugepages_after = 0;
- int ret;
- ksft_print_msg("Test soft-offline when enabled_soft_offline=%d\n",
enable_soft_offline);
- fd = create_hugetlbfs_file(&file_stat);
- if (fd < 0)
ksft_exit_fail_msg("Failed to create hugetlbfs file\n");
- hugepagesize_kb = file_stat.f_bsize / 1024;
- ksft_print_msg("Hugepagesize is %ldkB\n", hugepagesize_kb);
- if (set_enable_soft_offline(enable_soft_offline)) {
Nit: should this be written as if (set_enable_soft_offline(enable_soft_offline) != 0) to keep consistent with below code?
close(fd);
ksft_exit_fail_msg("Failed to set enable_soft_offline\n");
- }
- if (read_nr_hugepages(hugepagesize_kb, &nr_hugepages_before) != 0) {
close(fd);
ksft_exit_fail_msg("Failed to read nr_hugepages\n");
- }
- ksft_print_msg("Before MADV_SOFT_OFFLINE nr_hugepages=%ld\n",
nr_hugepages_before);
- ret = do_soft_offline(fd, 2 * file_stat.f_bsize, expect_errno);
- if (read_nr_hugepages(hugepagesize_kb, &nr_hugepages_after) != 0) {
close(fd);
ksft_exit_fail_msg("Failed to read nr_hugepages\n");
- }
...
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh index 3157204b9047..781117fac1ba 100755 --- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -331,6 +331,10 @@ CATEGORY="hugetlb" run_test ./thuge-gen CATEGORY="hugetlb" run_test ./charge_reserved_hugetlb.sh -cgroup-v2 CATEGORY="hugetlb" run_test ./hugetlb_reparenting_test.sh -cgroup-v2 if $RUN_DESTRUCTIVE; then +nr_hugepages_tmp=$(cat /proc/sys/vm/nr_hugepages) +echo 8 > /proc/sys/vm/nr_hugepages +CATEGORY="hugetlb" run_test ./hugetlb-soft-offline +echo "$nr_hugepages_tmp" > /proc/sys/vm/nr_hugepages
Should we save and restore the value of /proc/sys/vm/enable_soft_offline too?
With above fixed, this patch looks good to me. Acked-by: Miaohe Lin linmiaohe@huawei.com Thanks. .