When running tests on a CI system (e.g. LAVA) it is useful to output test results in TAP format so that the CI can parse the fine-grained results to show regressions. Many of the mm selftest binaries already output using the TAP format. And the kselftests runner (run_kselftest.sh) also uses the format. CI systems such as LAVA can already handle nested TAP reports. However, with the mm selftests we have 3 levels of nesting (run_kselftest.sh -> run_vmtests.sh -> individual test binaries) and the middle level did not previously support TAP, which breaks the parser.
Let's fix that by teaching run_vmtests.sh to output using the TAP format. Ideally this would be opt-in via a command line argument to avoid the possibility of breaking anyone's existing scripts that might scrape the output. However, it is not possible to pass arguments to tests invoked via run_kselftest.sh. So I've implemented an opt-out option (-n), which will revert to the existing output format.
Future changes to this file should be aware of 2 new conventions:
- output that is part of the TAP reporting is piped through tap_output - general output is piped through tap_prefix
Signed-off-by: Ryan Roberts ryan.roberts@arm.com --- tools/testing/selftests/mm/run_vmtests.sh | 51 +++++++++++++++++------ 1 file changed, 39 insertions(+), 12 deletions(-)
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh index 87f513f5cf91..246d53a5d7f2 100755 --- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -5,6 +5,7 @@ # Kselftest framework requirement - SKIP code is 4. ksft_skip=4
+count_total=0 count_pass=0 count_fail=0 count_skip=0 @@ -17,6 +18,7 @@ usage: ${BASH_SOURCE[0]:-$0} [ options ] -a: run all tests, including extra ones -t: specify specific categories to tests to run -h: display this message + -n: disable TAP output
The default behavior is to run required tests only. If -a is specified, will run all tests. @@ -77,12 +79,14 @@ EOF }
RUN_ALL=false +TAP_PREFIX="# "
-while getopts "aht:" OPT; do +while getopts "aht:n" OPT; do case ${OPT} in "a") RUN_ALL=true ;; "h") usage ;; "t") VM_SELFTEST_ITEMS=${OPTARG} ;; + "n") TAP_PREFIX= ;; esac done shift $((OPTIND -1)) @@ -184,30 +188,52 @@ fi VADDR64=0 echo "$ARCH64STR" | grep "$ARCH" &>/dev/null && VADDR64=1
+tap_prefix() { + sed -e "s/^/${TAP_PREFIX}/" +} + +tap_output() { + if [[ ! -z "$TAP_PREFIX" ]]; then + read str + echo $str + fi +} + +pretty_name() { + echo "$*" | sed -e 's/^(bash )?.///' +} + # Usage: run_test [test binary] [arbitrary test arguments...] run_test() { if test_selected ${CATEGORY}; then + local test=$(pretty_name "$*") local title="running $*" local sep=$(echo -n "$title" | tr "[:graph:][:space:]" -) - printf "%s\n%s\n%s\n" "$sep" "$title" "$sep" + printf "%s\n%s\n%s\n" "$sep" "$title" "$sep" | tap_prefix
- "$@" - local ret=$? + ("$@" 2>&1) | tap_prefix + local ret=${PIPESTATUS[0]} + count_total=$(( count_total + 1 )) if [ $ret -eq 0 ]; then count_pass=$(( count_pass + 1 )) - echo "[PASS]" + echo "[PASS]" | tap_prefix + echo "ok ${count_total} ${test}" | tap_output elif [ $ret -eq $ksft_skip ]; then count_skip=$(( count_skip + 1 )) - echo "[SKIP]" + echo "[SKIP]" | tap_prefix + echo "ok ${count_total} ${test} # SKIP" | tap_output exitcode=$ksft_skip else count_fail=$(( count_fail + 1 )) - echo "[FAIL]" + echo "[FAIL]" | tap_prefix + echo "not ok ${count_total} ${test} # exit=$ret" | tap_output exitcode=1 fi fi # test_selected }
+echo "TAP version 13" | tap_output + CATEGORY="hugetlb" run_test ./hugepage-mmap
shmmax=$(cat /proc/sys/kernel/shmmax) @@ -231,9 +257,9 @@ CATEGORY="hugetlb" run_test ./hugetlb_fault_after_madv echo "$nr_hugepages_tmp" > /proc/sys/vm/nr_hugepages
if test_selected "hugetlb"; then - echo "NOTE: These hugetlb tests provide minimal coverage. Use" - echo " https://github.com/libhugetlbfs/libhugetlbfs.git for" - echo " hugetlb regression testing." + echo "NOTE: These hugetlb tests provide minimal coverage. Use" | tap_prefix + echo " https://github.com/libhugetlbfs/libhugetlbfs.git for" | tap_prefix + echo " hugetlb regression testing." | tap_prefix fi
CATEGORY="mmap" run_test ./map_fixed_noreplace @@ -312,7 +338,7 @@ CATEGORY="hmm" run_test bash ./test_hmm.sh smoke # MADV_POPULATE_READ and MADV_POPULATE_WRITE tests CATEGORY="madv_populate" run_test ./madv_populate
-echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope +(echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope 2>&1) | tap_prefix CATEGORY="memfd_secret" run_test ./memfd_secret
# KSM KSM_MERGE_TIME_HUGE_PAGES test with size of 100 @@ -369,6 +395,7 @@ CATEGORY="mkdirty" run_test ./mkdirty
CATEGORY="mdwe" run_test ./mdwe_test
-echo "SUMMARY: PASS=${count_pass} SKIP=${count_skip} FAIL=${count_fail}" +echo "SUMMARY: PASS=${count_pass} SKIP=${count_skip} FAIL=${count_fail}" | tap_prefix +echo "1..${count_total}" | tap_output
exit $exitcode -- 2.25.1
On Thu, Dec 14, 2023 at 04:24:34PM +0000, Ryan Roberts wrote:
When running tests on a CI system (e.g. LAVA) it is useful to output test results in TAP format so that the CI can parse the fine-grained results to show regressions. Many of the mm selftest binaries already output using the TAP format. And the kselftests runner (run_kselftest.sh) also uses the format. CI systems such as LAVA can already handle nested TAP reports. However, with the mm selftests we have 3 levels of nesting (run_kselftest.sh -> run_vmtests.sh -> individual test binaries) and the middle level did not previously support TAP, which breaks the parser.
Reviewed-by: Mark Brown broonie@kernel.org
Let's fix that by teaching run_vmtests.sh to output using the TAP format. Ideally this would be opt-in via a command line argument to avoid the possibility of breaking anyone's existing scripts that might scrape the output. However, it is not possible to pass arguments to tests invoked via run_kselftest.sh. So I've implemented an opt-out option (-n), which will revert to the existing output format.
What I did for ftrace which had a similar situation was make a wrapper script which invokes the test runner, make the test runner a TEST_PROGS_EXTENDED so it's not run by the kselftest infrastructure automatically and make the wrapper a normal TEST_PROGS. Neither option is especially lovely.
On 15/12/2023 13:54, Mark Brown wrote:
On Thu, Dec 14, 2023 at 04:24:34PM +0000, Ryan Roberts wrote:
When running tests on a CI system (e.g. LAVA) it is useful to output test results in TAP format so that the CI can parse the fine-grained results to show regressions. Many of the mm selftest binaries already output using the TAP format. And the kselftests runner (run_kselftest.sh) also uses the format. CI systems such as LAVA can already handle nested TAP reports. However, with the mm selftests we have 3 levels of nesting (run_kselftest.sh -> run_vmtests.sh -> individual test binaries) and the middle level did not previously support TAP, which breaks the parser.
Reviewed-by: Mark Brown broonie@kernel.org
Thanks!
Let's fix that by teaching run_vmtests.sh to output using the TAP format. Ideally this would be opt-in via a command line argument to avoid the possibility of breaking anyone's existing scripts that might scrape the output. However, it is not possible to pass arguments to tests invoked via run_kselftest.sh. So I've implemented an opt-out option (-n), which will revert to the existing output format.
What I did for ftrace which had a similar situation was make a wrapper script which invokes the test runner, make the test runner a TEST_PROGS_EXTENDED so it's not run by the kselftest infrastructure automatically and make the wrapper a normal TEST_PROGS. Neither option is especially lovely.
Yeah that's a good idea... I'll wait and see if anyone shouts that this has broken something. If nothing is broken, I think it is better to just make TAP the default rather than adding yet another wrapper.
On Fri, Dec 15, 2023 at 01:58:45PM +0000, Ryan Roberts wrote:
On 15/12/2023 13:54, Mark Brown wrote:
What I did for ftrace which had a similar situation was make a wrapper script which invokes the test runner, make the test runner a TEST_PROGS_EXTENDED so it's not run by the kselftest infrastructure automatically and make the wrapper a normal TEST_PROGS. Neither option is especially lovely.
Yeah that's a good idea... I'll wait and see if anyone shouts that this has broken something. If nothing is broken, I think it is better to just make TAP the default rather than adding yet another wrapper.
I think it depends a bit how ergonomic the non-TAP output is for interactive use - TAP isn't amazing for humans so if there's something that's nicer it probably makes sense to keep that as the default. For these tests I'm not sure it's particularly an issue.
On 15/12/2023 14:08, Mark Brown wrote:
On Fri, Dec 15, 2023 at 01:58:45PM +0000, Ryan Roberts wrote:
On 15/12/2023 13:54, Mark Brown wrote:
What I did for ftrace which had a similar situation was make a wrapper script which invokes the test runner, make the test runner a TEST_PROGS_EXTENDED so it's not run by the kselftest infrastructure automatically and make the wrapper a normal TEST_PROGS. Neither option is especially lovely.
Yeah that's a good idea... I'll wait and see if anyone shouts that this has broken something. If nothing is broken, I think it is better to just make TAP the default rather than adding yet another wrapper.
I think it depends a bit how ergonomic the non-TAP output is for interactive use - TAP isn't amazing for humans so if there's something that's nicer it probably makes sense to keep that as the default. For these tests I'm not sure it's particularly an issue.
I've kept all the existing "pretty" output and results summary as is, it just gets a hash in front of it when TAP is enabled.
so this:
----------------------- running ./hugepage-mmap ----------------------- Returned address is 0xffff89e00000 First hex is 0 First hex is 3020100 [PASS] SUMMARY: PASS=1 SKIP=0 FAIL=0
becomes this:
TAP version 13 # ----------------------- # running ./hugepage-mmap # ----------------------- # Returned address is 0xffff89e00000 # First hex is 0 # First hex is 3020100 # [PASS] ok 1 hugepage-mmap # SUMMARY: PASS=1 SKIP=0 FAIL=0 1..1
If you think the latter is ofensive, then I can do the wrapping as you suggest.
On Fri, Dec 15, 2023 at 02:28:32PM +0000, Ryan Roberts wrote:
On 15/12/2023 14:08, Mark Brown wrote:
# ----------------------- # running ./hugepage-mmap # ----------------------- # Returned address is 0xffff89e00000 # First hex is 0 # First hex is 3020100 # [PASS] ok 1 hugepage-mmap # SUMMARY: PASS=1 SKIP=0 FAIL=0 1..1
If you think the latter is ofensive, then I can do the wrapping as you suggest.
I think it's fine here - it was more of an issue for ftrace.
On 12/15/23 06:28, Ryan Roberts wrote: ...
I've kept all the existing "pretty" output and results summary as is, it just gets a hash in front of it when TAP is enabled.
so this:
running ./hugepage-mmap
Returned address is 0xffff89e00000 First hex is 0 First hex is 3020100 [PASS] SUMMARY: PASS=1 SKIP=0 FAIL=0
becomes this:
TAP version 13 # ----------------------- # running ./hugepage-mmap # ----------------------- # Returned address is 0xffff89e00000 # First hex is 0 # First hex is 3020100 # [PASS] ok 1 hugepage-mmap # SUMMARY: PASS=1 SKIP=0 FAIL=0 1..1
If you think the latter is ofensive, then I can do the wrapping as you suggest.
I applied this and ran the tests, all while carefully reminding myself to "think like a human". :) And from that perspective, to me, the output is effectively the same: the leading '#' characters do not really change anything, from a readability point of view.
So IMHO you're on perfectly solid ground, if you just switch over directly to this format.
Tested-by: John Hubbard jhubbard@nvidia.com
thanks,
On 12/15/23 18:25, John Hubbard wrote:
On 12/15/23 06:28, Ryan Roberts wrote: ...
I've kept all the existing "pretty" output and results summary as is, it just gets a hash in front of it when TAP is enabled.
so this:
running ./hugepage-mmap
Returned address is 0xffff89e00000 First hex is 0 First hex is 3020100 [PASS] SUMMARY: PASS=1 SKIP=0 FAIL=0
becomes this:
TAP version 13 # ----------------------- # running ./hugepage-mmap # ----------------------- # Returned address is 0xffff89e00000 # First hex is 0 # First hex is 3020100 # [PASS] ok 1 hugepage-mmap # SUMMARY: PASS=1 SKIP=0 FAIL=0 1..1
If you think the latter is ofensive, then I can do the wrapping as you suggest.
I applied this and ran the tests, all while carefully reminding myself to "think like a human". :) And from that perspective, to me, the output is effectively the same: the leading '#' characters do not really change anything, from a readability point of view.
So IMHO you're on perfectly solid ground, if you just switch over directly to this format.
Tested-by: John Hubbard jhubbard@nvidia.com
I should also point out that some of the subtests already attempt a TAP output. So now we end up with TAP-within-TAP output for those programs.
For example: # ----------------------- # running ./madv_populate # ----------------------- # TAP version 13 # 1..21 # # [RUN] test_prot_read # ok 1 MADV_POPULATE_READ with PROT_READ # ok 2 MADV_POPULATE_WRITE with PROT_READ # # [RUN] test_prot_write # ok 3 MADV_POPULATE_READ with PROT_WRITE ...etc...
Note the double level of leading '#' characters.
Again, this is still readable enough for humans. But it should probably be removed in subsequent patches to the subtests.
thanks,
On 16/12/2023 02:40, John Hubbard wrote:
On 12/15/23 18:25, John Hubbard wrote:
On 12/15/23 06:28, Ryan Roberts wrote: ...
I've kept all the existing "pretty" output and results summary as is, it just gets a hash in front of it when TAP is enabled.
so this:
running ./hugepage-mmap
Returned address is 0xffff89e00000 First hex is 0 First hex is 3020100 [PASS] SUMMARY: PASS=1 SKIP=0 FAIL=0
becomes this:
TAP version 13 # ----------------------- # running ./hugepage-mmap # ----------------------- # Returned address is 0xffff89e00000 # First hex is 0 # First hex is 3020100 # [PASS] ok 1 hugepage-mmap # SUMMARY: PASS=1 SKIP=0 FAIL=0 1..1
If you think the latter is ofensive, then I can do the wrapping as you suggest.
I applied this and ran the tests, all while carefully reminding myself to "think like a human". :) And from that perspective, to me, the output is effectively the same: the leading '#' characters do not really change anything, from a readability point of view.
So IMHO you're on perfectly solid ground, if you just switch over directly to this format.
Great thanks for taking a look!
Tested-by: John Hubbard jhubbard@nvidia.com
I should also point out that some of the subtests already attempt a TAP output. So now we end up with TAP-within-TAP output for those programs.
It's actually TAP-in-TAP-in-TAP if you're running from run_kselftest.sh :)
For example: # ----------------------- # running ./madv_populate # ----------------------- # TAP version 13 # 1..21 # # [RUN] test_prot_read # ok 1 MADV_POPULATE_READ with PROT_READ # ok 2 MADV_POPULATE_WRITE with PROT_READ # # [RUN] test_prot_write # ok 3 MADV_POPULATE_READ with PROT_WRITE ...etc...
Note the double level of leading '#' characters.
Again, this is still readable enough for humans. But it should probably be removed in subsequent patches to the subtests.
I personally don't agree with this. It would be difficult to flatten to a single TAP instance because the top level doesn't have a clue how many test cases the child is running. Trying to do this will make things more fragile and less modular. LAVA can certainly deal with nested test cases and correctly parses everything to test case names that contain the test name at each level of nesting. The thing I was trying to solve with this patch was that previously the top level (run_kselftest.sh) and the bottom level (individual mm test binaries) were using TAP, but the middle level (run_vmtests.sh) wasn't, and this was confusing the LAVA parser.
thanks,
On 12/18/23 03:32, Ryan Roberts wrote: ...
I should also point out that some of the subtests already attempt a TAP output. So now we end up with TAP-within-TAP output for those programs.
It's actually TAP-in-TAP-in-TAP if you're running from run_kselftest.sh :)
For example: # ----------------------- # running ./madv_populate # ----------------------- # TAP version 13 # 1..21 # # [RUN] test_prot_read # ok 1 MADV_POPULATE_READ with PROT_READ # ok 2 MADV_POPULATE_WRITE with PROT_READ # # [RUN] test_prot_write # ok 3 MADV_POPULATE_READ with PROT_WRITE ...etc...
Note the double level of leading '#' characters.
Again, this is still readable enough for humans. But it should probably be removed in subsequent patches to the subtests.
I personally don't agree with this. It would be difficult to flatten to a single TAP instance because the top level doesn't have a clue how many test cases the
That's not quite what I had in mind...
child is running. Trying to do this will make things more fragile and less modular. LAVA can certainly deal with nested test cases and correctly parses everything to test case names that contain the test name at each level of nesting. The thing I was trying to solve with this patch was that previously the top level (run_kselftest.sh) and the bottom level (individual mm test binaries) were using TAP, but the middle level (run_vmtests.sh) wasn't, and this was confusing the LAVA parser.
I was thinking more along these lines:
a) For the individual programs (binaries), there is actually neither need nor desire to create TAP output at that level, because frameworks like LAVA only care about running a lot of tests and parsing the output.
b) Therefore, just stop specifying TAP output at the leaf level, and let run_vmtests.sh and run_kselftest.sh do it.
Looking at madv_populate.c, I see that it scatters calls to ksft_*() around. And I was thinking that this is all just redundant, isn't it?
thanks,
On 12/18/23 16:51, John Hubbard wrote:
On 12/18/23 03:32, Ryan Roberts wrote: ...
I should also point out that some of the subtests already attempt a TAP output. So now we end up with TAP-within-TAP output for those programs.
It's actually TAP-in-TAP-in-TAP if you're running from run_kselftest.sh :)
For example: # ----------------------- # running ./madv_populate # ----------------------- # TAP version 13 # 1..21 # # [RUN] test_prot_read # ok 1 MADV_POPULATE_READ with PROT_READ # ok 2 MADV_POPULATE_WRITE with PROT_READ # # [RUN] test_prot_write # ok 3 MADV_POPULATE_READ with PROT_WRITE ...etc...
Note the double level of leading '#' characters.
Again, this is still readable enough for humans. But it should probably be removed in subsequent patches to the subtests.
I personally don't agree with this. It would be difficult to flatten to a single TAP instance because the top level doesn't have a clue how many test cases the
That's not quite what I had in mind...
child is running. Trying to do this will make things more fragile and less modular. LAVA can certainly deal with nested test cases and correctly parses everything to test case names that contain the test name at each level of nesting. The thing I was trying to solve with this patch was that previously the top level (run_kselftest.sh) and the bottom level (individual mm test binaries) were using TAP, but the middle level (run_vmtests.sh) wasn't, and this was confusing the LAVA parser.
I was thinking more along these lines:
a) For the individual programs (binaries), there is actually neither need nor desire to create TAP output at that level, because frameworks like LAVA only care about running a lot of tests and parsing the output.
b) Therefore, just stop specifying TAP output at the leaf level, and let run_vmtests.sh and run_kselftest.sh do it.
Looking at madv_populate.c, I see that it scatters calls to ksft_*() around. And I was thinking that this is all just redundant, isn't it?
Although I suppose that the counter argument is that the subtests in madv_populate.c really *do* want to be specifically printed in TAP format.
arggh, I guess this is just not worth fooling around with after all.
thanks,
On 19/12/2023 00:55, John Hubbard wrote:
On 12/18/23 16:51, John Hubbard wrote:
On 12/18/23 03:32, Ryan Roberts wrote: ...
I should also point out that some of the subtests already attempt a TAP output. So now we end up with TAP-within-TAP output for those programs.
It's actually TAP-in-TAP-in-TAP if you're running from run_kselftest.sh :)
For example: # ----------------------- # running ./madv_populate # ----------------------- # TAP version 13 # 1..21 # # [RUN] test_prot_read # ok 1 MADV_POPULATE_READ with PROT_READ # ok 2 MADV_POPULATE_WRITE with PROT_READ # # [RUN] test_prot_write # ok 3 MADV_POPULATE_READ with PROT_WRITE ...etc...
Note the double level of leading '#' characters.
Again, this is still readable enough for humans. But it should probably be removed in subsequent patches to the subtests.
I personally don't agree with this. It would be difficult to flatten to a single TAP instance because the top level doesn't have a clue how many test cases the
That's not quite what I had in mind...
child is running. Trying to do this will make things more fragile and less modular. LAVA can certainly deal with nested test cases and correctly parses everything to test case names that contain the test name at each level of nesting. The thing I was trying to solve with this patch was that previously the top level (run_kselftest.sh) and the bottom level (individual mm test binaries) were using TAP, but the middle level (run_vmtests.sh) wasn't, and this was confusing the LAVA parser.
I was thinking more along these lines:
a) For the individual programs (binaries), there is actually neither need nor desire to create TAP output at that level, because frameworks like LAVA only care about running a lot of tests and parsing the output.
b) Therefore, just stop specifying TAP output at the leaf level, and let run_vmtests.sh and run_kselftest.sh do it.
Looking at madv_populate.c, I see that it scatters calls to ksft_*() around. And I was thinking that this is all just redundant, isn't it?
Although I suppose that the counter argument is that the subtests in madv_populate.c really *do* want to be specifically printed in TAP format.
arggh, I guess this is just not worth fooling around with after all.
Yes; I wouldn't want to lose the fine granularity we have currently. For example cow.c has ~900 test cases now that I've multiplied everything up for mTHP. 16 of those are known to fail (hugetlb issue) and 1 is skipped. I wouldn't want to reduce that down to a single cow test case that always fails; that's not helpful to understand if I've regressed something.
But sounds like we are both on the same page now.
thanks,
linux-kselftest-mirror@lists.linaro.org