Jakub Kicinski wrote:
On Thu, 05 Sep 2024 21:31:55 -0400 Willem de Bruijn wrote:
Packetdrill scripts are sensitive to timing. On the dbg build, I just observe a flaky test.
The tool takes --tolerance_usecs and --tolerance_percent arguments. I may have to update ksft_runner.sh to increase one if a dbg build is detected.
Let me know if I should respin now. Else I can also follow-up.
Need to figure out how best to detect debug builds. It is not in uname, and no proc/config.gz. Existence of /sys/kernel/debug/kmemleak is a proxy for current kernel/configs/debug.config, if a bit crude.
Should have kept on reading. Will use KSFT_MACHINE_SLOW:
+declare -a optargs +if [[ "${KSFT_MACHINE_SLOW}" == "yes" ]]; then
optargs+=('--tolerance_usecs=10000')
+fi
ktap_print_header ktap_set_plan 2
-packetdrill ${ipv4_args[@]} $(basename $script) > /dev/null \ +packetdrill ${ipv4_args[@]} ${optargs[@]} $(basename $script) > /dev/null \ && ktap_test_pass "ipv4" || ktap_test_fail "ipv4" -packetdrill ${ipv6_args[@]} $(basename $script) > /dev/null \ +packetdrill ${ipv6_args[@]} ${optargs[@]} $(basename $script) > /dev/null \ && ktap_test_pass "ipv6" || ktap_test_fail "ipv6"
Another config affecting timing may be CONFIG_HZ. I did not observe issues with these specific scripts with CONFIG_HZ=250. It may have to be tackled eventually. Or CONFIG_HZ=1000 hardcoded in config.
I will just add the CONFIG for now.
Not sure I follow the HZ idea, lowering the frequency helps stability?
We can see how well v2 does overnight, so far it's green: https://netdev.bots.linux.dev/contest.html?executor=vmksft-packetdrill-dbg (the net-next-2024-09-05--* branches had v1).
Great!
I saw one failure in manual runs and was unable to reproduce it with a few more iterations. Let's see how it goes.
We do adjust these internally, to the same value for KASAN.
FWIW status page lists two sets of packetdrill runners, probably because I 'reused' an old team-driver runner instead of creating a new one. It should straighten itself out by tomorrow.