Hi,
Using the same config for 6.5-rc2 on Ubuntu 22.04 LTS and 22.10, the execution stop at the exact same line on both boxes (os I reckon it is more than an accident):
# selftests: net/forwarding: sch_ets.sh # TEST: ping vlan 10 [ OK ] # TEST: ping vlan 11 [ OK ] # TEST: ping vlan 12 [ OK ] # Running in priomap mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 # TEST: bands 1:2 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.00 Measured ratio .99 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 1.00 Measured ratio 1.00 # Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.42 Measured ratio 1.42 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.60 Measured ratio 1.59 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 2 quanta 5000 2500, streams 0 1 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Running in classifier mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1
I tried to run 'set -x' enabled version standalone, but that one finished correctly (?).
It could be something previous scripts left, but right now I don't have a clue. I can attempt to rerun all tests with sch_ets.sh bash 'set -x' enabled later today.
Best regards, Mirsad Todorovac
Mirsad Todorovac mirsad.todorovac@alu.unizg.hr writes:
Using the same config for 6.5-rc2 on Ubuntu 22.04 LTS and 22.10, the execution stop at the exact same line on both boxes (os I reckon it is more than an accident):
# selftests: net/forwarding: sch_ets.sh # TEST: ping vlan 10 [ OK ] # TEST: ping vlan 11 [ OK ] # TEST: ping vlan 12 [ OK ] # Running in priomap mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 # TEST: bands 1:2 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.00 Measured ratio .99 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 1.00 Measured ratio 1.00 # Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.42 Measured ratio 1.42 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.60 Measured ratio 1.59 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 2 quanta 5000 2500, streams 0 1 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Running in classifier mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1
I tried to run 'set -x' enabled version standalone, but that one finished correctly (?).
It could be something previous scripts left, but right now I don't have a clue. I can attempt to rerun all tests with sch_ets.sh bash 'set -x' enabled later today.
If you run it standalone without set -x, does it finish as well? That would imply that the reproducer needs to include the previous tests as well.
It looks like the test is stuck in ets_test_mixed in classifier_mode. A way to run just this test would be:
TESTS="classifier_mode ets_test_mixed" ./sch_ets.sh
Looking at the code, the only place that I can see that waits on anything is the "{ kill %% && wait %%; } 2>/dev/null" line in stop_traffic() (in lib.sh). Maybe something like this would let us see if that's the case:
modified tools/testing/selftests/net/forwarding/lib.sh @@ -1468,8 +1470,10 @@ start_tcp_traffic()
stop_traffic() { + echo killing MZ # Suppress noise from killing mausezahn. { kill %% && wait %%; } 2>/dev/null + echo killed MZ }
On 7/20/23 11:43, Petr Machata wrote:
Mirsad Todorovac mirsad.todorovac@alu.unizg.hr writes:
Using the same config for 6.5-rc2 on Ubuntu 22.04 LTS and 22.10, the execution stop at the exact same line on both boxes (os I reckon it is more than an accident):
# selftests: net/forwarding: sch_ets.sh # TEST: ping vlan 10 [ OK ] # TEST: ping vlan 11 [ OK ] # TEST: ping vlan 12 [ OK ] # Running in priomap mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 # TEST: bands 1:2 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.00 Measured ratio .99 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 1.00 Measured ratio 1.00 # Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.42 Measured ratio 1.42 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.60 Measured ratio 1.59 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 2 quanta 5000 2500, streams 0 1 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Running in classifier mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1
I tried to run 'set -x' enabled version standalone, but that one finished correctly (?).
It could be something previous scripts left, but right now I don't have a clue. I can attempt to rerun all tests with sch_ets.sh bash 'set -x' enabled later today.
If you run it standalone without set -x, does it finish as well?
Added that. Yes, standlone run finishes correctly, with or without 'set -x':
root@defiant:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/forwarding# ./sch_ets.sh TEST: ping vlan 10 [ OK ] TEST: ping vlan 11 [ OK ] TEST: ping vlan 12 [ OK ] Running in priomap mode Testing ets bands 3 strict 3, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 strict 3, streams 1 2 TEST: band 1 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 2 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 TEST: bands 1:2 [ OK ] INFO: Expected ratio 2.00 Measured ratio 1.99 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.00 Measured ratio 1.00 TEST: bands 0:2 [ OK ] INFO: Expected ratio 1.00 Measured ratio .99 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.42 Measured ratio 1.42 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.60 Measured ratio 1.59 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 2 quanta 5000 2500, streams 0 1 TEST: bands 0:1 [ OK ] INFO: Expected ratio 2.00 Measured ratio 1.99 killing MZ killed MZ killing MZ killed MZ Running in classifier mode Testing ets bands 3 strict 3, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 strict 3, streams 1 2 TEST: band 1 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 2 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 TEST: bands 1:2 [ OK ] INFO: Expected ratio 2.00 Measured ratio 1.99 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.00 Measured ratio .99 TEST: bands 0:2 [ OK ] INFO: Expected ratio 1.00 Measured ratio .99 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.42 Measured ratio 1.42 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.60 Measured ratio 1.60 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 2 quanta 5000 2500, streams 0 1 TEST: bands 0:1 [ OK ] INFO: Expected ratio 2.00 Measured ratio 2.00 killing MZ killed MZ killing MZ killed MZ root@defiant:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/forwarding#
That would imply that the reproducer needs to include the previous tests as well.
This is entirely possible, as timeouts and CTRL+C events do not seem to be caught and the cleanup is not done ...
sch_ets_core.sh: trap cleanup EXIT
Some tests time out even after settings:timeout=240, so IMHO this should be taken into account.
Best regards, Mirsad Todorovac
It looks like the test is stuck in ets_test_mixed in classifier_mode. A way to run just this test would be:
TESTS="classifier_mode ets_test_mixed" ./sch_ets.sh
Looking at the code, the only place that I can see that waits on anything is the "{ kill %% && wait %%; } 2>/dev/null" line in stop_traffic() (in lib.sh). Maybe something like this would let us see if that's the case:
modified tools/testing/selftests/net/forwarding/lib.sh @@ -1468,8 +1470,10 @@ start_tcp_traffic() stop_traffic() {
- echo killing MZ # Suppress noise from killing mausezahn. { kill %% && wait %%; } 2>/dev/null
- echo killed MZ }
On 7/20/23 18:07, Mirsad Todorovac wrote:
On 7/20/23 11:43, Petr Machata wrote:
Mirsad Todorovac mirsad.todorovac@alu.unizg.hr writes:
Using the same config for 6.5-rc2 on Ubuntu 22.04 LTS and 22.10, the execution stop at the exact same line on both boxes (os I reckon it is more than an accident):
# selftests: net/forwarding: sch_ets.sh # TEST: ping vlan 10 [ OK ] # TEST: ping vlan 11 [ OK ] # TEST: ping vlan 12 [ OK ] # Running in priomap mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 # TEST: bands 1:2 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.00 Measured ratio .99 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 1.00 Measured ratio 1.00 # Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.42 Measured ratio 1.42 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.60 Measured ratio 1.59 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 2 quanta 5000 2500, streams 0 1 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Running in classifier mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1
I tried to run 'set -x' enabled version standalone, but that one finished correctly (?).
It could be something previous scripts left, but right now I don't have a clue. I can attempt to rerun all tests with sch_ets.sh bash 'set -x' enabled later today.
If you run it standalone without set -x, does it finish as well?
Added that. Yes, standlone run finishes correctly, with or without 'set -x':
root@defiant:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/forwarding# ./sch_ets.sh TEST: ping vlan 10 [ OK ] TEST: ping vlan 11 [ OK ] TEST: ping vlan 12 [ OK ] Running in priomap mode Testing ets bands 3 strict 3, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 strict 3, streams 1 2 TEST: band 1 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 2 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 TEST: bands 1:2 [ OK ] INFO: Expected ratio 2.00 Measured ratio 1.99 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.00 Measured ratio 1.00 TEST: bands 0:2 [ OK ] INFO: Expected ratio 1.00 Measured ratio .99 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.42 Measured ratio 1.42 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.60 Measured ratio 1.59 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 2 quanta 5000 2500, streams 0 1 TEST: bands 0:1 [ OK ] INFO: Expected ratio 2.00 Measured ratio 1.99 killing MZ killed MZ killing MZ killed MZ Running in classifier mode Testing ets bands 3 strict 3, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 strict 3, streams 1 2 TEST: band 1 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 2 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 TEST: bands 1:2 [ OK ] INFO: Expected ratio 2.00 Measured ratio 1.99 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.00 Measured ratio .99 TEST: bands 0:2 [ OK ] INFO: Expected ratio 1.00 Measured ratio .99 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.42 Measured ratio 1.42 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.60 Measured ratio 1.60 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 2 quanta 5000 2500, streams 0 1 TEST: bands 0:1 [ OK ] INFO: Expected ratio 2.00 Measured ratio 2.00 killing MZ killed MZ killing MZ killed MZ root@defiant:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/forwarding#
That would imply that the reproducer needs to include the previous tests as well.
This is entirely possible, as timeouts and CTRL+C events do not seem to be caught and the cleanup is not done ...
sch_ets_core.sh: trap cleanup EXIT
Some tests time out even after settings:timeout=240, so IMHO this should be taken into account.
Best regards, Mirsad Todorovac
It looks like the test is stuck in ets_test_mixed in classifier_mode. A way to run just this test would be:
TESTS="classifier_mode ets_test_mixed" ./sch_ets.sh
Looking at the code, the only place that I can see that waits on anything is the "{ kill %% && wait %%; } 2>/dev/null" line in stop_traffic() (in lib.sh). Maybe something like this would let us see if that's the case:
modified tools/testing/selftests/net/forwarding/lib.sh @@ -1468,8 +1470,10 @@ start_tcp_traffic() stop_traffic() { + echo killing MZ # Suppress noise from killing mausezahn. { kill %% && wait %%; } 2>/dev/null + echo killed MZ }
FYI, this is the [incomplete] list of the tests that time out even after being assigned long timeout of 240s instead of default.
marvin@defiant:~/linux/kernel/linux_torvalds$ grep TIMEOUT ../kselftest-6.5-rc2-net-forwarding-8.log not ok 13 selftests: net/forwarding: custom_multipath_hash.sh # TIMEOUT 240 seconds not ok 18 selftests: net/forwarding: gre_custom_multipath_hash.sh # TIMEOUT 240 seconds not ok 19 selftests: net/forwarding: gre_inner_v4_multipath.sh # TIMEOUT 240 seconds not ok 21 selftests: net/forwarding: gre_multipath_nh_res.sh # TIMEOUT 240 seconds not ok 22 selftests: net/forwarding: gre_multipath_nh.sh # TIMEOUT 240 seconds not ok 27 selftests: net/forwarding: ip6gre_custom_multipath_hash.sh # TIMEOUT 240 seconds not ok 34 selftests: net/forwarding: ip6gre_inner_v4_multipath.sh # TIMEOUT 240 seconds not ok 58 selftests: net/forwarding: no_forwarding.sh # TIMEOUT 240 seconds not ok 67 selftests: net/forwarding: router_mpath_nh_res.sh # TIMEOUT 240 seconds not ok 68 selftests: net/forwarding: router_mpath_nh.sh # TIMEOUT 240 seconds not ok 70 selftests: net/forwarding: router_multipath.sh # TIMEOUT 240 seconds marvin@defiant:~/linux/kernel/linux_torvalds$
Hope this helps.
Best regards, Mirsad Todorovac
On 7/20/23 18:25, Mirsad Todorovac wrote:
On 7/20/23 18:07, Mirsad Todorovac wrote:
On 7/20/23 11:43, Petr Machata wrote:
Mirsad Todorovac mirsad.todorovac@alu.unizg.hr writes:
Using the same config for 6.5-rc2 on Ubuntu 22.04 LTS and 22.10, the execution stop at the exact same line on both boxes (os I reckon it is more than an accident):
# selftests: net/forwarding: sch_ets.sh # TEST: ping vlan 10 [ OK ] # TEST: ping vlan 11 [ OK ] # TEST: ping vlan 12 [ OK ] # Running in priomap mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 # TEST: bands 1:2 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.00 Measured ratio .99 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 1.00 Measured ratio 1.00 # Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.42 Measured ratio 1.42 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.60 Measured ratio 1.59 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 2 quanta 5000 2500, streams 0 1 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Running in classifier mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1
I tried to run 'set -x' enabled version standalone, but that one finished correctly (?).
It could be something previous scripts left, but right now I don't have a clue. I can attempt to rerun all tests with sch_ets.sh bash 'set -x' enabled later today.
If you run it standalone without set -x, does it finish as well?
Added that. Yes, standlone run finishes correctly, with or without 'set -x':
root@defiant:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/forwarding# ./sch_ets.sh TEST: ping vlan 10 [ OK ] TEST: ping vlan 11 [ OK ] TEST: ping vlan 12 [ OK ] Running in priomap mode Testing ets bands 3 strict 3, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 strict 3, streams 1 2 TEST: band 1 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 2 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 TEST: bands 1:2 [ OK ] INFO: Expected ratio 2.00 Measured ratio 1.99 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.00 Measured ratio 1.00 TEST: bands 0:2 [ OK ] INFO: Expected ratio 1.00 Measured ratio .99 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.42 Measured ratio 1.42 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.60 Measured ratio 1.59 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 2 quanta 5000 2500, streams 0 1 TEST: bands 0:1 [ OK ] INFO: Expected ratio 2.00 Measured ratio 1.99 killing MZ killed MZ killing MZ killed MZ Running in classifier mode Testing ets bands 3 strict 3, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 strict 3, streams 1 2 TEST: band 1 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 2 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 TEST: bands 1:2 [ OK ] INFO: Expected ratio 2.00 Measured ratio 1.99 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.00 Measured ratio .99 TEST: bands 0:2 [ OK ] INFO: Expected ratio 1.00 Measured ratio .99 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.42 Measured ratio 1.42 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.60 Measured ratio 1.60 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 2 quanta 5000 2500, streams 0 1 TEST: bands 0:1 [ OK ] INFO: Expected ratio 2.00 Measured ratio 2.00 killing MZ killed MZ killing MZ killed MZ root@defiant:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/forwarding#
That would imply that the reproducer needs to include the previous tests as well.
This is entirely possible, as timeouts and CTRL+C events do not seem to be caught and the cleanup is not done ...
sch_ets_core.sh: trap cleanup EXIT
Some tests time out even after settings:timeout=240, so IMHO this should be taken into account.
Best regards, Mirsad Todorovac
It looks like the test is stuck in ets_test_mixed in classifier_mode. A way to run just this test would be:
TESTS="classifier_mode ets_test_mixed" ./sch_ets.sh
Looking at the code, the only place that I can see that waits on anything is the "{ kill %% && wait %%; } 2>/dev/null" line in stop_traffic() (in lib.sh). Maybe something like this would let us see if that's the case:
modified tools/testing/selftests/net/forwarding/lib.sh @@ -1468,8 +1470,10 @@ start_tcp_traffic() stop_traffic() { + echo killing MZ # Suppress noise from killing mausezahn. { kill %% && wait %%; } 2>/dev/null + echo killed MZ }
FYI, this is the [incomplete] list of the tests that time out even after being assigned long timeout of 240s instead of default.
marvin@defiant:~/linux/kernel/linux_torvalds$ grep TIMEOUT ../kselftest-6.5-rc2-net-forwarding-8.log not ok 13 selftests: net/forwarding: custom_multipath_hash.sh # TIMEOUT 240 seconds not ok 18 selftests: net/forwarding: gre_custom_multipath_hash.sh # TIMEOUT 240 seconds not ok 19 selftests: net/forwarding: gre_inner_v4_multipath.sh # TIMEOUT 240 seconds not ok 21 selftests: net/forwarding: gre_multipath_nh_res.sh # TIMEOUT 240 seconds not ok 22 selftests: net/forwarding: gre_multipath_nh.sh # TIMEOUT 240 seconds not ok 27 selftests: net/forwarding: ip6gre_custom_multipath_hash.sh # TIMEOUT 240 seconds not ok 34 selftests: net/forwarding: ip6gre_inner_v4_multipath.sh # TIMEOUT 240 seconds not ok 58 selftests: net/forwarding: no_forwarding.sh # TIMEOUT 240 seconds not ok 67 selftests: net/forwarding: router_mpath_nh_res.sh # TIMEOUT 240 seconds not ok 68 selftests: net/forwarding: router_mpath_nh.sh # TIMEOUT 240 seconds not ok 70 selftests: net/forwarding: router_multipath.sh # TIMEOUT 240 seconds marvin@defiant:~/linux/kernel/linux_torvalds$
The test seem to be stuck despite adding SIGTERM cleanup to all tests that timed out:
# selftests: net/forwarding: sch_ets.sh # TEST: ping vlan 10 [ OK ] # TEST: ping vlan 11 [ OK ] # TEST: ping vlan 12 [ OK ] # Running in priomap mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 # TEST: bands 1:2 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 2.00 # Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.00 Measured ratio .99 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 1.00 Measured ratio .99 # Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.42 Measured ratio 1.42 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.60 Measured ratio 1.59 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 2 quanta 5000 2500, streams 0 1 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Running in classifier mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1
What is happening is a stuck mousezahn loop:
root 558266 2732 0 01:17 pts/2 00:00:23 mausezahn veth0.10 -p 8000 -A 192.0.2.1 -B 192.0.2.2 -c 0 -a own -b ba:33:37:81:dc:5 root 558273 2732 0 01:17 pts/2 00:00:23 mausezahn veth0.11 -p 8000 -A 192.0.2.17 -B 192.0.2.18 -c 0 -a own -b ba:33:37:81:dc
$ sudo strace -p 558266 ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device)
I am not sure if this will ever complete, or is it a runaway loop?
Probably it is stuck, for nothing was written to the test log for 71 minutes.
I hope we are coming closer to the cause.
Previously I didn't figure out what hanged.
FWIW, I will post the patches for the scripts that timed out for your review.
Kind regards, Mirsad
On 7/22/23 02:31, Mirsad Todorovac wrote:
On 7/20/23 18:25, Mirsad Todorovac wrote:
On 7/20/23 18:07, Mirsad Todorovac wrote:
On 7/20/23 11:43, Petr Machata wrote:
Mirsad Todorovac mirsad.todorovac@alu.unizg.hr writes:
Using the same config for 6.5-rc2 on Ubuntu 22.04 LTS and 22.10, the execution stop at the exact same line on both boxes (os I reckon it is more than an accident):
# selftests: net/forwarding: sch_ets.sh # TEST: ping vlan 10 [ OK ] # TEST: ping vlan 11 [ OK ] # TEST: ping vlan 12 [ OK ] # Running in priomap mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 # TEST: bands 1:2 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.00 Measured ratio .99 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 1.00 Measured ratio 1.00 # Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.42 Measured ratio 1.42 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.60 Measured ratio 1.59 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 2 quanta 5000 2500, streams 0 1 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Running in classifier mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1
I tried to run 'set -x' enabled version standalone, but that one finished correctly (?).
It could be something previous scripts left, but right now I don't have a clue. I can attempt to rerun all tests with sch_ets.sh bash 'set -x' enabled later today.
If you run it standalone without set -x, does it finish as well?
Added that. Yes, standlone run finishes correctly, with or without 'set -x':
root@defiant:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/forwarding# ./sch_ets.sh TEST: ping vlan 10 [ OK ] TEST: ping vlan 11 [ OK ] TEST: ping vlan 12 [ OK ] Running in priomap mode Testing ets bands 3 strict 3, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 strict 3, streams 1 2 TEST: band 1 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 2 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 TEST: bands 1:2 [ OK ] INFO: Expected ratio 2.00 Measured ratio 1.99 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.00 Measured ratio 1.00 TEST: bands 0:2 [ OK ] INFO: Expected ratio 1.00 Measured ratio .99 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.42 Measured ratio 1.42 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.60 Measured ratio 1.59 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 2 quanta 5000 2500, streams 0 1 TEST: bands 0:1 [ OK ] INFO: Expected ratio 2.00 Measured ratio 1.99 killing MZ killed MZ killing MZ killed MZ Running in classifier mode Testing ets bands 3 strict 3, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 strict 3, streams 1 2 TEST: band 1 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 2 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 TEST: band 0 [ OK ] INFO: Expected ratio >95% Measured ratio 100.00 TEST: band 1 [ OK ] INFO: Expected ratio <5% Measured ratio 0 killing MZ killed MZ killing MZ killed MZ Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 TEST: bands 1:2 [ OK ] INFO: Expected ratio 2.00 Measured ratio 1.99 killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.00 Measured ratio .99 TEST: bands 0:2 [ OK ] INFO: Expected ratio 1.00 Measured ratio .99 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.42 Measured ratio 1.42 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 TEST: bands 0:1 [ OK ] INFO: Expected ratio 1.60 Measured ratio 1.60 TEST: bands 0:2 [ OK ] INFO: Expected ratio 3.33 Measured ratio 3.33 killing MZ killed MZ killing MZ killed MZ killing MZ killed MZ Testing ets bands 2 quanta 5000 2500, streams 0 1 TEST: bands 0:1 [ OK ] INFO: Expected ratio 2.00 Measured ratio 2.00 killing MZ killed MZ killing MZ killed MZ root@defiant:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/forwarding#
That would imply that the reproducer needs to include the previous tests as well.
This is entirely possible, as timeouts and CTRL+C events do not seem to be caught and the cleanup is not done ...
sch_ets_core.sh: trap cleanup EXIT
Some tests time out even after settings:timeout=240, so IMHO this should be taken into account.
Best regards, Mirsad Todorovac
It looks like the test is stuck in ets_test_mixed in classifier_mode. A way to run just this test would be:
TESTS="classifier_mode ets_test_mixed" ./sch_ets.sh
Looking at the code, the only place that I can see that waits on anything is the "{ kill %% && wait %%; } 2>/dev/null" line in stop_traffic() (in lib.sh). Maybe something like this would let us see if that's the case:
modified tools/testing/selftests/net/forwarding/lib.sh @@ -1468,8 +1470,10 @@ start_tcp_traffic() stop_traffic() { + echo killing MZ # Suppress noise from killing mausezahn. { kill %% && wait %%; } 2>/dev/null + echo killed MZ }
FYI, this is the [incomplete] list of the tests that time out even after being assigned long timeout of 240s instead of default.
marvin@defiant:~/linux/kernel/linux_torvalds$ grep TIMEOUT ../kselftest-6.5-rc2-net-forwarding-8.log not ok 13 selftests: net/forwarding: custom_multipath_hash.sh # TIMEOUT 240 seconds not ok 18 selftests: net/forwarding: gre_custom_multipath_hash.sh # TIMEOUT 240 seconds not ok 19 selftests: net/forwarding: gre_inner_v4_multipath.sh # TIMEOUT 240 seconds not ok 21 selftests: net/forwarding: gre_multipath_nh_res.sh # TIMEOUT 240 seconds not ok 22 selftests: net/forwarding: gre_multipath_nh.sh # TIMEOUT 240 seconds not ok 27 selftests: net/forwarding: ip6gre_custom_multipath_hash.sh # TIMEOUT 240 seconds not ok 34 selftests: net/forwarding: ip6gre_inner_v4_multipath.sh # TIMEOUT 240 seconds not ok 58 selftests: net/forwarding: no_forwarding.sh # TIMEOUT 240 seconds not ok 67 selftests: net/forwarding: router_mpath_nh_res.sh # TIMEOUT 240 seconds not ok 68 selftests: net/forwarding: router_mpath_nh.sh # TIMEOUT 240 seconds not ok 70 selftests: net/forwarding: router_multipath.sh # TIMEOUT 240 seconds marvin@defiant:~/linux/kernel/linux_torvalds$
The test seem to be stuck despite adding SIGTERM cleanup to all tests that timed out:
# selftests: net/forwarding: sch_ets.sh # TEST: ping vlan 10 [ OK ] # TEST: ping vlan 11 [ OK ] # TEST: ping vlan 12 [ OK ] # Running in priomap mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 # TEST: bands 1:2 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 2.00 # Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.00 Measured ratio .99 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 1.00 Measured ratio .99 # Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.42 Measured ratio 1.42 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.60 Measured ratio 1.59 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 2 quanta 5000 2500, streams 0 1 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Running in classifier mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1
What is happening is a stuck mousezahn loop:
root 558266 2732 0 01:17 pts/2 00:00:23 mausezahn veth0.10 -p 8000 -A 192.0.2.1 -B 192.0.2.2 -c 0 -a own -b ba:33:37:81:dc:5 root 558273 2732 0 01:17 pts/2 00:00:23 mausezahn veth0.11 -p 8000 -A 192.0.2.17 -B 192.0.2.18 -c 0 -a own -b ba:33:37:81:dc
$ sudo strace -p 558266 ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device) ioctl(5, SIOCGIFINDEX, {ifr_name="veth0.10"}) = -1 ENODEV (No such device)
I am not sure if this will ever complete, or is it a runaway loop?
Probably it is stuck, for nothing was written to the test log for 71 minutes.
I hope we are coming closer to the cause.
Previously I didn't figure out what hanged.
FWIW, I will post the patches for the scripts that timed out for your review.
Hi,
More homework:
I can trace the infinite mausezahn mode to here:
lib.sh 1420 __start_traffic() 1421 { 1422 local pktsize=$1; shift 1423 local proto=$1; shift 1424 local h_in=$1; shift # Where the traffic egresses the host 1425 local sip=$1; shift 1426 local dip=$1; shift 1427 local dmac=$1; shift 1428 1429 $MZ $h_in -p $pktsize -A $sip -B $dip -c 0 \ 1430 -a own -b $dmac -t "$proto" -q "$@" & 1431 sleep 1 1432 }
but I don't know why in this test it runs away for "veth0.10" in an infinite loop and is obviously nowhere killed for hours.
I seem to have no new ideas now so I guess I should take a break.
Kind regards, Mirsad
Mirsad Todorovac mirsad.todorovac@alu.unizg.hr writes:
On 7/20/23 18:25, Mirsad Todorovac wrote:
On 7/20/23 18:07, Mirsad Todorovac wrote:
root@defiant:/home/marvin/linux/kernel/linux_torvalds/tools/testing/selftests/net/forwarding#
That would imply that the reproducer needs to include the previous tests as well.
This is entirely possible, as timeouts and CTRL+C events do not seem to be caught and the cleanup is not done ...
sch_ets_core.sh: trap cleanup EXIT
Some tests time out even after settings:timeout=240, so IMHO this should be taken into account.
Best regards, Mirsad Todorovac
It looks like the test is stuck in ets_test_mixed in classifier_mode. A way to run just this test would be:
TESTS="classifier_mode ets_test_mixed" ./sch_ets.sh
Looking at the code, the only place that I can see that waits on anything is the "{ kill %% && wait %%; } 2>/dev/null" line in stop_traffic() (in lib.sh). Maybe something like this would let us see if that's the case:
modified tools/testing/selftests/net/forwarding/lib.sh @@ -1468,8 +1470,10 @@ start_tcp_traffic() stop_traffic() { + echo killing MZ # Suppress noise from killing mausezahn. { kill %% && wait %%; } 2>/dev/null + echo killed MZ }
FYI, this is the [incomplete] list of the tests that time out even after being assigned long timeout of 240s instead of default. marvin@defiant:~/linux/kernel/linux_torvalds$ grep TIMEOUT ../kselftest-6.5-rc2-net-forwarding-8.log not ok 13 selftests: net/forwarding: custom_multipath_hash.sh # TIMEOUT 240 seconds not ok 18 selftests: net/forwarding: gre_custom_multipath_hash.sh # TIMEOUT 240 seconds not ok 19 selftests: net/forwarding: gre_inner_v4_multipath.sh # TIMEOUT 240 seconds not ok 21 selftests: net/forwarding: gre_multipath_nh_res.sh # TIMEOUT 240 seconds not ok 22 selftests: net/forwarding: gre_multipath_nh.sh # TIMEOUT 240 seconds not ok 27 selftests: net/forwarding: ip6gre_custom_multipath_hash.sh # TIMEOUT 240 seconds not ok 34 selftests: net/forwarding: ip6gre_inner_v4_multipath.sh # TIMEOUT 240 seconds not ok 58 selftests: net/forwarding: no_forwarding.sh # TIMEOUT 240 seconds not ok 67 selftests: net/forwarding: router_mpath_nh_res.sh # TIMEOUT 240 seconds not ok 68 selftests: net/forwarding: router_mpath_nh.sh # TIMEOUT 240 seconds not ok 70 selftests: net/forwarding: router_multipath.sh # TIMEOUT 240 seconds marvin@defiant:~/linux/kernel/linux_torvalds$
The test seem to be stuck despite adding SIGTERM cleanup to all tests that timed out:
# selftests: net/forwarding: sch_ets.sh # TEST: ping vlan 10 [ OK ] # TEST: ping vlan 11 [ OK ] # TEST: ping vlan 12 [ OK ] # Running in priomap mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 1 2 # TEST: bands 1:2 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 2.00 # Testing ets bands 3 quanta 3300 3300 3300, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.00 Measured ratio .99 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 1.00 Measured ratio .99 # Testing ets bands 3 quanta 5000 3500 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.42 Measured ratio 1.42 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 3 quanta 5000 8000 1500, streams 0 1 2 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 1.60 Measured ratio 1.59 # TEST: bands 0:2 [ OK ] # INFO: Expected ratio 3.33 Measured ratio 3.33 # Testing ets bands 2 quanta 5000 2500, streams 0 1 # TEST: bands 0:1 [ OK ] # INFO: Expected ratio 2.00 Measured ratio 1.99 # Running in classifier mode # Testing ets bands 3 strict 3, streams 0 1 # TEST: band 0 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 1 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 3 strict 3, streams 1 2 # TEST: band 1 [ OK ] # INFO: Expected ratio >95% Measured ratio 100.00 # TEST: band 2 [ OK ] # INFO: Expected ratio <5% Measured ratio 0 # Testing ets bands 4 strict 1 quanta 5000 2500 1500, streams 0 1
What is happening is a stuck mousezahn loop:
root 558266 2732 0 01:17 pts/2 00:00:23 mausezahn veth0.10 -p 8000 -A 192.0.2.1 -B 192.0.2.2 -c 0 -a own -b ba:33:37:81:dc:5 root 558273 2732 0 01:17 pts/2 00:00:23 mausezahn veth0.11 -p 8000 -A 192.0.2.17 -B 192.0.2.18 -c 0 -a own -b ba:33:37:81:dc
Sure, the test is stuck and therefore didn't reap its MZ invocations. The test is supposed to invoke stop_traffic(), which then does { kill %% && wait %%; }, which should do it.
The question is, does it reach that statement? If yes, the wait is likely where it's stuck, but why? If not, where else could it be stuck? I don't see any other place where the test loops endlessly or waits.
I am not sure if this will ever complete, or is it a runaway loop?
It won't, that's what the "-c 0" on the command like does.
Probably it is stuck, for nothing was written to the test log for 71 minutes.
I hope we are coming closer to the cause.
Previously I didn't figure out what hanged.
The script did :)
The fact that there's mausezahn running alongside is relevant, but not the cause.
Hi,
On 7/20/23 18:07, Mirsad Todorovac wrote:
On 7/20/23 11:43, Petr Machata wrote:
Mirsad Todorovac mirsad.todorovac@alu.unizg.hr writes:
Using the same config for 6.5-rc2 on Ubuntu 22.04 LTS and 22.10, the execution stop at the exact same line on both boxes (os I reckon it is more than an accident):
# selftests: net/forwarding: sch_ets.sh
I have ran the full test (the net/forwarding part), with only sch_ets.sh being enabled the 'set -x' option, as it was the most interesting part.
This is my best effort at diagnosing the situation.
The hang appears on an unsuspecting command:
[...] # ++ local if_name=veth0 # ++ local add_del=del # ++ local array # ++ shift # ++ shift # ++ array=("${@}") # ++ ip link set dev veth0 down # ++ ip link set dev veth0 nomaster # ++ vrf_destroy vveth0 # ++ local vrf_name=vveth0 # ++ local tb_id # ++ __vrf_td_id_lookup vveth0 # ++ local vrf_name=vveth0 # ++ return 1 # ++ tb_id=1 # ++ ip -6 route del table 1 unreachable default metric 4278198272 # ++ ip -4 route del table 1 unreachable default metric 4278198272 # ++ ip link del dev vveth0 # ++ vrf_cleanup # ++ ip -6 rule add pref 0 table local # ++ ip -6 rule del pref 32765 # ++ ip -4 rule add pref 0 table local # ++ ip -4 rule del pref 32765
Please find the full kselftest execution log attached at your convenience.
Hope this helps, because I do not really see what makes this command hang indefinitely.
Config is the same, kernel still 6.5-rc2, to test in the same conditions.
Best regards, Mirsad Todorovac
linux-kselftest-mirror@lists.linaro.org