The first 100 lines of trace output don't always contain 3 or more distinct events. In busy systems, they may be dominated by repetitive events like sched_stat_runtime, causing the `$count -lt 3` check to fail. Example trace:
$ head -n 100 trace | grep -v ^# systemd-timesyn-266 [006] d.h2. 738.778482: sched_stat_runtime: comm=systemd-timesyn pid=266 runtime=976854 [ns] ftracetest-8751 [001] d.h2. 738.778512: sched_stat_runtime: comm=ftracetest pid=8751 runtime=938335 [ns] systemd-timesyn-266 [006] d.h1. 738.779531: sched_stat_runtime: comm=systemd-timesyn pid=266 runtime=1044284 [ns] ftracetest-8751 [001] d.h2. 738.779541: sched_stat_runtime: comm=ftracetest pid=8751 runtime=1028575 [ns] systemd-1 [007] d.h5. 738.779657: sched_stat_runtime: comm=systemd pid=1 runtime=642624 [ns] [...]
With trace cleared, simply check `$count -eq 0` to confirm subsystem enablement, just like toplevel-enable.tc does.
Fixes: 1a4ea83a6e67 ("selftests/ftrace: Limit length in subsystem-enable tests") Signed-off-by: Tengda Wu wutengda@huaweicloud.com --- .../selftests/ftrace/test.d/event/subsystem-enable.tc | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc b/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc index b7c8f29c09a9..3a28adc7b727 100644 --- a/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc +++ b/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc @@ -19,8 +19,8 @@ echo 'sched:*' > set_event yield
count=`head -n 100 trace | grep -v ^# | awk '{ print $5 }' | sort -u | wc -l` -if [ $count -lt 3 ]; then - fail "at least fork, exec and exit events should be recorded" +if [ $count -eq 0 ]; then + fail "none of scheduler events are recorded" fi
do_reset @@ -30,8 +30,8 @@ echo 1 > events/sched/enable yield
count=`head -n 100 trace | grep -v ^# | awk '{ print $5 }' | sort -u | wc -l` -if [ $count -lt 3 ]; then - fail "at least fork, exec and exit events should be recorded" +if [ $count -eq 0 ]; then + fail "none of scheduler events are recorded" fi
do_reset
On Thu, 10 Jul 2025 13:01:34 +0000 Tengda Wu wutengda@huaweicloud.com wrote:
Fixes: 1a4ea83a6e67 ("selftests/ftrace: Limit length in subsystem-enable tests") Signed-off-by: Tengda Wu wutengda@huaweicloud.com
.../selftests/ftrace/test.d/event/subsystem-enable.tc | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc b/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc index b7c8f29c09a9..3a28adc7b727 100644 --- a/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc +++ b/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc @@ -19,8 +19,8 @@ echo 'sched:*' > set_event yield count=`head -n 100 trace | grep -v ^# | awk '{ print $5 }' | sort -u | wc -l` -if [ $count -lt 3 ]; then
- fail "at least fork, exec and exit events should be recorded"
+if [ $count -eq 0 ]; then
- fail "none of scheduler events are recorded"
fi do_reset @@ -30,8 +30,8 @@ echo 1 > events/sched/enable yield count=`head -n 100 trace | grep -v ^# | awk '{ print $5 }' | sort -u | wc -l` -if [ $count -lt 3 ]; then
- fail "at least fork, exec and exit events should be recorded"
+if [ $count -eq 0 ]; then
- fail "none of scheduler events are recorded"
So if there's a bug that causes the system enable to only enable a single event, this will no longer catch it?
I rather not let the slide.
Can you test this to see if this works for you?
-- Steve
diff --git a/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc b/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc index b7c8f29c09a9..46a9e6d92730 100644 --- a/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc +++ b/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc @@ -14,11 +14,32 @@ fail() { #msg exit_fail }
+check_unique() { + cat trace_pipe | grep -v '^#' | awk ' + BEGIN { cnt = 0; } + { + for (i = 0; i < cnt; i++) { + if (event[i] == $5) { + break; + } + } + if (i == cnt) { + event[cnt++] = $5; + if (cnt > 2) { + exit; + } + } + } + END { + printf "%d", cnt; + }' +} + echo 'sched:*' > set_event
yield
-count=`head -n 100 trace | grep -v ^# | awk '{ print $5 }' | sort -u | wc -l` +count=`check_unique` if [ $count -lt 3 ]; then fail "at least fork, exec and exit events should be recorded" fi @@ -29,7 +50,7 @@ echo 1 > events/sched/enable
yield
-count=`head -n 100 trace | grep -v ^# | awk '{ print $5 }' | sort -u | wc -l` +count=`check_unique` if [ $count -lt 3 ]; then fail "at least fork, exec and exit events should be recorded" fi
On 2025/7/11 3:34, Steven Rostedt wrote:
On Thu, 10 Jul 2025 13:01:34 +0000 Tengda Wu wutengda@huaweicloud.com wrote:
Fixes: 1a4ea83a6e67 ("selftests/ftrace: Limit length in subsystem-enable tests") Signed-off-by: Tengda Wu wutengda@huaweicloud.com
.../selftests/ftrace/test.d/event/subsystem-enable.tc | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc b/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc index b7c8f29c09a9..3a28adc7b727 100644 --- a/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc +++ b/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc @@ -19,8 +19,8 @@ echo 'sched:*' > set_event yield count=`head -n 100 trace | grep -v ^# | awk '{ print $5 }' | sort -u | wc -l` -if [ $count -lt 3 ]; then
- fail "at least fork, exec and exit events should be recorded"
+if [ $count -eq 0 ]; then
- fail "none of scheduler events are recorded"
fi do_reset @@ -30,8 +30,8 @@ echo 1 > events/sched/enable yield count=`head -n 100 trace | grep -v ^# | awk '{ print $5 }' | sort -u | wc -l` -if [ $count -lt 3 ]; then
- fail "at least fork, exec and exit events should be recorded"
+if [ $count -eq 0 ]; then
- fail "none of scheduler events are recorded"
So if there's a bug that causes the system enable to only enable a single event, this will no longer catch it?
I rather not let the slide.
Can you test this to see if this works for you?
-- Steve
diff --git a/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc b/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc index b7c8f29c09a9..46a9e6d92730 100644 --- a/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc +++ b/tools/testing/selftests/ftrace/test.d/event/subsystem-enable.tc @@ -14,11 +14,32 @@ fail() { #msg exit_fail } +check_unique() {
- cat trace_pipe | grep -v '^#' | awk '
- BEGIN { cnt = 0; }
- {
for (i = 0; i < cnt; i++) {
if (event[i] == $5) {
break;
}
}
if (i == cnt) {
event[cnt++] = $5;
if (cnt > 2) {
exit;
}
}
- }
- END {
printf "%d", cnt;
- }'
+}
echo 'sched:*' > set_event yield -count=`head -n 100 trace | grep -v ^# | awk '{ print $5 }' | sort -u | wc -l` +count=`check_unique` if [ $count -lt 3 ]; then fail "at least fork, exec and exit events should be recorded" fi @@ -29,7 +50,7 @@ echo 1 > events/sched/enable yield -count=`head -n 100 trace | grep -v ^# | awk '{ print $5 }' | sort -u | wc -l` +count=`check_unique` if [ $count -lt 3 ]; then fail "at least fork, exec and exit events should be recorded" fi
The patch works well - after ~50 test iterations, we haven't observed any recurrence of the test case failures.
However, I'm concerned that using 'cat trace_pipe' (like the original 'cat trace' method) could bring back the stopping problem [1] on slower systems.
Could a slow trace_pipe reader (slower than sched event generation rate) reintroduce the original race condition?
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Some test details:
$ ./ftracetest -vvv subsystem-enable.tc [...] + echo sched:* + yield + ping 127.0.0.1 -c 1 PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.538 ms
--- 127.0.0.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 1ms rtt min/avg/max/mdev = 0.538/0.538/0.538/0.000 ms + check_unique + cat trace_pipe + grep -v ^# + awk BEGIN { cnt = 0; } { for (i = 0; i < cnt; i++) { if (event[i] == $5) { break; } } if (i == cnt) { event[cnt++] = $5; if (cnt > 2) { exit; } } } END { printf "%d", cnt; } + count=3 + [ 3 -lt 3 ] + do_reset [...]
Regards, Tengda
On July 10, 2025 10:48:54 PM EDT, Tengda Wu wutengda@huaweicloud.com wrote:
The patch works well - after ~50 test iterations, we haven't observed any recurrence of the test case failures.
However, I'm concerned that using 'cat trace_pipe' (like the original 'cat trace' method) could bring back the stopping problem [1] on slower systems.
Could a slow trace_pipe reader (slower than sched event generation rate) reintroduce the original race condition?
Only if it doesn't find three different events, in which case the test would fail regardless.
The awk script exits out as soon as it finds 3: unique events. It won't go forever, even on slower machines.
-- Steve
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Some test details:
$ ./ftracetest -vvv subsystem-enable.tc [...]
- echo sched:*
- yield
- ping 127.0.0.1 -c 1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.538 ms
--- 127.0.0.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 1ms rtt min/avg/max/mdev = 0.538/0.538/0.538/0.000 ms
- check_unique
- cat trace_pipe
- grep -v ^#
- awk BEGIN { cnt = 0; } { for (i = 0; i < cnt; i++) { if (event[i] == $5) { break; } } if (i == cnt) { event[cnt++] = $5; if (cnt > 2) { exit; } } } END { printf "%d", cnt; }
- count=3
- [ 3 -lt 3 ]
- do_reset
[...]
Regards, Tengda
On 2025/7/11 11:22, Steven Rostedt wrote:
On July 10, 2025 10:48:54 PM EDT, Tengda Wu wutengda@huaweicloud.com wrote:
The patch works well - after ~50 test iterations, we haven't observed any recurrence of the test case failures.
However, I'm concerned that using 'cat trace_pipe' (like the original 'cat trace' method) could bring back the stopping problem [1] on slower systems.
Could a slow trace_pipe reader (slower than sched event generation rate) reintroduce the original race condition?
Only if it doesn't find three different events, in which case the test would fail regardless.
The awk script exits out as soon as it finds 3: unique events. It won't go forever, even on slower machines.
-- Steve
Got it, thank you for explaining this. I have no further questions.
-- Tengda
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Some test details:
$ ./ftracetest -vvv subsystem-enable.tc [...]
- echo sched:*
- yield
- ping 127.0.0.1 -c 1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.538 ms
--- 127.0.0.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 1ms rtt min/avg/max/mdev = 0.538/0.538/0.538/0.000 ms
- check_unique
- cat trace_pipe
- grep -v ^#
- awk BEGIN { cnt = 0; } { for (i = 0; i < cnt; i++) { if (event[i] == $5) { break; } } if (i == cnt) { event[cnt++] = $5; if (cnt > 2) { exit; } } } END { printf "%d", cnt; }
- count=3
- [ 3 -lt 3 ]
- do_reset
[...]
Regards, Tengda
linux-kselftest-mirror@lists.linaro.org