One of the things that fp-stress does to stress the floating point context switching is send signals to the test threads it spawns. Currently we do this once per second but as suggested by Mark Rutland if we increase this we can improve the chances of triggering any issues with context switching the signal handling code. Do a quick change to increase the rate of signal delivery, trying to avoid excessive impact on emulated platforms, and a further change to mitigate the impact that this creates during startup.
Signed-off-by: Mark Brown broonie@kernel.org --- Changes in v2: - Minor clarifications in commit message and log output. - Link to v1: https://lore.kernel.org/r/20241029-arm64-fp-stress-interval-v1-0-db540abf6dd...
--- Mark Brown (2): kselftest/arm64: Increase frequency of signal delivery in fp-stress kselftest/arm64: Poll less often while waiting for fp-stress children
tools/testing/selftests/arm64/fp/fp-stress.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) --- base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354 change-id: 20241028-arm64-fp-stress-interval-8f5e62c06e12
Best regards,
Currently we only deliver signals to the processes being tested about once a second, meaning that the signal code paths are subject to relatively little stress. Increase this frequency substantially to 25ms intervals, along with some minor refactoring to make this more readily tuneable and maintain the 1s logging interval. This interval was chosen based on some experimentation with emulated platforms to avoid causing so much extra load that the test starts to run into the 45s limit for selftests or generally completely disconnect the timeout numbers from the
We could increase this if we moved the signal generation out of the main supervisor thread, though we should also consider that he percentage of time that we spend interacting with the floating point state is also a consideration.
Suggested-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Mark Brown broonie@kernel.org --- tools/testing/selftests/arm64/fp/fp-stress.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/fp-stress.c b/tools/testing/selftests/arm64/fp/fp-stress.c index faac24bdefeb9436e2daf20b7250d0ae25ca23a7..71d02c701bf56be56b7ad00a5f6614e33dc8e01b 100644 --- a/tools/testing/selftests/arm64/fp/fp-stress.c +++ b/tools/testing/selftests/arm64/fp/fp-stress.c @@ -28,6 +28,9 @@
#define MAX_VLS 16
+#define SIGNAL_INTERVAL_MS 25 +#define LOG_INTERVALS (1000 / SIGNAL_INTERVAL_MS) + struct child_data { char *name, *output; pid_t pid; @@ -449,7 +452,7 @@ static const struct option options[] = { int main(int argc, char **argv) { int ret; - int timeout = 10; + int timeout = 10 * (1000 / SIGNAL_INTERVAL_MS); int cpus, i, j, c; int sve_vl_count, sme_vl_count; bool all_children_started = false; @@ -505,7 +508,7 @@ int main(int argc, char **argv) have_sme2 ? "present" : "absent");
if (timeout > 0) - ksft_print_msg("Will run for %ds\n", timeout); + ksft_print_msg("Will run for %d\n", timeout); else ksft_print_msg("Will run until terminated\n");
@@ -578,14 +581,14 @@ int main(int argc, char **argv) break;
/* - * Timeout is counted in seconds with no output, the - * tests print during startup then are silent when - * running so this should ensure they all ran enough - * to install the signal handler, this is especially - * useful in emulation where we will both be slow and - * likely to have a large set of VLs. + * Timeout is counted in poll intervals with no + * output, the tests print during startup then are + * silent when running so this should ensure they all + * ran enough to install the signal handler, this is + * especially useful in emulation where we will both + * be slow and likely to have a large set of VLs. */ - ret = epoll_wait(epoll_fd, evs, tests, 1000); + ret = epoll_wait(epoll_fd, evs, tests, SIGNAL_INTERVAL_MS); if (ret < 0) { if (errno == EINTR) continue; @@ -625,8 +628,9 @@ int main(int argc, char **argv) all_children_started = true; }
- ksft_print_msg("Sending signals, timeout remaining: %d\n", - timeout); + if ((timeout % LOG_INTERVALS) == 0) + ksft_print_msg("Sending signals, timeout remaining: %d\n", + timeout);
for (i = 0; i < num_children; i++) child_tickle(&children[i]);
While fp-stress is waiting for children to start it doesn't send any signals to them so there is no need for it to have as short an epoll() timeout as it does when the children are all running. We do still want to have some timeout so that we can log diagnostics about missing children but this can be relatively large. On emulated platforms the overhead of running the supervisor process is quite high, especially during the process of execing the test binaries.
Implement a longer epoll() timeout during the setup phase, using a 5s timeout while waiting for children and switching to the signal raise interval when all the children are started and we start sending signals.
Signed-off-by: Mark Brown broonie@kernel.org --- tools/testing/selftests/arm64/fp/fp-stress.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/fp/fp-stress.c b/tools/testing/selftests/arm64/fp/fp-stress.c index 71d02c701bf56be56b7ad00a5f6614e33dc8e01b..4f1ef260ce7a26d24092fe9337714f8c3922070a 100644 --- a/tools/testing/selftests/arm64/fp/fp-stress.c +++ b/tools/testing/selftests/arm64/fp/fp-stress.c @@ -453,6 +453,7 @@ int main(int argc, char **argv) { int ret; int timeout = 10 * (1000 / SIGNAL_INTERVAL_MS); + int poll_interval = 5000; int cpus, i, j, c; int sve_vl_count, sme_vl_count; bool all_children_started = false; @@ -588,7 +589,7 @@ int main(int argc, char **argv) * especially useful in emulation where we will both * be slow and likely to have a large set of VLs. */ - ret = epoll_wait(epoll_fd, evs, tests, SIGNAL_INTERVAL_MS); + ret = epoll_wait(epoll_fd, evs, tests, poll_interval); if (ret < 0) { if (errno == EINTR) continue; @@ -626,6 +627,7 @@ int main(int argc, char **argv) }
all_children_started = true; + poll_interval = SIGNAL_INTERVAL_MS; }
if ((timeout % LOG_INTERVALS) == 0)
On Wed, Oct 30, 2024 at 12:02:01AM +0000, Mark Brown wrote:
One of the things that fp-stress does to stress the floating point context switching is send signals to the test threads it spawns. Currently we do this once per second but as suggested by Mark Rutland if we increase this we can improve the chances of triggering any issues with context switching the signal handling code. Do a quick change to increase the rate of signal delivery, trying to avoid excessive impact on emulated platforms, and a further change to mitigate the impact that this creates during startup.
Signed-off-by: Mark Brown broonie@kernel.org
Changes in v2:
- Minor clarifications in commit message and log output.
- Link to v1: https://lore.kernel.org/r/20241029-arm64-fp-stress-interval-v1-0-db540abf6dd...
Mark Brown (2): kselftest/arm64: Increase frequency of signal delivery in fp-stress kselftest/arm64: Poll less often while waiting for fp-stress children
With these changes, I was easily able to reproduce the SVCR=0 bug so:
Acked-by: Will Deacon will@kernel.org
for both.
Will
On Wed, 30 Oct 2024 00:02:01 +0000, Mark Brown wrote:
One of the things that fp-stress does to stress the floating point context switching is send signals to the test threads it spawns. Currently we do this once per second but as suggested by Mark Rutland if we increase this we can improve the chances of triggering any issues with context switching the signal handling code. Do a quick change to increase the rate of signal delivery, trying to avoid excessive impact on emulated platforms, and a further change to mitigate the impact that this creates during startup.
[...]
Applied to arm64 (for-next/kselftest), thanks!
[1/2] kselftest/arm64: Increase frequency of signal delivery in fp-stress https://git.kernel.org/arm64/c/a3590d71a1ac [2/2] kselftest/arm64: Poll less often while waiting for fp-stress children https://git.kernel.org/arm64/c/161e9925053c
linux-kselftest-mirror@lists.linaro.org