On Tue, Oct 29, 2024 at 03:43:37PM +0000, Mark Rutland wrote:
On those emulated platforms (FVP?), does this change trigger the faukure more often?
Yes.
I gave this a quick test, and with this change, running fp-stress on a defconfig kernel running on 1 CPU triggers the "Bad SVCR: 0" splat in 35/100 runs. Hacking that down to 5ms brings that to 89/100 runs. So even if we have to keep this high for an emulated platform, it'd probably be worth being able to override that as a parameter?
I was getting better numbers than that with the default multi-CPU setups on my particular machine, most runs were showing something IIRC. I do agree that it'd be a useful command line argument to add incrementally.
Otherwise, maybe it's worth increasing the timeout for the fp-stress test specifically? The docs at:
https://docs.kernel.org/dev-tools/kselftest.html#timeout-for-selftests
... imply that is possible, but don't say exactly how, and it seems legitimate for a stress test.
IIRC it's per suite and there's a bunch of pushback on using it in default configurations since the selftests are expected to run quickly in other cases where I'd have said it was a reasonable change to make. Stress tests are not entirely idiomatic for kselftest, it's supposed to run quickly.
+#define SIGNAL_INTERVAL_MS 25 +#define LOG_INTERVALS (1000 / SIGNAL_INTERVAL_MS)
Running this, I see that by default test logs:
# Will run for 400s
... for a timeout that's actually ~10s, due to the following, which isn't in the diff:
if (timeout > 0) ksft_print_msg("Will run for %ds\n", timeout);
... so that probably wants an update to either convert to seconds or be in terms of signals, and likewise for the "timeout remaining" message below.
Otherwise, this looks good to me.
Oh, yeah - we should probably just remove the unit from that one. I never see it due to all the spam from the subprocesses starting.