On Thu, Jul 20, 2023 at 07:42:09PM +0100, Mark Brown wrote:
When we collect a signal context with one of the SME modes enabled we will have enabled that mode behind the compiler and libc's back so they may issue some instructions not valid in streaming mode, causing spurious failures.
For the code prior to issuing the BRK to trigger signal handling we need to stay in streaming mode if we were already there since that's a part of the signal context the caller is trying to collect. Unfortunately this code includes a memset() which is likely to be heavily optimised and is likely to use FP instructions incompatible with streaming mode. We can avoid this happening by open coding the memset(), inserting a volatile assembly statement to avoid the compiler recognising what's being done and doing something in optimisation. This code is not performance critical so the inefficiency should not be an issue.
After collecting the context we can simply exit streaming mode, avoiding these issues. Use a full SMSTOP for safety to prevent any issues appearing with ZA.
Reported-by: Will Deacon will@kernel.org Signed-off-by: Mark Brown broonie@kernel.org
Changes in v3:
- Open code OPTIMISER_HIDE_VAR() instead of the memory clobber.
- Link to v2: https://lore.kernel.org/r/20230712-arm64-signal-memcpy-fix-v2-1-494f7025caf6...
Changes in v2:
- Rebase onto v6.5-rc1.
- Link to v1: https://lore.kernel.org/r/20230628-arm64-signal-memcpy-fix-v1-1-db3e0300829e...
.../selftests/arm64/signal/test_signals_utils.h | 25 +++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/signal/test_signals_utils.h b/tools/testing/selftests/arm64/signal/test_signals_utils.h index 222093f51b67..c7f5627171dd 100644 --- a/tools/testing/selftests/arm64/signal/test_signals_utils.h +++ b/tools/testing/selftests/arm64/signal/test_signals_utils.h @@ -60,13 +60,25 @@ static __always_inline bool get_current_context(struct tdescr *td, size_t dest_sz) { static volatile bool seen_already;
- int i;
- char *uc = (char *)dest_uc;
assert(td && dest_uc); /* it's a genuine invocation..reinit */ seen_already = 0; td->live_uc_valid = 0; td->live_sz = dest_sz;
- memset(dest_uc, 0x00, td->live_sz);
- /*
* This is a memset() but we don't want the compiler to
* optimise it into either instructions or a library call
* which might be incompatible with streaming mode.
*/
- for (i = 0; i < td->live_sz; i++) {
uc[i] = 0;
__asm__ ("" : "=r" (uc[i]) : "0" (uc[i]));
- }
Looking more closely at this, I see that the bpf and kvm selftests are able to include <linux/compiler.h> directly, so I don't see why we need to open-code this here. I also spotted that we've *already* written our own version of this as the 'curse()' macro in selftests/arm64/bti/compiler.h!
So I think we should either use linux/compiler.h or make our curse macro usable to all the arm64 selftests.
What do you think?
Will