On Wed, Jan 22, 2025 at 11:46:31AM +0000, Marc Zyngier wrote:
On Tue, 21 Jan 2025 15:37:13 +0000, Mark Rutland mark.rutland@arm.com wrote:
Alternatively, we could take the large hammer approach and always save and unbind the host state prior to entering the guest, so that hyp doesn't need to save anything. An unconditional call to fpsimd_save_and_flush_cpu_state() would suffice, and that'd also implicitly fix the SME issue below.
I think I'd rather see that. Even if that costs us a few hundred cycles on vcpu_load(), I would take that any time over the current fragile/broken behaviour.
Cool -- I'll go do that. I'm also happier with that approach.
*
* If hyp code does not save the host state, then the host
* state remains live on the CPU and saved fp_type is
* irrelevant until it is overwritten by a later call to
* fpsimd_save_user_state().
I'm not sure I understand this. If fp_type is irrelevant, surely it is *forever* irrelevant, not until something else happens. Or am I missing something?
Sorry, this was not very clear.
What this is trying to say is that *while the state is live on a CPU* fp_type is irrelevant, and it's only meaningful when saving/restoring state. As above, the only reason to set it here is so that *if* hyp saves and unbinds the state, fp_type will accurately describe what the hyp code saved.
The key thing is that there are two possibilities:
(1) The guest doesn't use FPSIMD/SVE, and no trap is taken to save the host state. In this case, fp_type is not consumed before the next time state has to be written back to memory (the act of which will set fp_type).
So in this case, setting fp_type is redundant but benign.
(2) The guest *does* use FPSIMD/SVE, and a trap is taken to hyp to save the host state. In this case the hyp code will save the task's FPSIMD state to task->thread.uw.fpsimd_state, but will not update task->thread.fp_type accordingly, and:
* If fp_type happened to be FP_STATE_FPSIMD, all is good and a later restore will load the state saved by the hyp code. * If fp_type happened to be FP_STATE_SVE, a later restore will load stale state from task->thread.sve_state.
... does that make sense?
It does now, thanks. But with your above alternative suggestion, this becomes completely moot, right?
Yep.
[...]
So I can:
(a) Add the dependency, as you suggest.
(b) Leave that as-is.
(c) Solve this in a different way so that we don't need a BUILD_BUG() or dependency. e.g. fix the SME case at the same time, at the cost of possibly needing to do a bit more work when backporting.
... any preference?
My preference would be on (c), if at all possible. My understanding is now that the fpsimd_save_and_flush_cpu_state() approach solves all of these problems, at the expense of a bit of overhead.
Did I get that correctly?
Yep -- I'll go spin that now.
Mark.