Switch MSR_IA32_FRED_RSP0 between host and guest in vmx_prepare_switch_to_{host,guest}().
MSR_IA32_FRED_RSP0 is used during ring 3 event delivery only, thus KVM, running on ring 0, can run safely with guest FRED RSP0, i.e., no need to switch between host/guest FRED RSP0 during VM entry and exit.
KVM should switch to host FRED RSP0 before returning to user level, and switch to guest FRED RSP0 before entering guest mode.
Heh, if only KVM had a framework that was specifically designed for context switching MSRs on return to userspace. Translation: please use the user_return_msr() APIs.
IIUC the user return MSR framework works for MSRs that are per CPU constants, but like MSR_KERNEL_GS_BASE, MSR_IA32_FRED_RSP0 is a per *task* constant, thus we can't use it.
Ah, in that case, the changelog is very misleading and needs to be fixed.
I probably should've given more details about how FRED RSPs are used: RSP0 is a bit of special because it's a per task constant pointing to its kernel stack top, while other RSPs are per CPU constants.
Alternatively, is the desired RSP0 value tracked anywhere other than the MSR?
Yes, It's simply "(unsigned long)task_stack_page(task) + THREAD_SIZE".
E.g. if it's somewhere in task_struct, then kvm_on_user_return() would restore the current task's desired RSP0.
So you're suggesting to extend the framework to allow per task constants?
Even if we don't get fancy, avoiding the RDMSR to get the current task's value would be nice.
TBH, I didn't know RDMSR is NOT a preferred approach. But it does make sense because it costs hundreds cycles to read from CR2.
And of course this can be easily changed to "(unsigned long)task_stack_page(task) + THREAD_SIZE".