On Wed, Sep 13, 2023 at 09:00:03PM -0400, Tyler Stachecki wrote:
Live-migrations under qemu result in guest corruption when the following three conditions are all met:
The source host CPU has capabilities that itself extend that of the guest CPU fpstate->user_xfeatures
The source kernel emits guest_fpu->user_xfeatures with respect to the host CPU (i.e. it *does not* have the "Fixes:" commit)
The destination kernel enforces that the xfeatures in the buffer given to KVM_SET_IOCTL are compatible with the guest CPU (i.e., it *does* have the "Fixes:" commit)
When these conditions are met, the semantical changes to fpstate->user_features trigger a subtle bug in qemu that results in qemu failing to put the XSAVE architectural state into KVM.
qemu then both ceases to put the remaining (non-XSAVE) x86 architectural state into KVM and makes the fateful mistake of resuming the guest anyways. This usually results in immediate guest corruption, silent or not.
Due to the grave nature of this qemu bug, attempt to retain behavior of old kernels by clamping the xfeatures specified in the buffer given to KVM_SET_IOCTL such that it aligns with the guests fpstate->user_xfeatures instead of returning an error.
So, IIUC, the xfeatures from the source guest will be different than the xfeatures of the target (destination) guest. Is that correct?
It does not seem right to me. I mean, from the guest viewpoint, some features will simply vanish during execution, and this could lead to major issues in the guest.
The idea here is that if the target (destination) host can't provide those features for the guest, then migration should fail.
I mean, qemu should fail the migration, and that's correct behavior. Is it what is happening?
Regards, Leo
Fixes: ad856280ddea ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0") Cc: stable@vger.kernel.org Cc: Leonardo Bras leobras@redhat.com Signed-off-by: Tyler Stachecki tstachecki@bloomberg.net
arch/x86/kvm/x86.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6c9c81e82e65..baad160b592f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5407,11 +5407,21 @@ static void kvm_vcpu_ioctl_x86_get_xsave2(struct kvm_vcpu *vcpu, static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu, struct kvm_xsave *guest_xsave) {
- union fpregs_state *ustate = (union fpregs_state *) guest_xsave->region;
- u64 user_xfeatures = vcpu->arch.guest_fpu.fpstate->user_xfeatures;
- if (fpstate_is_confidential(&vcpu->arch.guest_fpu)) return 0;
- return fpu_copy_uabi_to_guest_fpstate(&vcpu->arch.guest_fpu,
guest_xsave->region,
- /*
* In previous kernels, kvm_arch_vcpu_create() set the guest's fpstate
* based on what the host CPU supported. Recent kernels changed this
* and only accept ustate containing xfeatures that the guest CPU is
* capable of supporting.
*/
- ustate->xsave.header.xfeatures &= user_xfeatures;
- return fpu_copy_uabi_to_guest_fpstate(&vcpu->arch.guest_fpu, ustate, kvm_caps.supported_xcr0, &vcpu->arch.pkru);
}
2.30.2