On Thu, Sep 14, 2023 at 05:11:50AM -0400, Tyler Stachecki wrote:
On Thu, Sep 14, 2023 at 04:15:54AM -0300, Leonardo Bras wrote:
So, IIUC, the xfeatures from the source guest will be different than the xfeatures of the target (destination) guest. Is that correct?
Correct.
It does not seem right to me. I mean, from the guest viewpoint, some features will simply vanish during execution, and this could lead to major issues in the guest.
My assumption is that the guest CPU model should confine access to registers that make sense for that (guest) CPU.
e.g., take a host CPU capable of AVX-512 running a guest CPU model that only has AVX-256. If the guest suddenly loses the top 256 bits of %zmm*, it should not really be perceivable as %ymm architecturally remains unchanged.
Though maybe I'm being too rash here? Is there a case where this assumption breaks down?
There is no guarantee that it would be ok to simple remove a feature from the guest. Maybe it's fine for 99% of the cases for given feature, but it could always go wrong.
The idea here is that if the target (destination) host can't provide those features for the guest, then migration should fail.
I mean, qemu should fail the migration, and that's correct behavior. Is it what is happening?
Unfortunately, no, it is not... and that is biggest concern right now.
I do see some discussion between Peter and you on this topic and see that there was an RFC to implement such behavior stemming from it, here: https://lore.kernel.org/qemu-devel/20220607230645.53950-1-peterx@redhat.com/
... though I do not believe that work ever landed in the tree. Looking at qemu's master branch now, the error from kvm_arch_put_registers is just discarded in do_kvm_cpu_synchronize_post_init...
This is wrong, then. QEMU should abort the migration in this case, so the VM is not lost.
Of course, with this issue fixed, there is another issue to deal with: - VMs running on hosts with older kernel get stuck in hosts without the fixes.
Thanks, Leo
static void do_kvm_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg) { kvm_arch_put_registers(cpu, KVM_PUT_FULL_STATE); cpu->vcpu_dirty = false; }
Best, Tyler