On Sun, Dec 06, 2020 at 05:19:16PM +0100, Thomas Gleixner wrote:
On Thu, Dec 03 2020 at 19:11, Maxim Levitsky wrote:
- case KVM_SET_TSC_STATE: {
struct kvm_tsc_state __user *user_tsc_state = argp;
struct kvm_tsc_state tsc_state;
u64 host_tsc, wall_nsec;
u64 new_guest_tsc, new_guest_tsc_offset;
r = -EFAULT;
if (copy_from_user(&tsc_state, user_tsc_state, sizeof(tsc_state)))
goto out;
kvm_get_walltime(&wall_nsec, &host_tsc);
new_guest_tsc = tsc_state.tsc;
if (tsc_state.flags & KVM_TSC_STATE_TIMESTAMP_VALID) {
s64 diff = wall_nsec - tsc_state.nsec;
if (diff >= 0)
new_guest_tsc += nsec_to_cycles(vcpu, diff);
else
new_guest_tsc -= nsec_to_cycles(vcpu, -diff);
}
new_guest_tsc_offset = new_guest_tsc - kvm_scale_tsc(vcpu, host_tsc);
kvm_vcpu_write_tsc_offset(vcpu, new_guest_tsc_offset);
From a timekeeping POV and the guests expectation of TSC this is
fundamentally wrong:
tscguest = scaled(hosttsc) + offset
The TSC has to be viewed systemwide and not per CPU. It's systemwide used for timekeeping and for that to work it has to be synchronized.
Why would this be different on virt? Just because it's virt or what?
Migration is a guest wide thing and you're not migrating single vCPUs.
This hackery just papers over he underlying design fail that KVM looks at the TSC per vCPU which is the root cause and that needs to be fixed.
It already does it: The unified TSC offset is kept at kvm->arch.cur_tsc_offset.