Il lun 11 nov 2024, 21:55 Doug Covelli doug.covelli@broadcom.com ha scritto:
BDOOR_CMD_VCPU_MMIO_HONORS_PAT and BDOOR_CMD_VCPU_LEGACY_X2APIC_OK are not actually backdoor calls - they are flags returned by BDOOR_CMD_GET_VCPU_INFO.
BDOOR_CMD_VCPU_MMIO_HONORS_PAT is only ever set to 1 on ESX as it is only relevant for PCI passthru which is not supported on Linux/Windows/macOS. IIRC this was added over 10 years ago for some Infiniband device vendor to use in their driver although I'm not sure that ever materialized.
Ok. So I guess false is safe.
BDOOR_CMD_VCPU_LEGACY_X2APIC_OK indicates if it is OK to use x2APIC w/o interrupt remapping (e.g a virtual IOMMU). I'm not sure if KVM supports this but I think this one can be set to TRUE unconditionally as we have no plans to use KVM_CREATE_IRQCHIP - if anything we would use KVM_CAP_SPLIT_IRQCHIP although my preference would be to handle all APIC/IOAPIC/PIC emulation ourselves provided we can avoid CR8 exits but that is another discussion.
Split irqchip should be the best tradeoff. Without it, moves from cr8 stay in the kernel, but moves to cr8 always go to userspace with a KVM_EXIT_SET_TPR exit. You also won't be able to use Intel flexpriority (in-processor accelerated TPR) because KVM does not know which bits are set in IRR. So it will be *really* every move to cr8 that goes to userspace.
For now I think it makes sense to handle BDOOR_CMD_GET_VCPU_INFO at userlevel like we do on Windows and macOS.
BDOOR_CMD_GETTIME/BDOOR_CMD_GETTIMEFULL are similar with the former being deprecated in favor of the latter. Both do essentially the same thing which is to return the host OS's time - on Linux this is obtained via gettimeofday. I believe this is mainly used by tools to fix up the VM's time when resuming from suspend. I think it is fine to continue handling these at userlevel.
As long as the TSC is not involved it should be okay.
Paolo
Anyway, one question apart from this: is the API the same for the I/O port and hypercall backdoors?
Yeah the calls and arguments are the same. The hypercall based interface is an attempt to modernize the backdoor since as you pointed out the I/O based interface is kind of hacky as it bypasses the normal checks for an I/O port access at CPL3. It would be nice to get rid of it but unfortunately I don't think that will happen in the foreseeable future as there are a lot of existing VMs out there with older SW that still uses this interface.
Yeah, but I think it still justifies that the KVM_ENABLE_CAP API can enable the hypercall but not the I/O port.
Paolo