On Wed, Dec 3, 2025 at 1:26 PM David Woodhouse dwmw2@infradead.org wrote:
On Wed, 2025-12-03 at 00:50 +0000, Huang, Kai wrote:
-#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) -#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +#define KVM_X2APIC_API_USE_32BIT_IDS (_BITULL(0)) +#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (_BITULL(1)) +#define KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST (_BITULL(2)) +#define KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST (_BITULL(3))
I hate to say, but wants to ask again:
Since it's uAPI, are we expecting the two flags to have impact on in-kernel ioapic?
I think there should no harm to make the two also apply to in-kernel ioapic.
E.g., for now we can reject KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST flag for in-kernel ioapic. In the future, we might add EOI register support to in-kernel ioapic and report supporting suppress EOI broadcast, then we can in-kernel ioapic to honor these two flags too.
I don't think we should leave that to the unspecified 'future'. Let's fix the kernel I/O APIC to support the directed EOI at the same time, rather than having an interim version of KVM which supports the broadcast suppression but *not* the explicit EOI that replaces it.
Since I happened to have the I/O APIC PDFs in my recent history for other reasons, and implemented these extra registers for version 0x20 in another userspace VMM within living memory, I figured I could try to help with the actual implementation (untested, below).
There is some bikeshedding to be done on precisely *how* ->version_id should be set. Maybe we shouldn't have the ->version_id field, and should just check kvm->arch.suppress_eoi_broadcast to see which version to report?
That would make it impossible to use the fixed implementation on the local APIC side, without changing the way the IOAPIC appears to the guest.
There are no parameters that you can use in KVM_CREATE_IRQCHIP, unfortunately, and no checks that (for example) kvm_irqchip.pad or kvm_ioapic_state.pad are zero.
The best possibility that I can think of, is to model it like KVM_CAP_XSAVE2
1) Add a capability KVM_CAP_IRQCHIP2 (x86 only)
2) If reported, kvm_irqchip.pad becomes "flags" (a set of flag bits) and kvm_ioapic_state.pad becomes version_id when returned from KVM_GET_IRQCHIP. Using an anonymous union allows adding the synonyms.
3) On top of this, KVM_SET_IRQCHIP2 is added which checks that kvm_irqchip.flags is zero and that kvm_ioapic_state.version_id is either 0x11 or 0x20.
4) Leave the default to 0x11 for backwards compatibility.
The alternative is to add KVM_ENABLE_CAP(KVM_CAP_IRQCHIP2) but I dislike adding another stateful API.
Paolo