On Wed, 2025-12-03 at 00:50 +0000, Huang, Kai wrote:
-#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) -#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +#define KVM_X2APIC_API_USE_32BIT_IDS (_BITULL(0)) +#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (_BITULL(1)) +#define KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST (_BITULL(2)) +#define KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST (_BITULL(3))
I hate to say, but wants to ask again:
Since it's uAPI, are we expecting the two flags to have impact on in-kernel ioapic?
I think there should no harm to make the two also apply to in-kernel ioapic.
E.g., for now we can reject KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST flag for in-kernel ioapic. In the future, we might add EOI register support to in-kernel ioapic and report supporting suppress EOI broadcast, then we can in-kernel ioapic to honor these two flags too.
I don't think we should leave that to the unspecified 'future'. Let's fix the kernel I/O APIC to support the directed EOI at the same time, rather than having an interim version of KVM which supports the broadcast suppression but *not* the explicit EOI that replaces it.
Since I happened to have the I/O APIC PDFs in my recent history for other reasons, and implemented these extra registers for version 0x20 in another userspace VMM within living memory, I figured I could try to help with the actual implementation (untested, below).
There is some bikeshedding to be done on precisely *how* ->version_id should be set. Maybe we shouldn't have the ->version_id field, and should just check kvm->arch.suppress_eoi_broadcast to see which version to report?
While I'm at looking at it, I see there's an ASSERT(ent->fields.trig_mode == IOAPIC_LEVEL_TRIG) in the middle of kvm_ioapic_update_eoi_one(), a few lines after it drops ioapic->lock and then relocks it again. If it dropped the lock, doesn't that mean the guest could have *changed* the mode in the interim, making that ASSERT() guest-triggerable?
===== From: David Woodhouse dwmw@amazon.co.uk Subject: [PATCH] KVM: x86/ioapic: Implement support for I/O APIC version 0x20 with EOIR
As the weirdness with EOI broadcast suppression is being fixed in KVM, also update the in-kernel I/O APIC to handle the directed EOI which guests will need to use instead.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk --- arch/x86/kvm/ioapic.c | 22 +++++++++++++++++++++- arch/x86/kvm/ioapic.h | 18 +++++++++++------- 2 files changed, 32 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c index 2c2783296aed..e82f74a8e57e 100644 --- a/arch/x86/kvm/ioapic.c +++ b/arch/x86/kvm/ioapic.c @@ -49,7 +49,7 @@ static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic) switch (ioapic->ioregsel) { case IOAPIC_REG_VERSION: result = ((((IOAPIC_NUM_PINS - 1) & 0xff) << 16) - | (IOAPIC_VERSION_ID & 0xff)); + | ioapic->version_id); break;
case IOAPIC_REG_APIC_ID: @@ -57,6 +57,10 @@ static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic) result = ((ioapic->id & 0xf) << 24); break;
+ case IOAPIC_REG_BOOT_CONFIG: + result = 0x01; /* Processor bus */ + break; + default: { u32 redir_index = (ioapic->ioregsel - 0x10) >> 1; @@ -695,6 +699,21 @@ static int ioapic_mmio_write(struct kvm_vcpu *vcpu, struct kvm_io_device *this, ioapic_write_indirect(ioapic, data); break;
+ case IOAPIC_REG_EOIR: + if (ioapic->version_id >= 0x20) { + u8 vector = data & 0xff; + int i; + + rtc_irq_eoi(ioapic, vcpu, vector); + for (i = 0; i < IOAPIC_NUM_PINS; i++) { + union kvm_ioapic_redirect_entry *ent = &ioapic->redirtbl[i]; + + if (ent->fields.vector != vector) + continue; + kvm_ioapic_update_eoi_one(vcpu, ioapic, ent->fields.trig_mode, i); + } + } + break; default: break; } @@ -734,6 +753,7 @@ int kvm_ioapic_init(struct kvm *kvm) spin_lock_init(&ioapic->lock); INIT_DELAYED_WORK(&ioapic->eoi_inject, kvm_ioapic_eoi_inject_work); INIT_HLIST_HEAD(&ioapic->mask_notifier_list); + ioapic->version_id = IOAPIC_VERSION_ID; kvm->arch.vioapic = ioapic; kvm_ioapic_reset(ioapic); kvm_iodevice_init(&ioapic->dev, &ioapic_mmio_ops); diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h index bf28dbc11ff6..c8e44d726fbe 100644 --- a/arch/x86/kvm/ioapic.h +++ b/arch/x86/kvm/ioapic.h @@ -19,13 +19,16 @@ struct kvm_vcpu; #define IOAPIC_MEM_LENGTH 0x100
/* Direct registers. */ -#define IOAPIC_REG_SELECT 0x00 -#define IOAPIC_REG_WINDOW 0x10 +#define IOAPIC_REG_SELECT 0x00 +#define IOAPIC_REG_WINDOW 0x10 +#define IOAPIC_REG_IRQPA 0x20 +#define IOAPIC_REG_EOIR 0x40
-/* Indirect registers. */ -#define IOAPIC_REG_APIC_ID 0x00 /* x86 IOAPIC only */ -#define IOAPIC_REG_VERSION 0x01 -#define IOAPIC_REG_ARB_ID 0x02 /* x86 IOAPIC only */ +/* INDIRECT registers. */ +#define IOAPIC_REG_APIC_ID 0x00 /* x86 IOAPIC only */ +#define IOAPIC_REG_VERSION 0x01 +#define IOAPIC_REG_ARB_ID 0x02 /* x86 IOAPIC only */ +#define IOAPIC_REG_BOOT_CONFIG 0x03 /* x86 IOAPIC only */
/*ioapic delivery mode*/ #define IOAPIC_FIXED 0x0 @@ -76,7 +79,8 @@ struct kvm_ioapic { u32 ioregsel; u32 id; u32 irr; - u32 pad; + u8 version_id; + u8 pad[3]; union kvm_ioapic_redirect_entry redirtbl[IOAPIC_NUM_PINS]; unsigned long irq_states[IOAPIC_NUM_PINS]; struct kvm_io_device dev;