Add two flags for KVM_CAP_X2APIC_API to allow userspace to control support for Suppress EOI Broadcasts, which KVM completely mishandles. When x2APIC support was first added, KVM incorrectly advertised and "enabled" Suppress EOI Broadcast, without fully supporting the I/O APIC side of the equation, i.e. without adding directed EOI to KVM's in-kernel I/O APIC.
That flaw was carried over to split IRQCHIP support, i.e. KVM advertised support for Suppress EOI Broadcasts irrespective of whether or not the userspace I/O APIC implementation supported directed EOIs. Even worse, KVM didn't actually suppress EOI broadcasts, i.e. userspace VMMs without support for directed EOI came to rely on the "spurious" broadcasts.
KVM "fixed" the in-kernel I/O APIC implementation by completely disabling support for Suppress EOI Broadcasts in commit 0bcc3fb95b97 ("KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use"), but didn't do anything to remedy userspace I/O APIC implementations.
KVM's bogus handling of Suppress EOI Broadcast is problematic when the guest relies on interrupts being masked in the I/O APIC until well after the initial local APIC EOI. E.g. Windows with Credential Guard enabled handles interrupts in the following order: 1. Interrupt for L2 arrives. 2. L1 APIC EOIs the interrupt. 3. L1 resumes L2 and injects the interrupt. 4. L2 EOIs after servicing. 5. L1 performs the I/O APIC EOI.
Because KVM EOIs the I/O APIC at step #2, the guest can get an interrupt storm, e.g. if the IRQ line is still asserted and userspace reacts to the EOI by re-injecting the IRQ, because the guest doesn't de-assert the line until step #4, and doesn't expect the interrupt to be re-enabled until step #5.
Unfortunately, simply "fixing" the bug isn't an option, as KVM has no way of knowing if the userspace I/O APIC supports directed EOIs, i.e. suppressing EOI broadcasts would result in interrupts being stuck masked in the userspace I/O APIC due to step #5 being ignored by userspace. And fully disabling support for Suppress EOI Broadcast is also undesirable, as picking up the fix would require a guest reboot, *and* more importantly would change the virtual CPU model exposed to the guest without any buy-in from userspace.
Add two flags to allow userspace to choose exactly how to solve the immediate issue, and in the long term to allow userspace to control the virtual CPU model that is exposed to the guest (KVM should never have enabled support for Suppress EOI Broadcast without a userspace opt-in).
Note, Suppress EOI Broadcasts is defined only in Intel's SDM, not in AMD's APM. But the bit is writable on some AMD CPUs, e.g. Turin, and KVM's ABI is to support Directed EOI (KVM's name) irrespective of guest CPU vendor.
Fixes: 7543a635aa09 ("KVM: x86: Add KVM exit for IOAPIC EOIs") Closes: https://lore.kernel.org/kvm/7D497EF1-607D-4D37-98E7-DAF95F099342@nutanix.com Cc: stable@vger.kernel.org Co-developed-by: Sean Christopherson seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Khushit Shah khushit.shah@nutanix.com --- v3: - Resend with correct patch contents; v2 mail formatting was broken. No functional changes beyond the naming and grammar feedback from v1.
Testing: I ran the tests with QEMU 9.1 and a 6.12 kernel with the patch applied. - With an unmodified QEMU build, KVM's LAPIC SEOIB behavior remains unchanged. - Invoking the x2APIC API with KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK correctly suppresses LAPIC -> IOAPIC EOI broadcasts (verified via KVM tracepoints). - Invoking the x2APIC API with KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST results in SEOIB not being advertised to the guest, as expected (confirmed by checking the LAPIC LVR value inside the guest).
I'll send the corresponding QEMU-side patch shortly. --- Documentation/virt/kvm/api.rst | 14 ++++++++++++-- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/include/uapi/asm/kvm.h | 6 ++++-- arch/x86/kvm/lapic.c | 13 +++++++++++++ arch/x86/kvm/x86.c | 12 +++++++++--- 5 files changed, 40 insertions(+), 7 deletions(-)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 57061fa29e6a..4141d2bd8156 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7800,8 +7800,10 @@ Will return -EBUSY if a VCPU has already been created.
Valid feature flags in args[0] are::
- #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) - #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) + #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK (1ULL << 2) + #define KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST (1ULL << 3)
Enabling KVM_X2APIC_API_USE_32BIT_IDS changes the behavior of KVM_SET_GSI_ROUTING, KVM_SIGNAL_MSI, KVM_SET_LAPIC, and KVM_GET_LAPIC, @@ -7814,6 +7816,14 @@ as a broadcast even in x2APIC mode in order to support physical x2APIC without interrupt remapping. This is undesirable in logical mode, where 0xff represents CPUs 0-7 in cluster 0.
+Setting KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK overrides +KVM's quirky behavior of not actually suppressing EOI broadcasts for split IRQ +chips when support for Suppress EOI Broadcasts is advertised to the guest. + +Setting KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST disables support for +Suppress EOI Broadcasts entirely, i.e. instructs KVM to NOT advertise support +to the guest and thus disallow enabling EOI broadcast suppression in SPIV. + 7.8 KVM_CAP_S390_USER_INSTR0 ----------------------------
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 48598d017d6f..f6fdc0842c05 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1480,6 +1480,8 @@ struct kvm_arch {
bool x2apic_format; bool x2apic_broadcast_quirk_disabled; + bool disable_ignore_suppress_eoi_broadcast_quirk; + bool x2apic_disable_suppress_eoi_broadcast;
bool has_mapped_host_mmio; bool guest_can_read_msr_platform_info; diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index d420c9c066d4..7255385b6d80 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -913,8 +913,10 @@ struct kvm_sev_snp_launch_finish { __u64 pad1[4]; };
-#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) -#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) +#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +#define KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK (1ULL << 2) +#define KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST (1ULL << 3)
struct kvm_hyperv_eventfd { __u32 conn_id; diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 0ae7f913d782..cf8a2162872b 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -562,6 +562,7 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu) * IOAPIC. */ if (guest_cpu_cap_has(vcpu, X86_FEATURE_X2APIC) && + !vcpu->kvm->arch.x2apic_disable_suppress_eoi_broadcast && !ioapic_in_kernel(vcpu->kvm)) v |= APIC_LVR_DIRECTED_EOI; kvm_lapic_set_reg(apic, APIC_LVR, v); @@ -1517,6 +1518,18 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int vector)
/* Request a KVM exit to inform the userspace IOAPIC. */ if (irqchip_split(apic->vcpu->kvm)) { + /* + * Don't exit to userspace if the guest has enabled Directed + * EOI, a.k.a. Suppress EOI Broadcasts, in which case the local + * APIC doesn't broadcast EOIs (the guest must EOI the target + * I/O APIC(s) directly). Ignore the suppression if userspace + * has NOT disabled KVM's quirk (KVM advertised support for + * Suppress EOI Broadcasts without actually suppressing EOIs). + */ + if ((kvm_lapic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI) && + apic->vcpu->kvm->arch.disable_ignore_suppress_eoi_broadcast_quirk) + return; + apic->vcpu->arch.pending_ioapic_eoi = vector; kvm_make_request(KVM_REQ_IOAPIC_EOI_EXIT, apic->vcpu); return; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c9c2aa6f4705..e1b6fe783615 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -121,8 +121,11 @@ static u64 __read_mostly efer_reserved_bits = ~((u64)EFER_SCE);
#define KVM_CAP_PMU_VALID_MASK KVM_PMU_CAP_DISABLE
-#define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS | \ - KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK) +#define KVM_X2APIC_API_VALID_FLAGS \ + (KVM_X2APIC_API_USE_32BIT_IDS | \ + KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK | \ + KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK | \ + KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST)
static void update_cr8_intercept(struct kvm_vcpu *vcpu); static void process_nmi(struct kvm_vcpu *vcpu); @@ -6782,7 +6785,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, kvm->arch.x2apic_format = true; if (cap->args[0] & KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK) kvm->arch.x2apic_broadcast_quirk_disabled = true; - + if (cap->args[0] & KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK) + kvm->arch.disable_ignore_suppress_eoi_broadcast_quirk = true; + if (cap->args[0] & KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST) + kvm->arch.x2apic_disable_suppress_eoi_broadcast = true; r = 0; break; case KVM_CAP_X86_DISABLE_EXITS:
On Tue, 2025-11-25 at 18:05 +0000, Khushit Shah wrote:
Add two flags for KVM_CAP_X2APIC_API to allow userspace to control support for Suppress EOI Broadcasts, which KVM completely mishandles. When x2APIC support was first added, KVM incorrectly advertised and "enabled" Suppress EOI Broadcast, without fully supporting the I/O APIC side of the equation, i.e. without adding directed EOI to KVM's in-kernel I/O APIC.
That flaw was carried over to split IRQCHIP support, i.e. KVM advertised support for Suppress EOI Broadcasts irrespective of whether or not the userspace I/O APIC implementation supported directed EOIs. Even worse, KVM didn't actually suppress EOI broadcasts, i.e. userspace VMMs without support for directed EOI came to rely on the "spurious" broadcasts.
KVM "fixed" the in-kernel I/O APIC implementation by completely disabling support for Suppress EOI Broadcasts in commit 0bcc3fb95b97 ("KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use"), but didn't do anything to remedy userspace I/O APIC implementations.
KVM's bogus handling of Suppress EOI Broadcast is problematic when the guest relies on interrupts being masked in the I/O APIC until well after the initial local APIC EOI. E.g. Windows with Credential Guard enabled handles interrupts in the following order:
- Interrupt for L2 arrives.
- L1 APIC EOIs the interrupt.
- L1 resumes L2 and injects the interrupt.
- L2 EOIs after servicing.
- L1 performs the I/O APIC EOI.
Because KVM EOIs the I/O APIC at step #2, the guest can get an interrupt storm, e.g. if the IRQ line is still asserted and userspace reacts to the EOI by re-injecting the IRQ, because the guest doesn't de-assert the line until step #4, and doesn't expect the interrupt to be re-enabled until step #5.
Unfortunately, simply "fixing" the bug isn't an option, as KVM has no way of knowing if the userspace I/O APIC supports directed EOIs, i.e. suppressing EOI broadcasts would result in interrupts being stuck masked in the userspace I/O APIC due to step #5 being ignored by userspace. And fully disabling support for Suppress EOI Broadcast is also undesirable, as picking up the fix would require a guest reboot, *and* more importantly would change the virtual CPU model exposed to the guest without any buy-in from userspace.
Add two flags to allow userspace to choose exactly how to solve the immediate issue, and in the long term to allow userspace to control the virtual CPU model that is exposed to the guest (KVM should never have enabled support for Suppress EOI Broadcast without a userspace opt-in).
Note, Suppress EOI Broadcasts is defined only in Intel's SDM, not in AMD's APM. But the bit is writable on some AMD CPUs, e.g. Turin, and KVM's ABI is to support Directed EOI (KVM's name) irrespective of guest CPU vendor.
Fixes: 7543a635aa09 ("KVM: x86: Add KVM exit for IOAPIC EOIs") Closes: https://lore.kernel.org/kvm/7D497EF1-607D-4D37-98E7-DAF95F099342@nutanix.com Cc: stable@vger.kernel.org Co-developed-by: Sean Christopherson seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Khushit Shah khushit.shah@nutanix.com
Reviewed-by: Kai Huang kai.huang@intel.com
On Tue, 2025-12-02 at 07:42 -0800, Sean Christopherson wrote:
On Tue, Dec 02, 2025, David Woodhouse wrote:
On Tue, 2025-12-02 at 12:58 +0000, Khushit Shah wrote:
Thanks for the review!
On 2 Dec 2025, at 2:43 PM, David Woodhouse dwmw2@infradead.org wrote:
Firstly, excellent work debugging and diagnosing that!
On Tue, 2025-11-25 at 18:05 +0000, Khushit Shah wrote:
--- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7800,8 +7800,10 @@ Will return -EBUSY if a VCPU has already been created. Valid feature flags in args[0] are:: - #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) - #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) + #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK (1ULL << 2) + #define KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST (1ULL << 3)
I kind of hate these names. This part right here is what we leave behind for future generations, to understand the weird behaviour of KVM. To have "IGNORE" "SUPPRESS" "QUIRK" all in the same flag, quite apart from the length of the token, makes my brain hurt.
...
Could we perhaps call them 'ENABLE_SUPPRESS_EOI_BROADCAST' and 'DISABLE_SUPPRESS_EOI_BROADCAST', with a note saying that modern VMMs should always explicitly enable one or the other, because for historical reasons KVM only *pretends* to support it by default but it doesn't actually work correctly?
I don't disagree on the names being painful, but ENABLE_SUPPRESS_EOI_BROADCAST vs. DISABLE_SUPPRESS_EOI_BROADCAST won't work, and is even more confusing IMO.
I dunno, KVM never actually *did* suppress the EOI broadcast anyway, did it? This fix really *does* enable it — as opposed to just pretending to?
I was thinking along the lines of ...
Setting KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST causes KVM to advertise and correctly implement the Directed EOI feature in the local APIC, suppressing broadcast EOI when the feature is enabled by the guest.
Setting KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST causes KVM not to advertise the Directed EOI feature in the local APIC.
Userspace should explicitly either enable or disable the EOI broadcast using one of the two flags above. For historical compatibility reasons, if neither flag is set then KVM will advertise the feature but will not actually suppress the EOI broadcast, leading to potential IRQ storms in some guest configurations.
On Tue, Dec 02, 2025, David Woodhouse wrote:
On Tue, 2025-12-02 at 08:36 -0800, Sean Christopherson wrote:
Hmm, I suppose that could work for uAPI. Having both an ENABLE and a DISABLE is obviously a bit odd, but slowing down the reader might actually be a good thing in this case. And the documentation should be easy enough to write.
I was worried that having ENABLE and DISABLE controls would lead to confusing code internally, but there's no reason KVM's internal tracking needs to match uAPI.
How about this?
arch/x86/include/asm/kvm_host.h | 7 +++++++ arch/x86/include/uapi/asm/kvm.h | 6 ++++-- arch/x86/kvm/lapic.c | 16 +++++++++++++++- arch/x86/kvm/x86.c | 15 ++++++++++++--- 4 files changed, 38 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 5a3bfa293e8b..b4c41255f01d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1226,6 +1226,12 @@ enum kvm_irqchip_mode { KVM_IRQCHIP_SPLIT, /* created with KVM_CAP_SPLIT_IRQCHIP */ }; +enum kvm_suppress_eoi_broadcast_mode {
- KVM_SUPPRESS_EOI_QUIRKED,
- KVM_SUPPRESS_EOI_ENABLED,
- KVM_SUPPRESS_EOI_DISABLED,
+};
Looks good. I'd probably call it KVM_SUPPRESS_EOI_LEGACY though?
Why legacy? "Quirk" has specific meaning in KVM: technically broken behavior that is retained as the default for backwards compatibility. "Legacy" does not, outside of a few outliers like HPET crud.
And just for clarity I wouldn't embed the explicit checks against e.g arch.suppress_eoi_broadcast != KVM_SUPPRESS_EOI_LEGACY. I'd make static inline functions like
Ya, definitely no objection,
static inline bool kvm_lapic_advertise_directed_eoi(kvm)
s/directed_eoi/suppress_eoi_broadcast. I want to provide as clear of split as possible between the local APIC feature and the I/O APIC feature.
{ /* Legacy behaviour was to advertise this feature but it didn't * actually work. */ return kvm->arch.suppress_eoi_broadcast != KVM_SUPPRESS_EOI_DISABLED; }
static inline bool kvm_lapic_suppress_directed_eoi(kvm)
Too close to "suppress EOI broadcast", e.g. it would be easy to read this as "suppress EOIs" and invert the polarity. It's wordy, but I think kvm_lapic_ignore_suppress_eoi_broadcast() is the least awful name.
{ /* Legacy behaviour advertised this feature but didn't actually * suppress the EOI. */ return kvm->arch.suppress_eoi_broadcast == KVM_SUPPRESS_EOI_ENABLED; }
Because it keeps the batshittery in one place and clearly documented?
I note your version did actually suppress the broadcast even in the DISABLED case if the guest had managed to set that bit in SPIV, but I don't think it *can* so that difference doesn't matter anyway, right?
Right. If we want to be paranoid, we could WARN_ON_ONCE() in whatever the "ignore broadcast" accessor is called, because it should only be used if the bit is enabled in the local APIC.
On Tue, 2025-12-02 at 14:26 -0800, Sean Christopherson wrote:
On Tue, Dec 02, 2025, David Woodhouse wrote:
On Tue, 2025-12-02 at 08:36 -0800, Sean Christopherson wrote:
Hmm, I suppose that could work for uAPI. Having both an ENABLE and a DISABLE is obviously a bit odd, but slowing down the reader might actually be a good thing in this case. And the documentation should be easy enough to write.
I was worried that having ENABLE and DISABLE controls would lead to confusing code internally, but there's no reason KVM's internal tracking needs to match uAPI.
How about this?
arch/x86/include/asm/kvm_host.h | 7 +++++++ arch/x86/include/uapi/asm/kvm.h | 6 ++++-- arch/x86/kvm/lapic.c | 16 +++++++++++++++- arch/x86/kvm/x86.c | 15 ++++++++++++--- 4 files changed, 38 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 5a3bfa293e8b..b4c41255f01d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1226,6 +1226,12 @@ enum kvm_irqchip_mode { KVM_IRQCHIP_SPLIT, /* created with KVM_CAP_SPLIT_IRQCHIP */ }; +enum kvm_suppress_eoi_broadcast_mode {
- KVM_SUPPRESS_EOI_QUIRKED,
- KVM_SUPPRESS_EOI_ENABLED,
- KVM_SUPPRESS_EOI_DISABLED,
+};
Looks good. I'd probably call it KVM_SUPPRESS_EOI_LEGACY though?
Why legacy? "Quirk" has specific meaning in KVM: technically broken behavior that is retained as the default for backwards compatibility. "Legacy" does not, outside of a few outliers like HPET crud.
OK, fair enough.
And just for clarity I wouldn't embed the explicit checks against e.g arch.suppress_eoi_broadcast != KVM_SUPPRESS_EOI_LEGACY. I'd make static inline functions like
Ya, definitely no objection,
static inline bool kvm_lapic_advertise_directed_eoi(kvm)
s/directed_eoi/suppress_eoi_broadcast. I want to provide as clear of split as possible between the local APIC feature and the I/O APIC feature.
OK.
And the in-kernel I/O APIC still doesn't have directed EOI, so do we want to prevent userspace from enabling the broadcast suppression? Or just enable the directed EOI support in the kernel I/O APIC if broadcast suppression is enabled?
{ /* Legacy behaviour was to advertise this feature but it didn't * actually work. */ return kvm->arch.suppress_eoi_broadcast != KVM_SUPPRESS_EOI_DISABLED; }
static inline bool kvm_lapic_suppress_directed_eoi(kvm)
Too close to "suppress EOI broadcast", e.g. it would be easy to read this as "suppress EOIs" and invert the polarity. It's wordy, but I think kvm_lapic_ignore_suppress_eoi_broadcast() is the least awful name.
I think I meant to type 'kvm_lapic_suppress_eoi_broadcast' that time anyway, as in *actually* suppress the EOI broadcasting, rather than merely falsely advertising it.
But sure, I'm not too hung up on precisely how we name these helpers but I think it is helpful for them to exist. FWIW I think your name means we have to invert the return value from how I typed it too.
{ /* Legacy behaviour advertised this feature but didn't actually * suppress the EOI. */ return kvm->arch.suppress_eoi_broadcast == KVM_SUPPRESS_EOI_ENABLED; }
Because it keeps the batshittery in one place and clearly documented?
I note your version did actually suppress the broadcast even in the DISABLED case if the guest had managed to set that bit in SPIV, but I don't think it *can* so that difference doesn't matter anyway, right?
Right. If we want to be paranoid, we could WARN_ON_ONCE() in whatever the "ignore broadcast" accessor is called, because it should only be used if the bit is enabled in the local APIC.
I don't think we care; just calling out the finer details so they don't get missed accidentally.
-#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) -#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +#define KVM_X2APIC_API_USE_32BIT_IDS (_BITULL(0)) +#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (_BITULL(1)) +#define KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST (_BITULL(2)) +#define KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST (_BITULL(3))
I hate to say, but wants to ask again:
Since it's uAPI, are we expecting the two flags to have impact on in-kernel ioapic?
I think there should no harm to make the two also apply to in-kernel ioapic.
E.g., for now we can reject KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST flag for in-kernel ioapic. In the future, we might add EOI register support to in-kernel ioapic and report supporting suppress EOI broadcast, then we can in-kernel ioapic to honor these two flags too.
On Wed, 2025-12-03 at 00:50 +0000, Huang, Kai wrote:
Since it's uAPI, are we expecting the two flags to have impact on in-kernel ioapic?
I think there should no harm to make the two also apply to in-kernel ioapic.
E.g., for now we can reject KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST flag for in-kernel ioapic. In the future, we might add EOI register support to in-kernel ioapic and report supporting suppress EOI broadcast, then we can in-kernel ioapic to honor these two flags too.
The simplest option is probably to fix the in-kernel I/O APIC at the same time, bumping the version to 0x20 and adding support for the EOI register in the KVM_SUPPRESS_EOI_ENABLED case.
On 2 Dec 2025, at 10:06 PM, Sean Christopherson seanjc@google.com wrote:
- if (cap->args[0] & KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST)
- kvm->arch.suppress_eoi_broadcast = KVM_SUPPRESS_EOI_ENABLED;
- if (cap->args[0] & KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST)
- kvm->arch.suppress_eoi_broadcast = KVM_SUPPRESS_EOI_DISABLED;
Should this also update existing vCPUs by calling kvm_apic_set_version() for all already-created vCPUs? or is the intent that userspace must set these flags before any vCPUs are created? If so, should this warn if called after vcpus are created?
QEMU currently creates vCPUs before initializing the I/O APIC, so deciding KVM’s SEOIB behavior at vCPU initialization time is just a bit awkward. It’s manageable either way, I just want to confirm the uAPI semantics.
For reference, the initial QEMU side patch: https://patchew.org/QEMU/20251126093742.2110483-1-khushit.shah@nutanix.com/
On Wed, 2025-12-03 at 00:50 +0000, Huang, Kai wrote:
-#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) -#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +#define KVM_X2APIC_API_USE_32BIT_IDS (_BITULL(0)) +#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (_BITULL(1)) +#define KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST (_BITULL(2)) +#define KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST (_BITULL(3))
I hate to say, but wants to ask again:
Since it's uAPI, are we expecting the two flags to have impact on in-kernel ioapic?
I think there should no harm to make the two also apply to in-kernel ioapic.
E.g., for now we can reject KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST flag for in-kernel ioapic. In the future, we might add EOI register support to in-kernel ioapic and report supporting suppress EOI broadcast, then we can in-kernel ioapic to honor these two flags too.
I don't think we should leave that to the unspecified 'future'. Let's fix the kernel I/O APIC to support the directed EOI at the same time, rather than having an interim version of KVM which supports the broadcast suppression but *not* the explicit EOI that replaces it.
Since I happened to have the I/O APIC PDFs in my recent history for other reasons, and implemented these extra registers for version 0x20 in another userspace VMM within living memory, I figured I could try to help with the actual implementation (untested, below).
There is some bikeshedding to be done on precisely *how* ->version_id should be set. Maybe we shouldn't have the ->version_id field, and should just check kvm->arch.suppress_eoi_broadcast to see which version to report?
While I'm at looking at it, I see there's an ASSERT(ent->fields.trig_mode == IOAPIC_LEVEL_TRIG) in the middle of kvm_ioapic_update_eoi_one(), a few lines after it drops ioapic->lock and then relocks it again. If it dropped the lock, doesn't that mean the guest could have *changed* the mode in the interim, making that ASSERT() guest-triggerable?
===== From: David Woodhouse dwmw@amazon.co.uk Subject: [PATCH] KVM: x86/ioapic: Implement support for I/O APIC version 0x20 with EOIR
As the weirdness with EOI broadcast suppression is being fixed in KVM, also update the in-kernel I/O APIC to handle the directed EOI which guests will need to use instead.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk --- arch/x86/kvm/ioapic.c | 22 +++++++++++++++++++++- arch/x86/kvm/ioapic.h | 18 +++++++++++------- 2 files changed, 32 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c index 2c2783296aed..e82f74a8e57e 100644 --- a/arch/x86/kvm/ioapic.c +++ b/arch/x86/kvm/ioapic.c @@ -49,7 +49,7 @@ static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic) switch (ioapic->ioregsel) { case IOAPIC_REG_VERSION: result = ((((IOAPIC_NUM_PINS - 1) & 0xff) << 16) - | (IOAPIC_VERSION_ID & 0xff)); + | ioapic->version_id); break;
case IOAPIC_REG_APIC_ID: @@ -57,6 +57,10 @@ static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic) result = ((ioapic->id & 0xf) << 24); break;
+ case IOAPIC_REG_BOOT_CONFIG: + result = 0x01; /* Processor bus */ + break; + default: { u32 redir_index = (ioapic->ioregsel - 0x10) >> 1; @@ -695,6 +699,21 @@ static int ioapic_mmio_write(struct kvm_vcpu *vcpu, struct kvm_io_device *this, ioapic_write_indirect(ioapic, data); break;
+ case IOAPIC_REG_EOIR: + if (ioapic->version_id >= 0x20) { + u8 vector = data & 0xff; + int i; + + rtc_irq_eoi(ioapic, vcpu, vector); + for (i = 0; i < IOAPIC_NUM_PINS; i++) { + union kvm_ioapic_redirect_entry *ent = &ioapic->redirtbl[i]; + + if (ent->fields.vector != vector) + continue; + kvm_ioapic_update_eoi_one(vcpu, ioapic, ent->fields.trig_mode, i); + } + } + break; default: break; } @@ -734,6 +753,7 @@ int kvm_ioapic_init(struct kvm *kvm) spin_lock_init(&ioapic->lock); INIT_DELAYED_WORK(&ioapic->eoi_inject, kvm_ioapic_eoi_inject_work); INIT_HLIST_HEAD(&ioapic->mask_notifier_list); + ioapic->version_id = IOAPIC_VERSION_ID; kvm->arch.vioapic = ioapic; kvm_ioapic_reset(ioapic); kvm_iodevice_init(&ioapic->dev, &ioapic_mmio_ops); diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h index bf28dbc11ff6..c8e44d726fbe 100644 --- a/arch/x86/kvm/ioapic.h +++ b/arch/x86/kvm/ioapic.h @@ -19,13 +19,16 @@ struct kvm_vcpu; #define IOAPIC_MEM_LENGTH 0x100
/* Direct registers. */ -#define IOAPIC_REG_SELECT 0x00 -#define IOAPIC_REG_WINDOW 0x10 +#define IOAPIC_REG_SELECT 0x00 +#define IOAPIC_REG_WINDOW 0x10 +#define IOAPIC_REG_IRQPA 0x20 +#define IOAPIC_REG_EOIR 0x40
-/* Indirect registers. */ -#define IOAPIC_REG_APIC_ID 0x00 /* x86 IOAPIC only */ -#define IOAPIC_REG_VERSION 0x01 -#define IOAPIC_REG_ARB_ID 0x02 /* x86 IOAPIC only */ +/* INDIRECT registers. */ +#define IOAPIC_REG_APIC_ID 0x00 /* x86 IOAPIC only */ +#define IOAPIC_REG_VERSION 0x01 +#define IOAPIC_REG_ARB_ID 0x02 /* x86 IOAPIC only */ +#define IOAPIC_REG_BOOT_CONFIG 0x03 /* x86 IOAPIC only */
/*ioapic delivery mode*/ #define IOAPIC_FIXED 0x0 @@ -76,7 +79,8 @@ struct kvm_ioapic { u32 ioregsel; u32 id; u32 irr; - u32 pad; + u8 version_id; + u8 pad[3]; union kvm_ioapic_redirect_entry redirtbl[IOAPIC_NUM_PINS]; unsigned long irq_states[IOAPIC_NUM_PINS]; struct kvm_io_device dev;
On Wed, Dec 3, 2025 at 1:26 PM David Woodhouse dwmw2@infradead.org wrote:
On Wed, 2025-12-03 at 00:50 +0000, Huang, Kai wrote:
-#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) -#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +#define KVM_X2APIC_API_USE_32BIT_IDS (_BITULL(0)) +#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (_BITULL(1)) +#define KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST (_BITULL(2)) +#define KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST (_BITULL(3))
I hate to say, but wants to ask again:
Since it's uAPI, are we expecting the two flags to have impact on in-kernel ioapic?
I think there should no harm to make the two also apply to in-kernel ioapic.
E.g., for now we can reject KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST flag for in-kernel ioapic. In the future, we might add EOI register support to in-kernel ioapic and report supporting suppress EOI broadcast, then we can in-kernel ioapic to honor these two flags too.
I don't think we should leave that to the unspecified 'future'. Let's fix the kernel I/O APIC to support the directed EOI at the same time, rather than having an interim version of KVM which supports the broadcast suppression but *not* the explicit EOI that replaces it.
Since I happened to have the I/O APIC PDFs in my recent history for other reasons, and implemented these extra registers for version 0x20 in another userspace VMM within living memory, I figured I could try to help with the actual implementation (untested, below).
There is some bikeshedding to be done on precisely *how* ->version_id should be set. Maybe we shouldn't have the ->version_id field, and should just check kvm->arch.suppress_eoi_broadcast to see which version to report?
That would make it impossible to use the fixed implementation on the local APIC side, without changing the way the IOAPIC appears to the guest.
There are no parameters that you can use in KVM_CREATE_IRQCHIP, unfortunately, and no checks that (for example) kvm_irqchip.pad or kvm_ioapic_state.pad are zero.
The best possibility that I can think of, is to model it like KVM_CAP_XSAVE2
1) Add a capability KVM_CAP_IRQCHIP2 (x86 only)
2) If reported, kvm_irqchip.pad becomes "flags" (a set of flag bits) and kvm_ioapic_state.pad becomes version_id when returned from KVM_GET_IRQCHIP. Using an anonymous union allows adding the synonyms.
3) On top of this, KVM_SET_IRQCHIP2 is added which checks that kvm_irqchip.flags is zero and that kvm_ioapic_state.version_id is either 0x11 or 0x20.
4) Leave the default to 0x11 for backwards compatibility.
The alternative is to add KVM_ENABLE_CAP(KVM_CAP_IRQCHIP2) but I dislike adding another stateful API.
Paolo
On Wed, 2025-12-03 at 14:10 +0100, Paolo Bonzini wrote:
On Wed, Dec 3, 2025 at 1:26 PM David Woodhouse dwmw2@infradead.org wrote:
On Wed, 2025-12-03 at 00:50 +0000, Huang, Kai wrote:
-#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) -#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +#define KVM_X2APIC_API_USE_32BIT_IDS (_BITULL(0)) +#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (_BITULL(1)) +#define KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST (_BITULL(2)) +#define KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST (_BITULL(3))
I hate to say, but wants to ask again:
Since it's uAPI, are we expecting the two flags to have impact on in-kernel ioapic?
I think there should no harm to make the two also apply to in-kernel ioapic.
E.g., for now we can reject KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST flag for in-kernel ioapic. In the future, we might add EOI register support to in-kernel ioapic and report supporting suppress EOI broadcast, then we can in-kernel ioapic to honor these two flags too.
I don't think we should leave that to the unspecified 'future'. Let's fix the kernel I/O APIC to support the directed EOI at the same time, rather than having an interim version of KVM which supports the broadcast suppression but *not* the explicit EOI that replaces it.
Since I happened to have the I/O APIC PDFs in my recent history for other reasons, and implemented these extra registers for version 0x20 in another userspace VMM within living memory, I figured I could try to help with the actual implementation (untested, below).
There is some bikeshedding to be done on precisely *how* ->version_id should be set. Maybe we shouldn't have the ->version_id field, and should just check kvm->arch.suppress_eoi_broadcast to see which version to report?
That would make it impossible to use the fixed implementation on the local APIC side, without changing the way the IOAPIC appears to the guest.
Yes, but remember that "the fixed implementation on the local APIC side" means precisely that it's fixed to *not* broadcast the EOI. Which means you absolutely *need* to have an I/O APIC capable of receiving the explicit directed EOI, or the EOI will never happen at all.
Which is why it probably makes sense to drop the 'version_id' field from the struct where I'd added it, and just make the code report a hard-coded version based on suppress_eoi_broadcast being enabled:
(kvm->arch.suppress_eoi_broadcast == KVM_SUPPRESS_EOI_ENABLED) ? 0x20: 0x11
So yes, it's a guest-visible change, but only if the VMM explicitly *asks* for the broadcast suppression feature to work, in which case it's *necessary* anyway.
There are no parameters that you can use in KVM_CREATE_IRQCHIP, unfortunately, and no checks that (for example) kvm_irqchip.pad or kvm_ioapic_state.pad are zero.
The best possibility that I can think of, is to model it like KVM_CAP_XSAVE2
Add a capability KVM_CAP_IRQCHIP2 (x86 only)
If reported, kvm_irqchip.pad becomes "flags" (a set of flag bits)
and kvm_ioapic_state.pad becomes version_id when returned from KVM_GET_IRQCHIP. Using an anonymous union allows adding the synonyms.
- On top of this, KVM_SET_IRQCHIP2 is added which checks that
kvm_irqchip.flags is zero and that kvm_ioapic_state.version_id is either 0x11 or 0x20.
- Leave the default to 0x11 for backwards compatibility.
The alternative is to add KVM_ENABLE_CAP(KVM_CAP_IRQCHIP2) but I dislike adding another stateful API.
Yeah. Just gate it on the existing (well, nascent) KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST flag.
On Wed, Dec 3, 2025 at 2:32 PM David Woodhouse dwmw2@infradead.org wrote:
That would make it impossible to use the fixed implementation on the local APIC side, without changing the way the IOAPIC appears to the guest.
Yes, but remember that "the fixed implementation on the local APIC side" means precisely that it's fixed to *not* broadcast the EOI. Which means you absolutely *need* to have an I/O APIC capable of receiving the explicit directed EOI, or the EOI will never happen at all.
Which is why it probably makes sense to drop the 'version_id' field from the struct where I'd added it, and just make the code report a hard-coded version based on suppress_eoi_broadcast being enabled:
(kvm->arch.suppress_eoi_broadcast == KVM_SUPPRESS_EOI_ENABLED) ? 0x20: 0x11
So yes, it's a guest-visible change, but only if the VMM explicitly *asks* for the broadcast suppression feature to work, in which case it's *necessary* anyway.
I see what you mean and I guess you're right... "Setting X will cause the in-kernel IOAPIC to report version 0x20" is as obscure as it gets, but then so is "Setting X will break guests unless you tell in-kernel IOAPIC to report version 0x20".
So this is good, but the docs need to say clearly that this should only be set if either full in-kernel irqchip is in use or, for split irqchip, if the userspace IOAPIC implements directed EOI correctly.
Paolo
On Wed, 2025-12-03 at 14:36 +0100, Paolo Bonzini wrote:
So yes, it's a guest-visible change, but only if the VMM explicitly *asks* for the broadcast suppression feature to work, in which case it's *necessary* anyway.
I see what you mean and I guess you're right... "Setting X will cause the in-kernel IOAPIC to report version 0x20" is as obscure as it gets, but then so is "Setting X will break guests unless you tell in-kernel IOAPIC to report version 0x20".
So this is good, but the docs need to say clearly that this should only be set if either full in-kernel irqchip is in use or, for split irqchip, if the userspace IOAPIC implements directed EOI correctly.
Updated patch below, dropping the struct change and just directly using the helper which I *believe* is going to be called kvm_lapic_ignore_suppress_eoi_broadcast() and have the opposite polarity to the one I proposed upthread. Doesn't build here because of that helper, obvs.
Still untested.
For the documentation then, how about...
Setting KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST causes KVM to advertise and correctly implement the Directed EOI feature in the local APIC, suppressing broadcast EOI when the feature is enabled by the guest. Setting this flag will also cause the in-kernel I/O APIC to advertise version 0x20 with support for the EOI register; a userspace implementation of I/O APIC should also support the same, as some guest operating systems do not check for that feature in the I/O APIC before disabling the broadcast in the local APIC.
Setting KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST causes KVM not to advertise the Directed EOI feature in the local APIC.
Userspace should explicitly either enable or disable the EOI broadcast using one of the two flags above. For historical compatibility reasons, if neither flag is set then KVM will advertise the feature but will not actually suppress the EOI broadcast, leading to potential IRQ storms in some guest configurations.
From: David Woodhouse dwmw@amazon.co.uk Subject: [PATCH] KVM: x86/ioapic: Implement support for I/O APIC version 0x20 with EOIR
As the weirdness with EOI broadcast suppression is being fixed in KVM, also update the in-kernel I/O APIC to handle the directed EOI which guests will need to use instead, when broadcast EOI suppression is fully enabled.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk --- arch/x86/kvm/ioapic.c | 30 ++++++++++++++++++++++++++++-- arch/x86/kvm/ioapic.h | 20 ++++++++++++-------- 2 files changed, 40 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c index 2c2783296aed..0ed84b02c521 100644 --- a/arch/x86/kvm/ioapic.c +++ b/arch/x86/kvm/ioapic.c @@ -48,8 +48,11 @@ static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic)
switch (ioapic->ioregsel) { case IOAPIC_REG_VERSION: - result = ((((IOAPIC_NUM_PINS - 1) & 0xff) << 16) - | (IOAPIC_VERSION_ID & 0xff)); + if (kvm_lapic_ignore_suppress_eoi_broadcast(ioapic->kvm)) + result = IOAPIC_VERSION_ID; + else + result = IOAPIC_VERSION_ID_EOIR; + result |= ((IOAPIC_NUM_PINS - 1) & 0xff) << 16; break;
case IOAPIC_REG_APIC_ID: @@ -57,6 +60,10 @@ static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic) result = ((ioapic->id & 0xf) << 24); break;
+ case IOAPIC_REG_BOOT_CONFIG: + result = 0x01; /* Processor bus */ + break; + default: { u32 redir_index = (ioapic->ioregsel - 0x10) >> 1; @@ -695,6 +702,25 @@ static int ioapic_mmio_write(struct kvm_vcpu *vcpu, struct kvm_io_device *this, ioapic_write_indirect(ioapic, data); break;
+ case IOAPIC_REG_EOIR: + /* + * The EOIR register is supported (and version 0x20 advertised) + * when userspace explicitly enables broadcast EOI supression. + */ + if (!kvm_lapic_ignore_suppress_eoi_broadcast(vcpu->kvm)) { + u8 vector = data & 0xff; + int i; + + rtc_irq_eoi(ioapic, vcpu, vector); + for (i = 0; i < IOAPIC_NUM_PINS; i++) { + union kvm_ioapic_redirect_entry *ent = &ioapic->redirtbl[i]; + + if (ent->fields.vector != vector) + continue; + kvm_ioapic_update_eoi_one(vcpu, ioapic, ent->fields.trig_mode, i); + } + } + break; default: break; } diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h index bf28dbc11ff6..59d877f5f27b 100644 --- a/arch/x86/kvm/ioapic.h +++ b/arch/x86/kvm/ioapic.h @@ -11,7 +11,8 @@ struct kvm_vcpu;
#define IOAPIC_NUM_PINS KVM_IOAPIC_NUM_PINS #define MAX_NR_RESERVED_IOAPIC_PINS KVM_MAX_IRQ_ROUTES -#define IOAPIC_VERSION_ID 0x11 /* IOAPIC version */ +#define IOAPIC_VERSION_ID 0x11 /* Default IOAPIC version */ +#define IOAPIC_VERSION_ID_EOIR 0x20 /* IOAPIC version with EOIR support */ #define IOAPIC_EDGE_TRIG 0 #define IOAPIC_LEVEL_TRIG 1
@@ -19,13 +20,16 @@ struct kvm_vcpu; #define IOAPIC_MEM_LENGTH 0x100
/* Direct registers. */ -#define IOAPIC_REG_SELECT 0x00 -#define IOAPIC_REG_WINDOW 0x10 - -/* Indirect registers. */ -#define IOAPIC_REG_APIC_ID 0x00 /* x86 IOAPIC only */ -#define IOAPIC_REG_VERSION 0x01 -#define IOAPIC_REG_ARB_ID 0x02 /* x86 IOAPIC only */ +#define IOAPIC_REG_SELECT 0x00 +#define IOAPIC_REG_WINDOW 0x10 +#define IOAPIC_REG_IRQPA 0x20 +#define IOAPIC_REG_EOIR 0x40 /* version 0x20+ only */ + +/* INDIRECT registers. */ +#define IOAPIC_REG_APIC_ID 0x00 /* x86 IOAPIC only */ +#define IOAPIC_REG_VERSION 0x01 +#define IOAPIC_REG_ARB_ID 0x02 /* x86 IOAPIC only */ +#define IOAPIC_REG_BOOT_CONFIG 0x03 /* x86 IOAPIC only */
/*ioapic delivery mode*/ #define IOAPIC_FIXED 0x0
Firstly, excellent work debugging and diagnosing that!
On Tue, 2025-11-25 at 18:05 +0000, Khushit Shah wrote:
--- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7800,8 +7800,10 @@ Will return -EBUSY if a VCPU has already been created. Valid feature flags in args[0] are:: - #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) - #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) + #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK (1ULL << 2) + #define KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST (1ULL << 3)
I kind of hate these names. This part right here is what we leave behind for future generations, to understand the weird behaviour of KVM. To have "IGNORE" "SUPPRESS" "QUIRK" all in the same flag, quite apart from the length of the token, makes my brain hurt.
Enabling KVM_X2APIC_API_USE_32BIT_IDS changes the behavior of KVM_SET_GSI_ROUTING, KVM_SIGNAL_MSI, KVM_SET_LAPIC, and KVM_GET_LAPIC, @@ -7814,6 +7816,14 @@ as a broadcast even in x2APIC mode in order to support physical x2APIC without interrupt remapping. This is undesirable in logical mode, where 0xff represents CPUs 0-7 in cluster 0. +Setting KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK overrides +KVM's quirky behavior of not actually suppressing EOI broadcasts for split IRQ +chips when support for Suppress EOI Broadcasts is advertised to the guest.
This paragraph doesn't actually say what the flag *does*, only the old behaviour that it overrides?
+Setting KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST disables support for +Suppress EOI Broadcasts entirely, i.e. instructs KVM to NOT advertise support +to the guest and thus disallow enabling EOI broadcast suppression in SPIV.
Could we perhaps call them 'ENABLE_SUPPRESS_EOI_BROADCAST' and 'DISABLE_SUPPRESS_EOI_BROADCAST', with a note saying that modern VMMs should always explicitly enable one or the other, because for historical reasons KVM only *pretends* to support it by default but it doesn't actually work correctly?
Thanks for the review!
On 2 Dec 2025, at 2:43 PM, David Woodhouse dwmw2@infradead.org wrote:
Firstly, excellent work debugging and diagnosing that!
On Tue, 2025-11-25 at 18:05 +0000, Khushit Shah wrote:
--- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7800,8 +7800,10 @@ Will return -EBUSY if a VCPU has already been created. Valid feature flags in args[0] are::
- #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0)
- #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1)
- #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0)
- #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1)
- #define KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK (1ULL << 2)
- #define KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST (1ULL << 3)
I kind of hate these names. This part right here is what we leave behind for future generations, to understand the weird behaviour of KVM. To have "IGNORE" "SUPPRESS" "QUIRK" all in the same flag, quite apart from the length of the token, makes my brain hurt.
Yes, I agree the original name is too wordy. How about renaming it to KVM_X2APIC_API_ACTUALLY_SUPPRESS_EOI_BROADCASTS? That makes the intended KVM behaviour clear.
I'm also not very keen on ENABLE_SUPPRESS_EOI_BROADCAST it reads as if KVM is the one enabling the feature, which isn't the case. The guest decides whether to enable suppression; KVM should just advertise the capability correctly and then respect whatever the guest chooses.
On 2 Dec 2025, at 2:43 PM, David Woodhouse dwmw2@infradead.org wrote:
Enabling KVM_X2APIC_API_USE_32BIT_IDS changes the behavior of KVM_SET_GSI_ROUTING, KVM_SIGNAL_MSI, KVM_SET_LAPIC, and KVM_GET_LAPIC, @@ -7814,6 +7816,14 @@ as a broadcast even in x2APIC mode in order to support physical x2APIC without interrupt remapping. This is undesirable in logical mode, where 0xff represents CPUs 0-7 in cluster 0. +Setting KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK overrides +KVM's quirky behavior of not actually suppressing EOI broadcasts for split IRQ +chips when support for Suppress EOI Broadcasts is advertised to the guest.
This paragraph doesn't actually say what the flag *does*, only the old behaviour that it overrides?
Right, good point. I'll update the doc.
On Tue, 2025-12-02 at 12:58 +0000, Khushit Shah wrote:
Thanks for the review!
On 2 Dec 2025, at 2:43 PM, David Woodhouse dwmw2@infradead.org wrote:
Firstly, excellent work debugging and diagnosing that!
On Tue, 2025-11-25 at 18:05 +0000, Khushit Shah wrote:
--- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7800,8 +7800,10 @@ Will return -EBUSY if a VCPU has already been created. Valid feature flags in args[0] are:: - #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) - #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) + #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK (1ULL << 2) + #define KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST (1ULL << 3)
I kind of hate these names. This part right here is what we leave behind for future generations, to understand the weird behaviour of KVM. To have "IGNORE" "SUPPRESS" "QUIRK" all in the same flag, quite apart from the length of the token, makes my brain hurt.
Yes, I agree the original name is too wordy. How about renaming it to KVM_X2APIC_API_ACTUALLY_SUPPRESS_EOI_BROADCASTS? That makes the intended KVM behaviour clear.
I'm also not very keen on ENABLE_SUPPRESS_EOI_BROADCAST it reads as if KVM is the one enabling the feature, which isn't the case. The guest decides whether to enable suppression; KVM should just advertise the capability correctly and then respect whatever the guest chooses.
I think _ENABLE_ for enabling a feature for the guest to optionally use is reasonable enough; we'd generally say '_FORCE_' if we were going to turn it on unconditionally without the guest's knowledge.
Not entirely sure why you're OK with ACTUALLY_SUPPRESS_EOI_BROADCAST when you aren't ok with ENABLE_SUPPRESS_EOI_BROADCAST. In both cases you'd need to append _BUT_ONLY_IF_THE_GUEST_ASKS_FOR_IT if you want to be pedantic. :)
On Tue, Dec 02, 2025, David Woodhouse wrote:
On Tue, 2025-12-02 at 12:58 +0000, Khushit Shah wrote:
Thanks for the review!
On 2 Dec 2025, at 2:43 PM, David Woodhouse dwmw2@infradead.org wrote:
Firstly, excellent work debugging and diagnosing that!
On Tue, 2025-11-25 at 18:05 +0000, Khushit Shah wrote:
--- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7800,8 +7800,10 @@ Will return -EBUSY if a VCPU has already been created. Valid feature flags in args[0] are:: - #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) - #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) + #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK (1ULL << 2) + #define KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST (1ULL << 3)
I kind of hate these names. This part right here is what we leave behind for future generations, to understand the weird behaviour of KVM. To have "IGNORE" "SUPPRESS" "QUIRK" all in the same flag, quite apart from the length of the token, makes my brain hurt.
...
Could we perhaps call them 'ENABLE_SUPPRESS_EOI_BROADCAST' and 'DISABLE_SUPPRESS_EOI_BROADCAST', with a note saying that modern VMMs should always explicitly enable one or the other, because for historical reasons KVM only *pretends* to support it by default but it doesn't actually work correctly?
I don't disagree on the names being painful, but ENABLE_SUPPRESS_EOI_BROADCAST vs. DISABLE_SUPPRESS_EOI_BROADCAST won't work, and is even more confusing IMO.
The issue is that KVM "enables" SUPPRESS_EOI_BROADCAST in that the feature is exposed to the guest and can be enabled in local APICs, and that's one of the behaviors/configurations I want to preserve so that guests don't observe a feature change. Having an on/off switch doesn't work because KVM isn't fully disabling the feature, nor is KVM fully enabling the feature. It's a weird, half-baked state, hence the QUIRK.
More importantly, we can't use ENABLE bits because I want to preserve existing behavior exactly as-is. I.e. userspace needs to opt-in to disabling SUPPRESS_EOI_BROADCAST and/or to disabling IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK.
Yes, I agree the original name is too wordy. How about renaming it to KVM_X2APIC_API_ACTUALLY_SUPPRESS_EOI_BROADCASTS? That makes the intended KVM behaviour clear.
I'm also not very keen on ENABLE_SUPPRESS_EOI_BROADCAST it reads as if KVM is the one enabling the feature, which isn't the case.
Eh, there are myriad things that require enabling all both (or more) sides.
The guest decides whether to enable suppression; KVM should just advertise the capability correctly and then respect whatever the guest chooses.
I think _ENABLE_ for enabling a feature for the guest to optionally use is reasonable enough; we'd generally say '_FORCE_' if we were going to turn it on unconditionally without the guest's knowledge.
Not entirely sure why you're OK with ACTUALLY_SUPPRESS_EOI_BROADCAST when you aren't ok with ENABLE_SUPPRESS_EOI_BROADCAST. In both cases you'd need to append _BUT_ONLY_IF_THE_GUEST_ASKS_FOR_IT if you want to be pedantic. :)
+1, though as above I don't think we can use ENABLE for this particular mess.
On Tue, 2025-12-02 at 07:42 -0800, Sean Christopherson wrote:
The issue is that KVM "enables" SUPPRESS_EOI_BROADCAST in that the feature is exposed to the guest and can be enabled in local APICs, and that's one of the behaviors/configurations I want to preserve so that guests don't observe a feature change. Having an on/off switch doesn't work because KVM isn't fully disabling the feature, nor is KVM fully enabling the feature. It's a weird, half-baked state, hence the QUIRK.
More importantly, we can't use ENABLE bits because I want to preserve existing behavior exactly as-is. I.e. userspace needs to opt-in to disabling SUPPRESS_EOI_BROADCAST and/or to disabling IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK.
To respond to that part directly...
The legacy behaviour we need to preserve is neither fully enabled, nor fully disabled. I agree that the default needs to remain this weird half-baked quirk.
But now we are giving userspace a chance to explicitly choose either a proper ENABLE or a proper DISABLE, and isn't that what those flags should be called?
You have ENABLE, you have DISABLE, and you have a warning that you probably ought to choose one of them because the default you get otherwise is weirdly broken (but there for legacy compatibility).
On Tue, Dec 02, 2025, David Woodhouse wrote:
On Tue, 2025-12-02 at 07:42 -0800, Sean Christopherson wrote:
On Tue, Dec 02, 2025, David Woodhouse wrote:
On Tue, 2025-12-02 at 12:58 +0000, Khushit Shah wrote:
Thanks for the review!
On 2 Dec 2025, at 2:43 PM, David Woodhouse dwmw2@infradead.org wrote:
Firstly, excellent work debugging and diagnosing that!
On Tue, 2025-11-25 at 18:05 +0000, Khushit Shah wrote:
--- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7800,8 +7800,10 @@ Will return -EBUSY if a VCPU has already been created. Valid feature flags in args[0] are:: - #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) - #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) + #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK (1ULL << 2) + #define KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST (1ULL << 3)
I kind of hate these names. This part right here is what we leave behind for future generations, to understand the weird behaviour of KVM. To have "IGNORE" "SUPPRESS" "QUIRK" all in the same flag, quite apart from the length of the token, makes my brain hurt.
...
Could we perhaps call them 'ENABLE_SUPPRESS_EOI_BROADCAST' and 'DISABLE_SUPPRESS_EOI_BROADCAST', with a note saying that modern VMMs should always explicitly enable one or the other, because for historical reasons KVM only *pretends* to support it by default but it doesn't actually work correctly?
I don't disagree on the names being painful, but ENABLE_SUPPRESS_EOI_BROADCAST vs. DISABLE_SUPPRESS_EOI_BROADCAST won't work, and is even more confusing IMO.
I dunno, KVM never actually *did* suppress the EOI broadcast anyway, did it? This fix really *does* enable it — as opposed to just pretending to?
I was thinking along the lines of ...
Setting KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST causes KVM to advertise and correctly implement the Directed EOI feature in the local APIC, suppressing broadcast EOI when the feature is enabled by the guest.
Setting KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST causes KVM not to advertise the Directed EOI feature in the local APIC.
Userspace should explicitly either enable or disable the EOI broadcast using one of the two flags above. For historical compatibility reasons, if neither flag is set then KVM will advertise the feature but will not actually suppress the EOI broadcast, leading to potential IRQ storms in some guest configurations.
Hmm, I suppose that could work for uAPI. Having both an ENABLE and a DISABLE is obviously a bit odd, but slowing down the reader might actually be a good thing in this case. And the documentation should be easy enough to write.
I was worried that having ENABLE and DISABLE controls would lead to confusing code internally, but there's no reason KVM's internal tracking needs to match uAPI.
How about this?
--- arch/x86/include/asm/kvm_host.h | 7 +++++++ arch/x86/include/uapi/asm/kvm.h | 6 ++++-- arch/x86/kvm/lapic.c | 16 +++++++++++++++- arch/x86/kvm/x86.c | 15 ++++++++++++--- 4 files changed, 38 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 5a3bfa293e8b..b4c41255f01d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1226,6 +1226,12 @@ enum kvm_irqchip_mode { KVM_IRQCHIP_SPLIT, /* created with KVM_CAP_SPLIT_IRQCHIP */ };
+enum kvm_suppress_eoi_broadcast_mode { + KVM_SUPPRESS_EOI_QUIRKED, + KVM_SUPPRESS_EOI_ENABLED, + KVM_SUPPRESS_EOI_DISABLED, +}; + struct kvm_x86_msr_filter { u8 count; bool default_allow:1; @@ -1475,6 +1481,7 @@ struct kvm_arch {
bool x2apic_format; bool x2apic_broadcast_quirk_disabled; + enum kvm_suppress_eoi_broadcast_mode suppress_eoi_broadcast;
bool has_mapped_host_mmio; bool guest_can_read_msr_platform_info; diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 7ceff6583652..bd51596001f8 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -914,8 +914,10 @@ struct kvm_sev_snp_launch_finish { __u64 pad1[4]; };
-#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) -#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +#define KVM_X2APIC_API_USE_32BIT_IDS (_BITULL(0)) +#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (_BITULL(1)) +#define KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST (_BITULL(2)) +#define KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST (_BITULL(3))
struct kvm_hyperv_eventfd { __u32 conn_id; diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 1597dd0b0cc6..3f00c9640785 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -562,7 +562,8 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu) * IOAPIC. */ if (guest_cpu_cap_has(vcpu, X86_FEATURE_X2APIC) && - !ioapic_in_kernel(vcpu->kvm)) + !ioapic_in_kernel(vcpu->kvm) && + vcpu->kvm->arch.suppress_eoi_broadcast != KVM_SUPPRESS_EOI_DISABLED) v |= APIC_LVR_DIRECTED_EOI; kvm_lapic_set_reg(apic, APIC_LVR, v); } @@ -1517,6 +1518,19 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int vector)
/* Request a KVM exit to inform the userspace IOAPIC. */ if (irqchip_split(apic->vcpu->kvm)) { + /* + * Don't exit to userspace if the guest has enabled Directed + * EOI, a.k.a. Suppress EOI Broadcasts, in which case the local + * APIC doesn't broadcast EOIs (the guest must EOI the target + * I/O APIC(s) directly). Ignore the suppression if userspace + * has not explictly enabled support (KVM's historical quirky + * behavior is to advertise support for Suppress EOI Broadcasts + * without actually suppressing EOIs). + */ + if ((kvm_lapic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI) && + apic->vcpu->kvm->arch.suppress_eoi_broadcast != KVM_SUPPRESS_EOI_QUIRKED) + return; + apic->vcpu->arch.pending_ioapic_eoi = vector; kvm_make_request(KVM_REQ_IOAPIC_EOI_EXIT, apic->vcpu); return; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0c6d899d53dd..b36e048c7862 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -121,8 +121,10 @@ static u64 __read_mostly efer_reserved_bits = ~((u64)EFER_SCE);
#define KVM_CAP_PMU_VALID_MASK KVM_PMU_CAP_DISABLE
-#define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS | \ - KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK) +#define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS | \ + KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK | \ + KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST | \ + KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST)
static void update_cr8_intercept(struct kvm_vcpu *vcpu); static void process_nmi(struct kvm_vcpu *vcpu); @@ -6739,11 +6741,18 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, if (cap->args[0] & ~KVM_X2APIC_API_VALID_FLAGS) break;
+ if (cap->args[0] & KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST && + cap->args[0] & KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST) + break; + if (cap->args[0] & KVM_X2APIC_API_USE_32BIT_IDS) kvm->arch.x2apic_format = true; if (cap->args[0] & KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK) kvm->arch.x2apic_broadcast_quirk_disabled = true; - + if (cap->args[0] & KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST) + kvm->arch.suppress_eoi_broadcast = KVM_SUPPRESS_EOI_ENABLED; + if (cap->args[0] & KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST) + kvm->arch.suppress_eoi_broadcast = KVM_SUPPRESS_EOI_DISABLED; r = 0; break; case KVM_CAP_X86_DISABLE_EXITS:
base-commit: 6c3373b26189853230552bd3932b3edba5883423 --
On Tue, 2025-12-02 at 08:36 -0800, Sean Christopherson wrote:
Hmm, I suppose that could work for uAPI. Having both an ENABLE and a DISABLE is obviously a bit odd, but slowing down the reader might actually be a good thing in this case. And the documentation should be easy enough to write.
I was worried that having ENABLE and DISABLE controls would lead to confusing code internally, but there's no reason KVM's internal tracking needs to match uAPI.
How about this?
arch/x86/include/asm/kvm_host.h | 7 +++++++ arch/x86/include/uapi/asm/kvm.h | 6 ++++-- arch/x86/kvm/lapic.c | 16 +++++++++++++++- arch/x86/kvm/x86.c | 15 ++++++++++++--- 4 files changed, 38 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 5a3bfa293e8b..b4c41255f01d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1226,6 +1226,12 @@ enum kvm_irqchip_mode { KVM_IRQCHIP_SPLIT, /* created with KVM_CAP_SPLIT_IRQCHIP */ }; +enum kvm_suppress_eoi_broadcast_mode {
- KVM_SUPPRESS_EOI_QUIRKED,
- KVM_SUPPRESS_EOI_ENABLED,
- KVM_SUPPRESS_EOI_DISABLED,
+};
Looks good. I'd probably call it KVM_SUPPRESS_EOI_LEGACY though?
And just for clarity I wouldn't embed the explicit checks against e.g arch.suppress_eoi_broadcast != KVM_SUPPRESS_EOI_LEGACY. I'd make static inline functions like
static inline bool kvm_lapic_advertise_directed_eoi(kvm) { /* Legacy behaviour was to advertise this feature but it didn't * actually work. */ return kvm->arch.suppress_eoi_broadcast != KVM_SUPPRESS_EOI_DISABLED; }
static inline bool kvm_lapic_suppress_directed_eoi(kvm) { /* Legacy behaviour advertised this feature but didn't actually * suppress the EOI. */ return kvm->arch.suppress_eoi_broadcast == KVM_SUPPRESS_EOI_ENABLED; }
Because it keeps the batshittery in one place and clearly documented?
I note your version did actually suppress the broadcast even in the DISABLED case if the guest had managed to set that bit in SPIV, but I don't think it *can* so that difference doesn't matter anyway, right?
linux-stable-mirror@lists.linaro.org