On 19/12/2025 4:09 pm, Sean Christopherson wrote:
On Fri, Dec 19, 2025, Borislav Petkov wrote:
On December 19, 2025 1:01:31 AM UTC, Ariadne Conill ariadne@ariadne.space wrote:
Xen domU cannot access the given MMIO address for security reasons, resulting in a failed hypercall in ioremap() due to permissions.
Why does that matter though? Ah, because set_pte() assumes success, and so presumably the failed hypercall goes unnoticed and trying to access the MMIO #PFs due to !PRESENT mapping.
Fixes: ab8131028710 ("x86/CPU/AMD: Print the reason for the last reset") Signed-off-by: Ariadne Conill ariadne@ariadne.space Cc: xen-devel@lists.xenproject.org Cc: stable@vger.kernel.org
arch/x86/kernel/cpu/amd.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index a6f88ca1a6b4..99308fba4d7d 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -29,6 +29,8 @@ # include <asm/mmconfig.h> #endif
+#include <xen/xen.h>
#include "cpu.h"
u16 invlpgb_count_max __ro_after_init = 1; @@ -1333,6 +1335,10 @@ static __init int print_s5_reset_status_mmio(void) if (!cpu_feature_enabled(X86_FEATURE_ZEN)) return 0;
- /* Xen PV domU cannot access hardware directly, so bail for domU case */
Heh, Xen on Zen crime.
- if (cpu_feature_enabled(X86_FEATURE_XENPV) && !xen_initial_domain())
return 0;- addr = ioremap(FCH_PM_BASE + FCH_PM_S5_RESET_STATUS, sizeof(value)); if (!addr) return 0;
Sean, looka here. The other hypervisor wants other checks.
Time to whip out the X86_FEATURE_HYPERVISOR check.
LOL, Ariadne, be honest, how much did Boris pay you? :-D
Jokes aside, I suppose I'm fine adding a HYPERVISOR check, but at the same time, how is this not a Xen bug? Refusing to create a mapping because the VM doesn't have a device defined at a given GPA is pretty hostile behavior for a hypervisor.
This is a Xen PV guest. No SVM/VT-x.
A PV Guest (by it's contract with Xen) cannot create mappings to host physical addresses it doesn't own. Xen rejects the attempt, and Linux is ignoring the failure in this case. This has been ABI for 25 years.
From a more practical point of view, because guests can read their own pagetables, Xen can't swap the requested PTE for safe alternative to trap the MMIO access, because that violates Linux's model of what's mapped in this position.
~Andrew
On Fri, Dec 19, 2025, Andrew Cooper wrote:
On 19/12/2025 4:09 pm, Sean Christopherson wrote:
On Fri, Dec 19, 2025, Borislav Petkov wrote:
On December 19, 2025 1:01:31 AM UTC, Ariadne Conill ariadne@ariadne.space wrote:
Xen domU cannot access the given MMIO address for security reasons, resulting in a failed hypercall in ioremap() due to permissions.
Why does that matter though? Ah, because set_pte() assumes success, and so presumably the failed hypercall goes unnoticed and trying to access the MMIO #PFs due to !PRESENT mapping.
Fixes: ab8131028710 ("x86/CPU/AMD: Print the reason for the last reset") Signed-off-by: Ariadne Conill ariadne@ariadne.space Cc: xen-devel@lists.xenproject.org Cc: stable@vger.kernel.org
arch/x86/kernel/cpu/amd.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index a6f88ca1a6b4..99308fba4d7d 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -29,6 +29,8 @@ # include <asm/mmconfig.h> #endif
+#include <xen/xen.h>
#include "cpu.h"
u16 invlpgb_count_max __ro_after_init = 1; @@ -1333,6 +1335,10 @@ static __init int print_s5_reset_status_mmio(void) if (!cpu_feature_enabled(X86_FEATURE_ZEN)) return 0;
- /* Xen PV domU cannot access hardware directly, so bail for domU case */
Heh, Xen on Zen crime.
- if (cpu_feature_enabled(X86_FEATURE_XENPV) && !xen_initial_domain())
return 0;- addr = ioremap(FCH_PM_BASE + FCH_PM_S5_RESET_STATUS, sizeof(value)); if (!addr) return 0;
Sean, looka here. The other hypervisor wants other checks.
Time to whip out the X86_FEATURE_HYPERVISOR check.
LOL, Ariadne, be honest, how much did Boris pay you? :-D
Jokes aside, I suppose I'm fine adding a HYPERVISOR check, but at the same time, how is this not a Xen bug? Refusing to create a mapping because the VM doesn't have a device defined at a given GPA is pretty hostile behavior for a hypervisor.
This is a Xen PV guest. No SVM/VT-x.
A PV Guest (by it's contract with Xen) cannot create mappings to host physical addresses it doesn't own.
Huh, assuming wiki.xenproject.org[*] can be trusted, TIL Xen PV has no notion of a virtual motherboard/chipset:
In a paravirtualized VM, guests run with fully paravirtualized disk and network interfaces; interrupts and timers are paravirtualized; there is no emulated motherboard or device bus; guests boot directly into the kernel in the mode the kernel wishes to run in (32-bit or 64-bit), without needing to start in 16-bit mode or go through a BIOS; all privileged instructions are replaced with paravirtualized equivalents (hypercalls), and access to the page tables was paravirtualized as well.
[*] https://wiki.xenproject.org/wiki/Understanding_the_Virtualization_Spectrum
Xen rejects the attempt, and Linux is ignoring the failure in this case. This has been ABI for 25 years.
Heh, I suspected as much.
From a more practical point of view, because guests can read their own pagetables, Xen can't swap the requested PTE for safe alternative to trap the MMIO access, because that violates Linux's model of what's mapped in this position.
That all makes sense, but shouldn't the inability to ioremap() chipset MMIO be addressed in ioremap()? E.g. after poking around a bit, commit 6a92b11169a6 ("x86/EISA: Don't probe EISA bus for Xen PV guests") fudged around the same underlying flaw in eisa_bus_probe(). Ha! Though maybe that's no longer necessary as of 80a4da05642c ("x86/EISA: Use memremap() to probe for the EISA BIOS signature")?
Anyways, I'm still opposed to using HYPERVISOR. E.g. given that Dom0 can apparently access host MMIO just fine, and that it's perfectly reasonable for a KVM-based VMM to emulate the chipset, assuming a guest doesn't have access to some asset is simply wrong.
But rather than play whack-a-mole, can't we deal with the ignore PTE write failures in ioremap()? E.g. something like this?
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 12c8180ca1ba..b7e2c752af1d 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -29,6 +29,8 @@ #include <asm/memtype.h> #include <asm/setup.h>
+#include <xen/xen.h> + #include "physaddr.h"
/* @@ -301,6 +303,20 @@ __ioremap_caller(resource_size_t phys_addr, unsigned long size, if (ioremap_page_range(vaddr, vaddr + size, phys_addr, prot)) goto err_free_area;
+ /* + * Verify the range was actually mapped when running as a Xen PV DomU + * guest. Xen PV doesn't emulate a virtual chipset/motherboard, and + * disallows DomU from mapping host physical addresses that the domain + * doesn't own. Unfortunately, the PTE APIs assume success, and so + * Xen's rejection of the mapping is ignored. + */ + if (xen_pv_domain() && !xen_initial_domain()) { + int level; + + if (!lookup_address(vaddr, &level)) + goto err_free_area; + } + ret_addr = (void __iomem *) (vaddr + offset); mmiotrace_ioremap(unaligned_phys_addr, unaligned_size, ret_addr);
linux-stable-mirror@lists.linaro.org