The Processor _PDC buffer bits notify ACPI of the OS capabilities, and so ACPI can adjust the return of other Processor methods taking the OS capabilities into account.
When Linux is running as a Xen dom0, it's the hypervisor the entity in charge of processor power management, and hence Xen needs to make sure the capabilities reported in the _PDC buffer match the capabilities of the driver in Xen.
Introduce a small helper to sanitize the buffer when running as Xen dom0.
Signed-off-by: Roger Pau Monné roger.pau@citrix.com Cc: stable@vger.kernel.org --- arch/x86/include/asm/xen/hypervisor.h | 2 ++ arch/x86/xen/enlighten.c | 17 +++++++++++++++++ drivers/acpi/processor_pdc.c | 8 ++++++++ 3 files changed, 27 insertions(+)
diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h index b9f512138043..b4ed90ef5e68 100644 --- a/arch/x86/include/asm/xen/hypervisor.h +++ b/arch/x86/include/asm/xen/hypervisor.h @@ -63,12 +63,14 @@ void __init mem_map_via_hcall(struct boot_params *boot_params_p);
#ifdef CONFIG_XEN_DOM0 bool __init xen_processor_present(uint32_t acpi_id); +void xen_sanitize_pdc(uint32_t *buf); #else static inline bool xen_processor_present(uint32_t acpi_id) { BUG(); return false; } +static inline void xen_sanitize_pdc(uint32_t *buf) { BUG(); } #endif
#endif /* _ASM_X86_XEN_HYPERVISOR_H */ diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index d4c44361a26c..394dd6675113 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -372,4 +372,21 @@ bool __init xen_processor_present(uint32_t acpi_id)
return false; } + +void xen_sanitize_pdc(uint32_t *buf) +{ + struct xen_platform_op op = { + .cmd = XENPF_set_processor_pminfo, + .interface_version = XENPF_INTERFACE_VERSION, + .u.set_pminfo.id = -1, + .u.set_pminfo.type = XEN_PM_PDC, + }; + int ret; + + set_xen_guest_handle(op.u.set_pminfo.pdc, buf); + ret = HYPERVISOR_platform_op(&op); + if (ret) + pr_info("sanitize of _PDC buffer bits from Xen failed: %d\n", + ret); +} #endif diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c index 18fb04523f93..58f4c208517a 100644 --- a/drivers/acpi/processor_pdc.c +++ b/drivers/acpi/processor_pdc.c @@ -137,6 +137,14 @@ acpi_processor_eval_pdc(acpi_handle handle, struct acpi_object_list *pdc_in) buffer[2] &= ~(ACPI_PDC_C_C2C3_FFH | ACPI_PDC_C_C1_FFH);
} + if (xen_initial_domain()) + /* + * When Linux is running as Xen dom0 it's the hypervisor the + * entity in charge of the processor power management, and so + * Xen needs to check the OS capabilities reported in the _PDC + * buffer matches what the hypervisor driver supports. + */ + xen_sanitize_pdc((uint32_t *)pdc_in->pointer->buffer.pointer); status = acpi_evaluate_object(handle, "_PDC", pdc_in, NULL);
if (ACPI_FAILURE(status))
On 21.11.2022 11:21, Roger Pau Monne wrote:
--- a/drivers/acpi/processor_pdc.c +++ b/drivers/acpi/processor_pdc.c @@ -137,6 +137,14 @@ acpi_processor_eval_pdc(acpi_handle handle, struct acpi_object_list *pdc_in) buffer[2] &= ~(ACPI_PDC_C_C2C3_FFH | ACPI_PDC_C_C1_FFH); }
- if (xen_initial_domain())
/*
* When Linux is running as Xen dom0 it's the hypervisor the
* entity in charge of the processor power management, and so
* Xen needs to check the OS capabilities reported in the _PDC
* buffer matches what the hypervisor driver supports.
*/
status = acpi_evaluate_object(handle, "_PDC", pdc_in, NULL);xen_sanitize_pdc((uint32_t *)pdc_in->pointer->buffer.pointer);
Again looking at our old XenoLinux forward port we had this inside the earlier if(), as an _alternative_ to the &= (I don't think it's valid to apply both the kernel's and Xen's adjustments). That would also let you use "buffer" rather than re-calculating it via yet another (risky from an abstract pov) cast.
It was the very nature of requiring Xen-specific conditionals which I understand was the reason why so far no attempt was made to get this (incl the corresponding logic for patch 1) into any upstream kernel.
Jan
On 21.11.2022 15:10, Jan Beulich wrote:
On 21.11.2022 11:21, Roger Pau Monne wrote:
--- a/drivers/acpi/processor_pdc.c +++ b/drivers/acpi/processor_pdc.c @@ -137,6 +137,14 @@ acpi_processor_eval_pdc(acpi_handle handle, struct acpi_object_list *pdc_in) buffer[2] &= ~(ACPI_PDC_C_C2C3_FFH | ACPI_PDC_C_C1_FFH); }
- if (xen_initial_domain())
/*
* When Linux is running as Xen dom0 it's the hypervisor the
* entity in charge of the processor power management, and so
* Xen needs to check the OS capabilities reported in the _PDC
* buffer matches what the hypervisor driver supports.
*/
status = acpi_evaluate_object(handle, "_PDC", pdc_in, NULL);xen_sanitize_pdc((uint32_t *)pdc_in->pointer->buffer.pointer);
Again looking at our old XenoLinux forward port we had this inside the earlier if(), as an _alternative_ to the &= (I don't think it's valid to apply both the kernel's and Xen's adjustments). That would also let you use "buffer" rather than re-calculating it via yet another (risky from an abstract pov) cast.
Oh, I notice this can end up being misleading: Besides having it in the earlier if() we had also #ifdef-ed out that if() itself (keeping just the body). The equivalent of this here might then be
if (boot_option_idle_override == IDLE_NOMWAIT || xen_initial_domain()) {
Jan
On Mon, Nov 21, 2022 at 03:10:36PM +0100, Jan Beulich wrote:
On 21.11.2022 11:21, Roger Pau Monne wrote:
--- a/drivers/acpi/processor_pdc.c +++ b/drivers/acpi/processor_pdc.c @@ -137,6 +137,14 @@ acpi_processor_eval_pdc(acpi_handle handle, struct acpi_object_list *pdc_in) buffer[2] &= ~(ACPI_PDC_C_C2C3_FFH | ACPI_PDC_C_C1_FFH); }
- if (xen_initial_domain())
/*
* When Linux is running as Xen dom0 it's the hypervisor the
* entity in charge of the processor power management, and so
* Xen needs to check the OS capabilities reported in the _PDC
* buffer matches what the hypervisor driver supports.
*/
status = acpi_evaluate_object(handle, "_PDC", pdc_in, NULL);xen_sanitize_pdc((uint32_t *)pdc_in->pointer->buffer.pointer);
Again looking at our old XenoLinux forward port we had this inside the earlier if(), as an _alternative_ to the &= (I don't think it's valid to apply both the kernel's and Xen's adjustments). That would also let you use "buffer" rather than re-calculating it via yet another (risky from an abstract pov) cast.
Hm, I've wondered this and decided it wasn't worth to short-circuit the boot_option_idle_override conditional because ACPI_PDC_C_C2C3_FFH and ACPI_PDC_C_C1_FFH will be set anyway by Xen in arch_acpi_set_pdc_bits() as part of ACPI_PDC_C_CAPABILITY_SMP.
I could re-use some of the code in there, but didn't want to make it more difficult to read just for the benefit of reusing buffer.
It was the very nature of requiring Xen-specific conditionals which I understand was the reason why so far no attempt was made to get this (incl the corresponding logic for patch 1) into any upstream kernel.
Yes, well, it's all kind of ugly. Hence my suggestion to simply avoid doing any ACPI Processor object handling in Linux with the native code and handle it all in a Xen specific driver. That requires the Xen driver being able to fetch more data itself form the ACPI Processor methods, but also unties it from the dependency on the data being filled by the generic code, and the 'tricks' is plays into fooling generic code to think certain processors are online.
Thanks, Roger.
Hi, Roger,
On Mon, Nov 21, 2022 at 10:04 AM Roger Pau Monné roger.pau@citrix.com wrote:
On Mon, Nov 21, 2022 at 03:10:36PM +0100, Jan Beulich wrote:
On 21.11.2022 11:21, Roger Pau Monne wrote:
--- a/drivers/acpi/processor_pdc.c +++ b/drivers/acpi/processor_pdc.c @@ -137,6 +137,14 @@ acpi_processor_eval_pdc(acpi_handle handle, struct acpi_object_list *pdc_in) buffer[2] &= ~(ACPI_PDC_C_C2C3_FFH | ACPI_PDC_C_C1_FFH);
}
- if (xen_initial_domain())
/*
* When Linux is running as Xen dom0 it's the hypervisor the
* entity in charge of the processor power management, and so
* Xen needs to check the OS capabilities reported in the _PDC
* buffer matches what the hypervisor driver supports.
*/
status = acpi_evaluate_object(handle, "_PDC", pdc_in, NULL);xen_sanitize_pdc((uint32_t *)pdc_in->pointer->buffer.pointer);
Again looking at our old XenoLinux forward port we had this inside the earlier if(), as an _alternative_ to the &= (I don't think it's valid to apply both the kernel's and Xen's adjustments). That would also let you use "buffer" rather than re-calculating it via yet another (risky from an abstract pov) cast.
Hm, I've wondered this and decided it wasn't worth to short-circuit the boot_option_idle_override conditional because ACPI_PDC_C_C2C3_FFH and ACPI_PDC_C_C1_FFH will be set anyway by Xen in arch_acpi_set_pdc_bits() as part of ACPI_PDC_C_CAPABILITY_SMP.
I could re-use some of the code in there, but didn't want to make it more difficult to read just for the benefit of reusing buffer.
It was the very nature of requiring Xen-specific conditionals which I understand was the reason why so far no attempt was made to get this (incl the corresponding logic for patch 1) into any upstream kernel.
Yes, well, it's all kind of ugly. Hence my suggestion to simply avoid doing any ACPI Processor object handling in Linux with the native code and handle it all in a Xen specific driver. That requires the Xen driver being able to fetch more data itself form the ACPI Processor methods, but also unties it from the dependency on the data being filled by the generic code, and the 'tricks' is plays into fooling generic code to think certain processors are online.
Are you working on this patch anymore? My Xen HWP patches need a Linux patch like this one to set bit 12 in the PDC. I had an affected user test with this patch and it worked, serving as an equivalent of Linux commit a21211672c9a ("ACPI / processor: Request native thermal interrupt handling via _OSC").
Another idea is to use Linux's arch_acpi_set_pdc_bits() to make the hypercall to Xen. It occurs earlier: acpi_processor_set_pdc() acpi_processor_alloc_pdc() acpi_set_pdc_bits() arch_acpi_set_pdc_bits() acpi_processor_eval_pdc()
So the IDLE_NOMWAIT masking in acpi_processor_eval_pdc() would still apply. arch_acpi_set_pdc_bits() is provided the buffer, so it's a little cleaner in that respect.
Thanks, Jason
On Wed, Jun 14, 2023 at 03:57:11PM -0400, Jason Andryuk wrote:
Hi, Roger,
On Mon, Nov 21, 2022 at 10:04 AM Roger Pau Monné roger.pau@citrix.com wrote:
On Mon, Nov 21, 2022 at 03:10:36PM +0100, Jan Beulich wrote:
On 21.11.2022 11:21, Roger Pau Monne wrote:
--- a/drivers/acpi/processor_pdc.c +++ b/drivers/acpi/processor_pdc.c @@ -137,6 +137,14 @@ acpi_processor_eval_pdc(acpi_handle handle, struct acpi_object_list *pdc_in) buffer[2] &= ~(ACPI_PDC_C_C2C3_FFH | ACPI_PDC_C_C1_FFH);
}
- if (xen_initial_domain())
/*
* When Linux is running as Xen dom0 it's the hypervisor the
* entity in charge of the processor power management, and so
* Xen needs to check the OS capabilities reported in the _PDC
* buffer matches what the hypervisor driver supports.
*/
status = acpi_evaluate_object(handle, "_PDC", pdc_in, NULL);xen_sanitize_pdc((uint32_t *)pdc_in->pointer->buffer.pointer);
Again looking at our old XenoLinux forward port we had this inside the earlier if(), as an _alternative_ to the &= (I don't think it's valid to apply both the kernel's and Xen's adjustments). That would also let you use "buffer" rather than re-calculating it via yet another (risky from an abstract pov) cast.
Hm, I've wondered this and decided it wasn't worth to short-circuit the boot_option_idle_override conditional because ACPI_PDC_C_C2C3_FFH and ACPI_PDC_C_C1_FFH will be set anyway by Xen in arch_acpi_set_pdc_bits() as part of ACPI_PDC_C_CAPABILITY_SMP.
I could re-use some of the code in there, but didn't want to make it more difficult to read just for the benefit of reusing buffer.
It was the very nature of requiring Xen-specific conditionals which I understand was the reason why so far no attempt was made to get this (incl the corresponding logic for patch 1) into any upstream kernel.
Yes, well, it's all kind of ugly. Hence my suggestion to simply avoid doing any ACPI Processor object handling in Linux with the native code and handle it all in a Xen specific driver. That requires the Xen driver being able to fetch more data itself form the ACPI Processor methods, but also unties it from the dependency on the data being filled by the generic code, and the 'tricks' is plays into fooling generic code to think certain processors are online.
Are you working on this patch anymore?
Not really, I didn't get any feedback from maintainers (apart from Jans comments, which I do value), and wasn't aware of this causing issues, or being required by any other work, hence I kind of dropped it (I have plenty of other stuff to work on).
My Xen HWP patches need a Linux patch like this one to set bit 12 in the PDC. I had an affected user test with this patch and it worked, serving as an equivalent of Linux commit a21211672c9a ("ACPI / processor: Request native thermal interrupt handling via _OSC").
Another idea is to use Linux's arch_acpi_set_pdc_bits() to make the hypercall to Xen. It occurs earlier: acpi_processor_set_pdc() acpi_processor_alloc_pdc() acpi_set_pdc_bits() arch_acpi_set_pdc_bits() acpi_processor_eval_pdc()
So the IDLE_NOMWAIT masking in acpi_processor_eval_pdc() would still apply. arch_acpi_set_pdc_bits() is provided the buffer, so it's a little cleaner in that respect.
I see. My reasoning for placing the Xen filtering in acpi_processor_eval_pdc() is so that there are no further modifications to the buffer by Linux after the call to sanitize the buffer (XENPF_set_processor_pminfo).
I think if the filtering done by Xen is moved to arch_acpi_set_pdc_bits() we would then need to disable the evaluation of boot_option_idle_override in acpi_processor_eval_pdc() as we don't want dom0 choices affecting the selection of _PDC features done by Xen?
In any case, feel free to pick this patch and re-submit upstream if you want.
Thanks, Roger.
linux-stable-mirror@lists.linaro.org