From: Anirudh Rayabharam (Microsoft) anirudh@anirudhrb.com
9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline") introduces a new cpuhp state for hyperv initialization.
cpuhp_setup_state() returns the state number if state is CPUHP_AP_ONLINE_DYN or CPUHP_BP_PREPARE_DYN and 0 for all other states. For the hyperv case, since a new cpuhp state was introduced it would return 0. However, in hv_machine_shutdown(), the cpuhp_remove_state() call is conditioned upon "hyperv_init_cpuhp > 0". This will never be true and so hv_cpu_die() won't be called on all CPUs. This means the VP assist page won't be reset. When the kexec kernel tries to setup the VP assist page again, the hypervisor corrupts the memory region of the old VP assist page causing a panic in case the kexec kernel is using that memory elsewhere. This was originally fixed in dfe94d4086e4 ("x86/hyperv: Fix kexec panic/hang issues").
Set hyperv_init_cpuhp to CPUHP_AP_HYPERV_ONLINE upon successful setup so that the hyperv cpuhp state is removed correctly on kexec and the necessary cleanup takes place.
Cc: stable@vger.kernel.org Fixes: 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline") Signed-off-by: Anirudh Rayabharam (Microsoft) anirudh@anirudhrb.com --- arch/x86/hyperv/hv_init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 17a71e92a343..81d1981a75d1 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -607,7 +607,7 @@ void __init hyperv_init(void)
register_syscore_ops(&hv_syscore_ops);
- hyperv_init_cpuhp = cpuhp; + hyperv_init_cpuhp = CPUHP_AP_HYPERV_ONLINE;
if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_ACCESS_PARTITION_ID) hv_get_partition_id(); @@ -637,7 +637,7 @@ void __init hyperv_init(void) clean_guest_os_id: wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0); hv_ivm_msr_write(HV_X64_MSR_GUEST_OS_ID, 0); - cpuhp_remove_state(cpuhp); + cpuhp_remove_state(CPUHP_AP_HYPERV_ONLINE); free_ghcb_page: free_percpu(hv_ghcb_pg); free_vp_assist_page:
Anirudh Rayabharam anirudh@anirudhrb.com writes:
From: Anirudh Rayabharam (Microsoft) anirudh@anirudhrb.com
9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline") introduces a new cpuhp state for hyperv initialization.
cpuhp_setup_state() returns the state number if state is CPUHP_AP_ONLINE_DYN or CPUHP_BP_PREPARE_DYN and 0 for all other states. For the hyperv case, since a new cpuhp state was introduced it would return 0. However, in hv_machine_shutdown(), the cpuhp_remove_state() call is conditioned upon "hyperv_init_cpuhp > 0". This will never be true and so hv_cpu_die() won't be called on all CPUs. This means the VP assist page won't be reset. When the kexec kernel tries to setup the VP assist page again, the hypervisor corrupts the memory region of the old VP assist page causing a panic in case the kexec kernel is using that memory elsewhere. This was originally fixed in dfe94d4086e4 ("x86/hyperv: Fix kexec panic/hang issues").
Set hyperv_init_cpuhp to CPUHP_AP_HYPERV_ONLINE upon successful setup so that the hyperv cpuhp state is removed correctly on kexec and the necessary cleanup takes place.
Cc: stable@vger.kernel.org Fixes: 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline") Signed-off-by: Anirudh Rayabharam (Microsoft) anirudh@anirudhrb.com
arch/x86/hyperv/hv_init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 17a71e92a343..81d1981a75d1 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -607,7 +607,7 @@ void __init hyperv_init(void) register_syscore_ops(&hv_syscore_ops);
- hyperv_init_cpuhp = cpuhp;
- hyperv_init_cpuhp = CPUHP_AP_HYPERV_ONLINE;
Do we really need 'hyperv_init_cpuhp' at all? I.e. post-change (which LGTM btw), I can only see one usage in hv_machine_shutdown():
if (kexec_in_progress && hyperv_init_cpuhp > 0) cpuhp_remove_state(hyperv_init_cpuhp);
and I'm wondering if the 'hyperv_init_cpuhp' check is really needed. This only case where this check would fail is if we're crashing in between ms_hyperv_init_platform() and hyperv_init() afaiu. Does it hurt if we try cpuhp_remove_state() anyway?
if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_ACCESS_PARTITION_ID) hv_get_partition_id(); @@ -637,7 +637,7 @@ void __init hyperv_init(void) clean_guest_os_id: wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0); hv_ivm_msr_write(HV_X64_MSR_GUEST_OS_ID, 0);
- cpuhp_remove_state(cpuhp);
- cpuhp_remove_state(CPUHP_AP_HYPERV_ONLINE);
free_ghcb_page: free_percpu(hv_ghcb_pg); free_vp_assist_page:
On Mon, Aug 26, 2024 at 02:36:44PM +0200, Vitaly Kuznetsov wrote:
Anirudh Rayabharam anirudh@anirudhrb.com writes:
From: Anirudh Rayabharam (Microsoft) anirudh@anirudhrb.com
9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline") introduces a new cpuhp state for hyperv initialization.
cpuhp_setup_state() returns the state number if state is CPUHP_AP_ONLINE_DYN or CPUHP_BP_PREPARE_DYN and 0 for all other states. For the hyperv case, since a new cpuhp state was introduced it would return 0. However, in hv_machine_shutdown(), the cpuhp_remove_state() call is conditioned upon "hyperv_init_cpuhp > 0". This will never be true and so hv_cpu_die() won't be called on all CPUs. This means the VP assist page won't be reset. When the kexec kernel tries to setup the VP assist page again, the hypervisor corrupts the memory region of the old VP assist page causing a panic in case the kexec kernel is using that memory elsewhere. This was originally fixed in dfe94d4086e4 ("x86/hyperv: Fix kexec panic/hang issues").
Set hyperv_init_cpuhp to CPUHP_AP_HYPERV_ONLINE upon successful setup so that the hyperv cpuhp state is removed correctly on kexec and the necessary cleanup takes place.
Cc: stable@vger.kernel.org Fixes: 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline") Signed-off-by: Anirudh Rayabharam (Microsoft) anirudh@anirudhrb.com
arch/x86/hyperv/hv_init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 17a71e92a343..81d1981a75d1 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -607,7 +607,7 @@ void __init hyperv_init(void) register_syscore_ops(&hv_syscore_ops);
- hyperv_init_cpuhp = cpuhp;
- hyperv_init_cpuhp = CPUHP_AP_HYPERV_ONLINE;
Do we really need 'hyperv_init_cpuhp' at all? I.e. post-change (which LGTM btw), I can only see one usage in hv_machine_shutdown():
if (kexec_in_progress && hyperv_init_cpuhp > 0) cpuhp_remove_state(hyperv_init_cpuhp);
and I'm wondering if the 'hyperv_init_cpuhp' check is really needed. This only case where this check would fail is if we're crashing in between ms_hyperv_init_platform() and hyperv_init() afaiu. Does it
Or if we fail to setup the cpuhp state for some reason but don't actually crash and then later do a kexec?
I guess I was just trying to be extra safe and make sure we have actually setup the cpuhp state before calling cpuhp_remove_state() for it. However, looking elsewhere in the kernel code I don't see anybody doing this for custom states...
hurt if we try cpuhp_remove_state() anyway?
cpuhp_invoke_callback() would trigger a WARNING if we try to remove a cpuhp state that was never setup.
184 if (cpuhp_step_empty(bringup, step)) { 185 WARN_ON_ONCE(1); 186 return 0; 187 }
Thanks, Anirudh
if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_ACCESS_PARTITION_ID) hv_get_partition_id(); @@ -637,7 +637,7 @@ void __init hyperv_init(void) clean_guest_os_id: wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0); hv_ivm_msr_write(HV_X64_MSR_GUEST_OS_ID, 0);
- cpuhp_remove_state(cpuhp);
- cpuhp_remove_state(CPUHP_AP_HYPERV_ONLINE);
free_ghcb_page: free_percpu(hv_ghcb_pg); free_vp_assist_page:
-- Vitaly
linux-stable-mirror@lists.linaro.org