Hello!
The first patch fixes an issue reported by Sami, where linux panic()s when bringing secondary CPUs online. The problem was the Spectre workarounds trying to allocate a new slot for mitigating KVM when those pages are no longer writeable.
While debugging that issue, I spotted the Spectre-BHB KVM mitigation was over-riding the Spectre-v2 KVM Mitigation. It's supposed to happen the other way round.
The backports aren't the same as mainline because the spectre mitigation code was totally rewritten for v5.10, and prior to that the KVM infrastructure is very different.
Thanks,
James Morse (2): arm64: Fix panic() when Spectre-v2 causes Spectre-BHB to re-allocate KVM vectors arm64: errata: Fix KVM Spectre-v2 mitigation selection for Cortex-A57/A72
arch/arm64/kernel/cpu_errata.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-)
Sami reports that linux panic()s when resuming from suspend to RAM. This is because when CPUs are brought back online, they re-enable any necessary mitigations.
The Spectre-v2 and Spectre-BHB mitigations interact as both need to done by KVM when exiting a guest. Slots KVM can use as vectors are allocated, and templates for the mitigation are patched into the vector.
This fails if a new slot needs to be allocated once the kernel has finished booting as it is no-longer possible to modify KVM's vectors: | root@adam:/sys/devices/system/cpu/cpu1# echo 1 > online | Unable to handle kernel write to read-only memory at virtual add> | Mem abort info: | ESR = 0x9600004e | Exception class = DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | Data abort info: | ISV = 0, ISS = 0x0000004e | CM = 0, WnR = 1 | swapper pgtable: 4k pages, 48-bit VAs, pgdp = 000000000f07a71c | [ffff800000b4b800] pgd=00000009ffff8803, pud=00000009ffff7803, p> | Internal error: Oops: 9600004e [#1] PREEMPT SMP | Modules linked in: | Process swapper/1 (pid: 0, stack limit = 0x0000000063153c53) | CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.19.252-dirty #14 | Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno De> | pstate: 000001c5 (nzcv dAIF -PAN -UAO) | pc : __memcpy+0x48/0x180 | lr : __copy_hyp_vect_bpi+0x64/0x90
| Call trace: | __memcpy+0x48/0x180 | kvm_setup_bhb_slot+0x204/0x2a8 | spectre_bhb_enable_mitigation+0x1b8/0x1d0 | __verify_local_cpu_caps+0x54/0xf0 | check_local_cpu_capabilities+0xc4/0x184 | secondary_start_kernel+0xb0/0x170 | Code: b8404423 b80044c3 36180064 f8408423 (f80084c3) | ---[ end trace 859bcacb09555348 ]--- | Kernel panic - not syncing: Attempted to kill the idle task! | SMP: stopping secondary CPUs | Kernel Offset: disabled | CPU features: 0x10,25806086 | Memory Limit: none | ---[ end Kernel panic - not syncing: Attempted to kill the idle ]
This is only a problem on platforms where there is only one CPU that is vulnerable to both Spectre-v2 and Spectre-BHB.
The Spectre-v2 mitigation identifies the slot it can re-use by the CPU's 'fn'. It unconditionally writes the slot number and 'template_start' pointer. The Spectre-BHB mitigation identifies slots it can re-use by the CPU's template_start pointer, which was previously clobbered by the Spectre-v2 mitigation.
When there is only one CPU that is vulnerable to both issues, this causes Spectre-v2 to try to allocate a new slot, which fails.
Change both mitigations to check whether they are changing the slot this CPU uses before writing the percpu variables again.
This issue only exists in the stable backports for Spectre-BHB which have to use totally different infrastructure to mainline.
Reported-by: Sami Lee sami.lee@mediatek.com Fixes: 9013fd4bc958 ("arm64: Mitigate spectre style branch history side channels") Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/kernel/cpu_errata.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 4c7545cf5a02..2a7c05640b38 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -170,9 +170,12 @@ static void install_bp_hardening_cb(bp_hardening_cb_t fn, __copy_hyp_vect_bpi(slot, hyp_vecs_start, hyp_vecs_end); }
- __this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot); - __this_cpu_write(bp_hardening_data.fn, fn); - __this_cpu_write(bp_hardening_data.template_start, hyp_vecs_start); + if (fn != __this_cpu_read(bp_hardening_data.fn)) { + __this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot); + __this_cpu_write(bp_hardening_data.fn, fn); + __this_cpu_write(bp_hardening_data.template_start, + hyp_vecs_start); + } raw_spin_unlock(&bp_lock); } #else @@ -1320,8 +1323,11 @@ static void kvm_setup_bhb_slot(const char *hyp_vecs_start) __copy_hyp_vect_bpi(slot, hyp_vecs_start, hyp_vecs_end); }
- __this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot); - __this_cpu_write(bp_hardening_data.template_start, hyp_vecs_start); + if (hyp_vecs_start != __this_cpu_read(bp_hardening_data.template_start)) { + __this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot); + __this_cpu_write(bp_hardening_data.template_start, + hyp_vecs_start); + } raw_spin_unlock(&bp_lock); } #else
Both the Spectre-v2 and Spectre-BHB mitigations involve running a sequence immediately after exiting a guest, before any branches. In the stable kernels these sequences are built by copying templates into an empty vector slot.
For Spectre-BHB, Cortex-A57 and A72 require the branchy loop with k=8. If Spectre-v2 needs mitigating at the same time, a firmware call to EL3 is needed. The work EL3 does at this point is also enough to mitigate Spectre-BHB.
When enabling the Spectre-BHB mitigation, spectre_bhb_enable_mitigation() should check if a slot has already been allocated for Spectre-v2, meaning no work is needed for Spectre-BHB.
This check was missed in the earlier backport, add it.
Fixes: 9013fd4bc958 ("arm64: Mitigate spectre style branch history side channels") Signed-off-by: James Morse james.morse@arm.com --- arch/arm64/kernel/cpu_errata.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 2a7c05640b38..b18f307a3c59 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -1363,7 +1363,13 @@ void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *entry) } else if (spectre_bhb_loop_affected(SCOPE_LOCAL_CPU)) { switch (spectre_bhb_loop_affected(SCOPE_SYSTEM)) { case 8: - kvm_setup_bhb_slot(__spectre_bhb_loop_k8_start); + /* + * A57/A72-r0 will already have selected the + * spectre-indirect vector, which is sufficient + * for BHB too. + */ + if (!__this_cpu_read(bp_hardening_data.fn)) + kvm_setup_bhb_slot(__spectre_bhb_loop_k8_start); break; case 24: kvm_setup_bhb_slot(__spectre_bhb_loop_k24_start);
On Wed, Nov 30, 2022 at 06:28:17PM +0000, James Morse wrote:
Hello!
The first patch fixes an issue reported by Sami, where linux panic()s when bringing secondary CPUs online. The problem was the Spectre workarounds trying to allocate a new slot for mitigating KVM when those pages are no longer writeable.
While debugging that issue, I spotted the Spectre-BHB KVM mitigation was over-riding the Spectre-v2 KVM Mitigation. It's supposed to happen the other way round.
The backports aren't the same as mainline because the spectre mitigation code was totally rewritten for v5.10, and prior to that the KVM infrastructure is very different.
All now queued up, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org