On 14.02.2022 18:38, Borislav Petkov wrote:
On Wed, Feb 09, 2022 at 01:04:36PM -0800, Pawan Gupta wrote:
tsx_clear_cpuid() uses MSR_TSX_FORCE_ABORT to clear CPUID.RTM and CPUID.HLE. Not all CPUs support MSR_TSX_FORCE_ABORT, alternatively use MSR_IA32_TSX_CTRL when supported.
Fixes: 293649307ef9 ("x86/tsx: Clear CPUID bits when TSX always force aborts") Reported-by: kernel test robot lkp@intel.com Tested-by: Neelima Krishnan neelima.krishnan@intel.com Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com
<--- I'm assuming this needs to be
?
Yes, this needs to be backported to a few kernels that have the commit 293649307ef9 ("x86/tsx: Clear CPUID bits when TSX always force aborts"). Once this is reviewed, I will send a separate email to stable@ with the list of stable kernels.
@@ -106,13 +110,11 @@ void __init tsx_init(void) int ret;
/*
* Hardware will always abort a TSX transaction if both CPUID bits
* RTM_ALWAYS_ABORT and TSX_FORCE_ABORT are set. In this case, it is
* better not to enumerate CPUID.RTM and CPUID.HLE bits. Clear them
* here.
* Hardware will always abort a TSX transaction when CPUID
* RTM_ALWAYS_ABORT is set. In this case, it is better not to enumerate
*/* CPUID.RTM and CPUID.HLE bits. Clear them here.
- if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) &&
boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)) {
- if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT)) {
So you test here X86_FEATURE_RTM_ALWAYS_ABORT and tsx_clear_cpuid() tests it again. Simplify I guess.
X86_FEATURE_RTM_ALWAYS_ABORT is the precondition for MSR_TFA_TSX_CPUID_CLEAR bit to exist. For current callers of tsx_clear_cpuid() this condition is met, and test for X86_FEATURE_RTM_ALWAYS_ABORT can be removed. But, all the future callers must also have this check, otherwise the MSR write will fault.
tsx_ctrl_state = TSX_CTRL_RTM_ALWAYS_ABORT; tsx_clear_cpuid(); setup_clear_cpu_cap(X86_FEATURE_RTM);
Also, here you clear X86_FEATURE_RTM and X86_FEATURE_HLE but the other caller of tsx_clear_cpuid() - init_intel() - doesn't clear those bits. Why?
Calling setup_clear_cpu_cap() by boot CPU is sufficient to clear these bits for secondary CPUs. Moreover, init_intel() is also called during cpu hotplug, clearing cached CPUID bits will not be safe after boot.
IOW, can we concentrate the whole tsx_clear_cpuid() logic inside it and have callers only call it unconditionally. Then the decision whether to disable TSX and clear bits will happen all solely in that function instead of being spread around the boot code and being inconsistent.
There are certain cases where this will leave the system in an inconsistent state, for example smt toggle after a late microcode update that adds CPUID.RTM_ALWAYS_ABORT=1. During an smt toggle, if we unconditionally clear CPUID.RTM and CPUID.HLE in init_intel(), half of the CPUs will report TSX feature and other half will not.
Thanks, Pawan