On 15.02.2022 00:28, Borislav Petkov wrote:
On Mon, Feb 14, 2022 at 02:41:21PM -0800, Pawan Gupta wrote:
X86_FEATURE_RTM_ALWAYS_ABORT is the precondition for MSR_TFA_TSX_CPUID_CLEAR bit to exist. For current callers of tsx_clear_cpuid() this condition is met, and test for X86_FEATURE_RTM_ALWAYS_ABORT can be removed. But, all the future callers must also have this check, otherwise the MSR write will fault.
I meant something like this (completely untested):
diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c index c2343ea911e8..9d08a6b1726a 100644 --- a/arch/x86/kernel/cpu/tsx.c +++ b/arch/x86/kernel/cpu/tsx.c @@ -84,7 +84,7 @@ static enum tsx_ctrl_states x86_get_tsx_auto_mode(void) return TSX_CTRL_ENABLE; }
-void tsx_clear_cpuid(void) +bool tsx_clear_cpuid(void) { u64 msr;
@@ -97,11 +97,14 @@ void tsx_clear_cpuid(void) rdmsrl(MSR_TSX_FORCE_ABORT, msr); msr |= MSR_TFA_TSX_CPUID_CLEAR; wrmsrl(MSR_TSX_FORCE_ABORT, msr);
} else if (tsx_ctrl_is_supported()) { rdmsrl(MSR_IA32_TSX_CTRL, msr); msr |= TSX_CTRL_CPUID_CLEAR; wrmsrl(MSR_IA32_TSX_CTRL, msr);return true;
This will clear TSX CPUID when X86_FEATURE_RTM_ALWAYS_ABORT=0 also, because ...
}return true;
- return false;
}
void __init tsx_init(void) @@ -114,9 +117,8 @@ void __init tsx_init(void) * RTM_ALWAYS_ABORT is set. In this case, it is better not to enumerate * CPUID.RTM and CPUID.HLE bits. Clear them here. */
- if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT)) {
- if (tsx_clear_cpuid()) {
... we are calling tsx_clear_cpuid() unconditionally.
tsx_ctrl_state = TSX_CTRL_RTM_ALWAYS_ABORT;
setup_clear_cpu_cap(X86_FEATURE_RTM); setup_clear_cpu_cap(X86_FEATURE_HLE); return;tsx_clear_cpuid();
but I'm guessing TSX should be disabled by default during boot only when X86_FEATURE_RTM_ALWAYS_ABORT is set.
That is correct.
If those CPUs which support only disabling TSX through MSR_IA32_TSX_CTRL but don't have MSR_TSX_FORCE_ABORT - if those CPUs set X86_FEATURE_RTM_ALWAYS_ABORT too, then this should work.
There are certain cases where this will leave the system in an inconsistent state, for example smt toggle after a late microcode update
What is a "smt toggle"?
Sorry, I should have been more clear. I meant:
# echo off > /sys/devices/system/cpu/smt/control # echo on > /sys/devices/system/cpu/smt/control
You mean late microcode update and then offlining and onlining all logical CPUs except the BSP which would re-detect CPUID features?
That could also be the problematic case.
that adds CPUID.RTM_ALWAYS_ABORT=1. During an smt toggle, if we unconditionally clear CPUID.RTM and CPUID.HLE in init_intel(), half of the CPUs will report TSX feature and other half will not.
That is important and should be documented. Something like this perhaps:
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 8321c43554a1..6c7bca9d6f2e 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -722,6 +722,13 @@ static void init_intel(struct cpuinfo_x86 *c) else if (tsx_ctrl_state == TSX_CTRL_DISABLE) tsx_disable(); else if (tsx_ctrl_state == TSX_CTRL_RTM_ALWAYS_ABORT)
/*
* This call doesn't clear RTM and HLE X86_FEATURE bits because
* a late microcode reload adding MSR_TSX_FORCE_ABORT can cause
* for those bits to get cleared - something which the kernel
* cannot do due to userspace potentially already using said
* features.
tsx_clear_cpuid();*/
Thanks, I will add this comment in the next revision.
Pawan