From: Yazen Ghannam <yazen.ghannam(a)amd.com>
The Instruction Fetch (IF) units on AMD Zen-based systems do not
guarantee a synchronous #MC is delivered. Therefore, MCG_STATUS[EIPV|RIPV]
will not be set. However, the microarchitecture does guarantee that the
exception is delivered within the same context. In other words, the
exact rIP is not known, but the context is known to not have changed.
There is no architecturally-defined method to determine this behavior.
The Code Segment (CS) register is always valid on AMD Zen-based IF units
regardless of the value of MCG_STATUS[EIPV|RIPV].
Add a quirk for all current Zen-based systems to save the CS register
for the IF banks.
This is needed to properly determine the context of the error.
Otherwise, the severity grading function will assume the context is
IN_KERNEL due to the m->cs value being 0 (the initialized value). This
leads to unnecessary kernel panics on data poison errors due to the
kernel believing the poison consumption occurred in kernel context.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Yazen Ghannam <yazen.ghannam(a)amd.com>
---
arch/x86/kernel/cpu/mce/amd.c | 17 +++++++++++++++++
arch/x86/kernel/cpu/mce/core.c | 7 +++++++
arch/x86/kernel/cpu/mce/internal.h | 2 ++
3 files changed, 26 insertions(+)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index e486f96b3cb3..141dcdd857b5 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -180,6 +180,23 @@ static struct smca_hwid smca_hwid_mcatypes[] = {
struct smca_bank smca_banks[MAX_NR_BANKS];
EXPORT_SYMBOL_GPL(smca_banks);
+/*
+ * Zen-based Instruction Fetch Units set EIPV=RIPV=0 on poison consumption
+ * errors (XEC = 12). However, the context is still valid, so save the CS
+ * register for later use.
+ */
+void quirk_zen_ifu(int bank, struct mce *m, struct pt_regs *regs)
+{
+ if (smca_get_bank_type(bank) != SMCA_IF)
+ return;
+ if ((m->mcgstatus & (MCG_STATUS_EIPV|MCG_STATUS_RIPV)) != 0)
+ return;
+ if (((m->status >> 16) & 0x1F) != 12)
+ return;
+
+ m->cs = regs->cs;
+}
+
/*
* In SMCA enabled processors, we can have multiple banks for a given IP type.
* So to define a unique name for each bank, we use a temp c-string to append
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index bf7fe87a7e88..308fb644b94a 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1754,6 +1754,13 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
if (c->x86 == 0x15 && c->x86_model <= 0xf)
mce_flags.overflow_recov = 1;
+ if (c->x86 == 0x17 || c->x86 == 0x19)
+ quirk_no_way_out = quirk_zen_ifu;
+ }
+
+ if (c->x86_vendor == X86_VENDOR_HYGON) {
+ if (c->x86 == 0x18)
+ quirk_no_way_out = quirk_zen_ifu;
}
if (c->x86_vendor == X86_VENDOR_INTEL) {
diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h
index 88dcc79cfb07..656d5d6c9783 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -181,8 +181,10 @@ extern struct mca_msr_regs msr_ops;
extern bool filter_mce(struct mce *m);
#ifdef CONFIG_X86_MCE_AMD
+extern void quirk_zen_ifu(int bank, struct mce *m, struct pt_regs *regs);
extern bool amd_filter_mce(struct mce *m);
#else
+#define quirk_zen_ifu NULL
static inline bool amd_filter_mce(struct mce *m) { return false; };
#endif
--
2.25.1
This is the start of the stable review cycle for the 5.10.34 release.
There are 2 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 02 May 2021 14:19:04 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.34-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.34-rc1
Tomas Winkler <tomas.winkler(a)intel.com>
mei: me: add Alder Lake P device id.
Jiri Kosina <jkosina(a)suse.cz>
iwlwifi: Fix softirq/hardirq disabling in iwl_pcie_gen2_enqueue_hcmd()
-------------
Diffstat:
Makefile | 4 ++--
drivers/misc/mei/hw-me-regs.h | 1 +
drivers/misc/mei/pci-me.c | 1 +
drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c | 7 ++++---
4 files changed, 8 insertions(+), 5 deletions(-)