Hi.
I tested this on AMD Ryzen & Intel Broadwell system and dumped the boot_cpu_data before and after a microcode update. On the Intel system I also did a fatal MCE using mce-inject to confirm the output from the mce handling code.
P.
---8<---
On systems where a runtime microcode update has occurred the microcode version output in a MCE log record is wrong because boot_cpu_data.microcode is not updated during runtime.
Update boot_cpu_data.microcode when the BSP's microcode is updated.
Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records") Suggested-by: Borislav Petkov bp@alien8.com Signed-off-by: Prarit Bhargava prarit@redhat.com Cc: stable@vger.kernel.org Cc: sironi@amazon.de Cc: tony.luck@intel.com
Changes in v2: Use mc_amd->hdr.patch_id on AMD
arch/x86/kernel/cpu/microcode/amd.c | 4 ++++ arch/x86/kernel/cpu/microcode/intel.c | 4 ++++ 2 files changed, 8 insertions(+)
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c index 0624957aa068..63b072377ba4 100644 --- a/arch/x86/kernel/cpu/microcode/amd.c +++ b/arch/x86/kernel/cpu/microcode/amd.c @@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int cpu) uci->cpu_sig.rev = mc_amd->hdr.patch_id; c->microcode = mc_amd->hdr.patch_id;
- /* Update boot_cpu_data's revision too, if we're on the BSP: */
- if (c->cpu_index == boot_cpu_data.cpu_index)
boot_cpu_data.microcode = mc_amd->hdr.patch_id;
- return UCODE_UPDATED;
}
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c index 97ccf4c3b45b..256d336cbc04 100644 --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int cpu) uci->cpu_sig.rev = rev; c->microcode = rev;
- /* Update boot_cpu_data's revision too, if we're on the BSP: */
- if (c->cpu_index == boot_cpu_data.cpu_index)
boot_cpu_data.microcode = rev;
- return UCODE_UPDATED;
}
-- 2.17.0
After this patch, do we preserve an original microcode version somewhere? If no, why? Sometimes it is useful while debugging another crash because of faulty microcode.
Thanks.