Greetings,
this series is the backport of 19 upstream patches to add the current s390 spectre mitigation to kernel version 4.9.
It follows the x86 approach with array_index_nospec for the v1 spectre attack and retpoline/expoline for v2. As a fallback there is the ppa-12/ppa-13 based defense which requires an micro-code update.
Christian Borntraeger (3): KVM: s390: wire up bpb feature KVM: s390: force bp isolation for VSIE s390/entry.S: fix spurious zeroing of r0
Eugeniu Rosca (1): s390: Replace IS_ENABLED(EXPOLINE_*) with IS_ENABLED(CONFIG_EXPOLINE_*)
Heiko Carstens (1): s390: enable CPU alternatives unconditionally
Martin Schwidefsky (13): s390: scrub registers on kernel entry and KVM exit s390: add optimized array_index_mask_nospec s390/alternative: use a copy of the facility bit mask s390: add options to change branch prediction behaviour for the kernel s390: run user space and KVM guests with modified branch prediction s390: introduce execute-trampolines for branches s390: do not bypass BPENTER for interrupt system calls s390: move nobp parameter functions to nospec-branch.c s390: add automatic detection of the spectre defense s390: report spectre mitigation via syslog s390: add sysfs attributes for spectre s390: correct nospec auto detection init order s390: correct module section names for expoline code revert
Vasily Gorbik (1): s390: introduce CPU alternatives
Documentation/kernel-parameters.txt | 3 + arch/s390/Kconfig | 47 +++++++ arch/s390/Makefile | 10 ++ arch/s390/include/asm/alternative.h | 149 ++++++++++++++++++++ arch/s390/include/asm/barrier.h | 24 ++++ arch/s390/include/asm/facility.h | 18 +++ arch/s390/include/asm/kvm_host.h | 3 +- arch/s390/include/asm/lowcore.h | 7 +- arch/s390/include/asm/nospec-branch.h | 17 +++ arch/s390/include/asm/processor.h | 4 + arch/s390/include/asm/thread_info.h | 4 + arch/s390/include/uapi/asm/kvm.h | 5 +- arch/s390/kernel/Makefile | 6 +- arch/s390/kernel/alternative.c | 112 +++++++++++++++ arch/s390/kernel/early.c | 5 + arch/s390/kernel/entry.S | 250 ++++++++++++++++++++++++++++++---- arch/s390/kernel/ipl.c | 1 + arch/s390/kernel/module.c | 65 ++++++++- arch/s390/kernel/nospec-branch.c | 169 +++++++++++++++++++++++ arch/s390/kernel/processor.c | 18 +++ arch/s390/kernel/setup.c | 14 +- arch/s390/kernel/smp.c | 7 +- arch/s390/kernel/vmlinux.lds.S | 37 +++++ arch/s390/kvm/kvm-s390.c | 13 +- arch/s390/kvm/vsie.c | 30 ++++ drivers/s390/char/Makefile | 2 + include/uapi/linux/kvm.h | 1 + 27 files changed, 984 insertions(+), 37 deletions(-) create mode 100644 arch/s390/include/asm/alternative.h create mode 100644 arch/s390/include/asm/nospec-branch.h create mode 100644 arch/s390/kernel/alternative.c create mode 100644 arch/s390/kernel/nospec-branch.c
From: Vasily Gorbik gor@linux.vnet.ibm.com
[ Upstream commit 686140a1a9c41d85a4212a1c26d671139b76404b ]
Implement CPU alternatives, which allows to optionally patch newer instructions at runtime, based on CPU facilities availability.
A new kernel boot parameter "noaltinstr" disables patching.
Current implementation is derived from x86 alternatives. Although ideal instructions padding (when altinstr is longer then oldinstr) is added at compile time, and no oldinstr nops optimization has to be done at runtime. Also couple of compile time sanity checks are done: 1. oldinstr and altinstr must be <= 254 bytes long, 2. oldinstr and altinstr must not have an odd length.
alternative(oldinstr, altinstr, facility); alternative_2(oldinstr, altinstr1, facility1, altinstr2, facility2);
Both compile time and runtime padding consists of either 6/4/2 bytes nop or a jump (brcl) + 2 bytes nop filler if padding is longer then 6 bytes.
.altinstructions and .altinstr_replacement sections are part of __init_begin : __init_end region and are freed after initialization.
Signed-off-by: Vasily Gorbik gor@linux.vnet.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- Documentation/kernel-parameters.txt | 3 + arch/s390/Kconfig | 17 ++++ arch/s390/include/asm/alternative.h | 163 ++++++++++++++++++++++++++++++++++++ arch/s390/kernel/Makefile | 1 + arch/s390/kernel/alternative.c | 110 ++++++++++++++++++++++++ arch/s390/kernel/module.c | 17 ++++ arch/s390/kernel/setup.c | 3 + arch/s390/kernel/vmlinux.lds.S | 23 +++++ 8 files changed, 337 insertions(+) create mode 100644 arch/s390/include/asm/alternative.h create mode 100644 arch/s390/kernel/alternative.c
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 466c039c622b..5f9e51436a99 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2640,6 +2640,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
noalign [KNL,ARM]
+ noaltinstr [S390] Disables alternative instructions patching + (CPU alternatives feature). + noapic [SMP,APIC] Tells the kernel to not make use of any IOAPICs that may be present in the system.
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 9aa0d04c9dcc..f3e43262db9d 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -704,6 +704,22 @@ config SECCOMP
If unsure, say Y.
+config ALTERNATIVES + def_bool y + prompt "Patch optimized instructions for running CPU type" + help + When enabled the kernel code is compiled with additional + alternative instructions blocks optimized for newer CPU types. + These alternative instructions blocks are patched at kernel boot + time when running CPU supports them. This mechanism is used to + optimize some critical code paths (i.e. spinlocks) for newer CPUs + even if kernel is build to support older machine generations. + + This mechanism could be disabled by appending "noaltinstr" + option to the kernel command line. + + If unsure, say Y. + endmenu
menu "Power Management" @@ -753,6 +769,7 @@ config PFAULT config SHARED_KERNEL bool "VM shared kernel support" depends on !JUMP_LABEL + depends on !ALTERNATIVES help Select this option, if you want to share the text segment of the Linux kernel between different VM guests. This reduces memory diff --git a/arch/s390/include/asm/alternative.h b/arch/s390/include/asm/alternative.h new file mode 100644 index 000000000000..6c268f6a51d3 --- /dev/null +++ b/arch/s390/include/asm/alternative.h @@ -0,0 +1,163 @@ +#ifndef _ASM_S390_ALTERNATIVE_H +#define _ASM_S390_ALTERNATIVE_H + +#ifndef __ASSEMBLY__ + +#include <linux/types.h> +#include <linux/stddef.h> +#include <linux/stringify.h> + +struct alt_instr { + s32 instr_offset; /* original instruction */ + s32 repl_offset; /* offset to replacement instruction */ + u16 facility; /* facility bit set for replacement */ + u8 instrlen; /* length of original instruction */ + u8 replacementlen; /* length of new instruction */ +} __packed; + +#ifdef CONFIG_ALTERNATIVES +extern void apply_alternative_instructions(void); +extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end); +#else +static inline void apply_alternative_instructions(void) {}; +static inline void apply_alternatives(struct alt_instr *start, + struct alt_instr *end) {}; +#endif +/* + * |661: |662: |6620 |663: + * +-----------+---------------------+ + * | oldinstr | oldinstr_padding | + * | +----------+----------+ + * | | | | + * | | >6 bytes |6/4/2 nops| + * | |6 bytes jg-----------> + * +-----------+---------------------+ + * ^^ static padding ^^ + * + * .altinstr_replacement section + * +---------------------+-----------+ + * |6641: |6651: + * | alternative instr 1 | + * +-----------+---------+- - - - - -+ + * |6642: |6652: | + * | alternative instr 2 | padding + * +---------------------+- - - - - -+ + * ^ runtime ^ + * + * .altinstructions section + * +---------------------------------+ + * | alt_instr entries for each | + * | alternative instr | + * +---------------------------------+ + */ + +#define b_altinstr(num) "664"#num +#define e_altinstr(num) "665"#num + +#define e_oldinstr_pad_end "663" +#define oldinstr_len "662b-661b" +#define oldinstr_total_len e_oldinstr_pad_end"b-661b" +#define altinstr_len(num) e_altinstr(num)"b-"b_altinstr(num)"b" +#define oldinstr_pad_len(num) \ + "-(((" altinstr_len(num) ")-(" oldinstr_len ")) > 0) * " \ + "((" altinstr_len(num) ")-(" oldinstr_len "))" + +#define INSTR_LEN_SANITY_CHECK(len) \ + ".if " len " > 254\n" \ + "\t.error "cpu alternatives does not support instructions " \ + "blocks > 254 bytes"\n" \ + ".endif\n" \ + ".if (" len ") %% 2\n" \ + "\t.error "cpu alternatives instructions length is odd"\n" \ + ".endif\n" + +#define OLDINSTR_PADDING(oldinstr, num) \ + ".if " oldinstr_pad_len(num) " > 6\n" \ + "\tjg " e_oldinstr_pad_end "f\n" \ + "6620:\n" \ + "\t.fill (" oldinstr_pad_len(num) " - (6620b-662b)) / 2, 2, 0x0700\n" \ + ".else\n" \ + "\t.fill " oldinstr_pad_len(num) " / 6, 6, 0xc0040000\n" \ + "\t.fill " oldinstr_pad_len(num) " %% 6 / 4, 4, 0x47000000\n" \ + "\t.fill " oldinstr_pad_len(num) " %% 6 %% 4 / 2, 2, 0x0700\n" \ + ".endif\n" + +#define OLDINSTR(oldinstr, num) \ + "661:\n\t" oldinstr "\n662:\n" \ + OLDINSTR_PADDING(oldinstr, num) \ + e_oldinstr_pad_end ":\n" \ + INSTR_LEN_SANITY_CHECK(oldinstr_len) + +#define OLDINSTR_2(oldinstr, num1, num2) \ + "661:\n\t" oldinstr "\n662:\n" \ + ".if " altinstr_len(num1) " < " altinstr_len(num2) "\n" \ + OLDINSTR_PADDING(oldinstr, num2) \ + ".else\n" \ + OLDINSTR_PADDING(oldinstr, num1) \ + ".endif\n" \ + e_oldinstr_pad_end ":\n" \ + INSTR_LEN_SANITY_CHECK(oldinstr_len) + +#define ALTINSTR_ENTRY(facility, num) \ + "\t.long 661b - .\n" /* old instruction */ \ + "\t.long " b_altinstr(num)"b - .\n" /* alt instruction */ \ + "\t.word " __stringify(facility) "\n" /* facility bit */ \ + "\t.byte " oldinstr_total_len "\n" /* source len */ \ + "\t.byte " altinstr_len(num) "\n" /* alt instruction len */ + +#define ALTINSTR_REPLACEMENT(altinstr, num) /* replacement */ \ + b_altinstr(num)":\n\t" altinstr "\n" e_altinstr(num) ":\n" \ + INSTR_LEN_SANITY_CHECK(altinstr_len(num)) + +#ifdef CONFIG_ALTERNATIVES +/* alternative assembly primitive: */ +#define ALTERNATIVE(oldinstr, altinstr, facility) \ + ".pushsection .altinstr_replacement, "ax"\n" \ + ALTINSTR_REPLACEMENT(altinstr, 1) \ + ".popsection\n" \ + OLDINSTR(oldinstr, 1) \ + ".pushsection .altinstructions,"a"\n" \ + ALTINSTR_ENTRY(facility, 1) \ + ".popsection\n" + +#define ALTERNATIVE_2(oldinstr, altinstr1, facility1, altinstr2, facility2)\ + ".pushsection .altinstr_replacement, "ax"\n" \ + ALTINSTR_REPLACEMENT(altinstr1, 1) \ + ALTINSTR_REPLACEMENT(altinstr2, 2) \ + ".popsection\n" \ + OLDINSTR_2(oldinstr, 1, 2) \ + ".pushsection .altinstructions,"a"\n" \ + ALTINSTR_ENTRY(facility1, 1) \ + ALTINSTR_ENTRY(facility2, 2) \ + ".popsection\n" +#else +/* Alternative instructions are disabled, let's put just oldinstr in */ +#define ALTERNATIVE(oldinstr, altinstr, facility) \ + oldinstr "\n" + +#define ALTERNATIVE_2(oldinstr, altinstr1, facility1, altinstr2, facility2) \ + oldinstr "\n" +#endif + +/* + * Alternative instructions for different CPU types or capabilities. + * + * This allows to use optimized instructions even on generic binary + * kernels. + * + * oldinstr is padded with jump and nops at compile time if altinstr is + * longer. altinstr is padded with jump and nops at run-time during patching. + * + * For non barrier like inlines please define new variants + * without volatile and memory clobber. + */ +#define alternative(oldinstr, altinstr, facility) \ + asm volatile(ALTERNATIVE(oldinstr, altinstr, facility) : : : "memory") + +#define alternative_2(oldinstr, altinstr1, facility1, altinstr2, facility2) \ + asm volatile(ALTERNATIVE_2(oldinstr, altinstr1, facility1, \ + altinstr2, facility2) ::: "memory") + +#endif /* __ASSEMBLY__ */ + +#endif /* _ASM_S390_ALTERNATIVE_H */ diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile index 1f0fe98f6db9..383855ebbfb0 100644 --- a/arch/s390/kernel/Makefile +++ b/arch/s390/kernel/Makefile @@ -75,6 +75,7 @@ obj-$(CONFIG_KPROBES) += kprobes.o obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o obj-$(CONFIG_CRASH_DUMP) += crash_dump.o obj-$(CONFIG_UPROBES) += uprobes.o +obj-$(CONFIG_ALTERNATIVES) += alternative.o
obj-$(CONFIG_PERF_EVENTS) += perf_event.o perf_cpum_cf.o perf_cpum_sf.o obj-$(CONFIG_PERF_EVENTS) += perf_cpum_cf_events.o diff --git a/arch/s390/kernel/alternative.c b/arch/s390/kernel/alternative.c new file mode 100644 index 000000000000..315986a06cf5 --- /dev/null +++ b/arch/s390/kernel/alternative.c @@ -0,0 +1,110 @@ +#include <linux/module.h> +#include <asm/alternative.h> +#include <asm/facility.h> + +#define MAX_PATCH_LEN (255 - 1) + +static int __initdata_or_module alt_instr_disabled; + +static int __init disable_alternative_instructions(char *str) +{ + alt_instr_disabled = 1; + return 0; +} + +early_param("noaltinstr", disable_alternative_instructions); + +struct brcl_insn { + u16 opc; + s32 disp; +} __packed; + +static u16 __initdata_or_module nop16 = 0x0700; +static u32 __initdata_or_module nop32 = 0x47000000; +static struct brcl_insn __initdata_or_module nop48 = { + 0xc004, 0 +}; + +static const void *nops[] __initdata_or_module = { + &nop16, + &nop32, + &nop48 +}; + +static void __init_or_module add_jump_padding(void *insns, unsigned int len) +{ + struct brcl_insn brcl = { + 0xc0f4, + len / 2 + }; + + memcpy(insns, &brcl, sizeof(brcl)); + insns += sizeof(brcl); + len -= sizeof(brcl); + + while (len > 0) { + memcpy(insns, &nop16, 2); + insns += 2; + len -= 2; + } +} + +static void __init_or_module add_padding(void *insns, unsigned int len) +{ + if (len > 6) + add_jump_padding(insns, len); + else if (len >= 2) + memcpy(insns, nops[len / 2 - 1], len); +} + +static void __init_or_module __apply_alternatives(struct alt_instr *start, + struct alt_instr *end) +{ + struct alt_instr *a; + u8 *instr, *replacement; + u8 insnbuf[MAX_PATCH_LEN]; + + /* + * The scan order should be from start to end. A later scanned + * alternative code can overwrite previously scanned alternative code. + */ + for (a = start; a < end; a++) { + int insnbuf_sz = 0; + + instr = (u8 *)&a->instr_offset + a->instr_offset; + replacement = (u8 *)&a->repl_offset + a->repl_offset; + + if (!test_facility(a->facility)) + continue; + + if (unlikely(a->instrlen % 2 || a->replacementlen % 2)) { + WARN_ONCE(1, "cpu alternatives instructions length is " + "odd, skipping patching\n"); + continue; + } + + memcpy(insnbuf, replacement, a->replacementlen); + insnbuf_sz = a->replacementlen; + + if (a->instrlen > a->replacementlen) { + add_padding(insnbuf + a->replacementlen, + a->instrlen - a->replacementlen); + insnbuf_sz += a->instrlen - a->replacementlen; + } + + s390_kernel_write(instr, insnbuf, insnbuf_sz); + } +} + +void __init_or_module apply_alternatives(struct alt_instr *start, + struct alt_instr *end) +{ + if (!alt_instr_disabled) + __apply_alternatives(start, end); +} + +extern struct alt_instr __alt_instructions[], __alt_instructions_end[]; +void __init apply_alternative_instructions(void) +{ + apply_alternatives(__alt_instructions, __alt_instructions_end); +} diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c index fbc07891f9e7..186fdb7f0f3d 100644 --- a/arch/s390/kernel/module.c +++ b/arch/s390/kernel/module.c @@ -31,6 +31,7 @@ #include <linux/kernel.h> #include <linux/moduleloader.h> #include <linux/bug.h> +#include <asm/alternative.h>
#if 0 #define DEBUGP printk @@ -428,6 +429,22 @@ int module_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs, struct module *me) { + const Elf_Shdr *s; + char *secstrings; + + if (IS_ENABLED(CONFIG_ALTERNATIVES)) { + secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; + for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) { + if (!strcmp(".altinstructions", + secstrings + s->sh_name)) { + /* patch .altinstructions */ + void *aseg = (void *)s->sh_addr; + + apply_alternatives(aseg, aseg + s->sh_size); + } + } + } + jump_label_apply_nops(me); return 0; } diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index e974e53ab597..c3bdb3e80380 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -63,6 +63,7 @@ #include <asm/sclp.h> #include <asm/sysinfo.h> #include <asm/numa.h> +#include <asm/alternative.h> #include "entry.h"
/* @@ -931,6 +932,8 @@ void __init setup_arch(char **cmdline_p) conmode_default(); set_preferred_console();
+ apply_alternative_instructions(); + /* Setup zfcpdump support */ setup_zfcpdump();
diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S index 115bda280d50..b8ec50cb1b6f 100644 --- a/arch/s390/kernel/vmlinux.lds.S +++ b/arch/s390/kernel/vmlinux.lds.S @@ -99,6 +99,29 @@ SECTIONS EXIT_DATA }
+ /* + * struct alt_inst entries. From the header (alternative.h): + * "Alternative instructions for different CPU types or capabilities" + * Think locking instructions on spinlocks. + * Note, that it is a part of __init region. + */ + . = ALIGN(8); + .altinstructions : { + __alt_instructions = .; + *(.altinstructions) + __alt_instructions_end = .; + } + + /* + * And here are the replacement instructions. The linker sticks + * them as binary blobs. The .altinstructions has enough data to + * get the address and the length of them to patch the kernel safely. + * Note, that it is a part of __init region. + */ + .altinstr_replacement : { + *(.altinstr_replacement) + } + /* early.c uses stsi, which requires page aligned data. */ . = ALIGN(PAGE_SIZE); INIT_DATA_SECTION(0x100)
From: Heiko Carstens heiko.carstens@de.ibm.com
[ Upstream commit 049a2c2d486e8cc82c5cd79fa479c5b105b109e9 ]
Remove the CPU_ALTERNATIVES config option and enable the code unconditionally. The config option was only added to avoid a conflict with the named saved segment support. Since that code is gone there is no reason to keep the CPU_ALTERNATIVES config option.
Just enable it unconditionally to also reduce the number of config options and make it less likely that something breaks.
Signed-off-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/Kconfig | 16 ---------------- arch/s390/include/asm/alternative.h | 20 +++----------------- arch/s390/kernel/Makefile | 3 +-- arch/s390/kernel/module.c | 15 ++++++--------- 4 files changed, 10 insertions(+), 44 deletions(-)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index f3e43262db9d..f5a154d438ee 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -704,22 +704,6 @@ config SECCOMP
If unsure, say Y.
-config ALTERNATIVES - def_bool y - prompt "Patch optimized instructions for running CPU type" - help - When enabled the kernel code is compiled with additional - alternative instructions blocks optimized for newer CPU types. - These alternative instructions blocks are patched at kernel boot - time when running CPU supports them. This mechanism is used to - optimize some critical code paths (i.e. spinlocks) for newer CPUs - even if kernel is build to support older machine generations. - - This mechanism could be disabled by appending "noaltinstr" - option to the kernel command line. - - If unsure, say Y. - endmenu
menu "Power Management" diff --git a/arch/s390/include/asm/alternative.h b/arch/s390/include/asm/alternative.h index 6c268f6a51d3..a72002056b54 100644 --- a/arch/s390/include/asm/alternative.h +++ b/arch/s390/include/asm/alternative.h @@ -15,14 +15,9 @@ struct alt_instr { u8 replacementlen; /* length of new instruction */ } __packed;
-#ifdef CONFIG_ALTERNATIVES -extern void apply_alternative_instructions(void); -extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end); -#else -static inline void apply_alternative_instructions(void) {}; -static inline void apply_alternatives(struct alt_instr *start, - struct alt_instr *end) {}; -#endif +void apply_alternative_instructions(void); +void apply_alternatives(struct alt_instr *start, struct alt_instr *end); + /* * |661: |662: |6620 |663: * +-----------+---------------------+ @@ -109,7 +104,6 @@ static inline void apply_alternatives(struct alt_instr *start, b_altinstr(num)":\n\t" altinstr "\n" e_altinstr(num) ":\n" \ INSTR_LEN_SANITY_CHECK(altinstr_len(num))
-#ifdef CONFIG_ALTERNATIVES /* alternative assembly primitive: */ #define ALTERNATIVE(oldinstr, altinstr, facility) \ ".pushsection .altinstr_replacement, "ax"\n" \ @@ -130,14 +124,6 @@ static inline void apply_alternatives(struct alt_instr *start, ALTINSTR_ENTRY(facility1, 1) \ ALTINSTR_ENTRY(facility2, 2) \ ".popsection\n" -#else -/* Alternative instructions are disabled, let's put just oldinstr in */ -#define ALTERNATIVE(oldinstr, altinstr, facility) \ - oldinstr "\n" - -#define ALTERNATIVE_2(oldinstr, altinstr1, facility1, altinstr2, facility2) \ - oldinstr "\n" -#endif
/* * Alternative instructions for different CPU types or capabilities. diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile index 383855ebbfb0..d4b73f28c395 100644 --- a/arch/s390/kernel/Makefile +++ b/arch/s390/kernel/Makefile @@ -57,7 +57,7 @@ obj-y += processor.o sys_s390.o ptrace.o signal.o cpcmd.o ebcdic.o nmi.o obj-y += debug.o irq.o ipl.o dis.o diag.o sclp.o vdso.o als.o obj-y += sysinfo.o jump_label.o lgr.o os_info.o machine_kexec.o pgm_check.o obj-y += runtime_instr.o cache.o fpu.o dumpstack.o -obj-y += entry.o reipl.o relocate_kernel.o +obj-y += entry.o reipl.o relocate_kernel.o alternative.o
extra-y += head.o head64.o vmlinux.lds
@@ -75,7 +75,6 @@ obj-$(CONFIG_KPROBES) += kprobes.o obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o obj-$(CONFIG_CRASH_DUMP) += crash_dump.o obj-$(CONFIG_UPROBES) += uprobes.o -obj-$(CONFIG_ALTERNATIVES) += alternative.o
obj-$(CONFIG_PERF_EVENTS) += perf_event.o perf_cpum_cf.o perf_cpum_sf.o obj-$(CONFIG_PERF_EVENTS) += perf_cpum_cf_events.o diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c index 186fdb7f0f3d..6f68259bf80e 100644 --- a/arch/s390/kernel/module.c +++ b/arch/s390/kernel/module.c @@ -432,16 +432,13 @@ int module_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *s; char *secstrings;
- if (IS_ENABLED(CONFIG_ALTERNATIVES)) { - secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; - for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) { - if (!strcmp(".altinstructions", - secstrings + s->sh_name)) { - /* patch .altinstructions */ - void *aseg = (void *)s->sh_addr; + secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; + for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) { + if (!strcmp(".altinstructions", secstrings + s->sh_name)) { + /* patch .altinstructions */ + void *aseg = (void *)s->sh_addr;
- apply_alternatives(aseg, aseg + s->sh_size); - } + apply_alternatives(aseg, aseg + s->sh_size); } }
From: Christian Borntraeger borntraeger@de.ibm.com
[ Upstream commit 35b3fde6203b932b2b1a5b53b3d8808abc9c4f60 ]
The new firmware interfaces for branch prediction behaviour changes are transparently available for the guest. Nevertheless, there is new state attached that should be migrated and properly resetted. Provide a mechanism for handling reset, migration and VSIE.
Signed-off-by: Christian Borntraeger borntraeger@de.ibm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Cornelia Huck cohuck@redhat.com [Changed capability number to 152. - Radim] Signed-off-by: Radim Krčmář rkrcmar@redhat.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/include/asm/kvm_host.h | 3 ++- arch/s390/include/uapi/asm/kvm.h | 5 ++++- arch/s390/kvm/kvm-s390.c | 13 ++++++++++++- arch/s390/kvm/vsie.c | 10 ++++++++++ include/uapi/linux/kvm.h | 1 + 5 files changed, 29 insertions(+), 3 deletions(-)
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index a41faf34b034..5792590d0e7c 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -181,7 +181,8 @@ struct kvm_s390_sie_block { __u16 ipa; /* 0x0056 */ __u32 ipb; /* 0x0058 */ __u32 scaoh; /* 0x005c */ - __u8 reserved60; /* 0x0060 */ +#define FPF_BPBC 0x20 + __u8 fpf; /* 0x0060 */ __u8 ecb; /* 0x0061 */ __u8 ecb2; /* 0x0062 */ #define ECB3_AES 0x04 diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h index a2ffec4139ad..81c02e198527 100644 --- a/arch/s390/include/uapi/asm/kvm.h +++ b/arch/s390/include/uapi/asm/kvm.h @@ -197,6 +197,7 @@ struct kvm_guest_debug_arch { #define KVM_SYNC_VRS (1UL << 6) #define KVM_SYNC_RICCB (1UL << 7) #define KVM_SYNC_FPRS (1UL << 8) +#define KVM_SYNC_BPBC (1UL << 10) /* definition of registers in kvm_run */ struct kvm_sync_regs { __u64 prefix; /* prefix register */ @@ -217,7 +218,9 @@ struct kvm_sync_regs { }; __u8 reserved[512]; /* for future vector expansion */ __u32 fpc; /* valid on KVM_SYNC_VRS or KVM_SYNC_FPRS */ - __u8 padding[52]; /* riccb needs to be 64byte aligned */ + __u8 bpbc : 1; /* bp mode */ + __u8 reserved2 : 7; + __u8 padding1[51]; /* riccb needs to be 64byte aligned */ __u8 riccb[64]; /* runtime instrumentation controls block */ };
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index a70ff09b4982..2032ab81b2d7 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -401,6 +401,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_S390_RI: r = test_facility(64); break; + case KVM_CAP_S390_BPB: + r = test_facility(82); + break; default: r = 0; } @@ -1713,6 +1716,8 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) kvm_s390_set_prefix(vcpu, 0); if (test_kvm_facility(vcpu->kvm, 64)) vcpu->run->kvm_valid_regs |= KVM_SYNC_RICCB; + if (test_kvm_facility(vcpu->kvm, 82)) + vcpu->run->kvm_valid_regs |= KVM_SYNC_BPBC; /* fprs can be synchronized via vrs, even if the guest has no vx. With * MACHINE_HAS_VX, (load|store)_fpu_regs() will work with vrs format. */ @@ -1829,7 +1834,6 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) if (test_fp_ctl(current->thread.fpu.fpc)) /* User space provided an invalid FPC, let's clear it */ current->thread.fpu.fpc = 0; - save_access_regs(vcpu->arch.host_acrs); restore_access_regs(vcpu->run->s.regs.acrs); gmap_enable(vcpu->arch.enabled_gmap); @@ -1877,6 +1881,7 @@ static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu) current->thread.fpu.fpc = 0; vcpu->arch.sie_block->gbea = 1; vcpu->arch.sie_block->pp = 0; + vcpu->arch.sie_block->fpf &= ~FPF_BPBC; vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID; kvm_clear_async_pf_completion_queue(vcpu); if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm)) @@ -2744,6 +2749,11 @@ static void sync_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) if (riccb->valid) vcpu->arch.sie_block->ecb3 |= 0x01; } + if ((kvm_run->kvm_dirty_regs & KVM_SYNC_BPBC) && + test_kvm_facility(vcpu->kvm, 82)) { + vcpu->arch.sie_block->fpf &= ~FPF_BPBC; + vcpu->arch.sie_block->fpf |= kvm_run->s.regs.bpbc ? FPF_BPBC : 0; + }
kvm_run->kvm_dirty_regs = 0; } @@ -2762,6 +2772,7 @@ static void store_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) kvm_run->s.regs.pft = vcpu->arch.pfault_token; kvm_run->s.regs.pfs = vcpu->arch.pfault_select; kvm_run->s.regs.pfc = vcpu->arch.pfault_compare; + kvm_run->s.regs.bpbc = (vcpu->arch.sie_block->fpf & FPF_BPBC) == FPF_BPBC; }
int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c index d8673e243f13..e56b7724054c 100644 --- a/arch/s390/kvm/vsie.c +++ b/arch/s390/kvm/vsie.c @@ -217,6 +217,12 @@ static void unshadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page) memcpy(scb_o->gcr, scb_s->gcr, 128); scb_o->pp = scb_s->pp;
+ /* branch prediction */ + if (test_kvm_facility(vcpu->kvm, 82)) { + scb_o->fpf &= ~FPF_BPBC; + scb_o->fpf |= scb_s->fpf & FPF_BPBC; + } + /* interrupt intercept */ switch (scb_s->icptcode) { case ICPT_PROGI: @@ -259,6 +265,7 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page) scb_s->ecb3 = 0; scb_s->ecd = 0; scb_s->fac = 0; + scb_s->fpf = 0;
rc = prepare_cpuflags(vcpu, vsie_page); if (rc) @@ -316,6 +323,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page) prefix_unmapped(vsie_page); scb_s->ecb |= scb_o->ecb & 0x10U; } + /* branch prediction */ + if (test_kvm_facility(vcpu->kvm, 82)) + scb_s->fpf |= scb_o->fpf & FPF_BPBC; /* SIMD */ if (test_kvm_facility(vcpu->kvm, 129)) { scb_s->eca |= scb_o->eca & 0x00020000U; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 4ee67cb99143..05b9bb63dbec 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -870,6 +870,7 @@ struct kvm_ppc_smmu_info { #define KVM_CAP_S390_USER_INSTR0 130 #define KVM_CAP_MSI_DEVID 131 #define KVM_CAP_PPC_HTM 132 +#define KVM_CAP_S390_BPB 152
#ifdef KVM_CAP_IRQ_ROUTING
[ Upstream commit 7041d28115e91f2144f811ffe8a195c696b1e1d0 ]
Clear all user space registers on entry to the kernel and all KVM guest registers on KVM guest exit if the register does not contain either a parameter or a result value.
Reviewed-by: Christian Borntraeger borntraeger@de.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/entry.S | 47 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+)
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 3bc2825173ef..04bff4d0e8f2 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -255,6 +255,12 @@ ENTRY(sie64a) sie_exit: lg %r14,__SF_EMPTY+8(%r15) # load guest register save area stmg %r0,%r13,0(%r14) # save guest gprs 0-13 + xgr %r0,%r0 # clear guest registers to + xgr %r1,%r1 # prevent speculative use + xgr %r2,%r2 + xgr %r3,%r3 + xgr %r4,%r4 + xgr %r5,%r5 lmg %r6,%r14,__SF_GPRS(%r15) # restore kernel registers lg %r2,__SF_EMPTY+16(%r15) # return exit reason code br %r14 @@ -290,6 +296,8 @@ ENTRY(system_call) .Lsysc_vtime: UPDATE_VTIME %r10,%r13,__LC_SYNC_ENTER_TIMER stmg %r0,%r7,__PT_R0(%r11) + # clear user controlled register to prevent speculative use + xgr %r0,%r0 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC @@ -517,6 +525,15 @@ ENTRY(pgm_check_handler) mvc __THREAD_trap_tdb(256,%r14),0(%r13) 3: la %r11,STACK_FRAME_OVERHEAD(%r15) stmg %r0,%r7,__PT_R0(%r11) + # clear user controlled registers to prevent speculative use + xgr %r0,%r0 + xgr %r1,%r1 + xgr %r2,%r2 + xgr %r3,%r3 + xgr %r4,%r4 + xgr %r5,%r5 + xgr %r6,%r6 + xgr %r7,%r7 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC stmg %r8,%r9,__PT_PSW(%r11) mvc __PT_INT_CODE(4,%r11),__LC_PGM_ILC @@ -580,6 +597,16 @@ ENTRY(io_int_handler) lmg %r8,%r9,__LC_IO_OLD_PSW SWITCH_ASYNC __LC_SAVE_AREA_ASYNC,__LC_ASYNC_ENTER_TIMER stmg %r0,%r7,__PT_R0(%r11) + # clear user controlled registers to prevent speculative use + xgr %r0,%r0 + xgr %r1,%r1 + xgr %r2,%r2 + xgr %r3,%r3 + xgr %r4,%r4 + xgr %r5,%r5 + xgr %r6,%r6 + xgr %r7,%r7 + xgr %r10,%r10 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_ASYNC stmg %r8,%r9,__PT_PSW(%r11) mvc __PT_INT_CODE(12,%r11),__LC_SUBCHANNEL_ID @@ -755,6 +782,16 @@ ENTRY(ext_int_handler) lmg %r8,%r9,__LC_EXT_OLD_PSW SWITCH_ASYNC __LC_SAVE_AREA_ASYNC,__LC_ASYNC_ENTER_TIMER stmg %r0,%r7,__PT_R0(%r11) + # clear user controlled registers to prevent speculative use + xgr %r0,%r0 + xgr %r1,%r1 + xgr %r2,%r2 + xgr %r3,%r3 + xgr %r4,%r4 + xgr %r5,%r5 + xgr %r6,%r6 + xgr %r7,%r7 + xgr %r10,%r10 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_ASYNC stmg %r8,%r9,__PT_PSW(%r11) lghi %r1,__LC_EXT_PARAMS2 @@ -925,6 +962,16 @@ ENTRY(mcck_int_handler) .Lmcck_skip: lghi %r14,__LC_GPREGS_SAVE_AREA+64 stmg %r0,%r7,__PT_R0(%r11) + # clear user controlled registers to prevent speculative use + xgr %r0,%r0 + xgr %r1,%r1 + xgr %r2,%r2 + xgr %r3,%r3 + xgr %r4,%r4 + xgr %r5,%r5 + xgr %r6,%r6 + xgr %r7,%r7 + xgr %r10,%r10 mvc __PT_R8(64,%r11),0(%r14) stmg %r8,%r9,__PT_PSW(%r11) xc __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
[ Upstream commit e2dd833389cc4069a96b57bdd24227b5f52288f5 ]
Add an optimized version of the array_index_mask_nospec function for s390 based on a compare and a subtract with borrow.
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/include/asm/barrier.h | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h index 5c8db3ce61c8..03b2e5bf1206 100644 --- a/arch/s390/include/asm/barrier.h +++ b/arch/s390/include/asm/barrier.h @@ -48,6 +48,30 @@ do { \ #define __smp_mb__before_atomic() barrier() #define __smp_mb__after_atomic() barrier()
+/** + * array_index_mask_nospec - generate a mask for array_idx() that is + * ~0UL when the bounds check succeeds and 0 otherwise + * @index: array element index + * @size: number of elements in array + */ +#define array_index_mask_nospec array_index_mask_nospec +static inline unsigned long array_index_mask_nospec(unsigned long index, + unsigned long size) +{ + unsigned long mask; + + if (__builtin_constant_p(size) && size > 0) { + asm(" clgr %2,%1\n" + " slbgr %0,%0\n" + :"=d" (mask) : "d" (size-1), "d" (index) :"cc"); + return mask; + } + asm(" clgr %1,%2\n" + " slbgr %0,%0\n" + :"=d" (mask) : "d" (size), "d" (index) :"cc"); + return ~mask; +} + #include <asm-generic/barrier.h>
#endif /* __ASM_BARRIER_H */
[ Upstream commit cf1489984641369611556bf00c48f945c77bcf02 ]
To be able to switch off specific CPU alternatives with kernel parameters make a copy of the facility bit mask provided by STFLE and use the copy for the decision to apply an alternative.
Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Cornelia Huck cohuck@redhat.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/include/asm/facility.h | 18 ++++++++++++++++++ arch/s390/include/asm/lowcore.h | 3 ++- arch/s390/kernel/alternative.c | 3 ++- arch/s390/kernel/early.c | 3 +++ arch/s390/kernel/setup.c | 4 +++- arch/s390/kernel/smp.c | 4 +++- 6 files changed, 31 insertions(+), 4 deletions(-)
diff --git a/arch/s390/include/asm/facility.h b/arch/s390/include/asm/facility.h index 09b406db7529..7a8a1457dbb8 100644 --- a/arch/s390/include/asm/facility.h +++ b/arch/s390/include/asm/facility.h @@ -17,6 +17,24 @@
#define MAX_FACILITY_BIT (256*8) /* stfle_fac_list has 256 bytes */
+static inline void __set_facility(unsigned long nr, void *facilities) +{ + unsigned char *ptr = (unsigned char *) facilities; + + if (nr >= MAX_FACILITY_BIT) + return; + ptr[nr >> 3] |= 0x80 >> (nr & 7); +} + +static inline void __clear_facility(unsigned long nr, void *facilities) +{ + unsigned char *ptr = (unsigned char *) facilities; + + if (nr >= MAX_FACILITY_BIT) + return; + ptr[nr >> 3] &= ~(0x80 >> (nr & 7)); +} + static inline int __test_facility(unsigned long nr, void *facilities) { unsigned char *ptr; diff --git a/arch/s390/include/asm/lowcore.h b/arch/s390/include/asm/lowcore.h index 7b93b78f423c..d52e7efea7d6 100644 --- a/arch/s390/include/asm/lowcore.h +++ b/arch/s390/include/asm/lowcore.h @@ -150,7 +150,8 @@ struct lowcore { __u8 pad_0x0e20[0x0f00-0x0e20]; /* 0x0e20 */
/* Extended facility list */ - __u64 stfle_fac_list[32]; /* 0x0f00 */ + __u64 stfle_fac_list[16]; /* 0x0f00 */ + __u64 alt_stfle_fac_list[16]; /* 0x0f80 */ __u8 pad_0x1000[0x11b0-0x1000]; /* 0x1000 */
/* Pointer to vector register save area */ diff --git a/arch/s390/kernel/alternative.c b/arch/s390/kernel/alternative.c index 315986a06cf5..f5060af61176 100644 --- a/arch/s390/kernel/alternative.c +++ b/arch/s390/kernel/alternative.c @@ -74,7 +74,8 @@ static void __init_or_module __apply_alternatives(struct alt_instr *start, instr = (u8 *)&a->instr_offset + a->instr_offset; replacement = (u8 *)&a->repl_offset + a->repl_offset;
- if (!test_facility(a->facility)) + if (!__test_facility(a->facility, + S390_lowcore.alt_stfle_fac_list)) continue;
if (unlikely(a->instrlen % 2 || a->replacementlen % 2)) { diff --git a/arch/s390/kernel/early.c b/arch/s390/kernel/early.c index 62578989c74d..171aaa221e4b 100644 --- a/arch/s390/kernel/early.c +++ b/arch/s390/kernel/early.c @@ -299,6 +299,9 @@ static noinline __init void setup_facility_list(void) { stfle(S390_lowcore.stfle_fac_list, ARRAY_SIZE(S390_lowcore.stfle_fac_list)); + memcpy(S390_lowcore.alt_stfle_fac_list, + S390_lowcore.stfle_fac_list, + sizeof(S390_lowcore.alt_stfle_fac_list)); }
static __init void detect_diag9c(void) diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index c3bdb3e80380..433b380896b9 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -336,7 +336,9 @@ static void __init setup_lowcore(void) lc->machine_flags = S390_lowcore.machine_flags; lc->stfl_fac_list = S390_lowcore.stfl_fac_list; memcpy(lc->stfle_fac_list, S390_lowcore.stfle_fac_list, - MAX_FACILITY_BIT/8); + sizeof(lc->stfle_fac_list)); + memcpy(lc->alt_stfle_fac_list, S390_lowcore.alt_stfle_fac_list, + sizeof(lc->alt_stfle_fac_list)); if (MACHINE_HAS_VX) lc->vector_save_area_addr = (unsigned long) &lc->vector_save_area; diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c index 35531fe1c5ea..63c8a840d56b 100644 --- a/arch/s390/kernel/smp.c +++ b/arch/s390/kernel/smp.c @@ -253,7 +253,9 @@ static void pcpu_prepare_secondary(struct pcpu *pcpu, int cpu) __ctl_store(lc->cregs_save_area, 0, 15); save_access_regs((unsigned int *) lc->access_regs_save_area); memcpy(lc->stfle_fac_list, S390_lowcore.stfle_fac_list, - MAX_FACILITY_BIT/8); + sizeof(lc->stfle_fac_list)); + memcpy(lc->alt_stfle_fac_list, S390_lowcore.alt_stfle_fac_list, + sizeof(lc->alt_stfle_fac_list)); }
static void pcpu_attach_task(struct pcpu *pcpu, struct task_struct *tsk)
[ Upstream commit d768bd892fc8f066cd3aa000eb1867bcf32db0ee ]
Add the PPA instruction to the system entry and exit path to switch the kernel to a different branch prediction behaviour. The instructions are added via CPU alternatives and can be disabled with the "nospec" or the "nobp=0" kernel parameter. If the default behaviour selected with CONFIG_KERNEL_NOBP is set to "n" then the "nobp=1" parameter can be used to enable the changed kernel branch prediction.
Acked-by: Cornelia Huck cohuck@redhat.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/Kconfig | 17 ++++++++++++++ arch/s390/include/asm/processor.h | 1 + arch/s390/kernel/alternative.c | 23 +++++++++++++++++++ arch/s390/kernel/early.c | 2 ++ arch/s390/kernel/entry.S | 48 +++++++++++++++++++++++++++++++++++++++ arch/s390/kernel/ipl.c | 1 + arch/s390/kernel/smp.c | 2 ++ 7 files changed, 94 insertions(+)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index f5a154d438ee..02eb7a91ad35 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -704,6 +704,23 @@ config SECCOMP
If unsure, say Y.
+config KERNEL_NOBP + def_bool n + prompt "Enable modified branch prediction for the kernel by default" + help + If this option is selected the kernel will switch to a modified + branch prediction mode if the firmware interface is available. + The modified branch prediction mode improves the behaviour in + regard to speculative execution. + + With the option enabled the kernel parameter "nobp=0" or "nospec" + can be used to run the kernel in the normal branch prediction mode. + + With the option disabled the modified branch prediction mode is + enabled with the "nobp=1" kernel parameter. + + If unsure, say N. + endmenu
menu "Power Management" diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h index 6bcbbece082b..44de5d0014c5 100644 --- a/arch/s390/include/asm/processor.h +++ b/arch/s390/include/asm/processor.h @@ -84,6 +84,7 @@ void cpu_detect_mhz_feature(void); extern const struct seq_operations cpuinfo_op; extern int sysctl_ieee_emulation_warnings; extern void execve_tail(void); +extern void __bpon(void);
/* * User space process size: 2GB for 31 bit, 4TB or 8PT for 64 bit. diff --git a/arch/s390/kernel/alternative.c b/arch/s390/kernel/alternative.c index f5060af61176..ed21de98b79e 100644 --- a/arch/s390/kernel/alternative.c +++ b/arch/s390/kernel/alternative.c @@ -14,6 +14,29 @@ static int __init disable_alternative_instructions(char *str)
early_param("noaltinstr", disable_alternative_instructions);
+static int __init nobp_setup_early(char *str) +{ + bool enabled; + int rc; + + rc = kstrtobool(str, &enabled); + if (rc) + return rc; + if (enabled && test_facility(82)) + __set_facility(82, S390_lowcore.alt_stfle_fac_list); + else + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); + return 0; +} +early_param("nobp", nobp_setup_early); + +static int __init nospec_setup_early(char *str) +{ + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); + return 0; +} +early_param("nospec", nospec_setup_early); + struct brcl_insn { u16 opc; s32 disp; diff --git a/arch/s390/kernel/early.c b/arch/s390/kernel/early.c index 171aaa221e4b..0c7a7d5d95f1 100644 --- a/arch/s390/kernel/early.c +++ b/arch/s390/kernel/early.c @@ -302,6 +302,8 @@ static noinline __init void setup_facility_list(void) memcpy(S390_lowcore.alt_stfle_fac_list, S390_lowcore.stfle_fac_list, sizeof(S390_lowcore.alt_stfle_fac_list)); + if (!IS_ENABLED(CONFIG_KERNEL_NOBP)) + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); }
static __init void detect_diag9c(void) diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 04bff4d0e8f2..bfcf04de5808 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -163,6 +163,34 @@ _PIF_WORK = (_PIF_PER_TRAP) tm off+\addr, \mask .endm
+ .macro BPOFF + .pushsection .altinstr_replacement, "ax" +660: .long 0xb2e8c000 + .popsection +661: .long 0x47000000 + .pushsection .altinstructions, "a" + .long 661b - . + .long 660b - . + .word 82 + .byte 4 + .byte 4 + .popsection + .endm + + .macro BPON + .pushsection .altinstr_replacement, "ax" +662: .long 0xb2e8d000 + .popsection +663: .long 0x47000000 + .pushsection .altinstructions, "a" + .long 663b - . + .long 662b - . + .word 82 + .byte 4 + .byte 4 + .popsection + .endm + .section .kprobes.text, "ax" .Ldummy: /* @@ -175,6 +203,11 @@ _PIF_WORK = (_PIF_PER_TRAP) */ nop 0
+ENTRY(__bpon) + .globl __bpon + BPON + br %r14 + /* * Scheduler resume function, called by switch_to * gpr2 = (task_struct *) prev @@ -234,7 +267,10 @@ ENTRY(sie64a) jnz .Lsie_skip TSTMSK __LC_CPU_FLAGS,_CIF_FPU jo .Lsie_skip # exit if fp/vx regs changed + BPON sie 0(%r14) +.Lsie_exit: + BPOFF .Lsie_skip: ni __SIE_PROG0C+3(%r14),0xfe # no longer in SIE lctlg %c1,%c1,__LC_USER_ASCE # load primary asce @@ -286,6 +322,7 @@ ENTRY(system_call) stpt __LC_SYNC_ENTER_TIMER .Lsysc_stmg: stmg %r8,%r15,__LC_SAVE_AREA_SYNC + BPOFF lg %r10,__LC_LAST_BREAK lg %r12,__LC_THREAD_INFO lghi %r14,_PIF_SYSCALL @@ -332,6 +369,7 @@ ENTRY(system_call) jnz .Lsysc_work # check for work TSTMSK __LC_CPU_FLAGS,_CIF_WORK jnz .Lsysc_work + BPON .Lsysc_restore: lg %r14,__LC_VDSO_PER_CPU lmg %r0,%r10,__PT_R0(%r11) @@ -492,6 +530,7 @@ ENTRY(kernel_thread_starter)
ENTRY(pgm_check_handler) stpt __LC_SYNC_ENTER_TIMER + BPOFF stmg %r8,%r15,__LC_SAVE_AREA_SYNC lg %r10,__LC_LAST_BREAK lg %r12,__LC_THREAD_INFO @@ -590,6 +629,7 @@ ENTRY(pgm_check_handler) ENTRY(io_int_handler) STCK __LC_INT_CLOCK stpt __LC_ASYNC_ENTER_TIMER + BPOFF stmg %r8,%r15,__LC_SAVE_AREA_ASYNC lg %r10,__LC_LAST_BREAK lg %r12,__LC_THREAD_INFO @@ -641,9 +681,13 @@ ENTRY(io_int_handler) lg %r14,__LC_VDSO_PER_CPU lmg %r0,%r10,__PT_R0(%r11) mvc __LC_RETURN_PSW(16),__PT_PSW(%r11) + tm __PT_PSW+1(%r11),0x01 # returning to user ? + jno .Lio_exit_kernel + BPON .Lio_exit_timer: stpt __LC_EXIT_TIMER mvc __VDSO_ECTG_BASE(16,%r14),__LC_EXIT_TIMER +.Lio_exit_kernel: lmg %r11,%r15,__PT_R11(%r11) lpswe __LC_RETURN_PSW .Lio_done: @@ -775,6 +819,7 @@ ENTRY(io_int_handler) ENTRY(ext_int_handler) STCK __LC_INT_CLOCK stpt __LC_ASYNC_ENTER_TIMER + BPOFF stmg %r8,%r15,__LC_SAVE_AREA_ASYNC lg %r10,__LC_LAST_BREAK lg %r12,__LC_THREAD_INFO @@ -824,6 +869,7 @@ ENTRY(psw_idle) .Lpsw_idle_stcctm: #endif oi __LC_CPU_FLAGS+7,_CIF_ENABLED_WAIT + BPON STCK __CLOCK_IDLE_ENTER(%r2) stpt __TIMER_IDLE_ENTER(%r2) .Lpsw_idle_lpsw: @@ -931,6 +977,7 @@ load_fpu_regs: */ ENTRY(mcck_int_handler) STCK __LC_MCCK_CLOCK + BPOFF la %r1,4095 # revalidate r1 spt __LC_CPU_TIMER_SAVE_AREA-4095(%r1) # revalidate cpu timer lmg %r0,%r15,__LC_GPREGS_SAVE_AREA-4095(%r1)# revalidate gprs @@ -997,6 +1044,7 @@ ENTRY(mcck_int_handler) mvc __LC_RETURN_MCCK_PSW(16),__PT_PSW(%r11) # move return PSW tm __LC_RETURN_MCCK_PSW+1,0x01 # returning to user ? jno 0f + BPON stpt __LC_EXIT_TIMER mvc __VDSO_ECTG_BASE(16,%r14),__LC_EXIT_TIMER 0: lmg %r11,%r15,__PT_R11(%r11) diff --git a/arch/s390/kernel/ipl.c b/arch/s390/kernel/ipl.c index 39127b691b78..df49f2a1a7e5 100644 --- a/arch/s390/kernel/ipl.c +++ b/arch/s390/kernel/ipl.c @@ -563,6 +563,7 @@ static struct kset *ipl_kset;
static void __ipl_run(void *unused) { + __bpon(); diag308(DIAG308_LOAD_CLEAR, NULL); if (MACHINE_IS_VM) __cpcmd("IPL", NULL, 0, NULL); diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c index 63c8a840d56b..b34ff0798d38 100644 --- a/arch/s390/kernel/smp.c +++ b/arch/s390/kernel/smp.c @@ -304,6 +304,7 @@ static void pcpu_delegate(struct pcpu *pcpu, void (*func)(void *), mem_assign_absolute(lc->restart_fn, (unsigned long) func); mem_assign_absolute(lc->restart_data, (unsigned long) data); mem_assign_absolute(lc->restart_source, source_cpu); + __bpon(); asm volatile( "0: sigp 0,%0,%2 # sigp restart to target cpu\n" " brc 2,0b # busy, try again\n" @@ -877,6 +878,7 @@ void __cpu_die(unsigned int cpu) void __noreturn cpu_die(void) { idle_task_exit(); + __bpon(); pcpu_sigp_retry(pcpu_devices + smp_processor_id(), SIGP_STOP, 0); for (;;) ; }
[ Upstream commit 6b73044b2b0081ee3dd1cd6eaab7dee552601efb ]
Define TIF_ISOLATE_BP and TIF_ISOLATE_BP_GUEST and add the necessary plumbing in entry.S to be able to run user space and KVM guests with limited branch prediction.
To switch a user space process to limited branch prediction the s390_isolate_bp() function has to be call, and to run a vCPU of a KVM guest associated with the current task with limited branch prediction call s390_isolate_bp_guest().
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/include/asm/processor.h | 3 +++ arch/s390/include/asm/thread_info.h | 4 +++ arch/s390/kernel/entry.S | 49 ++++++++++++++++++++++++++++++++++--- arch/s390/kernel/processor.c | 18 ++++++++++++++ 4 files changed, 70 insertions(+), 4 deletions(-)
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h index 44de5d0014c5..d5842126ec70 100644 --- a/arch/s390/include/asm/processor.h +++ b/arch/s390/include/asm/processor.h @@ -360,6 +360,9 @@ extern void memcpy_absolute(void *, void *, size_t); memcpy_absolute(&(dest), &__tmp, sizeof(__tmp)); \ }
+extern int s390_isolate_bp(void); +extern int s390_isolate_bp_guest(void); + #endif /* __ASSEMBLY__ */
#endif /* __ASM_S390_PROCESSOR_H */ diff --git a/arch/s390/include/asm/thread_info.h b/arch/s390/include/asm/thread_info.h index f15c0398c363..84f2ae44b4e9 100644 --- a/arch/s390/include/asm/thread_info.h +++ b/arch/s390/include/asm/thread_info.h @@ -79,6 +79,8 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src); #define TIF_SECCOMP 5 /* secure computing */ #define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ #define TIF_UPROBE 7 /* breakpointed or single-stepping */ +#define TIF_ISOLATE_BP 8 /* Run process with isolated BP */ +#define TIF_ISOLATE_BP_GUEST 9 /* Run KVM guests with isolated BP */ #define TIF_31BIT 16 /* 32bit process */ #define TIF_MEMDIE 17 /* is terminating due to OOM killer */ #define TIF_RESTORE_SIGMASK 18 /* restore signal mask in do_signal() */ @@ -94,6 +96,8 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src); #define _TIF_SECCOMP _BITUL(TIF_SECCOMP) #define _TIF_SYSCALL_TRACEPOINT _BITUL(TIF_SYSCALL_TRACEPOINT) #define _TIF_UPROBE _BITUL(TIF_UPROBE) +#define _TIF_ISOLATE_BP _BITUL(TIF_ISOLATE_BP) +#define _TIF_ISOLATE_BP_GUEST _BITUL(TIF_ISOLATE_BP_GUEST) #define _TIF_31BIT _BITUL(TIF_31BIT) #define _TIF_SINGLE_STEP _BITUL(TIF_SINGLE_STEP)
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index bfcf04de5808..0dd787aabac7 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -105,6 +105,7 @@ _PIF_WORK = (_PIF_PER_TRAP) j 3f 1: LAST_BREAK %r14 UPDATE_VTIME %r14,%r15,\timer + BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP 2: lg %r15,__LC_ASYNC_STACK # load async stack 3: la %r11,STACK_FRAME_OVERHEAD(%r15) .endm @@ -191,6 +192,40 @@ _PIF_WORK = (_PIF_PER_TRAP) .popsection .endm
+ .macro BPENTER tif_ptr,tif_mask + .pushsection .altinstr_replacement, "ax" +662: .word 0xc004, 0x0000, 0x0000 # 6 byte nop + .word 0xc004, 0x0000, 0x0000 # 6 byte nop + .popsection +664: TSTMSK \tif_ptr,\tif_mask + jz . + 8 + .long 0xb2e8d000 + .pushsection .altinstructions, "a" + .long 664b - . + .long 662b - . + .word 82 + .byte 12 + .byte 12 + .popsection + .endm + + .macro BPEXIT tif_ptr,tif_mask + TSTMSK \tif_ptr,\tif_mask + .pushsection .altinstr_replacement, "ax" +662: jnz . + 8 + .long 0xb2e8d000 + .popsection +664: jz . + 8 + .long 0xb2e8c000 + .pushsection .altinstructions, "a" + .long 664b - . + .long 662b - . + .word 82 + .byte 8 + .byte 8 + .popsection + .endm + .section .kprobes.text, "ax" .Ldummy: /* @@ -248,9 +283,11 @@ ENTRY(__switch_to) */ ENTRY(sie64a) stmg %r6,%r14,__SF_GPRS(%r15) # save kernel registers + lg %r12,__LC_CURRENT stg %r2,__SF_EMPTY(%r15) # save control block pointer stg %r3,__SF_EMPTY+8(%r15) # save guest register save area xc __SF_EMPTY+16(8,%r15),__SF_EMPTY+16(%r15) # reason code = 0 + mvc __SF_EMPTY+24(8,%r15),__TI_flags(%r12) # copy thread flags TSTMSK __LC_CPU_FLAGS,_CIF_FPU # load guest fp/vx registers ? jno .Lsie_load_guest_gprs brasl %r14,load_fpu_regs # load guest fp/vx regs @@ -267,10 +304,11 @@ ENTRY(sie64a) jnz .Lsie_skip TSTMSK __LC_CPU_FLAGS,_CIF_FPU jo .Lsie_skip # exit if fp/vx regs changed - BPON + BPEXIT __SF_EMPTY+24(%r15),(_TIF_ISOLATE_BP|_TIF_ISOLATE_BP_GUEST) sie 0(%r14) .Lsie_exit: BPOFF + BPENTER __SF_EMPTY+24(%r15),(_TIF_ISOLATE_BP|_TIF_ISOLATE_BP_GUEST) .Lsie_skip: ni __SIE_PROG0C+3(%r14),0xfe # no longer in SIE lctlg %c1,%c1,__LC_USER_ASCE # load primary asce @@ -332,6 +370,7 @@ ENTRY(system_call) LAST_BREAK %r13 .Lsysc_vtime: UPDATE_VTIME %r10,%r13,__LC_SYNC_ENTER_TIMER + BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP stmg %r0,%r7,__PT_R0(%r11) # clear user controlled register to prevent speculative use xgr %r0,%r0 @@ -369,7 +408,7 @@ ENTRY(system_call) jnz .Lsysc_work # check for work TSTMSK __LC_CPU_FLAGS,_CIF_WORK jnz .Lsysc_work - BPON + BPEXIT __TI_flags(%r12),_TIF_ISOLATE_BP .Lsysc_restore: lg %r14,__LC_VDSO_PER_CPU lmg %r0,%r10,__PT_R0(%r11) @@ -555,6 +594,7 @@ ENTRY(pgm_check_handler) j 3f 2: LAST_BREAK %r14 UPDATE_VTIME %r14,%r15,__LC_SYNC_ENTER_TIMER + BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP lg %r15,__LC_KERNEL_STACK lg %r14,__TI_task(%r12) aghi %r14,__TASK_thread # pointer to thread_struct @@ -683,7 +723,7 @@ ENTRY(io_int_handler) mvc __LC_RETURN_PSW(16),__PT_PSW(%r11) tm __PT_PSW+1(%r11),0x01 # returning to user ? jno .Lio_exit_kernel - BPON + BPEXIT __TI_flags(%r12),_TIF_ISOLATE_BP .Lio_exit_timer: stpt __LC_EXIT_TIMER mvc __VDSO_ECTG_BASE(16,%r14),__LC_EXIT_TIMER @@ -1044,7 +1084,7 @@ ENTRY(mcck_int_handler) mvc __LC_RETURN_MCCK_PSW(16),__PT_PSW(%r11) # move return PSW tm __LC_RETURN_MCCK_PSW+1,0x01 # returning to user ? jno 0f - BPON + BPEXIT __TI_flags(%r12),_TIF_ISOLATE_BP stpt __LC_EXIT_TIMER mvc __VDSO_ECTG_BASE(16,%r14),__LC_EXIT_TIMER 0: lmg %r11,%r15,__PT_R11(%r11) @@ -1165,6 +1205,7 @@ cleanup_critical: .quad .Lsie_done
.Lcleanup_sie: + BPENTER __SF_EMPTY+24(%r15),(_TIF_ISOLATE_BP|_TIF_ISOLATE_BP_GUEST) lg %r9,__SF_EMPTY(%r15) # get control block pointer ni __SIE_PROG0C+3(%r9),0xfe # no longer in SIE lctlg %c1,%c1,__LC_USER_ASCE # load primary asce diff --git a/arch/s390/kernel/processor.c b/arch/s390/kernel/processor.c index 81d0808085e6..d856263fd768 100644 --- a/arch/s390/kernel/processor.c +++ b/arch/s390/kernel/processor.c @@ -179,3 +179,21 @@ const struct seq_operations cpuinfo_op = { .stop = c_stop, .show = show_cpuinfo, }; + +int s390_isolate_bp(void) +{ + if (!test_facility(82)) + return -EOPNOTSUPP; + set_thread_flag(TIF_ISOLATE_BP); + return 0; +} +EXPORT_SYMBOL(s390_isolate_bp); + +int s390_isolate_bp_guest(void) +{ + if (!test_facility(82)) + return -EOPNOTSUPP; + set_thread_flag(TIF_ISOLATE_BP_GUEST); + return 0; +} +EXPORT_SYMBOL(s390_isolate_bp_guest);
[ Upstream commit f19fbd5ed642dc31c809596412dab1ed56f2f156 ]
Add CONFIG_EXPOLINE to enable the use of the new -mindirect-branch= and -mfunction_return= compiler options to create a kernel fortified against the specte v2 attack.
With CONFIG_EXPOLINE=y all indirect branches will be issued with an execute type instruction. For z10 or newer the EXRL instruction will be used, for older machines the EX instruction. The typical indirect call
basr %r14,%r1
is replaced with a PC relative call to a new thunk
brasl %r14,__s390x_indirect_jump_r1
The thunk contains the EXRL/EX instruction to the indirect branch
__s390x_indirect_jump_r1: exrl 0,0f j . 0: br %r1
The detour via the execute type instruction has a performance impact. To get rid of the detour the new kernel parameter "nospectre_v2" and "spectre_v2=[on,off,auto]" can be used. If the parameter is specified the kernel and module code will be patched at runtime.
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/Kconfig | 28 +++++++++ arch/s390/Makefile | 10 +++ arch/s390/include/asm/lowcore.h | 4 +- arch/s390/include/asm/nospec-branch.h | 18 ++++++ arch/s390/kernel/Makefile | 4 ++ arch/s390/kernel/entry.S | 113 ++++++++++++++++++++++++++-------- arch/s390/kernel/module.c | 62 ++++++++++++++++--- arch/s390/kernel/nospec-branch.c | 101 ++++++++++++++++++++++++++++++ arch/s390/kernel/setup.c | 4 ++ arch/s390/kernel/smp.c | 1 + arch/s390/kernel/vmlinux.lds.S | 14 +++++ drivers/s390/char/Makefile | 2 + 12 files changed, 326 insertions(+), 35 deletions(-) create mode 100644 arch/s390/include/asm/nospec-branch.h create mode 100644 arch/s390/kernel/nospec-branch.c
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 02eb7a91ad35..c88bb4a50db7 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -721,6 +721,34 @@ config KERNEL_NOBP
If unsure, say N.
+config EXPOLINE + def_bool n + prompt "Avoid speculative indirect branches in the kernel" + help + Compile the kernel with the expoline compiler options to guard + against kernel-to-user data leaks by avoiding speculative indirect + branches. + Requires a compiler with -mindirect-branch=thunk support for full + protection. The kernel may run slower. + + If unsure, say N. + +choice + prompt "Expoline default" + depends on EXPOLINE + default EXPOLINE_FULL + +config EXPOLINE_OFF + bool "spectre_v2=off" + +config EXPOLINE_MEDIUM + bool "spectre_v2=auto" + +config EXPOLINE_FULL + bool "spectre_v2=on" + +endchoice + endmenu
menu "Power Management" diff --git a/arch/s390/Makefile b/arch/s390/Makefile index 54e00526b8df..d241a9fddf43 100644 --- a/arch/s390/Makefile +++ b/arch/s390/Makefile @@ -79,6 +79,16 @@ ifeq ($(call cc-option-yn,-mwarn-dynamicstack),y) cflags-$(CONFIG_WARN_DYNAMIC_STACK) += -mwarn-dynamicstack endif
+ifdef CONFIG_EXPOLINE + ifeq ($(call cc-option-yn,$(CC_FLAGS_MARCH) -mindirect-branch=thunk),y) + CC_FLAGS_EXPOLINE := -mindirect-branch=thunk + CC_FLAGS_EXPOLINE += -mfunction-return=thunk + CC_FLAGS_EXPOLINE += -mindirect-branch-table + export CC_FLAGS_EXPOLINE + cflags-y += $(CC_FLAGS_EXPOLINE) + endif +endif + ifdef CONFIG_FUNCTION_TRACER # make use of hotpatch feature if the compiler supports it cc_hotpatch := -mhotpatch=0,3 diff --git a/arch/s390/include/asm/lowcore.h b/arch/s390/include/asm/lowcore.h index d52e7efea7d6..ad4e0cee1557 100644 --- a/arch/s390/include/asm/lowcore.h +++ b/arch/s390/include/asm/lowcore.h @@ -135,7 +135,9 @@ struct lowcore { /* Per cpu primary space access list */ __u32 paste[16]; /* 0x0400 */
- __u8 pad_0x04c0[0x0e00-0x0440]; /* 0x0440 */ + /* br %r1 trampoline */ + __u16 br_r1_trampoline; /* 0x0440 */ + __u8 pad_0x0442[0x0e00-0x0442]; /* 0x0442 */
/* * 0xe00 contains the address of the IPL Parameter Information diff --git a/arch/s390/include/asm/nospec-branch.h b/arch/s390/include/asm/nospec-branch.h new file mode 100644 index 000000000000..7df48e5cf36f --- /dev/null +++ b/arch/s390/include/asm/nospec-branch.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_S390_EXPOLINE_H +#define _ASM_S390_EXPOLINE_H + +#ifndef __ASSEMBLY__ + +#include <linux/types.h> + +extern int nospec_call_disable; +extern int nospec_return_disable; + +void nospec_init_branches(void); +void nospec_call_revert(s32 *start, s32 *end); +void nospec_return_revert(s32 *start, s32 *end); + +#endif /* __ASSEMBLY__ */ + +#endif /* _ASM_S390_EXPOLINE_H */ diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile index d4b73f28c395..1c3e5d529cc1 100644 --- a/arch/s390/kernel/Makefile +++ b/arch/s390/kernel/Makefile @@ -42,6 +42,7 @@ ifneq ($(CC_FLAGS_MARCH),-march=z900) CFLAGS_REMOVE_sclp.o += $(CC_FLAGS_MARCH) CFLAGS_sclp.o += -march=z900 CFLAGS_REMOVE_als.o += $(CC_FLAGS_MARCH) +CFLAGS_REMOVE_als.o += $(CC_FLAGS_EXPOLINE) CFLAGS_als.o += -march=z900 AFLAGS_REMOVE_head.o += $(CC_FLAGS_MARCH) AFLAGS_head.o += -march=z900 @@ -61,6 +62,9 @@ obj-y += entry.o reipl.o relocate_kernel.o alternative.o
extra-y += head.o head64.o vmlinux.lds
+obj-$(CONFIG_EXPOLINE) += nospec-branch.o +CFLAGS_REMOVE_expoline.o += $(CC_FLAGS_EXPOLINE) + obj-$(CONFIG_MODULES) += module.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_SCHED_TOPOLOGY) += topology.o diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 0dd787aabac7..8cbe75d7353d 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -226,6 +226,68 @@ _PIF_WORK = (_PIF_PER_TRAP) .popsection .endm
+#ifdef CONFIG_EXPOLINE + + .macro GEN_BR_THUNK name,reg,tmp + .section .text.\name,"axG",@progbits,\name,comdat + .globl \name + .hidden \name + .type \name,@function +\name: + .cfi_startproc +#ifdef CONFIG_HAVE_MARCH_Z10_FEATURES + exrl 0,0f +#else + larl \tmp,0f + ex 0,0(\tmp) +#endif + j . +0: br \reg + .cfi_endproc + .endm + + GEN_BR_THUNK __s390x_indirect_jump_r1use_r9,%r9,%r1 + GEN_BR_THUNK __s390x_indirect_jump_r1use_r14,%r14,%r1 + GEN_BR_THUNK __s390x_indirect_jump_r11use_r14,%r14,%r11 + + .macro BASR_R14_R9 +0: brasl %r14,__s390x_indirect_jump_r1use_r9 + .pushsection .s390_indirect_branches,"a",@progbits + .long 0b-. + .popsection + .endm + + .macro BR_R1USE_R14 +0: jg __s390x_indirect_jump_r1use_r14 + .pushsection .s390_indirect_branches,"a",@progbits + .long 0b-. + .popsection + .endm + + .macro BR_R11USE_R14 +0: jg __s390x_indirect_jump_r11use_r14 + .pushsection .s390_indirect_branches,"a",@progbits + .long 0b-. + .popsection + .endm + +#else /* CONFIG_EXPOLINE */ + + .macro BASR_R14_R9 + basr %r14,%r9 + .endm + + .macro BR_R1USE_R14 + br %r14 + .endm + + .macro BR_R11USE_R14 + br %r14 + .endm + +#endif /* CONFIG_EXPOLINE */ + + .section .kprobes.text, "ax" .Ldummy: /* @@ -241,7 +303,7 @@ _PIF_WORK = (_PIF_PER_TRAP) ENTRY(__bpon) .globl __bpon BPON - br %r14 + BR_R1USE_R14
/* * Scheduler resume function, called by switch_to @@ -269,9 +331,9 @@ ENTRY(__switch_to) mvc __LC_CURRENT_PID(4,%r0),__TASK_pid(%r3) # store pid of next lmg %r6,%r15,__SF_GPRS(%r15) # load gprs of next task TSTMSK __LC_MACHINE_FLAGS,MACHINE_FLAG_LPP - bzr %r14 + jz 0f .insn s,0xb2800000,__LC_LPP # set program parameter - br %r14 +0: BR_R1USE_R14
.L__critical_start:
@@ -337,7 +399,7 @@ sie_exit: xgr %r5,%r5 lmg %r6,%r14,__SF_GPRS(%r15) # restore kernel registers lg %r2,__SF_EMPTY+16(%r15) # return exit reason code - br %r14 + BR_R1USE_R14 .Lsie_fault: lghi %r14,-EFAULT stg %r14,__SF_EMPTY+16(%r15) # set exit reason code @@ -396,7 +458,7 @@ ENTRY(system_call) lgf %r9,0(%r8,%r10) # get system call add. TSTMSK __TI_flags(%r12),_TIF_TRACE jnz .Lsysc_tracesys - basr %r14,%r9 # call sys_xxxx + BASR_R14_R9 # call sys_xxxx stg %r2,__PT_R2(%r11) # store return value
.Lsysc_return: @@ -536,7 +598,7 @@ ENTRY(system_call) lmg %r3,%r7,__PT_R3(%r11) stg %r7,STACK_FRAME_OVERHEAD(%r15) lg %r2,__PT_ORIG_GPR2(%r11) - basr %r14,%r9 # call sys_xxx + BASR_R14_R9 # call sys_xxx stg %r2,__PT_R2(%r11) # store return value .Lsysc_tracenogo: TSTMSK __TI_flags(%r12),_TIF_TRACE @@ -560,7 +622,7 @@ ENTRY(ret_from_fork) lmg %r9,%r10,__PT_R9(%r11) # load gprs ENTRY(kernel_thread_starter) la %r2,0(%r10) - basr %r14,%r9 + BASR_R14_R9 j .Lsysc_tracenogo
/* @@ -634,9 +696,9 @@ ENTRY(pgm_check_handler) nill %r10,0x007f sll %r10,2 je .Lpgm_return - lgf %r1,0(%r10,%r1) # load address of handler routine + lgf %r9,0(%r10,%r1) # load address of handler routine lgr %r2,%r11 # pass pointer to pt_regs - basr %r14,%r1 # branch to interrupt-handler + BASR_R14_R9 # branch to interrupt-handler .Lpgm_return: LOCKDEP_SYS_EXIT tm __PT_PSW+1(%r11),0x01 # returning to user ? @@ -914,7 +976,7 @@ ENTRY(psw_idle) stpt __TIMER_IDLE_ENTER(%r2) .Lpsw_idle_lpsw: lpswe __SF_EMPTY(%r15) - br %r14 + BR_R1USE_R14 .Lpsw_idle_end:
/* @@ -928,7 +990,7 @@ ENTRY(save_fpu_regs) lg %r2,__LC_CURRENT aghi %r2,__TASK_thread TSTMSK __LC_CPU_FLAGS,_CIF_FPU - bor %r14 + jo .Lsave_fpu_regs_exit stfpc __THREAD_FPU_fpc(%r2) .Lsave_fpu_regs_fpc_end: lg %r3,__THREAD_FPU_regs(%r2) @@ -958,7 +1020,8 @@ ENTRY(save_fpu_regs) std 15,120(%r3) .Lsave_fpu_regs_done: oi __LC_CPU_FLAGS+7,_CIF_FPU - br %r14 +.Lsave_fpu_regs_exit: + BR_R1USE_R14 .Lsave_fpu_regs_end: #if IS_ENABLED(CONFIG_KVM) EXPORT_SYMBOL(save_fpu_regs) @@ -978,7 +1041,7 @@ load_fpu_regs: lg %r4,__LC_CURRENT aghi %r4,__TASK_thread TSTMSK __LC_CPU_FLAGS,_CIF_FPU - bnor %r14 + jno .Lload_fpu_regs_exit lfpc __THREAD_FPU_fpc(%r4) TSTMSK __LC_MACHINE_FLAGS,MACHINE_FLAG_VX lg %r4,__THREAD_FPU_regs(%r4) # %r4 <- reg save area @@ -1007,7 +1070,8 @@ load_fpu_regs: ld 15,120(%r4) .Lload_fpu_regs_done: ni __LC_CPU_FLAGS+7,255-_CIF_FPU - br %r14 +.Lload_fpu_regs_exit: + BR_R1USE_R14 .Lload_fpu_regs_end:
.L__critical_end: @@ -1180,7 +1244,7 @@ cleanup_critical: jl 0f clg %r9,BASED(.Lcleanup_table+104) # .Lload_fpu_regs_end jl .Lcleanup_load_fpu_regs -0: br %r14 +0: BR_R11USE_R14
.align 8 .Lcleanup_table: @@ -1210,7 +1274,7 @@ cleanup_critical: ni __SIE_PROG0C+3(%r9),0xfe # no longer in SIE lctlg %c1,%c1,__LC_USER_ASCE # load primary asce larl %r9,sie_exit # skip forward to sie_exit - br %r14 + BR_R11USE_R14 #endif
.Lcleanup_system_call: @@ -1267,7 +1331,7 @@ cleanup_critical: stg %r15,56(%r11) # r15 stack pointer # set new psw address and exit larl %r9,.Lsysc_do_svc - br %r14 + BR_R11USE_R14 .Lcleanup_system_call_insn: .quad system_call .quad .Lsysc_stmg @@ -1277,7 +1341,7 @@ cleanup_critical:
.Lcleanup_sysc_tif: larl %r9,.Lsysc_tif - br %r14 + BR_R11USE_R14
.Lcleanup_sysc_restore: # check if stpt has been executed @@ -1294,14 +1358,14 @@ cleanup_critical: mvc 0(64,%r11),__PT_R8(%r9) lmg %r0,%r7,__PT_R0(%r9) 1: lmg %r8,%r9,__LC_RETURN_PSW - br %r14 + BR_R11USE_R14 .Lcleanup_sysc_restore_insn: .quad .Lsysc_exit_timer .quad .Lsysc_done - 4
.Lcleanup_io_tif: larl %r9,.Lio_tif - br %r14 + BR_R11USE_R14
.Lcleanup_io_restore: # check if stpt has been executed @@ -1315,7 +1379,7 @@ cleanup_critical: mvc 0(64,%r11),__PT_R8(%r9) lmg %r0,%r7,__PT_R0(%r9) 1: lmg %r8,%r9,__LC_RETURN_PSW - br %r14 + BR_R11USE_R14 .Lcleanup_io_restore_insn: .quad .Lio_exit_timer .quad .Lio_done - 4 @@ -1368,17 +1432,17 @@ cleanup_critical: # prepare return psw nihh %r8,0xfcfd # clear irq & wait state bits lg %r9,48(%r11) # return from psw_idle - br %r14 + BR_R11USE_R14 .Lcleanup_idle_insn: .quad .Lpsw_idle_lpsw
.Lcleanup_save_fpu_regs: larl %r9,save_fpu_regs - br %r14 + BR_R11USE_R14
.Lcleanup_load_fpu_regs: larl %r9,load_fpu_regs - br %r14 + BR_R11USE_R14
/* * Integer constants @@ -1394,7 +1458,6 @@ cleanup_critical: .Lsie_critical_length: .quad .Lsie_done - .Lsie_gmap #endif - .section .rodata, "a" #define SYSCALL(esame,emu) .long esame .globl sys_call_table diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c index 6f68259bf80e..f461b09b3e85 100644 --- a/arch/s390/kernel/module.c +++ b/arch/s390/kernel/module.c @@ -32,6 +32,8 @@ #include <linux/moduleloader.h> #include <linux/bug.h> #include <asm/alternative.h> +#include <asm/nospec-branch.h> +#include <asm/facility.h>
#if 0 #define DEBUGP printk @@ -168,7 +170,11 @@ int module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs, me->arch.got_offset = me->core_layout.size; me->core_layout.size += me->arch.got_size; me->arch.plt_offset = me->core_layout.size; - me->core_layout.size += me->arch.plt_size; + if (me->arch.plt_size) { + if (IS_ENABLED(CONFIG_EXPOLINE) && !nospec_call_disable) + me->arch.plt_size += PLT_ENTRY_SIZE; + me->core_layout.size += me->arch.plt_size; + } return 0; }
@@ -322,9 +328,21 @@ static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab, unsigned int *ip; ip = me->core_layout.base + me->arch.plt_offset + info->plt_offset; - ip[0] = 0x0d10e310; /* basr 1,0; lg 1,10(1); br 1 */ - ip[1] = 0x100a0004; - ip[2] = 0x07f10000; + ip[0] = 0x0d10e310; /* basr 1,0 */ + ip[1] = 0x100a0004; /* lg 1,10(1) */ + if (IS_ENABLED(CONFIG_EXPOLINE) && + !nospec_call_disable) { + unsigned int *ij; + ij = me->core_layout.base + + me->arch.plt_offset + + me->arch.plt_size - PLT_ENTRY_SIZE; + ip[2] = 0xa7f40000 + /* j __jump_r1 */ + (unsigned int)(u16) + (((unsigned long) ij - 8 - + (unsigned long) ip) / 2); + } else { + ip[2] = 0x07f10000; /* br %r1 */ + } ip[3] = (unsigned int) (val >> 32); ip[4] = (unsigned int) val; info->plt_initialized = 1; @@ -430,16 +448,42 @@ int module_finalize(const Elf_Ehdr *hdr, struct module *me) { const Elf_Shdr *s; - char *secstrings; + char *secstrings, *secname; + void *aseg; + + if (IS_ENABLED(CONFIG_EXPOLINE) && + !nospec_call_disable && me->arch.plt_size) { + unsigned int *ij; + + ij = me->core_layout.base + me->arch.plt_offset + + me->arch.plt_size - PLT_ENTRY_SIZE; + if (test_facility(35)) { + ij[0] = 0xc6000000; /* exrl %r0,.+10 */ + ij[1] = 0x0005a7f4; /* j . */ + ij[2] = 0x000007f1; /* br %r1 */ + } else { + ij[0] = 0x44000000 | (unsigned int) + offsetof(struct lowcore, br_r1_trampoline); + ij[1] = 0xa7f40000; /* j . */ + } + }
secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) { - if (!strcmp(".altinstructions", secstrings + s->sh_name)) { - /* patch .altinstructions */ - void *aseg = (void *)s->sh_addr; + aseg = (void *) s->sh_addr; + secname = secstrings + s->sh_name;
+ if (!strcmp(".altinstructions", secname)) + /* patch .altinstructions */ apply_alternatives(aseg, aseg + s->sh_size); - } + + if (IS_ENABLED(CONFIG_EXPOLINE) && + (!strcmp(".nospec_call_table", secname))) + nospec_call_revert(aseg, aseg + s->sh_size); + + if (IS_ENABLED(CONFIG_EXPOLINE) && + (!strcmp(".nospec_return_table", secname))) + nospec_return_revert(aseg, aseg + s->sh_size); }
jump_label_apply_nops(me); diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c new file mode 100644 index 000000000000..86ee26a612cf --- /dev/null +++ b/arch/s390/kernel/nospec-branch.c @@ -0,0 +1,101 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/module.h> +#include <asm/facility.h> +#include <asm/nospec-branch.h> + +int nospec_call_disable = IS_ENABLED(EXPOLINE_OFF); +int nospec_return_disable = !IS_ENABLED(EXPOLINE_FULL); + +static int __init nospectre_v2_setup_early(char *str) +{ + nospec_call_disable = 1; + nospec_return_disable = 1; + return 0; +} +early_param("nospectre_v2", nospectre_v2_setup_early); + +static int __init spectre_v2_setup_early(char *str) +{ + if (str && !strncmp(str, "on", 2)) { + nospec_call_disable = 0; + nospec_return_disable = 0; + } + if (str && !strncmp(str, "off", 3)) { + nospec_call_disable = 1; + nospec_return_disable = 1; + } + if (str && !strncmp(str, "auto", 4)) { + nospec_call_disable = 0; + nospec_return_disable = 1; + } + return 0; +} +early_param("spectre_v2", spectre_v2_setup_early); + +static void __init_or_module __nospec_revert(s32 *start, s32 *end) +{ + enum { BRCL_EXPOLINE, BRASL_EXPOLINE } type; + u8 *instr, *thunk, *br; + u8 insnbuf[6]; + s32 *epo; + + /* Second part of the instruction replace is always a nop */ + memcpy(insnbuf + 2, (char[]) { 0x47, 0x00, 0x00, 0x00 }, 4); + for (epo = start; epo < end; epo++) { + instr = (u8 *) epo + *epo; + if (instr[0] == 0xc0 && (instr[1] & 0x0f) == 0x04) + type = BRCL_EXPOLINE; /* brcl instruction */ + else if (instr[0] == 0xc0 && (instr[1] & 0x0f) == 0x05) + type = BRASL_EXPOLINE; /* brasl instruction */ + else + continue; + thunk = instr + (*(int *)(instr + 2)) * 2; + if (thunk[0] == 0xc6 && thunk[1] == 0x00) + /* exrl %r0,<target-br> */ + br = thunk + (*(int *)(thunk + 2)) * 2; + else if (thunk[0] == 0xc0 && (thunk[1] & 0x0f) == 0x00 && + thunk[6] == 0x44 && thunk[7] == 0x00 && + (thunk[8] & 0x0f) == 0x00 && thunk[9] == 0x00 && + (thunk[1] & 0xf0) == (thunk[8] & 0xf0)) + /* larl %rx,<target br> + ex %r0,0(%rx) */ + br = thunk + (*(int *)(thunk + 2)) * 2; + else + continue; + if (br[0] != 0x07 || (br[1] & 0xf0) != 0xf0) + continue; + switch (type) { + case BRCL_EXPOLINE: + /* brcl to thunk, replace with br + nop */ + insnbuf[0] = br[0]; + insnbuf[1] = (instr[1] & 0xf0) | (br[1] & 0x0f); + break; + case BRASL_EXPOLINE: + /* brasl to thunk, replace with basr + nop */ + insnbuf[0] = 0x0d; + insnbuf[1] = (instr[1] & 0xf0) | (br[1] & 0x0f); + break; + } + + s390_kernel_write(instr, insnbuf, 6); + } +} + +void __init_or_module nospec_call_revert(s32 *start, s32 *end) +{ + if (nospec_call_disable) + __nospec_revert(start, end); +} + +void __init_or_module nospec_return_revert(s32 *start, s32 *end) +{ + if (nospec_return_disable) + __nospec_revert(start, end); +} + +extern s32 __nospec_call_start[], __nospec_call_end[]; +extern s32 __nospec_return_start[], __nospec_return_end[]; +void __init nospec_init_branches(void) +{ + nospec_call_revert(__nospec_call_start, __nospec_call_end); + nospec_return_revert(__nospec_return_start, __nospec_return_end); +} diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index 433b380896b9..bc25366f0244 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -64,6 +64,7 @@ #include <asm/sysinfo.h> #include <asm/numa.h> #include <asm/alternative.h> +#include <asm/nospec-branch.h> #include "entry.h"
/* @@ -375,6 +376,7 @@ static void __init setup_lowcore(void) #ifdef CONFIG_SMP lc->spinlock_lockval = arch_spin_lockval(0); #endif + lc->br_r1_trampoline = 0x07f1; /* br %r1 */
set_prefix((u32)(unsigned long) lc); lowcore_ptr[0] = lc; @@ -935,6 +937,8 @@ void __init setup_arch(char **cmdline_p) set_preferred_console();
apply_alternative_instructions(); + if (IS_ENABLED(CONFIG_EXPOLINE)) + nospec_init_branches();
/* Setup zfcpdump support */ setup_zfcpdump(); diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c index b34ff0798d38..0a31110f41f6 100644 --- a/arch/s390/kernel/smp.c +++ b/arch/s390/kernel/smp.c @@ -205,6 +205,7 @@ static int pcpu_alloc_lowcore(struct pcpu *pcpu, int cpu) lc->panic_stack = panic_stack + PANIC_FRAME_OFFSET; lc->cpu_nr = cpu; lc->spinlock_lockval = arch_spin_lockval(cpu); + lc->br_r1_trampoline = 0x07f1; /* br %r1 */ if (MACHINE_HAS_VX) lc->vector_save_area_addr = (unsigned long) &lc->vector_save_area; diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S index b8ec50cb1b6f..dd96b467946b 100644 --- a/arch/s390/kernel/vmlinux.lds.S +++ b/arch/s390/kernel/vmlinux.lds.S @@ -122,6 +122,20 @@ SECTIONS *(.altinstr_replacement) }
+ /* + * Table with the patch locations to undo expolines + */ + .nospec_call_table : { + __nospec_call_start = . ; + *(.s390_indirect*) + __nospec_call_end = . ; + } + .nospec_return_table : { + __nospec_return_start = . ; + *(.s390_return*) + __nospec_return_end = . ; + } + /* early.c uses stsi, which requires page aligned data. */ . = ALIGN(PAGE_SIZE); INIT_DATA_SECTION(0x100) diff --git a/drivers/s390/char/Makefile b/drivers/s390/char/Makefile index 41e28b23b26a..8ac27efe34fc 100644 --- a/drivers/s390/char/Makefile +++ b/drivers/s390/char/Makefile @@ -2,6 +2,8 @@ # S/390 character devices #
+CFLAGS_REMOVE_sclp_early_core.o += $(CC_FLAGS_EXPOLINE) + obj-y += ctrlchar.o keyboard.o defkeymap.o sclp.o sclp_rw.o sclp_quiesce.o \ sclp_cmd.o sclp_config.o sclp_cpi_sys.o sclp_ocf.o sclp_ctl.o \ sclp_early.o
From: Christian Borntraeger borntraeger@de.ibm.com
[ Upstream commit f315104ad8b0c32be13eac628569ae707c332cb5 ]
If the guest runs with bp isolation when doing a SIE instruction, we must also run the nested guest with bp isolation when emulating that SIE instruction. This is done by activating BPBC in the lpar, which acts as an override for lower level guests.
Signed-off-by: Christian Borntraeger borntraeger@de.ibm.com Reviewed-by: Janosch Frank frankja@linux.vnet.ibm.com Reviewed-by: David Hildenbrand david@redhat.com Signed-off-by: Christian Borntraeger borntraeger@de.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kvm/vsie.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c index e56b7724054c..ced6c9b8f04d 100644 --- a/arch/s390/kvm/vsie.c +++ b/arch/s390/kvm/vsie.c @@ -764,6 +764,7 @@ static int do_vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page) { struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s; struct kvm_s390_sie_block *scb_o = vsie_page->scb_o; + int guest_bp_isolation; int rc;
handle_last_fault(vcpu, vsie_page); @@ -774,6 +775,20 @@ static int do_vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page) s390_handle_mcck();
srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); + + /* save current guest state of bp isolation override */ + guest_bp_isolation = test_thread_flag(TIF_ISOLATE_BP_GUEST); + + /* + * The guest is running with BPBC, so we have to force it on for our + * nested guest. This is done by enabling BPBC globally, so the BPBC + * control in the SCB (which the nested guest can modify) is simply + * ignored. + */ + if (test_kvm_facility(vcpu->kvm, 82) && + vcpu->arch.sie_block->fpf & FPF_BPBC) + set_thread_flag(TIF_ISOLATE_BP_GUEST); + local_irq_disable(); guest_enter_irqoff(); local_irq_enable(); @@ -783,6 +798,11 @@ static int do_vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page) local_irq_disable(); guest_exit_irqoff(); local_irq_enable(); + + /* restore guest state for bp isolation override */ + if (!guest_bp_isolation) + clear_thread_flag(TIF_ISOLATE_BP_GUEST); + vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
if (rc > 0)
From: Eugeniu Rosca erosca@de.adit-jv.com
[ Upstream commit 2cb370d615e9fbed9e95ed222c2c8f337181aa90 ]
I've accidentally stumbled upon the IS_ENABLED(EXPOLINE_*) lines, which obviously always evaluate to false. Fix this.
Fixes: f19fbd5ed642 ("s390: introduce execute-trampolines for branches") Signed-off-by: Eugeniu Rosca erosca@de.adit-jv.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/nospec-branch.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index 86ee26a612cf..57f55c24c21c 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -3,8 +3,8 @@ #include <asm/facility.h> #include <asm/nospec-branch.h>
-int nospec_call_disable = IS_ENABLED(EXPOLINE_OFF); -int nospec_return_disable = !IS_ENABLED(EXPOLINE_FULL); +int nospec_call_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF); +int nospec_return_disable = !IS_ENABLED(CONFIG_EXPOLINE_FULL);
static int __init nospectre_v2_setup_early(char *str) {
[ Upstream commit d5feec04fe578c8dbd9e2e1439afc2f0af761ed4 ]
The system call path can be interrupted before the switch back to the standard branch prediction with BPENTER has been done. The critical section cleanup code skips forward to .Lsysc_do_svc and bypasses the BPENTER. In this case the kernel and all subsequent code will run with the limited branch prediction.
Fixes: eacf67eb9b32 ("s390: run user space and KVM guests with modified branch prediction") Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/entry.S | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 8cbe75d7353d..c3e28af207cf 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -1316,7 +1316,8 @@ cleanup_critical: srag %r9,%r9,23 jz 0f mvc __TI_last_break(8,%r12),16(%r11) -0: # set up saved register r11 +0: BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP + # set up saved register r11 lg %r15,__LC_KERNEL_STACK la %r9,STACK_FRAME_OVERHEAD(%r15) stg %r9,24(%r11) # r11 pt_regs pointer
From: Christian Borntraeger borntraeger@de.ibm.com
[ Upstream commit d3f468963cd6fd6d2aa5e26aed8b24232096d0e1 ]
when a system call is interrupted we might call the critical section cleanup handler that re-does some of the operations. When we are between .Lsysc_vtime and .Lsysc_do_svc we might also redo the saving of the problem state registers r0-r7:
.Lcleanup_system_call: [...] 0: # update accounting time stamp mvc __LC_LAST_UPDATE_TIMER(8),__LC_SYNC_ENTER_TIMER # set up saved register r11 lg %r15,__LC_KERNEL_STACK la %r9,STACK_FRAME_OVERHEAD(%r15) stg %r9,24(%r11) # r11 pt_regs pointer # fill pt_regs mvc __PT_R8(64,%r9),__LC_SAVE_AREA_SYNC ---> stmg %r0,%r7,__PT_R0(%r9)
The problem is now, that we might have already zeroed out r0. The fix is to move the zeroing of r0 after sysc_do_svc.
Reported-by: Farhan Ali alifm@linux.vnet.ibm.com Fixes: 7041d28115e91 ("s390: scrub registers on kernel entry and KVM exit") Signed-off-by: Christian Borntraeger borntraeger@de.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/entry.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index c3e28af207cf..1996afeb2e81 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -434,13 +434,13 @@ ENTRY(system_call) UPDATE_VTIME %r10,%r13,__LC_SYNC_ENTER_TIMER BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP stmg %r0,%r7,__PT_R0(%r11) - # clear user controlled register to prevent speculative use - xgr %r0,%r0 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC stg %r14,__PT_FLAGS(%r11) .Lsysc_do_svc: + # clear user controlled register to prevent speculative use + xgr %r0,%r0 lg %r10,__TI_sysc_table(%r12) # address of system call table llgh %r8,__PT_INT_CODE+2(%r11) slag %r8,%r8,2 # shift and test for svc 0
[ Upstream commit b2e2f43a01bace1a25bdbae04c9f9846882b727a ]
Keep the code for the nobp parameter handling with the code for expolines. Both are related to the spectre v2 mitigation.
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/Makefile | 4 ++-- arch/s390/kernel/alternative.c | 23 ----------------------- arch/s390/kernel/nospec-branch.c | 27 +++++++++++++++++++++++++++ 3 files changed, 29 insertions(+), 25 deletions(-)
diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile index 1c3e5d529cc1..0501cac2ab95 100644 --- a/arch/s390/kernel/Makefile +++ b/arch/s390/kernel/Makefile @@ -59,11 +59,11 @@ obj-y += debug.o irq.o ipl.o dis.o diag.o sclp.o vdso.o als.o obj-y += sysinfo.o jump_label.o lgr.o os_info.o machine_kexec.o pgm_check.o obj-y += runtime_instr.o cache.o fpu.o dumpstack.o obj-y += entry.o reipl.o relocate_kernel.o alternative.o +obj-y += nospec-branch.o
extra-y += head.o head64.o vmlinux.lds
-obj-$(CONFIG_EXPOLINE) += nospec-branch.o -CFLAGS_REMOVE_expoline.o += $(CC_FLAGS_EXPOLINE) +CFLAGS_REMOVE_nospec-branch.o += $(CC_FLAGS_EXPOLINE)
obj-$(CONFIG_MODULES) += module.o obj-$(CONFIG_SMP) += smp.o diff --git a/arch/s390/kernel/alternative.c b/arch/s390/kernel/alternative.c index ed21de98b79e..f5060af61176 100644 --- a/arch/s390/kernel/alternative.c +++ b/arch/s390/kernel/alternative.c @@ -14,29 +14,6 @@ static int __init disable_alternative_instructions(char *str)
early_param("noaltinstr", disable_alternative_instructions);
-static int __init nobp_setup_early(char *str) -{ - bool enabled; - int rc; - - rc = kstrtobool(str, &enabled); - if (rc) - return rc; - if (enabled && test_facility(82)) - __set_facility(82, S390_lowcore.alt_stfle_fac_list); - else - __clear_facility(82, S390_lowcore.alt_stfle_fac_list); - return 0; -} -early_param("nobp", nobp_setup_early); - -static int __init nospec_setup_early(char *str) -{ - __clear_facility(82, S390_lowcore.alt_stfle_fac_list); - return 0; -} -early_param("nospec", nospec_setup_early); - struct brcl_insn { u16 opc; s32 disp; diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index 57f55c24c21c..3ce59d044a4d 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -3,6 +3,31 @@ #include <asm/facility.h> #include <asm/nospec-branch.h>
+static int __init nobp_setup_early(char *str) +{ + bool enabled; + int rc; + + rc = kstrtobool(str, &enabled); + if (rc) + return rc; + if (enabled && test_facility(82)) + __set_facility(82, S390_lowcore.alt_stfle_fac_list); + else + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); + return 0; +} +early_param("nobp", nobp_setup_early); + +static int __init nospec_setup_early(char *str) +{ + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); + return 0; +} +early_param("nospec", nospec_setup_early); + +#ifdef CONFIG_EXPOLINE + int nospec_call_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF); int nospec_return_disable = !IS_ENABLED(CONFIG_EXPOLINE_FULL);
@@ -99,3 +124,5 @@ void __init nospec_init_branches(void) nospec_call_revert(__nospec_call_start, __nospec_call_end); nospec_return_revert(__nospec_return_start, __nospec_return_end); } + +#endif /* CONFIG_EXPOLINE */
[ Upstream commit 6e179d64126b909f0b288fa63cdbf07c531e9b1d ]
Automatically decide between nobp vs. expolines if the spectre_v2=auto kernel parameter is specified or CONFIG_EXPOLINE_AUTO=y is set.
The decision made at boot time due to CONFIG_EXPOLINE_AUTO=y being set can be overruled with the nobp, nospec and spectre_v2 kernel parameters.
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/Kconfig | 2 +- arch/s390/Makefile | 2 +- arch/s390/include/asm/nospec-branch.h | 6 ++-- arch/s390/kernel/alternative.c | 1 + arch/s390/kernel/module.c | 11 +++--- arch/s390/kernel/nospec-branch.c | 68 +++++++++++++++++++++-------------- 6 files changed, 52 insertions(+), 38 deletions(-)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index c88bb4a50db7..fbb8fca6f9c2 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -741,7 +741,7 @@ choice config EXPOLINE_OFF bool "spectre_v2=off"
-config EXPOLINE_MEDIUM +config EXPOLINE_AUTO bool "spectre_v2=auto"
config EXPOLINE_FULL diff --git a/arch/s390/Makefile b/arch/s390/Makefile index d241a9fddf43..bef67c0f63e2 100644 --- a/arch/s390/Makefile +++ b/arch/s390/Makefile @@ -85,7 +85,7 @@ ifdef CONFIG_EXPOLINE CC_FLAGS_EXPOLINE += -mfunction-return=thunk CC_FLAGS_EXPOLINE += -mindirect-branch-table export CC_FLAGS_EXPOLINE - cflags-y += $(CC_FLAGS_EXPOLINE) + cflags-y += $(CC_FLAGS_EXPOLINE) -DCC_USING_EXPOLINE endif endif
diff --git a/arch/s390/include/asm/nospec-branch.h b/arch/s390/include/asm/nospec-branch.h index 7df48e5cf36f..35bf28fe4c64 100644 --- a/arch/s390/include/asm/nospec-branch.h +++ b/arch/s390/include/asm/nospec-branch.h @@ -6,12 +6,10 @@
#include <linux/types.h>
-extern int nospec_call_disable; -extern int nospec_return_disable; +extern int nospec_disable;
void nospec_init_branches(void); -void nospec_call_revert(s32 *start, s32 *end); -void nospec_return_revert(s32 *start, s32 *end); +void nospec_revert(s32 *start, s32 *end);
#endif /* __ASSEMBLY__ */
diff --git a/arch/s390/kernel/alternative.c b/arch/s390/kernel/alternative.c index f5060af61176..b57b293998dc 100644 --- a/arch/s390/kernel/alternative.c +++ b/arch/s390/kernel/alternative.c @@ -1,6 +1,7 @@ #include <linux/module.h> #include <asm/alternative.h> #include <asm/facility.h> +#include <asm/nospec-branch.h>
#define MAX_PATCH_LEN (255 - 1)
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c index f461b09b3e85..f3f7321bef1a 100644 --- a/arch/s390/kernel/module.c +++ b/arch/s390/kernel/module.c @@ -171,7 +171,7 @@ int module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs, me->core_layout.size += me->arch.got_size; me->arch.plt_offset = me->core_layout.size; if (me->arch.plt_size) { - if (IS_ENABLED(CONFIG_EXPOLINE) && !nospec_call_disable) + if (IS_ENABLED(CONFIG_EXPOLINE) && !nospec_disable) me->arch.plt_size += PLT_ENTRY_SIZE; me->core_layout.size += me->arch.plt_size; } @@ -330,8 +330,7 @@ static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab, info->plt_offset; ip[0] = 0x0d10e310; /* basr 1,0 */ ip[1] = 0x100a0004; /* lg 1,10(1) */ - if (IS_ENABLED(CONFIG_EXPOLINE) && - !nospec_call_disable) { + if (IS_ENABLED(CONFIG_EXPOLINE) && !nospec_disable) { unsigned int *ij; ij = me->core_layout.base + me->arch.plt_offset + @@ -452,7 +451,7 @@ int module_finalize(const Elf_Ehdr *hdr, void *aseg;
if (IS_ENABLED(CONFIG_EXPOLINE) && - !nospec_call_disable && me->arch.plt_size) { + !nospec_disable && me->arch.plt_size) { unsigned int *ij;
ij = me->core_layout.base + me->arch.plt_offset + @@ -479,11 +478,11 @@ int module_finalize(const Elf_Ehdr *hdr,
if (IS_ENABLED(CONFIG_EXPOLINE) && (!strcmp(".nospec_call_table", secname))) - nospec_call_revert(aseg, aseg + s->sh_size); + nospec_revert(aseg, aseg + s->sh_size);
if (IS_ENABLED(CONFIG_EXPOLINE) && (!strcmp(".nospec_return_table", secname))) - nospec_return_revert(aseg, aseg + s->sh_size); + nospec_revert(aseg, aseg + s->sh_size); }
jump_label_apply_nops(me); diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index 3ce59d044a4d..daa36a38a8b7 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -11,10 +11,17 @@ static int __init nobp_setup_early(char *str) rc = kstrtobool(str, &enabled); if (rc) return rc; - if (enabled && test_facility(82)) + if (enabled && test_facility(82)) { + /* + * The user explicitely requested nobp=1, enable it and + * disable the expoline support. + */ __set_facility(82, S390_lowcore.alt_stfle_fac_list); - else + if (IS_ENABLED(CONFIG_EXPOLINE)) + nospec_disable = 1; + } else { __clear_facility(82, S390_lowcore.alt_stfle_fac_list); + } return 0; } early_param("nobp", nobp_setup_early); @@ -28,31 +35,46 @@ early_param("nospec", nospec_setup_early);
#ifdef CONFIG_EXPOLINE
-int nospec_call_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF); -int nospec_return_disable = !IS_ENABLED(CONFIG_EXPOLINE_FULL); +int nospec_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF);
static int __init nospectre_v2_setup_early(char *str) { - nospec_call_disable = 1; - nospec_return_disable = 1; + nospec_disable = 1; return 0; } early_param("nospectre_v2", nospectre_v2_setup_early);
+static int __init spectre_v2_auto_early(void) +{ + if (IS_ENABLED(CC_USING_EXPOLINE)) { + /* + * The kernel has been compiled with expolines. + * Keep expolines enabled and disable nobp. + */ + nospec_disable = 0; + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); + } + /* + * If the kernel has not been compiled with expolines the + * nobp setting decides what is done, this depends on the + * CONFIG_KERNEL_NP option and the nobp/nospec parameters. + */ + return 0; +} +#ifdef CONFIG_EXPOLINE_AUTO +early_initcall(spectre_v2_auto_early); +#endif + static int __init spectre_v2_setup_early(char *str) { if (str && !strncmp(str, "on", 2)) { - nospec_call_disable = 0; - nospec_return_disable = 0; - } - if (str && !strncmp(str, "off", 3)) { - nospec_call_disable = 1; - nospec_return_disable = 1; - } - if (str && !strncmp(str, "auto", 4)) { - nospec_call_disable = 0; - nospec_return_disable = 1; + nospec_disable = 0; + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); } + if (str && !strncmp(str, "off", 3)) + nospec_disable = 1; + if (str && !strncmp(str, "auto", 4)) + spectre_v2_auto_early(); return 0; } early_param("spectre_v2", spectre_v2_setup_early); @@ -105,15 +127,9 @@ static void __init_or_module __nospec_revert(s32 *start, s32 *end) } }
-void __init_or_module nospec_call_revert(s32 *start, s32 *end) -{ - if (nospec_call_disable) - __nospec_revert(start, end); -} - -void __init_or_module nospec_return_revert(s32 *start, s32 *end) +void __init_or_module nospec_revert(s32 *start, s32 *end) { - if (nospec_return_disable) + if (nospec_disable) __nospec_revert(start, end); }
@@ -121,8 +137,8 @@ extern s32 __nospec_call_start[], __nospec_call_end[]; extern s32 __nospec_return_start[], __nospec_return_end[]; void __init nospec_init_branches(void) { - nospec_call_revert(__nospec_call_start, __nospec_call_end); - nospec_return_revert(__nospec_return_start, __nospec_return_end); + nospec_revert(__nospec_call_start, __nospec_call_end); + nospec_revert(__nospec_return_start, __nospec_return_end); }
#endif /* CONFIG_EXPOLINE */
[ Upstream commit bc035599718412cfba9249aa713f90ef13f13ee9 ]
Add a boot message if either of the spectre defenses is active. The message is "Spectre V2 mitigation: execute trampolines." or "Spectre V2 mitigation: limited branch prediction."
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/nospec-branch.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index daa36a38a8b7..7387ebace891 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -33,6 +33,16 @@ static int __init nospec_setup_early(char *str) } early_param("nospec", nospec_setup_early);
+static int __init nospec_report(void) +{ + if (IS_ENABLED(CC_USING_EXPOLINE) && !nospec_disable) + pr_info("Spectre V2 mitigation: execute trampolines.\n"); + if (__test_facility(82, S390_lowcore.alt_stfle_fac_list)) + pr_info("Spectre V2 mitigation: limited branch prediction.\n"); + return 0; +} +arch_initcall(nospec_report); + #ifdef CONFIG_EXPOLINE
int nospec_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF);
[ Upstream commit d424986f1d6b16079b3231db0314923f4f8deed1 ]
Set CONFIG_GENERIC_CPU_VULNERABILITIES and provide the two functions cpu_show_spectre_v1 and cpu_show_spectre_v2 to report the spectre mitigations.
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/Kconfig | 1 + arch/s390/kernel/nospec-branch.c | 19 +++++++++++++++++++ 2 files changed, 20 insertions(+)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index fbb8fca6f9c2..1c4a595e8224 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -118,6 +118,7 @@ config S390 select GENERIC_CLOCKEVENTS select GENERIC_CPU_AUTOPROBE select GENERIC_CPU_DEVICES if !SMP + select GENERIC_CPU_VULNERABILITIES select GENERIC_FIND_FIRST_BIT select GENERIC_SMP_IDLE_THREAD select GENERIC_TIME_VSYSCALL diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index 7387ebace891..73c06d42792d 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include <linux/module.h> +#include <linux/device.h> #include <asm/facility.h> #include <asm/nospec-branch.h>
@@ -43,6 +44,24 @@ static int __init nospec_report(void) } arch_initcall(nospec_report);
+#ifdef CONFIG_SYSFS +ssize_t cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Mitigation: __user pointer sanitization\n"); +} + +ssize_t cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (IS_ENABLED(CC_USING_EXPOLINE) && !nospec_disable) + return sprintf(buf, "Mitigation: execute trampolines\n"); + if (__test_facility(82, S390_lowcore.alt_stfle_fac_list)) + return sprintf(buf, "Mitigation: limited branch prediction.\n"); + return sprintf(buf, "Vulnerable\n"); +} +#endif + #ifdef CONFIG_EXPOLINE
int nospec_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF);
[ Upstream commit 6a3d1e81a434fc311f224b8be77258bafc18ccc6 ]
With CONFIG_EXPOLINE_AUTO=y the call of spectre_v2_auto_early() via early_initcall is done *after* the early_param functions. This overwrites any settings done with the nobp/no_spectre_v2/spectre_v2 parameters. The code patching for the kernel is done after the evaluation of the early parameters but before the early_initcall is done. The end result is a kernel image that is patched correctly but the kernel modules are not.
Make sure that the nospec auto detection function is called before the early parameters are evaluated and before the code patching is done.
Fixes: 6e179d64126b ("s390: add automatic detection of the spectre defense") Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/include/asm/nospec-branch.h | 1 + arch/s390/kernel/nospec-branch.c | 8 ++------ arch/s390/kernel/setup.c | 3 +++ 3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/s390/include/asm/nospec-branch.h b/arch/s390/include/asm/nospec-branch.h index 35bf28fe4c64..b4bd8c41e9d3 100644 --- a/arch/s390/include/asm/nospec-branch.h +++ b/arch/s390/include/asm/nospec-branch.h @@ -9,6 +9,7 @@ extern int nospec_disable;
void nospec_init_branches(void); +void nospec_auto_detect(void); void nospec_revert(s32 *start, s32 *end);
#endif /* __ASSEMBLY__ */ diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index 73c06d42792d..9f3b5b382743 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -73,7 +73,7 @@ static int __init nospectre_v2_setup_early(char *str) } early_param("nospectre_v2", nospectre_v2_setup_early);
-static int __init spectre_v2_auto_early(void) +void __init nospec_auto_detect(void) { if (IS_ENABLED(CC_USING_EXPOLINE)) { /* @@ -88,11 +88,7 @@ static int __init spectre_v2_auto_early(void) * nobp setting decides what is done, this depends on the * CONFIG_KERNEL_NP option and the nobp/nospec parameters. */ - return 0; } -#ifdef CONFIG_EXPOLINE_AUTO -early_initcall(spectre_v2_auto_early); -#endif
static int __init spectre_v2_setup_early(char *str) { @@ -103,7 +99,7 @@ static int __init spectre_v2_setup_early(char *str) if (str && !strncmp(str, "off", 3)) nospec_disable = 1; if (str && !strncmp(str, "auto", 4)) - spectre_v2_auto_early(); + nospec_auto_detect(); return 0; } early_param("spectre_v2", spectre_v2_setup_early); diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index bc25366f0244..feb9d97a9d14 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -876,6 +876,9 @@ void __init setup_arch(char **cmdline_p) init_mm.end_data = (unsigned long) &_edata; init_mm.brk = (unsigned long) &_end;
+ if (IS_ENABLED(CONFIG_EXPOLINE_AUTO)) + nospec_auto_detect(); + parse_early_param(); #ifdef CONFIG_CRASH_DUMP /* Deactivate elfcorehdr= kernel parameter */
[ Upstream commit 6cf09958f32b9667bb3ebadf74367c791112771b ]
The main linker script vmlinux.lds.S for the kernel image merges the expoline code patch tables into two section ".nospec_call_table" and ".nospec_return_table". This is *not* done for the modules, there the sections retain their original names as generated by gcc: ".s390_indirect_call", ".s390_return_mem" and ".s390_return_reg".
The module_finalize code has to check for the compiler generated section names, otherwise no code patching is done. This slows down the module code in case of "spectre_v2=off".
Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/module.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c index f3f7321bef1a..64ccfdf96b32 100644 --- a/arch/s390/kernel/module.c +++ b/arch/s390/kernel/module.c @@ -477,11 +477,11 @@ int module_finalize(const Elf_Ehdr *hdr, apply_alternatives(aseg, aseg + s->sh_size);
if (IS_ENABLED(CONFIG_EXPOLINE) && - (!strcmp(".nospec_call_table", secname))) + (!strncmp(".s390_indirect", secname, 14))) nospec_revert(aseg, aseg + s->sh_size);
if (IS_ENABLED(CONFIG_EXPOLINE) && - (!strcmp(".nospec_return_table", secname))) + (!strncmp(".s390_return", secname, 12))) nospec_revert(aseg, aseg + s->sh_size); }
On Fri, Apr 27, 2018 at 07:36:38AM +0200, Martin Schwidefsky wrote:
Greetings,
this series is the backport of 19 upstream patches to add the current s390 spectre mitigation to kernel version 4.9.
It follows the x86 approach with array_index_nospec for the v1 spectre attack and retpoline/expoline for v2. As a fallback there is the ppa-12/ppa-13 based defense which requires an micro-code update.
All now applied, thanks!
greg k-h
linux-stable-mirror@lists.linaro.org