Greetings,
this series is the backport of 18 upstream patches to add the current s390 spectre mitigation to kernel version 4.14. One less patch than 4.14 and 4.9 as there is no nested KVM guest support (aka vsie) in 4.4.
It follows the x86 approach with array_index_nospec for the v1 spectre attack and retpoline/expoline for v2. As a fallback there is the ppa-12/ppa-13 based defense which requires an micro-code update.
Christian Borntraeger (2): KVM: s390: wire up bpb feature s390/entry.S: fix spurious zeroing of r0
Eugeniu Rosca (1): s390: Replace IS_ENABLED(EXPOLINE_*) with IS_ENABLED(CONFIG_EXPOLINE_*)
Heiko Carstens (1): s390: enable CPU alternatives unconditionally
Martin Schwidefsky (13): s390: scrub registers on kernel entry and KVM exit s390: add optimized array_index_mask_nospec s390/alternative: use a copy of the facility bit mask s390: add options to change branch prediction behaviour for the kernel s390: run user space and KVM guests with modified branch prediction s390: introduce execute-trampolines for branches s390: do not bypass BPENTER for interrupt system calls s390: move nobp parameter functions to nospec-branch.c s390: add automatic detection of the spectre defense s390: report spectre mitigation via syslog s390: add sysfs attributes for spectre s390: correct nospec auto detection init order s390: correct module section names for expoline code revert
Vasily Gorbik (1): s390: introduce CPU alternatives
Documentation/kernel-parameters.txt | 3 + arch/s390/Kconfig | 47 +++++++ arch/s390/Makefile | 10 ++ arch/s390/include/asm/alternative.h | 149 ++++++++++++++++++++ arch/s390/include/asm/barrier.h | 24 ++++ arch/s390/include/asm/facility.h | 18 +++ arch/s390/include/asm/kvm_host.h | 3 +- arch/s390/include/asm/lowcore.h | 7 +- arch/s390/include/asm/nospec-branch.h | 17 +++ arch/s390/include/asm/processor.h | 4 + arch/s390/include/asm/thread_info.h | 4 + arch/s390/include/uapi/asm/kvm.h | 3 + arch/s390/kernel/Makefile | 5 +- arch/s390/kernel/alternative.c | 112 +++++++++++++++ arch/s390/kernel/early.c | 5 + arch/s390/kernel/entry.S | 250 ++++++++++++++++++++++++++++++---- arch/s390/kernel/ipl.c | 1 + arch/s390/kernel/module.c | 65 ++++++++- arch/s390/kernel/nospec-branch.c | 169 +++++++++++++++++++++++ arch/s390/kernel/processor.c | 18 +++ arch/s390/kernel/setup.c | 14 +- arch/s390/kernel/smp.c | 7 +- arch/s390/kernel/vmlinux.lds.S | 37 +++++ arch/s390/kvm/kvm-s390.c | 13 +- drivers/s390/char/Makefile | 2 + include/uapi/linux/kvm.h | 1 + 26 files changed, 952 insertions(+), 36 deletions(-) create mode 100644 arch/s390/include/asm/alternative.h create mode 100644 arch/s390/include/asm/nospec-branch.h create mode 100644 arch/s390/kernel/alternative.c create mode 100644 arch/s390/kernel/nospec-branch.c
From: Vasily Gorbik gor@linux.vnet.ibm.com
[ Upstream commit 686140a1a9c41d85a4212a1c26d671139b76404b ]
Implement CPU alternatives, which allows to optionally patch newer instructions at runtime, based on CPU facilities availability.
A new kernel boot parameter "noaltinstr" disables patching.
Current implementation is derived from x86 alternatives. Although ideal instructions padding (when altinstr is longer then oldinstr) is added at compile time, and no oldinstr nops optimization has to be done at runtime. Also couple of compile time sanity checks are done: 1. oldinstr and altinstr must be <= 254 bytes long, 2. oldinstr and altinstr must not have an odd length.
alternative(oldinstr, altinstr, facility); alternative_2(oldinstr, altinstr1, facility1, altinstr2, facility2);
Both compile time and runtime padding consists of either 6/4/2 bytes nop or a jump (brcl) + 2 bytes nop filler if padding is longer then 6 bytes.
.altinstructions and .altinstr_replacement sections are part of __init_begin : __init_end region and are freed after initialization.
Signed-off-by: Vasily Gorbik gor@linux.vnet.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- Documentation/kernel-parameters.txt | 3 + arch/s390/Kconfig | 17 ++++ arch/s390/include/asm/alternative.h | 163 ++++++++++++++++++++++++++++++++++++ arch/s390/kernel/Makefile | 1 + arch/s390/kernel/alternative.c | 110 ++++++++++++++++++++++++ arch/s390/kernel/module.c | 17 ++++ arch/s390/kernel/setup.c | 3 + arch/s390/kernel/vmlinux.lds.S | 23 +++++ 8 files changed, 337 insertions(+) create mode 100644 arch/s390/include/asm/alternative.h create mode 100644 arch/s390/kernel/alternative.c
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index f53ef1ac3122..4df6bd7d01ed 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2402,6 +2402,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
noalign [KNL,ARM]
+ noaltinstr [S390] Disables alternative instructions patching + (CPU alternatives feature). + noapic [SMP,APIC] Tells the kernel to not make use of any IOAPICs that may be present in the system.
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 2ee95ece0498..89eb54f9f7d6 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -705,6 +705,22 @@ config SECCOMP
If unsure, say Y.
+config ALTERNATIVES + def_bool y + prompt "Patch optimized instructions for running CPU type" + help + When enabled the kernel code is compiled with additional + alternative instructions blocks optimized for newer CPU types. + These alternative instructions blocks are patched at kernel boot + time when running CPU supports them. This mechanism is used to + optimize some critical code paths (i.e. spinlocks) for newer CPUs + even if kernel is build to support older machine generations. + + This mechanism could be disabled by appending "noaltinstr" + option to the kernel command line. + + If unsure, say Y. + endmenu
menu "Power Management" @@ -754,6 +770,7 @@ config PFAULT config SHARED_KERNEL bool "VM shared kernel support" depends on !JUMP_LABEL + depends on !ALTERNATIVES help Select this option, if you want to share the text segment of the Linux kernel between different VM guests. This reduces memory diff --git a/arch/s390/include/asm/alternative.h b/arch/s390/include/asm/alternative.h new file mode 100644 index 000000000000..6c268f6a51d3 --- /dev/null +++ b/arch/s390/include/asm/alternative.h @@ -0,0 +1,163 @@ +#ifndef _ASM_S390_ALTERNATIVE_H +#define _ASM_S390_ALTERNATIVE_H + +#ifndef __ASSEMBLY__ + +#include <linux/types.h> +#include <linux/stddef.h> +#include <linux/stringify.h> + +struct alt_instr { + s32 instr_offset; /* original instruction */ + s32 repl_offset; /* offset to replacement instruction */ + u16 facility; /* facility bit set for replacement */ + u8 instrlen; /* length of original instruction */ + u8 replacementlen; /* length of new instruction */ +} __packed; + +#ifdef CONFIG_ALTERNATIVES +extern void apply_alternative_instructions(void); +extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end); +#else +static inline void apply_alternative_instructions(void) {}; +static inline void apply_alternatives(struct alt_instr *start, + struct alt_instr *end) {}; +#endif +/* + * |661: |662: |6620 |663: + * +-----------+---------------------+ + * | oldinstr | oldinstr_padding | + * | +----------+----------+ + * | | | | + * | | >6 bytes |6/4/2 nops| + * | |6 bytes jg-----------> + * +-----------+---------------------+ + * ^^ static padding ^^ + * + * .altinstr_replacement section + * +---------------------+-----------+ + * |6641: |6651: + * | alternative instr 1 | + * +-----------+---------+- - - - - -+ + * |6642: |6652: | + * | alternative instr 2 | padding + * +---------------------+- - - - - -+ + * ^ runtime ^ + * + * .altinstructions section + * +---------------------------------+ + * | alt_instr entries for each | + * | alternative instr | + * +---------------------------------+ + */ + +#define b_altinstr(num) "664"#num +#define e_altinstr(num) "665"#num + +#define e_oldinstr_pad_end "663" +#define oldinstr_len "662b-661b" +#define oldinstr_total_len e_oldinstr_pad_end"b-661b" +#define altinstr_len(num) e_altinstr(num)"b-"b_altinstr(num)"b" +#define oldinstr_pad_len(num) \ + "-(((" altinstr_len(num) ")-(" oldinstr_len ")) > 0) * " \ + "((" altinstr_len(num) ")-(" oldinstr_len "))" + +#define INSTR_LEN_SANITY_CHECK(len) \ + ".if " len " > 254\n" \ + "\t.error "cpu alternatives does not support instructions " \ + "blocks > 254 bytes"\n" \ + ".endif\n" \ + ".if (" len ") %% 2\n" \ + "\t.error "cpu alternatives instructions length is odd"\n" \ + ".endif\n" + +#define OLDINSTR_PADDING(oldinstr, num) \ + ".if " oldinstr_pad_len(num) " > 6\n" \ + "\tjg " e_oldinstr_pad_end "f\n" \ + "6620:\n" \ + "\t.fill (" oldinstr_pad_len(num) " - (6620b-662b)) / 2, 2, 0x0700\n" \ + ".else\n" \ + "\t.fill " oldinstr_pad_len(num) " / 6, 6, 0xc0040000\n" \ + "\t.fill " oldinstr_pad_len(num) " %% 6 / 4, 4, 0x47000000\n" \ + "\t.fill " oldinstr_pad_len(num) " %% 6 %% 4 / 2, 2, 0x0700\n" \ + ".endif\n" + +#define OLDINSTR(oldinstr, num) \ + "661:\n\t" oldinstr "\n662:\n" \ + OLDINSTR_PADDING(oldinstr, num) \ + e_oldinstr_pad_end ":\n" \ + INSTR_LEN_SANITY_CHECK(oldinstr_len) + +#define OLDINSTR_2(oldinstr, num1, num2) \ + "661:\n\t" oldinstr "\n662:\n" \ + ".if " altinstr_len(num1) " < " altinstr_len(num2) "\n" \ + OLDINSTR_PADDING(oldinstr, num2) \ + ".else\n" \ + OLDINSTR_PADDING(oldinstr, num1) \ + ".endif\n" \ + e_oldinstr_pad_end ":\n" \ + INSTR_LEN_SANITY_CHECK(oldinstr_len) + +#define ALTINSTR_ENTRY(facility, num) \ + "\t.long 661b - .\n" /* old instruction */ \ + "\t.long " b_altinstr(num)"b - .\n" /* alt instruction */ \ + "\t.word " __stringify(facility) "\n" /* facility bit */ \ + "\t.byte " oldinstr_total_len "\n" /* source len */ \ + "\t.byte " altinstr_len(num) "\n" /* alt instruction len */ + +#define ALTINSTR_REPLACEMENT(altinstr, num) /* replacement */ \ + b_altinstr(num)":\n\t" altinstr "\n" e_altinstr(num) ":\n" \ + INSTR_LEN_SANITY_CHECK(altinstr_len(num)) + +#ifdef CONFIG_ALTERNATIVES +/* alternative assembly primitive: */ +#define ALTERNATIVE(oldinstr, altinstr, facility) \ + ".pushsection .altinstr_replacement, "ax"\n" \ + ALTINSTR_REPLACEMENT(altinstr, 1) \ + ".popsection\n" \ + OLDINSTR(oldinstr, 1) \ + ".pushsection .altinstructions,"a"\n" \ + ALTINSTR_ENTRY(facility, 1) \ + ".popsection\n" + +#define ALTERNATIVE_2(oldinstr, altinstr1, facility1, altinstr2, facility2)\ + ".pushsection .altinstr_replacement, "ax"\n" \ + ALTINSTR_REPLACEMENT(altinstr1, 1) \ + ALTINSTR_REPLACEMENT(altinstr2, 2) \ + ".popsection\n" \ + OLDINSTR_2(oldinstr, 1, 2) \ + ".pushsection .altinstructions,"a"\n" \ + ALTINSTR_ENTRY(facility1, 1) \ + ALTINSTR_ENTRY(facility2, 2) \ + ".popsection\n" +#else +/* Alternative instructions are disabled, let's put just oldinstr in */ +#define ALTERNATIVE(oldinstr, altinstr, facility) \ + oldinstr "\n" + +#define ALTERNATIVE_2(oldinstr, altinstr1, facility1, altinstr2, facility2) \ + oldinstr "\n" +#endif + +/* + * Alternative instructions for different CPU types or capabilities. + * + * This allows to use optimized instructions even on generic binary + * kernels. + * + * oldinstr is padded with jump and nops at compile time if altinstr is + * longer. altinstr is padded with jump and nops at run-time during patching. + * + * For non barrier like inlines please define new variants + * without volatile and memory clobber. + */ +#define alternative(oldinstr, altinstr, facility) \ + asm volatile(ALTERNATIVE(oldinstr, altinstr, facility) : : : "memory") + +#define alternative_2(oldinstr, altinstr1, facility1, altinstr2, facility2) \ + asm volatile(ALTERNATIVE_2(oldinstr, altinstr1, facility1, \ + altinstr2, facility2) ::: "memory") + +#endif /* __ASSEMBLY__ */ + +#endif /* _ASM_S390_ALTERNATIVE_H */ diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile index dc167a23b920..c1a0075e1384 100644 --- a/arch/s390/kernel/Makefile +++ b/arch/s390/kernel/Makefile @@ -62,6 +62,7 @@ obj-$(CONFIG_KPROBES) += kprobes.o obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o obj-$(CONFIG_CRASH_DUMP) += crash_dump.o obj-$(CONFIG_UPROBES) += uprobes.o +obj-$(CONFIG_ALTERNATIVES) += alternative.o
obj-$(CONFIG_PERF_EVENTS) += perf_event.o perf_cpum_cf.o perf_cpum_sf.o obj-$(CONFIG_PERF_EVENTS) += perf_cpum_cf_events.o diff --git a/arch/s390/kernel/alternative.c b/arch/s390/kernel/alternative.c new file mode 100644 index 000000000000..315986a06cf5 --- /dev/null +++ b/arch/s390/kernel/alternative.c @@ -0,0 +1,110 @@ +#include <linux/module.h> +#include <asm/alternative.h> +#include <asm/facility.h> + +#define MAX_PATCH_LEN (255 - 1) + +static int __initdata_or_module alt_instr_disabled; + +static int __init disable_alternative_instructions(char *str) +{ + alt_instr_disabled = 1; + return 0; +} + +early_param("noaltinstr", disable_alternative_instructions); + +struct brcl_insn { + u16 opc; + s32 disp; +} __packed; + +static u16 __initdata_or_module nop16 = 0x0700; +static u32 __initdata_or_module nop32 = 0x47000000; +static struct brcl_insn __initdata_or_module nop48 = { + 0xc004, 0 +}; + +static const void *nops[] __initdata_or_module = { + &nop16, + &nop32, + &nop48 +}; + +static void __init_or_module add_jump_padding(void *insns, unsigned int len) +{ + struct brcl_insn brcl = { + 0xc0f4, + len / 2 + }; + + memcpy(insns, &brcl, sizeof(brcl)); + insns += sizeof(brcl); + len -= sizeof(brcl); + + while (len > 0) { + memcpy(insns, &nop16, 2); + insns += 2; + len -= 2; + } +} + +static void __init_or_module add_padding(void *insns, unsigned int len) +{ + if (len > 6) + add_jump_padding(insns, len); + else if (len >= 2) + memcpy(insns, nops[len / 2 - 1], len); +} + +static void __init_or_module __apply_alternatives(struct alt_instr *start, + struct alt_instr *end) +{ + struct alt_instr *a; + u8 *instr, *replacement; + u8 insnbuf[MAX_PATCH_LEN]; + + /* + * The scan order should be from start to end. A later scanned + * alternative code can overwrite previously scanned alternative code. + */ + for (a = start; a < end; a++) { + int insnbuf_sz = 0; + + instr = (u8 *)&a->instr_offset + a->instr_offset; + replacement = (u8 *)&a->repl_offset + a->repl_offset; + + if (!test_facility(a->facility)) + continue; + + if (unlikely(a->instrlen % 2 || a->replacementlen % 2)) { + WARN_ONCE(1, "cpu alternatives instructions length is " + "odd, skipping patching\n"); + continue; + } + + memcpy(insnbuf, replacement, a->replacementlen); + insnbuf_sz = a->replacementlen; + + if (a->instrlen > a->replacementlen) { + add_padding(insnbuf + a->replacementlen, + a->instrlen - a->replacementlen); + insnbuf_sz += a->instrlen - a->replacementlen; + } + + s390_kernel_write(instr, insnbuf, insnbuf_sz); + } +} + +void __init_or_module apply_alternatives(struct alt_instr *start, + struct alt_instr *end) +{ + if (!alt_instr_disabled) + __apply_alternatives(start, end); +} + +extern struct alt_instr __alt_instructions[], __alt_instructions_end[]; +void __init apply_alternative_instructions(void) +{ + apply_alternatives(__alt_instructions, __alt_instructions_end); +} diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c index 0c1a679314dd..85c1ba09eae0 100644 --- a/arch/s390/kernel/module.c +++ b/arch/s390/kernel/module.c @@ -31,6 +31,7 @@ #include <linux/kernel.h> #include <linux/moduleloader.h> #include <linux/bug.h> +#include <asm/alternative.h>
#if 0 #define DEBUGP printk @@ -424,6 +425,22 @@ int module_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs, struct module *me) { + const Elf_Shdr *s; + char *secstrings; + + if (IS_ENABLED(CONFIG_ALTERNATIVES)) { + secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; + for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) { + if (!strcmp(".altinstructions", + secstrings + s->sh_name)) { + /* patch .altinstructions */ + void *aseg = (void *)s->sh_addr; + + apply_alternatives(aseg, aseg + s->sh_size); + } + } + } + jump_label_apply_nops(me); vfree(me->arch.syminfo); me->arch.syminfo = NULL; diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index d097d71685df..136caf3128f0 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -63,6 +63,7 @@ #include <asm/sclp.h> #include <asm/sysinfo.h> #include <asm/numa.h> +#include <asm/alternative.h> #include "entry.h"
/* @@ -893,6 +894,8 @@ void __init setup_arch(char **cmdline_p) conmode_default(); set_preferred_console();
+ apply_alternative_instructions(); + /* Setup zfcpdump support */ setup_zfcpdump();
diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S index 6c553f6e791a..6cff68a84bb3 100644 --- a/arch/s390/kernel/vmlinux.lds.S +++ b/arch/s390/kernel/vmlinux.lds.S @@ -78,6 +78,29 @@ SECTIONS EXIT_DATA }
+ /* + * struct alt_inst entries. From the header (alternative.h): + * "Alternative instructions for different CPU types or capabilities" + * Think locking instructions on spinlocks. + * Note, that it is a part of __init region. + */ + . = ALIGN(8); + .altinstructions : { + __alt_instructions = .; + *(.altinstructions) + __alt_instructions_end = .; + } + + /* + * And here are the replacement instructions. The linker sticks + * them as binary blobs. The .altinstructions has enough data to + * get the address and the length of them to patch the kernel safely. + * Note, that it is a part of __init region. + */ + .altinstr_replacement : { + *(.altinstr_replacement) + } + /* early.c uses stsi, which requires page aligned data. */ . = ALIGN(PAGE_SIZE); INIT_DATA_SECTION(0x100)
From: Heiko Carstens heiko.carstens@de.ibm.com
[ Upstream commit 049a2c2d486e8cc82c5cd79fa479c5b105b109e9 ]
Remove the CPU_ALTERNATIVES config option and enable the code unconditionally. The config option was only added to avoid a conflict with the named saved segment support. Since that code is gone there is no reason to keep the CPU_ALTERNATIVES config option.
Just enable it unconditionally to also reduce the number of config options and make it less likely that something breaks.
Signed-off-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/Kconfig | 16 ---------------- arch/s390/include/asm/alternative.h | 20 +++----------------- arch/s390/kernel/Makefile | 3 +-- arch/s390/kernel/module.c | 15 ++++++--------- 4 files changed, 10 insertions(+), 44 deletions(-)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 89eb54f9f7d6..04f0d2e89186 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -705,22 +705,6 @@ config SECCOMP
If unsure, say Y.
-config ALTERNATIVES - def_bool y - prompt "Patch optimized instructions for running CPU type" - help - When enabled the kernel code is compiled with additional - alternative instructions blocks optimized for newer CPU types. - These alternative instructions blocks are patched at kernel boot - time when running CPU supports them. This mechanism is used to - optimize some critical code paths (i.e. spinlocks) for newer CPUs - even if kernel is build to support older machine generations. - - This mechanism could be disabled by appending "noaltinstr" - option to the kernel command line. - - If unsure, say Y. - endmenu
menu "Power Management" diff --git a/arch/s390/include/asm/alternative.h b/arch/s390/include/asm/alternative.h index 6c268f6a51d3..a72002056b54 100644 --- a/arch/s390/include/asm/alternative.h +++ b/arch/s390/include/asm/alternative.h @@ -15,14 +15,9 @@ struct alt_instr { u8 replacementlen; /* length of new instruction */ } __packed;
-#ifdef CONFIG_ALTERNATIVES -extern void apply_alternative_instructions(void); -extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end); -#else -static inline void apply_alternative_instructions(void) {}; -static inline void apply_alternatives(struct alt_instr *start, - struct alt_instr *end) {}; -#endif +void apply_alternative_instructions(void); +void apply_alternatives(struct alt_instr *start, struct alt_instr *end); + /* * |661: |662: |6620 |663: * +-----------+---------------------+ @@ -109,7 +104,6 @@ static inline void apply_alternatives(struct alt_instr *start, b_altinstr(num)":\n\t" altinstr "\n" e_altinstr(num) ":\n" \ INSTR_LEN_SANITY_CHECK(altinstr_len(num))
-#ifdef CONFIG_ALTERNATIVES /* alternative assembly primitive: */ #define ALTERNATIVE(oldinstr, altinstr, facility) \ ".pushsection .altinstr_replacement, "ax"\n" \ @@ -130,14 +124,6 @@ static inline void apply_alternatives(struct alt_instr *start, ALTINSTR_ENTRY(facility1, 1) \ ALTINSTR_ENTRY(facility2, 2) \ ".popsection\n" -#else -/* Alternative instructions are disabled, let's put just oldinstr in */ -#define ALTERNATIVE(oldinstr, altinstr, facility) \ - oldinstr "\n" - -#define ALTERNATIVE_2(oldinstr, altinstr1, facility1, altinstr2, facility2) \ - oldinstr "\n" -#endif
/* * Alternative instructions for different CPU types or capabilities. diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile index c1a0075e1384..a4c95c847905 100644 --- a/arch/s390/kernel/Makefile +++ b/arch/s390/kernel/Makefile @@ -44,7 +44,7 @@ obj-y += processor.o sys_s390.o ptrace.o signal.o cpcmd.o ebcdic.o nmi.o obj-y += debug.o irq.o ipl.o dis.o diag.o sclp.o vdso.o obj-y += sysinfo.o jump_label.o lgr.o os_info.o machine_kexec.o pgm_check.o obj-y += runtime_instr.o cache.o dumpstack.o -obj-y += entry.o reipl.o relocate_kernel.o +obj-y += entry.o reipl.o relocate_kernel.o alternative.o
extra-y += head.o head64.o vmlinux.lds
@@ -62,7 +62,6 @@ obj-$(CONFIG_KPROBES) += kprobes.o obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o obj-$(CONFIG_CRASH_DUMP) += crash_dump.o obj-$(CONFIG_UPROBES) += uprobes.o -obj-$(CONFIG_ALTERNATIVES) += alternative.o
obj-$(CONFIG_PERF_EVENTS) += perf_event.o perf_cpum_cf.o perf_cpum_sf.o obj-$(CONFIG_PERF_EVENTS) += perf_cpum_cf_events.o diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c index 85c1ba09eae0..06c7e82342cd 100644 --- a/arch/s390/kernel/module.c +++ b/arch/s390/kernel/module.c @@ -428,16 +428,13 @@ int module_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *s; char *secstrings;
- if (IS_ENABLED(CONFIG_ALTERNATIVES)) { - secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; - for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) { - if (!strcmp(".altinstructions", - secstrings + s->sh_name)) { - /* patch .altinstructions */ - void *aseg = (void *)s->sh_addr; + secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; + for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) { + if (!strcmp(".altinstructions", secstrings + s->sh_name)) { + /* patch .altinstructions */ + void *aseg = (void *)s->sh_addr;
- apply_alternatives(aseg, aseg + s->sh_size); - } + apply_alternatives(aseg, aseg + s->sh_size); } }
From: Christian Borntraeger borntraeger@de.ibm.com
[ Upstream commit 35b3fde6203b932b2b1a5b53b3d8808abc9c4f60 ]
The new firmware interfaces for branch prediction behaviour changes are transparently available for the guest. Nevertheless, there is new state attached that should be migrated and properly resetted. Provide a mechanism for handling reset, migration and VSIE.
Signed-off-by: Christian Borntraeger borntraeger@de.ibm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Cornelia Huck cohuck@redhat.com [Changed capability number to 152. - Radim] Signed-off-by: Radim Krčmář rkrcmar@redhat.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/include/asm/kvm_host.h | 3 ++- arch/s390/include/uapi/asm/kvm.h | 3 +++ arch/s390/kvm/kvm-s390.c | 13 ++++++++++++- include/uapi/linux/kvm.h | 1 + 4 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index e9a983f40a24..7d9c5917da2b 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -136,7 +136,8 @@ struct kvm_s390_sie_block { __u16 ipa; /* 0x0056 */ __u32 ipb; /* 0x0058 */ __u32 scaoh; /* 0x005c */ - __u8 reserved60; /* 0x0060 */ +#define FPF_BPBC 0x20 + __u8 fpf; /* 0x0060 */ __u8 ecb; /* 0x0061 */ __u8 ecb2; /* 0x0062 */ #define ECB3_AES 0x04 diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h index ef1a5fcc6c66..3aec52253c04 100644 --- a/arch/s390/include/uapi/asm/kvm.h +++ b/arch/s390/include/uapi/asm/kvm.h @@ -151,6 +151,7 @@ struct kvm_guest_debug_arch { #define KVM_SYNC_ARCH0 (1UL << 4) #define KVM_SYNC_PFAULT (1UL << 5) #define KVM_SYNC_VRS (1UL << 6) +#define KVM_SYNC_BPBC (1UL << 10) /* definition of registers in kvm_run */ struct kvm_sync_regs { __u64 prefix; /* prefix register */ @@ -168,6 +169,8 @@ struct kvm_sync_regs { __u64 vrs[32][2]; /* vector registers */ __u8 reserved[512]; /* for future vector expansion */ __u32 fpc; /* only valid with vector registers */ + __u8 bpbc : 1; /* bp mode */ + __u8 reserved2 : 7; };
#define KVM_REG_S390_TODPR (KVM_REG_S390 | KVM_REG_SIZE_U32 | 0x1) diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 23e3f5d77a24..3166516639d6 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -257,6 +257,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_S390_VECTOR_REGISTERS: r = MACHINE_HAS_VX; break; + case KVM_CAP_S390_BPB: + r = test_facility(82); + break; default: r = 0; } @@ -1262,6 +1265,8 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) KVM_SYNC_CRS | KVM_SYNC_ARCH0 | KVM_SYNC_PFAULT; + if (test_kvm_facility(vcpu->kvm, 82)) + vcpu->run->kvm_valid_regs |= KVM_SYNC_BPBC; if (test_kvm_facility(vcpu->kvm, 129)) vcpu->run->kvm_valid_regs |= KVM_SYNC_VRS;
@@ -1286,7 +1291,6 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) if (test_fp_ctl(current->thread.fpu.fpc)) /* User space provided an invalid FPC, let's clear it */ current->thread.fpu.fpc = 0; - save_access_regs(vcpu->arch.host_acrs); restore_access_regs(vcpu->run->s.regs.acrs); gmap_enable(vcpu->arch.gmap); @@ -1327,6 +1331,7 @@ static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu) current->thread.fpu.fpc = 0; vcpu->arch.sie_block->gbea = 1; vcpu->arch.sie_block->pp = 0; + vcpu->arch.sie_block->fpf &= ~FPF_BPBC; vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID; kvm_clear_async_pf_completion_queue(vcpu); if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm)) @@ -2162,6 +2167,7 @@ static void store_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) kvm_run->s.regs.pft = vcpu->arch.pfault_token; kvm_run->s.regs.pfs = vcpu->arch.pfault_select; kvm_run->s.regs.pfc = vcpu->arch.pfault_compare; + kvm_run->s.regs.bpbc = (vcpu->arch.sie_block->fpf & FPF_BPBC) == FPF_BPBC; }
int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) @@ -2194,6 +2200,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) kvm_run->exit_reason = KVM_EXIT_INTR; rc = -EINTR; } + if ((kvm_run->kvm_dirty_regs & KVM_SYNC_BPBC) && + test_kvm_facility(vcpu->kvm, 82)) { + vcpu->arch.sie_block->fpf &= ~FPF_BPBC; + vcpu->arch.sie_block->fpf |= kvm_run->s.regs.bpbc ? FPF_BPBC : 0; + }
if (guestdbg_exit_pending(vcpu) && !rc) { kvm_s390_prepare_debug_exit(vcpu); diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 03f3618612aa..376d0ab5b9f2 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -831,6 +831,7 @@ struct kvm_ppc_smmu_info { #define KVM_CAP_GUEST_DEBUG_HW_WPS 120 #define KVM_CAP_SPLIT_IRQCHIP 121 #define KVM_CAP_IOEVENTFD_ANY_LENGTH 122 +#define KVM_CAP_S390_BPB 152
#ifdef KVM_CAP_IRQ_ROUTING
sorry this mail should have only gone to me...no idea why suppress-cc=all did not work.
On 04/27/2018 06:13 PM, Christian Borntraeger wrote:
From: Martin Schwidefsky schwidefsky@de.ibm.com
From: Christian Borntraeger borntraeger@de.ibm.com
[ Upstream commit 35b3fde6203b932b2b1a5b53b3d8808abc9c4f60 ]
The new firmware interfaces for branch prediction behaviour changes are transparently available for the guest. Nevertheless, there is new state attached that should be migrated and properly resetted. Provide a mechanism for handling reset, migration and VSIE.
Signed-off-by: Christian Borntraeger borntraeger@de.ibm.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Cornelia Huck cohuck@redhat.com [Changed capability number to 152. - Radim] Signed-off-by: Radim Krčmář rkrcmar@redhat.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
arch/s390/include/asm/kvm_host.h | 3 ++- arch/s390/include/uapi/asm/kvm.h | 3 +++ arch/s390/kvm/kvm-s390.c | 13 ++++++++++++- include/uapi/linux/kvm.h | 1 + 4 files changed, 18 insertions(+), 2 deletions(-)
Index: linux/arch/s390/include/asm/kvm_host.h
--- linux.orig/arch/s390/include/asm/kvm_host.h +++ linux/arch/s390/include/asm/kvm_host.h @@ -136,7 +136,8 @@ struct kvm_s390_sie_block { __u16 ipa; /* 0x0056 */ __u32 ipb; /* 0x0058 */ __u32 scaoh; /* 0x005c */
- __u8 reserved60; /* 0x0060 */
+#define FPF_BPBC 0x20
- __u8 fpf; /* 0x0060 */ __u8 ecb; /* 0x0061 */ __u8 ecb2; /* 0x0062 */
#define ECB3_AES 0x04 Index: linux/arch/s390/include/uapi/asm/kvm.h =================================================================== --- linux.orig/arch/s390/include/uapi/asm/kvm.h +++ linux/arch/s390/include/uapi/asm/kvm.h @@ -151,6 +151,7 @@ struct kvm_guest_debug_arch { #define KVM_SYNC_ARCH0 (1UL << 4) #define KVM_SYNC_PFAULT (1UL << 5) #define KVM_SYNC_VRS (1UL << 6) +#define KVM_SYNC_BPBC (1UL << 10) /* definition of registers in kvm_run */ struct kvm_sync_regs { __u64 prefix; /* prefix register */ @@ -168,6 +169,8 @@ struct kvm_sync_regs { __u64 vrs[32][2]; /* vector registers */ __u8 reserved[512]; /* for future vector expansion */ __u32 fpc; /* only valid with vector registers */
- __u8 bpbc : 1; /* bp mode */
- __u8 reserved2 : 7;
}; #define KVM_REG_S390_TODPR (KVM_REG_S390 | KVM_REG_SIZE_U32 | 0x1) Index: linux/arch/s390/kvm/kvm-s390.c =================================================================== --- linux.orig/arch/s390/kvm/kvm-s390.c +++ linux/arch/s390/kvm/kvm-s390.c @@ -257,6 +257,9 @@ int kvm_vm_ioctl_check_extension(struct case KVM_CAP_S390_VECTOR_REGISTERS: r = MACHINE_HAS_VX; break;
- case KVM_CAP_S390_BPB:
r = test_facility(82);
default: r = 0; }break;
@@ -1264,6 +1267,8 @@ int kvm_arch_vcpu_init(struct kvm_vcpu * KVM_SYNC_PFAULT; if (test_kvm_facility(vcpu->kvm, 129)) vcpu->run->kvm_valid_regs |= KVM_SYNC_VRS;
- if (test_kvm_facility(vcpu->kvm, 82))
vcpu->run->kvm_valid_regs |= KVM_SYNC_BPBC;
if (kvm_is_ucontrol(vcpu->kvm)) return __kvm_ucontrol_vcpu_init(vcpu); @@ -1327,6 +1332,7 @@ static void kvm_s390_vcpu_initial_reset( current->thread.fpu.fpc = 0; vcpu->arch.sie_block->gbea = 1; vcpu->arch.sie_block->pp = 0;
- vcpu->arch.sie_block->fpf &= ~FPF_BPBC; vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID; kvm_clear_async_pf_completion_queue(vcpu); if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm))
@@ -2145,6 +2151,11 @@ static void sync_regs(struct kvm_vcpu *v if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID) kvm_clear_async_pf_completion_queue(vcpu); }
- if ((kvm_run->kvm_dirty_regs & KVM_SYNC_BPBC) &&
test_kvm_facility(vcpu->kvm, 82)) {
vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
vcpu->arch.sie_block->fpf |= kvm_run->s.regs.bpbc ? FPF_BPBC : 0;
- } kvm_run->kvm_dirty_regs = 0;
} @@ -2162,6 +2173,7 @@ static void store_regs(struct kvm_vcpu * kvm_run->s.regs.pft = vcpu->arch.pfault_token; kvm_run->s.regs.pfs = vcpu->arch.pfault_select; kvm_run->s.regs.pfc = vcpu->arch.pfault_compare;
- kvm_run->s.regs.bpbc = (vcpu->arch.sie_block->fpf & FPF_BPBC) == FPF_BPBC;
} int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) Index: linux/include/uapi/linux/kvm.h =================================================================== --- linux.orig/include/uapi/linux/kvm.h +++ linux/include/uapi/linux/kvm.h @@ -831,6 +831,7 @@ struct kvm_ppc_smmu_info { #define KVM_CAP_GUEST_DEBUG_HW_WPS 120 #define KVM_CAP_SPLIT_IRQCHIP 121 #define KVM_CAP_IOEVENTFD_ANY_LENGTH 122 +#define KVM_CAP_S390_BPB 152 #ifdef KVM_CAP_IRQ_ROUTING
[ Upstream commit 7041d28115e91f2144f811ffe8a195c696b1e1d0 ]
Clear all user space registers on entry to the kernel and all KVM guest registers on KVM guest exit if the register does not contain either a parameter or a result value.
Reviewed-by: Christian Borntraeger borntraeger@de.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/entry.S | 47 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+)
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 4612ed7ec2e5..245de839380f 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -244,6 +244,12 @@ ENTRY(sie64a) sie_exit: lg %r14,__SF_EMPTY+8(%r15) # load guest register save area stmg %r0,%r13,0(%r14) # save guest gprs 0-13 + xgr %r0,%r0 # clear guest registers to + xgr %r1,%r1 # prevent speculative use + xgr %r2,%r2 + xgr %r3,%r3 + xgr %r4,%r4 + xgr %r5,%r5 lmg %r6,%r14,__SF_GPRS(%r15) # restore kernel registers lg %r2,__SF_EMPTY+16(%r15) # return exit reason code br %r14 @@ -277,6 +283,8 @@ ENTRY(system_call) .Lsysc_vtime: UPDATE_VTIME %r10,%r13,__LC_SYNC_ENTER_TIMER stmg %r0,%r7,__PT_R0(%r11) + # clear user controlled register to prevent speculative use + xgr %r0,%r0 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC @@ -504,6 +512,15 @@ ENTRY(pgm_check_handler) mvc __THREAD_trap_tdb(256,%r14),0(%r13) 3: la %r11,STACK_FRAME_OVERHEAD(%r15) stmg %r0,%r7,__PT_R0(%r11) + # clear user controlled registers to prevent speculative use + xgr %r0,%r0 + xgr %r1,%r1 + xgr %r2,%r2 + xgr %r3,%r3 + xgr %r4,%r4 + xgr %r5,%r5 + xgr %r6,%r6 + xgr %r7,%r7 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC stmg %r8,%r9,__PT_PSW(%r11) mvc __PT_INT_CODE(4,%r11),__LC_PGM_ILC @@ -567,6 +584,16 @@ ENTRY(io_int_handler) lmg %r8,%r9,__LC_IO_OLD_PSW SWITCH_ASYNC __LC_SAVE_AREA_ASYNC,__LC_ASYNC_ENTER_TIMER stmg %r0,%r7,__PT_R0(%r11) + # clear user controlled registers to prevent speculative use + xgr %r0,%r0 + xgr %r1,%r1 + xgr %r2,%r2 + xgr %r3,%r3 + xgr %r4,%r4 + xgr %r5,%r5 + xgr %r6,%r6 + xgr %r7,%r7 + xgr %r10,%r10 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_ASYNC stmg %r8,%r9,__PT_PSW(%r11) mvc __PT_INT_CODE(12,%r11),__LC_SUBCHANNEL_ID @@ -742,6 +769,16 @@ ENTRY(ext_int_handler) lmg %r8,%r9,__LC_EXT_OLD_PSW SWITCH_ASYNC __LC_SAVE_AREA_ASYNC,__LC_ASYNC_ENTER_TIMER stmg %r0,%r7,__PT_R0(%r11) + # clear user controlled registers to prevent speculative use + xgr %r0,%r0 + xgr %r1,%r1 + xgr %r2,%r2 + xgr %r3,%r3 + xgr %r4,%r4 + xgr %r5,%r5 + xgr %r6,%r6 + xgr %r7,%r7 + xgr %r10,%r10 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_ASYNC stmg %r8,%r9,__PT_PSW(%r11) lghi %r1,__LC_EXT_PARAMS2 @@ -908,6 +945,16 @@ ENTRY(mcck_int_handler) .Lmcck_skip: lghi %r14,__LC_GPREGS_SAVE_AREA+64 stmg %r0,%r7,__PT_R0(%r11) + # clear user controlled registers to prevent speculative use + xgr %r0,%r0 + xgr %r1,%r1 + xgr %r2,%r2 + xgr %r3,%r3 + xgr %r4,%r4 + xgr %r5,%r5 + xgr %r6,%r6 + xgr %r7,%r7 + xgr %r10,%r10 mvc __PT_R8(64,%r11),0(%r14) stmg %r8,%r9,__PT_PSW(%r11) xc __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
[ Upstream commit e2dd833389cc4069a96b57bdd24227b5f52288f5 ]
Add an optimized version of the array_index_mask_nospec function for s390 based on a compare and a subtract with borrow.
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/include/asm/barrier.h | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h index d68e11e0df5e..e903b28e7358 100644 --- a/arch/s390/include/asm/barrier.h +++ b/arch/s390/include/asm/barrier.h @@ -53,4 +53,28 @@ do { \ ___p1; \ })
+/** + * array_index_mask_nospec - generate a mask for array_idx() that is + * ~0UL when the bounds check succeeds and 0 otherwise + * @index: array element index + * @size: number of elements in array + */ +#define array_index_mask_nospec array_index_mask_nospec +static inline unsigned long array_index_mask_nospec(unsigned long index, + unsigned long size) +{ + unsigned long mask; + + if (__builtin_constant_p(size) && size > 0) { + asm(" clgr %2,%1\n" + " slbgr %0,%0\n" + :"=d" (mask) : "d" (size-1), "d" (index) :"cc"); + return mask; + } + asm(" clgr %1,%2\n" + " slbgr %0,%0\n" + :"=d" (mask) : "d" (size), "d" (index) :"cc"); + return ~mask; +} + #endif /* __ASM_BARRIER_H */
[ Upstream commit cf1489984641369611556bf00c48f945c77bcf02 ]
To be able to switch off specific CPU alternatives with kernel parameters make a copy of the facility bit mask provided by STFLE and use the copy for the decision to apply an alternative.
Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Cornelia Huck cohuck@redhat.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/include/asm/facility.h | 18 ++++++++++++++++++ arch/s390/include/asm/lowcore.h | 3 ++- arch/s390/kernel/alternative.c | 3 ++- arch/s390/kernel/early.c | 3 +++ arch/s390/kernel/setup.c | 4 +++- arch/s390/kernel/smp.c | 4 +++- 6 files changed, 31 insertions(+), 4 deletions(-)
diff --git a/arch/s390/include/asm/facility.h b/arch/s390/include/asm/facility.h index 0aa6a7ed95a3..155fcc7bcba6 100644 --- a/arch/s390/include/asm/facility.h +++ b/arch/s390/include/asm/facility.h @@ -13,6 +13,24 @@
#define MAX_FACILITY_BIT (256*8) /* stfle_fac_list has 256 bytes */
+static inline void __set_facility(unsigned long nr, void *facilities) +{ + unsigned char *ptr = (unsigned char *) facilities; + + if (nr >= MAX_FACILITY_BIT) + return; + ptr[nr >> 3] |= 0x80 >> (nr & 7); +} + +static inline void __clear_facility(unsigned long nr, void *facilities) +{ + unsigned char *ptr = (unsigned char *) facilities; + + if (nr >= MAX_FACILITY_BIT) + return; + ptr[nr >> 3] &= ~(0x80 >> (nr & 7)); +} + static inline int __test_facility(unsigned long nr, void *facilities) { unsigned char *ptr; diff --git a/arch/s390/include/asm/lowcore.h b/arch/s390/include/asm/lowcore.h index afe1cfebf1a4..2f8809671151 100644 --- a/arch/s390/include/asm/lowcore.h +++ b/arch/s390/include/asm/lowcore.h @@ -170,7 +170,8 @@ struct _lowcore { __u8 pad_0x0e20[0x0f00-0x0e20]; /* 0x0e20 */
/* Extended facility list */ - __u64 stfle_fac_list[32]; /* 0x0f00 */ + __u64 stfle_fac_list[16]; /* 0x0f00 */ + __u64 alt_stfle_fac_list[16]; /* 0x0f80 */ __u8 pad_0x1000[0x11b0-0x1000]; /* 0x1000 */
/* Pointer to vector register save area */ diff --git a/arch/s390/kernel/alternative.c b/arch/s390/kernel/alternative.c index 315986a06cf5..f5060af61176 100644 --- a/arch/s390/kernel/alternative.c +++ b/arch/s390/kernel/alternative.c @@ -74,7 +74,8 @@ static void __init_or_module __apply_alternatives(struct alt_instr *start, instr = (u8 *)&a->instr_offset + a->instr_offset; replacement = (u8 *)&a->repl_offset + a->repl_offset;
- if (!test_facility(a->facility)) + if (!__test_facility(a->facility, + S390_lowcore.alt_stfle_fac_list)) continue;
if (unlikely(a->instrlen % 2 || a->replacementlen % 2)) { diff --git a/arch/s390/kernel/early.c b/arch/s390/kernel/early.c index ee7b8e7ca4f8..88d0923fe244 100644 --- a/arch/s390/kernel/early.c +++ b/arch/s390/kernel/early.c @@ -279,6 +279,9 @@ static noinline __init void setup_facility_list(void) { stfle(S390_lowcore.stfle_fac_list, ARRAY_SIZE(S390_lowcore.stfle_fac_list)); + memcpy(S390_lowcore.alt_stfle_fac_list, + S390_lowcore.stfle_fac_list, + sizeof(S390_lowcore.alt_stfle_fac_list)); }
static __init void detect_diag9c(void) diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index 136caf3128f0..e5627d4d910c 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -334,7 +334,9 @@ static void __init setup_lowcore(void) lc->machine_flags = S390_lowcore.machine_flags; lc->stfl_fac_list = S390_lowcore.stfl_fac_list; memcpy(lc->stfle_fac_list, S390_lowcore.stfle_fac_list, - MAX_FACILITY_BIT/8); + sizeof(lc->stfle_fac_list)); + memcpy(lc->alt_stfle_fac_list, S390_lowcore.alt_stfle_fac_list, + sizeof(lc->alt_stfle_fac_list)); if (MACHINE_HAS_VX) lc->vector_save_area_addr = (unsigned long) &lc->vector_save_area; diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c index 9062df575afe..adf25c7a1750 100644 --- a/arch/s390/kernel/smp.c +++ b/arch/s390/kernel/smp.c @@ -250,7 +250,9 @@ static void pcpu_prepare_secondary(struct pcpu *pcpu, int cpu) __ctl_store(lc->cregs_save_area, 0, 15); save_access_regs((unsigned int *) lc->access_regs_save_area); memcpy(lc->stfle_fac_list, S390_lowcore.stfle_fac_list, - MAX_FACILITY_BIT/8); + sizeof(lc->stfle_fac_list)); + memcpy(lc->alt_stfle_fac_list, S390_lowcore.alt_stfle_fac_list, + sizeof(lc->alt_stfle_fac_list)); }
static void pcpu_attach_task(struct pcpu *pcpu, struct task_struct *tsk)
[ Upstream commit d768bd892fc8f066cd3aa000eb1867bcf32db0ee ]
Add the PPA instruction to the system entry and exit path to switch the kernel to a different branch prediction behaviour. The instructions are added via CPU alternatives and can be disabled with the "nospec" or the "nobp=0" kernel parameter. If the default behaviour selected with CONFIG_KERNEL_NOBP is set to "n" then the "nobp=1" parameter can be used to enable the changed kernel branch prediction.
Acked-by: Cornelia Huck cohuck@redhat.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/Kconfig | 17 ++++++++++++++ arch/s390/include/asm/processor.h | 1 + arch/s390/kernel/alternative.c | 23 +++++++++++++++++++ arch/s390/kernel/early.c | 2 ++ arch/s390/kernel/entry.S | 48 +++++++++++++++++++++++++++++++++++++++ arch/s390/kernel/ipl.c | 1 + arch/s390/kernel/smp.c | 2 ++ 7 files changed, 94 insertions(+)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 04f0d2e89186..391255169396 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -705,6 +705,23 @@ config SECCOMP
If unsure, say Y.
+config KERNEL_NOBP + def_bool n + prompt "Enable modified branch prediction for the kernel by default" + help + If this option is selected the kernel will switch to a modified + branch prediction mode if the firmware interface is available. + The modified branch prediction mode improves the behaviour in + regard to speculative execution. + + With the option enabled the kernel parameter "nobp=0" or "nospec" + can be used to run the kernel in the normal branch prediction mode. + + With the option disabled the modified branch prediction mode is + enabled with the "nobp=1" kernel parameter. + + If unsure, say N. + endmenu
menu "Power Management" diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h index c61ed7890cef..d5431b366e1a 100644 --- a/arch/s390/include/asm/processor.h +++ b/arch/s390/include/asm/processor.h @@ -69,6 +69,7 @@ extern void s390_adjust_jiffies(void); extern const struct seq_operations cpuinfo_op; extern int sysctl_ieee_emulation_warnings; extern void execve_tail(void); +extern void __bpon(void);
/* * User space process size: 2GB for 31 bit, 4TB or 8PT for 64 bit. diff --git a/arch/s390/kernel/alternative.c b/arch/s390/kernel/alternative.c index f5060af61176..ed21de98b79e 100644 --- a/arch/s390/kernel/alternative.c +++ b/arch/s390/kernel/alternative.c @@ -14,6 +14,29 @@ static int __init disable_alternative_instructions(char *str)
early_param("noaltinstr", disable_alternative_instructions);
+static int __init nobp_setup_early(char *str) +{ + bool enabled; + int rc; + + rc = kstrtobool(str, &enabled); + if (rc) + return rc; + if (enabled && test_facility(82)) + __set_facility(82, S390_lowcore.alt_stfle_fac_list); + else + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); + return 0; +} +early_param("nobp", nobp_setup_early); + +static int __init nospec_setup_early(char *str) +{ + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); + return 0; +} +early_param("nospec", nospec_setup_early); + struct brcl_insn { u16 opc; s32 disp; diff --git a/arch/s390/kernel/early.c b/arch/s390/kernel/early.c index 88d0923fe244..8eccead675d4 100644 --- a/arch/s390/kernel/early.c +++ b/arch/s390/kernel/early.c @@ -282,6 +282,8 @@ static noinline __init void setup_facility_list(void) memcpy(S390_lowcore.alt_stfle_fac_list, S390_lowcore.stfle_fac_list, sizeof(S390_lowcore.alt_stfle_fac_list)); + if (!IS_ENABLED(CONFIG_KERNEL_NOBP)) + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); }
static __init void detect_diag9c(void) diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 245de839380f..42072fa4a82a 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -162,8 +162,41 @@ _PIF_WORK = (_PIF_PER_TRAP) tm off+\addr, \mask .endm
+ .macro BPOFF + .pushsection .altinstr_replacement, "ax" +660: .long 0xb2e8c000 + .popsection +661: .long 0x47000000 + .pushsection .altinstructions, "a" + .long 661b - . + .long 660b - . + .word 82 + .byte 4 + .byte 4 + .popsection + .endm + + .macro BPON + .pushsection .altinstr_replacement, "ax" +662: .long 0xb2e8d000 + .popsection +663: .long 0x47000000 + .pushsection .altinstructions, "a" + .long 663b - . + .long 662b - . + .word 82 + .byte 4 + .byte 4 + .popsection + .endm + .section .kprobes.text, "ax"
+ENTRY(__bpon) + .globl __bpon + BPON + br %r14 + /* * Scheduler resume function, called by switch_to * gpr2 = (task_struct *) prev @@ -223,7 +256,10 @@ ENTRY(sie64a) jnz .Lsie_skip TSTMSK __LC_CPU_FLAGS,_CIF_FPU jo .Lsie_skip # exit if fp/vx regs changed + BPON sie 0(%r14) +.Lsie_exit: + BPOFF .Lsie_skip: ni __SIE_PROG0C+3(%r14),0xfe # no longer in SIE lctlg %c1,%c1,__LC_USER_ASCE # load primary asce @@ -273,6 +309,7 @@ ENTRY(system_call) stpt __LC_SYNC_ENTER_TIMER .Lsysc_stmg: stmg %r8,%r15,__LC_SAVE_AREA_SYNC + BPOFF lg %r10,__LC_LAST_BREAK lg %r12,__LC_THREAD_INFO lghi %r14,_PIF_SYSCALL @@ -319,6 +356,7 @@ ENTRY(system_call) jnz .Lsysc_work # check for work TSTMSK __LC_CPU_FLAGS,_CIF_WORK jnz .Lsysc_work + BPON .Lsysc_restore: lg %r14,__LC_VDSO_PER_CPU lmg %r0,%r10,__PT_R0(%r11) @@ -479,6 +517,7 @@ ENTRY(kernel_thread_starter)
ENTRY(pgm_check_handler) stpt __LC_SYNC_ENTER_TIMER + BPOFF stmg %r8,%r15,__LC_SAVE_AREA_SYNC lg %r10,__LC_LAST_BREAK lg %r12,__LC_THREAD_INFO @@ -577,6 +616,7 @@ ENTRY(pgm_check_handler) ENTRY(io_int_handler) STCK __LC_INT_CLOCK stpt __LC_ASYNC_ENTER_TIMER + BPOFF stmg %r8,%r15,__LC_SAVE_AREA_ASYNC lg %r10,__LC_LAST_BREAK lg %r12,__LC_THREAD_INFO @@ -628,9 +668,13 @@ ENTRY(io_int_handler) lg %r14,__LC_VDSO_PER_CPU lmg %r0,%r10,__PT_R0(%r11) mvc __LC_RETURN_PSW(16),__PT_PSW(%r11) + tm __PT_PSW+1(%r11),0x01 # returning to user ? + jno .Lio_exit_kernel + BPON .Lio_exit_timer: stpt __LC_EXIT_TIMER mvc __VDSO_ECTG_BASE(16,%r14),__LC_EXIT_TIMER +.Lio_exit_kernel: lmg %r11,%r15,__PT_R11(%r11) lpswe __LC_RETURN_PSW .Lio_done: @@ -762,6 +806,7 @@ ENTRY(io_int_handler) ENTRY(ext_int_handler) STCK __LC_INT_CLOCK stpt __LC_ASYNC_ENTER_TIMER + BPOFF stmg %r8,%r15,__LC_SAVE_AREA_ASYNC lg %r10,__LC_LAST_BREAK lg %r12,__LC_THREAD_INFO @@ -810,6 +855,7 @@ ENTRY(psw_idle) .insn rsy,0xeb0000000017,%r1,5,__SF_EMPTY+16(%r15) .Lpsw_idle_stcctm: #endif + BPON STCK __CLOCK_IDLE_ENTER(%r2) stpt __TIMER_IDLE_ENTER(%r2) .Lpsw_idle_lpsw: @@ -914,6 +960,7 @@ load_fpu_regs: */ ENTRY(mcck_int_handler) STCK __LC_MCCK_CLOCK + BPOFF la %r1,4095 # revalidate r1 spt __LC_CPU_TIMER_SAVE_AREA-4095(%r1) # revalidate cpu timer lmg %r0,%r15,__LC_GPREGS_SAVE_AREA-4095(%r1)# revalidate gprs @@ -980,6 +1027,7 @@ ENTRY(mcck_int_handler) mvc __LC_RETURN_MCCK_PSW(16),__PT_PSW(%r11) # move return PSW tm __LC_RETURN_MCCK_PSW+1,0x01 # returning to user ? jno 0f + BPON stpt __LC_EXIT_TIMER mvc __VDSO_ECTG_BASE(16,%r14),__LC_EXIT_TIMER 0: lmg %r11,%r15,__PT_R11(%r11) diff --git a/arch/s390/kernel/ipl.c b/arch/s390/kernel/ipl.c index 42570d8fb265..6fff0c827558 100644 --- a/arch/s390/kernel/ipl.c +++ b/arch/s390/kernel/ipl.c @@ -563,6 +563,7 @@ static struct kset *ipl_kset;
static void __ipl_run(void *unused) { + __bpon(); diag308(DIAG308_IPL, NULL); if (MACHINE_IS_VM) __cpcmd("IPL", NULL, 0, NULL); diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c index adf25c7a1750..6de44b6bbb0d 100644 --- a/arch/s390/kernel/smp.c +++ b/arch/s390/kernel/smp.c @@ -301,6 +301,7 @@ static void pcpu_delegate(struct pcpu *pcpu, void (*func)(void *), mem_assign_absolute(lc->restart_fn, (unsigned long) func); mem_assign_absolute(lc->restart_data, (unsigned long) data); mem_assign_absolute(lc->restart_source, source_cpu); + __bpon(); asm volatile( "0: sigp 0,%0,%2 # sigp restart to target cpu\n" " brc 2,0b # busy, try again\n" @@ -890,6 +891,7 @@ void __cpu_die(unsigned int cpu) void __noreturn cpu_die(void) { idle_task_exit(); + __bpon(); pcpu_sigp_retry(pcpu_devices + smp_processor_id(), SIGP_STOP, 0); for (;;) ; }
[ Upstream commit 6b73044b2b0081ee3dd1cd6eaab7dee552601efb ]
Define TIF_ISOLATE_BP and TIF_ISOLATE_BP_GUEST and add the necessary plumbing in entry.S to be able to run user space and KVM guests with limited branch prediction.
To switch a user space process to limited branch prediction the s390_isolate_bp() function has to be call, and to run a vCPU of a KVM guest associated with the current task with limited branch prediction call s390_isolate_bp_guest().
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/include/asm/processor.h | 3 +++ arch/s390/include/asm/thread_info.h | 4 +++ arch/s390/kernel/entry.S | 49 ++++++++++++++++++++++++++++++++++--- arch/s390/kernel/processor.c | 18 ++++++++++++++ 4 files changed, 70 insertions(+), 4 deletions(-)
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h index d5431b366e1a..f915a0f1b0fc 100644 --- a/arch/s390/include/asm/processor.h +++ b/arch/s390/include/asm/processor.h @@ -316,6 +316,9 @@ extern void memcpy_absolute(void *, void *, size_t); memcpy_absolute(&(dest), &__tmp, sizeof(__tmp)); \ }
+extern int s390_isolate_bp(void); +extern int s390_isolate_bp_guest(void); + #endif /* __ASSEMBLY__ */
#endif /* __ASM_S390_PROCESSOR_H */ diff --git a/arch/s390/include/asm/thread_info.h b/arch/s390/include/asm/thread_info.h index 692b9247c019..b2504163c8fa 100644 --- a/arch/s390/include/asm/thread_info.h +++ b/arch/s390/include/asm/thread_info.h @@ -78,6 +78,8 @@ void arch_release_task_struct(struct task_struct *tsk); #define TIF_SECCOMP 5 /* secure computing */ #define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ #define TIF_UPROBE 7 /* breakpointed or single-stepping */ +#define TIF_ISOLATE_BP 8 /* Run process with isolated BP */ +#define TIF_ISOLATE_BP_GUEST 9 /* Run KVM guests with isolated BP */ #define TIF_31BIT 16 /* 32bit process */ #define TIF_MEMDIE 17 /* is terminating due to OOM killer */ #define TIF_RESTORE_SIGMASK 18 /* restore signal mask in do_signal() */ @@ -93,6 +95,8 @@ void arch_release_task_struct(struct task_struct *tsk); #define _TIF_SECCOMP _BITUL(TIF_SECCOMP) #define _TIF_SYSCALL_TRACEPOINT _BITUL(TIF_SYSCALL_TRACEPOINT) #define _TIF_UPROBE _BITUL(TIF_UPROBE) +#define _TIF_ISOLATE_BP _BITUL(TIF_ISOLATE_BP) +#define _TIF_ISOLATE_BP_GUEST _BITUL(TIF_ISOLATE_BP_GUEST) #define _TIF_31BIT _BITUL(TIF_31BIT) #define _TIF_SINGLE_STEP _BITUL(TIF_SINGLE_STEP)
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 42072fa4a82a..7fdf7c2efaf6 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -104,6 +104,7 @@ _PIF_WORK = (_PIF_PER_TRAP) j 3f 1: LAST_BREAK %r14 UPDATE_VTIME %r14,%r15,\timer + BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP 2: lg %r15,__LC_ASYNC_STACK # load async stack 3: la %r11,STACK_FRAME_OVERHEAD(%r15) .endm @@ -190,6 +191,40 @@ _PIF_WORK = (_PIF_PER_TRAP) .popsection .endm
+ .macro BPENTER tif_ptr,tif_mask + .pushsection .altinstr_replacement, "ax" +662: .word 0xc004, 0x0000, 0x0000 # 6 byte nop + .word 0xc004, 0x0000, 0x0000 # 6 byte nop + .popsection +664: TSTMSK \tif_ptr,\tif_mask + jz . + 8 + .long 0xb2e8d000 + .pushsection .altinstructions, "a" + .long 664b - . + .long 662b - . + .word 82 + .byte 12 + .byte 12 + .popsection + .endm + + .macro BPEXIT tif_ptr,tif_mask + TSTMSK \tif_ptr,\tif_mask + .pushsection .altinstr_replacement, "ax" +662: jnz . + 8 + .long 0xb2e8d000 + .popsection +664: jz . + 8 + .long 0xb2e8c000 + .pushsection .altinstructions, "a" + .long 664b - . + .long 662b - . + .word 82 + .byte 8 + .byte 8 + .popsection + .endm + .section .kprobes.text, "ax"
ENTRY(__bpon) @@ -237,9 +272,11 @@ ENTRY(__switch_to) */ ENTRY(sie64a) stmg %r6,%r14,__SF_GPRS(%r15) # save kernel registers + lg %r12,__LC_CURRENT stg %r2,__SF_EMPTY(%r15) # save control block pointer stg %r3,__SF_EMPTY+8(%r15) # save guest register save area xc __SF_EMPTY+16(8,%r15),__SF_EMPTY+16(%r15) # reason code = 0 + mvc __SF_EMPTY+24(8,%r15),__TI_flags(%r12) # copy thread flags TSTMSK __LC_CPU_FLAGS,_CIF_FPU # load guest fp/vx registers ? jno .Lsie_load_guest_gprs brasl %r14,load_fpu_regs # load guest fp/vx regs @@ -256,10 +293,11 @@ ENTRY(sie64a) jnz .Lsie_skip TSTMSK __LC_CPU_FLAGS,_CIF_FPU jo .Lsie_skip # exit if fp/vx regs changed - BPON + BPEXIT __SF_EMPTY+24(%r15),(_TIF_ISOLATE_BP|_TIF_ISOLATE_BP_GUEST) sie 0(%r14) .Lsie_exit: BPOFF + BPENTER __SF_EMPTY+24(%r15),(_TIF_ISOLATE_BP|_TIF_ISOLATE_BP_GUEST) .Lsie_skip: ni __SIE_PROG0C+3(%r14),0xfe # no longer in SIE lctlg %c1,%c1,__LC_USER_ASCE # load primary asce @@ -319,6 +357,7 @@ ENTRY(system_call) LAST_BREAK %r13 .Lsysc_vtime: UPDATE_VTIME %r10,%r13,__LC_SYNC_ENTER_TIMER + BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP stmg %r0,%r7,__PT_R0(%r11) # clear user controlled register to prevent speculative use xgr %r0,%r0 @@ -356,7 +395,7 @@ ENTRY(system_call) jnz .Lsysc_work # check for work TSTMSK __LC_CPU_FLAGS,_CIF_WORK jnz .Lsysc_work - BPON + BPEXIT __TI_flags(%r12),_TIF_ISOLATE_BP .Lsysc_restore: lg %r14,__LC_VDSO_PER_CPU lmg %r0,%r10,__PT_R0(%r11) @@ -542,6 +581,7 @@ ENTRY(pgm_check_handler) j 3f 2: LAST_BREAK %r14 UPDATE_VTIME %r14,%r15,__LC_SYNC_ENTER_TIMER + BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP lg %r15,__LC_KERNEL_STACK lg %r14,__TI_task(%r12) aghi %r14,__TASK_thread # pointer to thread_struct @@ -670,7 +710,7 @@ ENTRY(io_int_handler) mvc __LC_RETURN_PSW(16),__PT_PSW(%r11) tm __PT_PSW+1(%r11),0x01 # returning to user ? jno .Lio_exit_kernel - BPON + BPEXIT __TI_flags(%r12),_TIF_ISOLATE_BP .Lio_exit_timer: stpt __LC_EXIT_TIMER mvc __VDSO_ECTG_BASE(16,%r14),__LC_EXIT_TIMER @@ -1027,7 +1067,7 @@ ENTRY(mcck_int_handler) mvc __LC_RETURN_MCCK_PSW(16),__PT_PSW(%r11) # move return PSW tm __LC_RETURN_MCCK_PSW+1,0x01 # returning to user ? jno 0f - BPON + BPEXIT __TI_flags(%r12),_TIF_ISOLATE_BP stpt __LC_EXIT_TIMER mvc __VDSO_ECTG_BASE(16,%r14),__LC_EXIT_TIMER 0: lmg %r11,%r15,__PT_R11(%r11) @@ -1148,6 +1188,7 @@ cleanup_critical: .quad .Lsie_done
.Lcleanup_sie: + BPENTER __SF_EMPTY+24(%r15),(_TIF_ISOLATE_BP|_TIF_ISOLATE_BP_GUEST) lg %r9,__SF_EMPTY(%r15) # get control block pointer ni __SIE_PROG0C+3(%r9),0xfe # no longer in SIE lctlg %c1,%c1,__LC_USER_ASCE # load primary asce diff --git a/arch/s390/kernel/processor.c b/arch/s390/kernel/processor.c index 7ce00e7a709a..ab236bd970bb 100644 --- a/arch/s390/kernel/processor.c +++ b/arch/s390/kernel/processor.c @@ -13,6 +13,7 @@ #include <linux/cpu.h> #include <asm/diag.h> #include <asm/elf.h> +#include <asm/facility.h> #include <asm/lowcore.h> #include <asm/param.h> #include <asm/smp.h> @@ -113,3 +114,20 @@ const struct seq_operations cpuinfo_op = { .show = show_cpuinfo, };
+int s390_isolate_bp(void) +{ + if (!test_facility(82)) + return -EOPNOTSUPP; + set_thread_flag(TIF_ISOLATE_BP); + return 0; +} +EXPORT_SYMBOL(s390_isolate_bp); + +int s390_isolate_bp_guest(void) +{ + if (!test_facility(82)) + return -EOPNOTSUPP; + set_thread_flag(TIF_ISOLATE_BP_GUEST); + return 0; +} +EXPORT_SYMBOL(s390_isolate_bp_guest);
[ Upstream commit f19fbd5ed642dc31c809596412dab1ed56f2f156 ]
Add CONFIG_EXPOLINE to enable the use of the new -mindirect-branch= and -mfunction_return= compiler options to create a kernel fortified against the specte v2 attack.
With CONFIG_EXPOLINE=y all indirect branches will be issued with an execute type instruction. For z10 or newer the EXRL instruction will be used, for older machines the EX instruction. The typical indirect call
basr %r14,%r1
is replaced with a PC relative call to a new thunk
brasl %r14,__s390x_indirect_jump_r1
The thunk contains the EXRL/EX instruction to the indirect branch
__s390x_indirect_jump_r1: exrl 0,0f j . 0: br %r1
The detour via the execute type instruction has a performance impact. To get rid of the detour the new kernel parameter "nospectre_v2" and "spectre_v2=[on,off,auto]" can be used. If the parameter is specified the kernel and module code will be patched at runtime.
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/Kconfig | 28 +++++++++ arch/s390/Makefile | 10 +++ arch/s390/include/asm/lowcore.h | 4 +- arch/s390/include/asm/nospec-branch.h | 18 ++++++ arch/s390/kernel/Makefile | 3 + arch/s390/kernel/entry.S | 113 ++++++++++++++++++++++++++-------- arch/s390/kernel/module.c | 62 ++++++++++++++++--- arch/s390/kernel/nospec-branch.c | 101 ++++++++++++++++++++++++++++++ arch/s390/kernel/setup.c | 4 ++ arch/s390/kernel/smp.c | 1 + arch/s390/kernel/vmlinux.lds.S | 14 +++++ drivers/s390/char/Makefile | 2 + 12 files changed, 325 insertions(+), 35 deletions(-) create mode 100644 arch/s390/include/asm/nospec-branch.h create mode 100644 arch/s390/kernel/nospec-branch.c
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 391255169396..89bab36b98ac 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -722,6 +722,34 @@ config KERNEL_NOBP
If unsure, say N.
+config EXPOLINE + def_bool n + prompt "Avoid speculative indirect branches in the kernel" + help + Compile the kernel with the expoline compiler options to guard + against kernel-to-user data leaks by avoiding speculative indirect + branches. + Requires a compiler with -mindirect-branch=thunk support for full + protection. The kernel may run slower. + + If unsure, say N. + +choice + prompt "Expoline default" + depends on EXPOLINE + default EXPOLINE_FULL + +config EXPOLINE_OFF + bool "spectre_v2=off" + +config EXPOLINE_MEDIUM + bool "spectre_v2=auto" + +config EXPOLINE_FULL + bool "spectre_v2=on" + +endchoice + endmenu
menu "Power Management" diff --git a/arch/s390/Makefile b/arch/s390/Makefile index e8d4423e4f85..9ff5714eafaf 100644 --- a/arch/s390/Makefile +++ b/arch/s390/Makefile @@ -77,6 +77,16 @@ ifeq ($(call cc-option-yn,-mwarn-dynamicstack),y) cflags-$(CONFIG_WARN_DYNAMIC_STACK) += -mwarn-dynamicstack endif
+ifdef CONFIG_EXPOLINE + ifeq ($(call cc-option-yn,$(CC_FLAGS_MARCH) -mindirect-branch=thunk),y) + CC_FLAGS_EXPOLINE := -mindirect-branch=thunk + CC_FLAGS_EXPOLINE += -mfunction-return=thunk + CC_FLAGS_EXPOLINE += -mindirect-branch-table + export CC_FLAGS_EXPOLINE + cflags-y += $(CC_FLAGS_EXPOLINE) + endif +endif + ifdef CONFIG_FUNCTION_TRACER # make use of hotpatch feature if the compiler supports it cc_hotpatch := -mhotpatch=0,3 diff --git a/arch/s390/include/asm/lowcore.h b/arch/s390/include/asm/lowcore.h index 2f8809671151..8520c23e419b 100644 --- a/arch/s390/include/asm/lowcore.h +++ b/arch/s390/include/asm/lowcore.h @@ -155,7 +155,9 @@ struct _lowcore { /* Per cpu primary space access list */ __u32 paste[16]; /* 0x0400 */
- __u8 pad_0x04c0[0x0e00-0x0440]; /* 0x0440 */ + /* br %r1 trampoline */ + __u16 br_r1_trampoline; /* 0x0440 */ + __u8 pad_0x0442[0x0e00-0x0442]; /* 0x0442 */
/* * 0xe00 contains the address of the IPL Parameter Information diff --git a/arch/s390/include/asm/nospec-branch.h b/arch/s390/include/asm/nospec-branch.h new file mode 100644 index 000000000000..7df48e5cf36f --- /dev/null +++ b/arch/s390/include/asm/nospec-branch.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_S390_EXPOLINE_H +#define _ASM_S390_EXPOLINE_H + +#ifndef __ASSEMBLY__ + +#include <linux/types.h> + +extern int nospec_call_disable; +extern int nospec_return_disable; + +void nospec_init_branches(void); +void nospec_call_revert(s32 *start, s32 *end); +void nospec_return_revert(s32 *start, s32 *end); + +#endif /* __ASSEMBLY__ */ + +#endif /* _ASM_S390_EXPOLINE_H */ diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile index a4c95c847905..9c9cf5331d6f 100644 --- a/arch/s390/kernel/Makefile +++ b/arch/s390/kernel/Makefile @@ -48,6 +48,9 @@ obj-y += entry.o reipl.o relocate_kernel.o alternative.o
extra-y += head.o head64.o vmlinux.lds
+obj-$(CONFIG_EXPOLINE) += nospec-branch.o +CFLAGS_REMOVE_expoline.o += $(CC_FLAGS_EXPOLINE) + obj-$(CONFIG_MODULES) += s390_ksyms.o module.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_SCHED_BOOK) += topology.o diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 7fdf7c2efaf6..beda1c33ca12 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -225,12 +225,74 @@ _PIF_WORK = (_PIF_PER_TRAP) .popsection .endm
+#ifdef CONFIG_EXPOLINE + + .macro GEN_BR_THUNK name,reg,tmp + .section .text.\name,"axG",@progbits,\name,comdat + .globl \name + .hidden \name + .type \name,@function +\name: + .cfi_startproc +#ifdef CONFIG_HAVE_MARCH_Z10_FEATURES + exrl 0,0f +#else + larl \tmp,0f + ex 0,0(\tmp) +#endif + j . +0: br \reg + .cfi_endproc + .endm + + GEN_BR_THUNK __s390x_indirect_jump_r1use_r9,%r9,%r1 + GEN_BR_THUNK __s390x_indirect_jump_r1use_r14,%r14,%r1 + GEN_BR_THUNK __s390x_indirect_jump_r11use_r14,%r14,%r11 + + .macro BASR_R14_R9 +0: brasl %r14,__s390x_indirect_jump_r1use_r9 + .pushsection .s390_indirect_branches,"a",@progbits + .long 0b-. + .popsection + .endm + + .macro BR_R1USE_R14 +0: jg __s390x_indirect_jump_r1use_r14 + .pushsection .s390_indirect_branches,"a",@progbits + .long 0b-. + .popsection + .endm + + .macro BR_R11USE_R14 +0: jg __s390x_indirect_jump_r11use_r14 + .pushsection .s390_indirect_branches,"a",@progbits + .long 0b-. + .popsection + .endm + +#else /* CONFIG_EXPOLINE */ + + .macro BASR_R14_R9 + basr %r14,%r9 + .endm + + .macro BR_R1USE_R14 + br %r14 + .endm + + .macro BR_R11USE_R14 + br %r14 + .endm + +#endif /* CONFIG_EXPOLINE */ + + .section .kprobes.text, "ax"
ENTRY(__bpon) .globl __bpon BPON - br %r14 + BR_R1USE_R14
/* * Scheduler resume function, called by switch_to @@ -258,9 +320,9 @@ ENTRY(__switch_to) mvc __LC_CURRENT_PID(4,%r0),__TASK_pid(%r3) # store pid of next lmg %r6,%r15,__SF_GPRS(%r15) # load gprs of next task TSTMSK __LC_MACHINE_FLAGS,MACHINE_FLAG_LPP - bzr %r14 + jz 0f .insn s,0xb2800000,__LC_LPP # set program parameter - br %r14 +0: BR_R1USE_R14
.L__critical_start:
@@ -326,7 +388,7 @@ sie_exit: xgr %r5,%r5 lmg %r6,%r14,__SF_GPRS(%r15) # restore kernel registers lg %r2,__SF_EMPTY+16(%r15) # return exit reason code - br %r14 + BR_R1USE_R14 .Lsie_fault: lghi %r14,-EFAULT stg %r14,__SF_EMPTY+16(%r15) # set exit reason code @@ -383,7 +445,7 @@ ENTRY(system_call) lgf %r9,0(%r8,%r10) # get system call add. TSTMSK __TI_flags(%r12),_TIF_TRACE jnz .Lsysc_tracesys - basr %r14,%r9 # call sys_xxxx + BASR_R14_R9 # call sys_xxxx stg %r2,__PT_R2(%r11) # store return value
.Lsysc_return: @@ -523,7 +585,7 @@ ENTRY(system_call) lmg %r3,%r7,__PT_R3(%r11) stg %r7,STACK_FRAME_OVERHEAD(%r15) lg %r2,__PT_ORIG_GPR2(%r11) - basr %r14,%r9 # call sys_xxx + BASR_R14_R9 # call sys_xxx stg %r2,__PT_R2(%r11) # store return value .Lsysc_tracenogo: TSTMSK __TI_flags(%r12),_TIF_TRACE @@ -547,7 +609,7 @@ ENTRY(ret_from_fork) lmg %r9,%r10,__PT_R9(%r11) # load gprs ENTRY(kernel_thread_starter) la %r2,0(%r10) - basr %r14,%r9 + BASR_R14_R9 j .Lsysc_tracenogo
/* @@ -621,9 +683,9 @@ ENTRY(pgm_check_handler) nill %r10,0x007f sll %r10,2 je .Lpgm_return - lgf %r1,0(%r10,%r1) # load address of handler routine + lgf %r9,0(%r10,%r1) # load address of handler routine lgr %r2,%r11 # pass pointer to pt_regs - basr %r14,%r1 # branch to interrupt-handler + BASR_R14_R9 # branch to interrupt-handler .Lpgm_return: LOCKDEP_SYS_EXIT tm __PT_PSW+1(%r11),0x01 # returning to user ? @@ -900,7 +962,7 @@ ENTRY(psw_idle) stpt __TIMER_IDLE_ENTER(%r2) .Lpsw_idle_lpsw: lpswe __SF_EMPTY(%r15) - br %r14 + BR_R1USE_R14 .Lpsw_idle_end:
/* @@ -914,7 +976,7 @@ ENTRY(save_fpu_regs) lg %r2,__LC_CURRENT aghi %r2,__TASK_thread TSTMSK __LC_CPU_FLAGS,_CIF_FPU - bor %r14 + jo .Lsave_fpu_regs_exit stfpc __THREAD_FPU_fpc(%r2) .Lsave_fpu_regs_fpc_end: lg %r3,__THREAD_FPU_regs(%r2) @@ -944,7 +1006,8 @@ ENTRY(save_fpu_regs) std 15,120(%r3) .Lsave_fpu_regs_done: oi __LC_CPU_FLAGS+7,_CIF_FPU - br %r14 +.Lsave_fpu_regs_exit: + BR_R1USE_R14 .Lsave_fpu_regs_end:
/* @@ -961,7 +1024,7 @@ load_fpu_regs: lg %r4,__LC_CURRENT aghi %r4,__TASK_thread TSTMSK __LC_CPU_FLAGS,_CIF_FPU - bnor %r14 + jno .Lload_fpu_regs_exit lfpc __THREAD_FPU_fpc(%r4) TSTMSK __LC_MACHINE_FLAGS,MACHINE_FLAG_VX lg %r4,__THREAD_FPU_regs(%r4) # %r4 <- reg save area @@ -990,7 +1053,8 @@ load_fpu_regs: ld 15,120(%r4) .Lload_fpu_regs_done: ni __LC_CPU_FLAGS+7,255-_CIF_FPU - br %r14 +.Lload_fpu_regs_exit: + BR_R1USE_R14 .Lload_fpu_regs_end:
.L__critical_end: @@ -1163,7 +1227,7 @@ cleanup_critical: jl 0f clg %r9,BASED(.Lcleanup_table+104) # .Lload_fpu_regs_end jl .Lcleanup_load_fpu_regs -0: br %r14 +0: BR_R11USE_R14
.align 8 .Lcleanup_table: @@ -1193,7 +1257,7 @@ cleanup_critical: ni __SIE_PROG0C+3(%r9),0xfe # no longer in SIE lctlg %c1,%c1,__LC_USER_ASCE # load primary asce larl %r9,sie_exit # skip forward to sie_exit - br %r14 + BR_R11USE_R14 #endif
.Lcleanup_system_call: @@ -1250,7 +1314,7 @@ cleanup_critical: stg %r15,56(%r11) # r15 stack pointer # set new psw address and exit larl %r9,.Lsysc_do_svc - br %r14 + BR_R11USE_R14 .Lcleanup_system_call_insn: .quad system_call .quad .Lsysc_stmg @@ -1260,7 +1324,7 @@ cleanup_critical:
.Lcleanup_sysc_tif: larl %r9,.Lsysc_tif - br %r14 + BR_R11USE_R14
.Lcleanup_sysc_restore: # check if stpt has been executed @@ -1277,14 +1341,14 @@ cleanup_critical: mvc 0(64,%r11),__PT_R8(%r9) lmg %r0,%r7,__PT_R0(%r9) 1: lmg %r8,%r9,__LC_RETURN_PSW - br %r14 + BR_R11USE_R14 .Lcleanup_sysc_restore_insn: .quad .Lsysc_exit_timer .quad .Lsysc_done - 4
.Lcleanup_io_tif: larl %r9,.Lio_tif - br %r14 + BR_R11USE_R14
.Lcleanup_io_restore: # check if stpt has been executed @@ -1298,7 +1362,7 @@ cleanup_critical: mvc 0(64,%r11),__PT_R8(%r9) lmg %r0,%r7,__PT_R0(%r9) 1: lmg %r8,%r9,__LC_RETURN_PSW - br %r14 + BR_R11USE_R14 .Lcleanup_io_restore_insn: .quad .Lio_exit_timer .quad .Lio_done - 4 @@ -1350,17 +1414,17 @@ cleanup_critical: # prepare return psw nihh %r8,0xfcfd # clear irq & wait state bits lg %r9,48(%r11) # return from psw_idle - br %r14 + BR_R11USE_R14 .Lcleanup_idle_insn: .quad .Lpsw_idle_lpsw
.Lcleanup_save_fpu_regs: larl %r9,save_fpu_regs - br %r14 + BR_R11USE_R14
.Lcleanup_load_fpu_regs: larl %r9,load_fpu_regs - br %r14 + BR_R11USE_R14
/* * Integer constants @@ -1376,7 +1440,6 @@ cleanup_critical: .Lsie_critical_length: .quad .Lsie_done - .Lsie_gmap #endif - .section .rodata, "a" #define SYSCALL(esame,emu) .long esame .globl sys_call_table diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c index 06c7e82342cd..86b49f209251 100644 --- a/arch/s390/kernel/module.c +++ b/arch/s390/kernel/module.c @@ -32,6 +32,8 @@ #include <linux/moduleloader.h> #include <linux/bug.h> #include <asm/alternative.h> +#include <asm/nospec-branch.h> +#include <asm/facility.h>
#if 0 #define DEBUGP printk @@ -164,7 +166,11 @@ int module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs, me->arch.got_offset = me->core_size; me->core_size += me->arch.got_size; me->arch.plt_offset = me->core_size; - me->core_size += me->arch.plt_size; + if (me->arch.plt_size) { + if (IS_ENABLED(CONFIG_EXPOLINE) && !nospec_call_disable) + me->arch.plt_size += PLT_ENTRY_SIZE; + me->core_size += me->arch.plt_size; + } return 0; }
@@ -318,9 +324,21 @@ static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab, unsigned int *ip; ip = me->module_core + me->arch.plt_offset + info->plt_offset; - ip[0] = 0x0d10e310; /* basr 1,0; lg 1,10(1); br 1 */ - ip[1] = 0x100a0004; - ip[2] = 0x07f10000; + ip[0] = 0x0d10e310; /* basr 1,0 */ + ip[1] = 0x100a0004; /* lg 1,10(1) */ + if (IS_ENABLED(CONFIG_EXPOLINE) && + !nospec_call_disable) { + unsigned int *ij; + ij = me->module_core + + me->arch.plt_offset + + me->arch.plt_size - PLT_ENTRY_SIZE; + ip[2] = 0xa7f40000 + /* j __jump_r1 */ + (unsigned int)(u16) + (((unsigned long) ij - 8 - + (unsigned long) ip) / 2); + } else { + ip[2] = 0x07f10000; /* br %r1 */ + } ip[3] = (unsigned int) (val >> 32); ip[4] = (unsigned int) val; info->plt_initialized = 1; @@ -426,16 +444,42 @@ int module_finalize(const Elf_Ehdr *hdr, struct module *me) { const Elf_Shdr *s; - char *secstrings; + char *secstrings, *secname; + void *aseg; + + if (IS_ENABLED(CONFIG_EXPOLINE) && + !nospec_call_disable && me->arch.plt_size) { + unsigned int *ij; + + ij = me->module_core + me->arch.plt_offset + + me->arch.plt_size - PLT_ENTRY_SIZE; + if (test_facility(35)) { + ij[0] = 0xc6000000; /* exrl %r0,.+10 */ + ij[1] = 0x0005a7f4; /* j . */ + ij[2] = 0x000007f1; /* br %r1 */ + } else { + ij[0] = 0x44000000 | (unsigned int) + offsetof(struct _lowcore, br_r1_trampoline); + ij[1] = 0xa7f40000; /* j . */ + } + }
secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) { - if (!strcmp(".altinstructions", secstrings + s->sh_name)) { - /* patch .altinstructions */ - void *aseg = (void *)s->sh_addr; + aseg = (void *) s->sh_addr; + secname = secstrings + s->sh_name;
+ if (!strcmp(".altinstructions", secname)) + /* patch .altinstructions */ apply_alternatives(aseg, aseg + s->sh_size); - } + + if (IS_ENABLED(CONFIG_EXPOLINE) && + (!strcmp(".nospec_call_table", secname))) + nospec_call_revert(aseg, aseg + s->sh_size); + + if (IS_ENABLED(CONFIG_EXPOLINE) && + (!strcmp(".nospec_return_table", secname))) + nospec_return_revert(aseg, aseg + s->sh_size); }
jump_label_apply_nops(me); diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c new file mode 100644 index 000000000000..86ee26a612cf --- /dev/null +++ b/arch/s390/kernel/nospec-branch.c @@ -0,0 +1,101 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/module.h> +#include <asm/facility.h> +#include <asm/nospec-branch.h> + +int nospec_call_disable = IS_ENABLED(EXPOLINE_OFF); +int nospec_return_disable = !IS_ENABLED(EXPOLINE_FULL); + +static int __init nospectre_v2_setup_early(char *str) +{ + nospec_call_disable = 1; + nospec_return_disable = 1; + return 0; +} +early_param("nospectre_v2", nospectre_v2_setup_early); + +static int __init spectre_v2_setup_early(char *str) +{ + if (str && !strncmp(str, "on", 2)) { + nospec_call_disable = 0; + nospec_return_disable = 0; + } + if (str && !strncmp(str, "off", 3)) { + nospec_call_disable = 1; + nospec_return_disable = 1; + } + if (str && !strncmp(str, "auto", 4)) { + nospec_call_disable = 0; + nospec_return_disable = 1; + } + return 0; +} +early_param("spectre_v2", spectre_v2_setup_early); + +static void __init_or_module __nospec_revert(s32 *start, s32 *end) +{ + enum { BRCL_EXPOLINE, BRASL_EXPOLINE } type; + u8 *instr, *thunk, *br; + u8 insnbuf[6]; + s32 *epo; + + /* Second part of the instruction replace is always a nop */ + memcpy(insnbuf + 2, (char[]) { 0x47, 0x00, 0x00, 0x00 }, 4); + for (epo = start; epo < end; epo++) { + instr = (u8 *) epo + *epo; + if (instr[0] == 0xc0 && (instr[1] & 0x0f) == 0x04) + type = BRCL_EXPOLINE; /* brcl instruction */ + else if (instr[0] == 0xc0 && (instr[1] & 0x0f) == 0x05) + type = BRASL_EXPOLINE; /* brasl instruction */ + else + continue; + thunk = instr + (*(int *)(instr + 2)) * 2; + if (thunk[0] == 0xc6 && thunk[1] == 0x00) + /* exrl %r0,<target-br> */ + br = thunk + (*(int *)(thunk + 2)) * 2; + else if (thunk[0] == 0xc0 && (thunk[1] & 0x0f) == 0x00 && + thunk[6] == 0x44 && thunk[7] == 0x00 && + (thunk[8] & 0x0f) == 0x00 && thunk[9] == 0x00 && + (thunk[1] & 0xf0) == (thunk[8] & 0xf0)) + /* larl %rx,<target br> + ex %r0,0(%rx) */ + br = thunk + (*(int *)(thunk + 2)) * 2; + else + continue; + if (br[0] != 0x07 || (br[1] & 0xf0) != 0xf0) + continue; + switch (type) { + case BRCL_EXPOLINE: + /* brcl to thunk, replace with br + nop */ + insnbuf[0] = br[0]; + insnbuf[1] = (instr[1] & 0xf0) | (br[1] & 0x0f); + break; + case BRASL_EXPOLINE: + /* brasl to thunk, replace with basr + nop */ + insnbuf[0] = 0x0d; + insnbuf[1] = (instr[1] & 0xf0) | (br[1] & 0x0f); + break; + } + + s390_kernel_write(instr, insnbuf, 6); + } +} + +void __init_or_module nospec_call_revert(s32 *start, s32 *end) +{ + if (nospec_call_disable) + __nospec_revert(start, end); +} + +void __init_or_module nospec_return_revert(s32 *start, s32 *end) +{ + if (nospec_return_disable) + __nospec_revert(start, end); +} + +extern s32 __nospec_call_start[], __nospec_call_end[]; +extern s32 __nospec_return_start[], __nospec_return_end[]; +void __init nospec_init_branches(void) +{ + nospec_call_revert(__nospec_call_start, __nospec_call_end); + nospec_return_revert(__nospec_return_start, __nospec_return_end); +} diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index e5627d4d910c..959e7646cec7 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -64,6 +64,7 @@ #include <asm/sysinfo.h> #include <asm/numa.h> #include <asm/alternative.h> +#include <asm/nospec-branch.h> #include "entry.h"
/* @@ -373,6 +374,7 @@ static void __init setup_lowcore(void) #ifdef CONFIG_SMP lc->spinlock_lockval = arch_spin_lockval(0); #endif + lc->br_r1_trampoline = 0x07f1; /* br %r1 */
set_prefix((u32)(unsigned long) lc); lowcore_ptr[0] = lc; @@ -897,6 +899,8 @@ void __init setup_arch(char **cmdline_p) set_preferred_console();
apply_alternative_instructions(); + if (IS_ENABLED(CONFIG_EXPOLINE)) + nospec_init_branches();
/* Setup zfcpdump support */ setup_zfcpdump(); diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c index 6de44b6bbb0d..77f4f334a465 100644 --- a/arch/s390/kernel/smp.c +++ b/arch/s390/kernel/smp.c @@ -200,6 +200,7 @@ static int pcpu_alloc_lowcore(struct pcpu *pcpu, int cpu) lc->panic_stack = panic_stack + PANIC_FRAME_OFFSET; lc->cpu_nr = cpu; lc->spinlock_lockval = arch_spin_lockval(cpu); + lc->br_r1_trampoline = 0x07f1; /* br %r1 */ if (MACHINE_HAS_VX) lc->vector_save_area_addr = (unsigned long) &lc->vector_save_area; diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S index 6cff68a84bb3..a4ae08e416e6 100644 --- a/arch/s390/kernel/vmlinux.lds.S +++ b/arch/s390/kernel/vmlinux.lds.S @@ -101,6 +101,20 @@ SECTIONS *(.altinstr_replacement) }
+ /* + * Table with the patch locations to undo expolines + */ + .nospec_call_table : { + __nospec_call_start = . ; + *(.s390_indirect*) + __nospec_call_end = . ; + } + .nospec_return_table : { + __nospec_return_start = . ; + *(.s390_return*) + __nospec_return_end = . ; + } + /* early.c uses stsi, which requires page aligned data. */ . = ALIGN(PAGE_SIZE); INIT_DATA_SECTION(0x100) diff --git a/drivers/s390/char/Makefile b/drivers/s390/char/Makefile index 6fa9364d1c07..835f1054976b 100644 --- a/drivers/s390/char/Makefile +++ b/drivers/s390/char/Makefile @@ -2,6 +2,8 @@ # S/390 character devices #
+CFLAGS_REMOVE_sclp_early_core.o += $(CC_FLAGS_EXPOLINE) + obj-y += ctrlchar.o keyboard.o defkeymap.o sclp.o sclp_rw.o sclp_quiesce.o \ sclp_cmd.o sclp_config.o sclp_cpi_sys.o sclp_ocf.o sclp_ctl.o \ sclp_early.o
From: Eugeniu Rosca erosca@de.adit-jv.com
[ Upstream commit 2cb370d615e9fbed9e95ed222c2c8f337181aa90 ]
I've accidentally stumbled upon the IS_ENABLED(EXPOLINE_*) lines, which obviously always evaluate to false. Fix this.
Fixes: f19fbd5ed642 ("s390: introduce execute-trampolines for branches") Signed-off-by: Eugeniu Rosca erosca@de.adit-jv.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/nospec-branch.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index 86ee26a612cf..57f55c24c21c 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -3,8 +3,8 @@ #include <asm/facility.h> #include <asm/nospec-branch.h>
-int nospec_call_disable = IS_ENABLED(EXPOLINE_OFF); -int nospec_return_disable = !IS_ENABLED(EXPOLINE_FULL); +int nospec_call_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF); +int nospec_return_disable = !IS_ENABLED(CONFIG_EXPOLINE_FULL);
static int __init nospectre_v2_setup_early(char *str) {
[ Upstream commit d5feec04fe578c8dbd9e2e1439afc2f0af761ed4 ]
The system call path can be interrupted before the switch back to the standard branch prediction with BPENTER has been done. The critical section cleanup code skips forward to .Lsysc_do_svc and bypasses the BPENTER. In this case the kernel and all subsequent code will run with the limited branch prediction.
Fixes: eacf67eb9b32 ("s390: run user space and KVM guests with modified branch prediction") Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/entry.S | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index beda1c33ca12..2c71cdf2a4f0 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -1299,7 +1299,8 @@ cleanup_critical: srag %r9,%r9,23 jz 0f mvc __TI_last_break(8,%r12),16(%r11) -0: # set up saved register r11 +0: BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP + # set up saved register r11 lg %r15,__LC_KERNEL_STACK la %r9,STACK_FRAME_OVERHEAD(%r15) stg %r9,24(%r11) # r11 pt_regs pointer
From: Christian Borntraeger borntraeger@de.ibm.com
[ Upstream commit d3f468963cd6fd6d2aa5e26aed8b24232096d0e1 ]
when a system call is interrupted we might call the critical section cleanup handler that re-does some of the operations. When we are between .Lsysc_vtime and .Lsysc_do_svc we might also redo the saving of the problem state registers r0-r7:
.Lcleanup_system_call: [...] 0: # update accounting time stamp mvc __LC_LAST_UPDATE_TIMER(8),__LC_SYNC_ENTER_TIMER # set up saved register r11 lg %r15,__LC_KERNEL_STACK la %r9,STACK_FRAME_OVERHEAD(%r15) stg %r9,24(%r11) # r11 pt_regs pointer # fill pt_regs mvc __PT_R8(64,%r9),__LC_SAVE_AREA_SYNC ---> stmg %r0,%r7,__PT_R0(%r9)
The problem is now, that we might have already zeroed out r0. The fix is to move the zeroing of r0 after sysc_do_svc.
Reported-by: Farhan Ali alifm@linux.vnet.ibm.com Fixes: 7041d28115e91 ("s390: scrub registers on kernel entry and KVM exit") Signed-off-by: Christian Borntraeger borntraeger@de.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/entry.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 2c71cdf2a4f0..c63730326215 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -421,13 +421,13 @@ ENTRY(system_call) UPDATE_VTIME %r10,%r13,__LC_SYNC_ENTER_TIMER BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP stmg %r0,%r7,__PT_R0(%r11) - # clear user controlled register to prevent speculative use - xgr %r0,%r0 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC stg %r14,__PT_FLAGS(%r11) .Lsysc_do_svc: + # clear user controlled register to prevent speculative use + xgr %r0,%r0 lg %r10,__TI_sysc_table(%r12) # address of system call table llgh %r8,__PT_INT_CODE+2(%r11) slag %r8,%r8,2 # shift and test for svc 0
[ Upstream commit b2e2f43a01bace1a25bdbae04c9f9846882b727a ]
Keep the code for the nobp parameter handling with the code for expolines. Both are related to the spectre v2 mitigation.
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/Makefile | 4 ++-- arch/s390/kernel/alternative.c | 23 ----------------------- arch/s390/kernel/nospec-branch.c | 27 +++++++++++++++++++++++++++ 3 files changed, 29 insertions(+), 25 deletions(-)
diff --git a/arch/s390/kernel/Makefile b/arch/s390/kernel/Makefile index 9c9cf5331d6f..8ccfbf22ecbb 100644 --- a/arch/s390/kernel/Makefile +++ b/arch/s390/kernel/Makefile @@ -45,11 +45,11 @@ obj-y += debug.o irq.o ipl.o dis.o diag.o sclp.o vdso.o obj-y += sysinfo.o jump_label.o lgr.o os_info.o machine_kexec.o pgm_check.o obj-y += runtime_instr.o cache.o dumpstack.o obj-y += entry.o reipl.o relocate_kernel.o alternative.o +obj-y += nospec-branch.o
extra-y += head.o head64.o vmlinux.lds
-obj-$(CONFIG_EXPOLINE) += nospec-branch.o -CFLAGS_REMOVE_expoline.o += $(CC_FLAGS_EXPOLINE) +CFLAGS_REMOVE_nospec-branch.o += $(CC_FLAGS_EXPOLINE)
obj-$(CONFIG_MODULES) += s390_ksyms.o module.o obj-$(CONFIG_SMP) += smp.o diff --git a/arch/s390/kernel/alternative.c b/arch/s390/kernel/alternative.c index ed21de98b79e..f5060af61176 100644 --- a/arch/s390/kernel/alternative.c +++ b/arch/s390/kernel/alternative.c @@ -14,29 +14,6 @@ static int __init disable_alternative_instructions(char *str)
early_param("noaltinstr", disable_alternative_instructions);
-static int __init nobp_setup_early(char *str) -{ - bool enabled; - int rc; - - rc = kstrtobool(str, &enabled); - if (rc) - return rc; - if (enabled && test_facility(82)) - __set_facility(82, S390_lowcore.alt_stfle_fac_list); - else - __clear_facility(82, S390_lowcore.alt_stfle_fac_list); - return 0; -} -early_param("nobp", nobp_setup_early); - -static int __init nospec_setup_early(char *str) -{ - __clear_facility(82, S390_lowcore.alt_stfle_fac_list); - return 0; -} -early_param("nospec", nospec_setup_early); - struct brcl_insn { u16 opc; s32 disp; diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index 57f55c24c21c..3ce59d044a4d 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -3,6 +3,31 @@ #include <asm/facility.h> #include <asm/nospec-branch.h>
+static int __init nobp_setup_early(char *str) +{ + bool enabled; + int rc; + + rc = kstrtobool(str, &enabled); + if (rc) + return rc; + if (enabled && test_facility(82)) + __set_facility(82, S390_lowcore.alt_stfle_fac_list); + else + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); + return 0; +} +early_param("nobp", nobp_setup_early); + +static int __init nospec_setup_early(char *str) +{ + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); + return 0; +} +early_param("nospec", nospec_setup_early); + +#ifdef CONFIG_EXPOLINE + int nospec_call_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF); int nospec_return_disable = !IS_ENABLED(CONFIG_EXPOLINE_FULL);
@@ -99,3 +124,5 @@ void __init nospec_init_branches(void) nospec_call_revert(__nospec_call_start, __nospec_call_end); nospec_return_revert(__nospec_return_start, __nospec_return_end); } + +#endif /* CONFIG_EXPOLINE */
[ Upstream commit 6e179d64126b909f0b288fa63cdbf07c531e9b1d ]
Automatically decide between nobp vs. expolines if the spectre_v2=auto kernel parameter is specified or CONFIG_EXPOLINE_AUTO=y is set.
The decision made at boot time due to CONFIG_EXPOLINE_AUTO=y being set can be overruled with the nobp, nospec and spectre_v2 kernel parameters.
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/Kconfig | 2 +- arch/s390/Makefile | 2 +- arch/s390/include/asm/nospec-branch.h | 6 ++-- arch/s390/kernel/alternative.c | 1 + arch/s390/kernel/module.c | 11 +++--- arch/s390/kernel/nospec-branch.c | 68 +++++++++++++++++++++-------------- 6 files changed, 52 insertions(+), 38 deletions(-)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 89bab36b98ac..75c4a78d2e1c 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -742,7 +742,7 @@ choice config EXPOLINE_OFF bool "spectre_v2=off"
-config EXPOLINE_MEDIUM +config EXPOLINE_AUTO bool "spectre_v2=auto"
config EXPOLINE_FULL diff --git a/arch/s390/Makefile b/arch/s390/Makefile index 9ff5714eafaf..d924f9b6dc73 100644 --- a/arch/s390/Makefile +++ b/arch/s390/Makefile @@ -83,7 +83,7 @@ ifdef CONFIG_EXPOLINE CC_FLAGS_EXPOLINE += -mfunction-return=thunk CC_FLAGS_EXPOLINE += -mindirect-branch-table export CC_FLAGS_EXPOLINE - cflags-y += $(CC_FLAGS_EXPOLINE) + cflags-y += $(CC_FLAGS_EXPOLINE) -DCC_USING_EXPOLINE endif endif
diff --git a/arch/s390/include/asm/nospec-branch.h b/arch/s390/include/asm/nospec-branch.h index 7df48e5cf36f..35bf28fe4c64 100644 --- a/arch/s390/include/asm/nospec-branch.h +++ b/arch/s390/include/asm/nospec-branch.h @@ -6,12 +6,10 @@
#include <linux/types.h>
-extern int nospec_call_disable; -extern int nospec_return_disable; +extern int nospec_disable;
void nospec_init_branches(void); -void nospec_call_revert(s32 *start, s32 *end); -void nospec_return_revert(s32 *start, s32 *end); +void nospec_revert(s32 *start, s32 *end);
#endif /* __ASSEMBLY__ */
diff --git a/arch/s390/kernel/alternative.c b/arch/s390/kernel/alternative.c index f5060af61176..b57b293998dc 100644 --- a/arch/s390/kernel/alternative.c +++ b/arch/s390/kernel/alternative.c @@ -1,6 +1,7 @@ #include <linux/module.h> #include <asm/alternative.h> #include <asm/facility.h> +#include <asm/nospec-branch.h>
#define MAX_PATCH_LEN (255 - 1)
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c index 86b49f209251..2fdfa64ba014 100644 --- a/arch/s390/kernel/module.c +++ b/arch/s390/kernel/module.c @@ -167,7 +167,7 @@ int module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs, me->core_size += me->arch.got_size; me->arch.plt_offset = me->core_size; if (me->arch.plt_size) { - if (IS_ENABLED(CONFIG_EXPOLINE) && !nospec_call_disable) + if (IS_ENABLED(CONFIG_EXPOLINE) && !nospec_disable) me->arch.plt_size += PLT_ENTRY_SIZE; me->core_size += me->arch.plt_size; } @@ -326,8 +326,7 @@ static int apply_rela(Elf_Rela *rela, Elf_Addr base, Elf_Sym *symtab, info->plt_offset; ip[0] = 0x0d10e310; /* basr 1,0 */ ip[1] = 0x100a0004; /* lg 1,10(1) */ - if (IS_ENABLED(CONFIG_EXPOLINE) && - !nospec_call_disable) { + if (IS_ENABLED(CONFIG_EXPOLINE) && !nospec_disable) { unsigned int *ij; ij = me->module_core + me->arch.plt_offset + @@ -448,7 +447,7 @@ int module_finalize(const Elf_Ehdr *hdr, void *aseg;
if (IS_ENABLED(CONFIG_EXPOLINE) && - !nospec_call_disable && me->arch.plt_size) { + !nospec_disable && me->arch.plt_size) { unsigned int *ij;
ij = me->module_core + me->arch.plt_offset + @@ -475,11 +474,11 @@ int module_finalize(const Elf_Ehdr *hdr,
if (IS_ENABLED(CONFIG_EXPOLINE) && (!strcmp(".nospec_call_table", secname))) - nospec_call_revert(aseg, aseg + s->sh_size); + nospec_revert(aseg, aseg + s->sh_size);
if (IS_ENABLED(CONFIG_EXPOLINE) && (!strcmp(".nospec_return_table", secname))) - nospec_return_revert(aseg, aseg + s->sh_size); + nospec_revert(aseg, aseg + s->sh_size); }
jump_label_apply_nops(me); diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index 3ce59d044a4d..daa36a38a8b7 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -11,10 +11,17 @@ static int __init nobp_setup_early(char *str) rc = kstrtobool(str, &enabled); if (rc) return rc; - if (enabled && test_facility(82)) + if (enabled && test_facility(82)) { + /* + * The user explicitely requested nobp=1, enable it and + * disable the expoline support. + */ __set_facility(82, S390_lowcore.alt_stfle_fac_list); - else + if (IS_ENABLED(CONFIG_EXPOLINE)) + nospec_disable = 1; + } else { __clear_facility(82, S390_lowcore.alt_stfle_fac_list); + } return 0; } early_param("nobp", nobp_setup_early); @@ -28,31 +35,46 @@ early_param("nospec", nospec_setup_early);
#ifdef CONFIG_EXPOLINE
-int nospec_call_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF); -int nospec_return_disable = !IS_ENABLED(CONFIG_EXPOLINE_FULL); +int nospec_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF);
static int __init nospectre_v2_setup_early(char *str) { - nospec_call_disable = 1; - nospec_return_disable = 1; + nospec_disable = 1; return 0; } early_param("nospectre_v2", nospectre_v2_setup_early);
+static int __init spectre_v2_auto_early(void) +{ + if (IS_ENABLED(CC_USING_EXPOLINE)) { + /* + * The kernel has been compiled with expolines. + * Keep expolines enabled and disable nobp. + */ + nospec_disable = 0; + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); + } + /* + * If the kernel has not been compiled with expolines the + * nobp setting decides what is done, this depends on the + * CONFIG_KERNEL_NP option and the nobp/nospec parameters. + */ + return 0; +} +#ifdef CONFIG_EXPOLINE_AUTO +early_initcall(spectre_v2_auto_early); +#endif + static int __init spectre_v2_setup_early(char *str) { if (str && !strncmp(str, "on", 2)) { - nospec_call_disable = 0; - nospec_return_disable = 0; - } - if (str && !strncmp(str, "off", 3)) { - nospec_call_disable = 1; - nospec_return_disable = 1; - } - if (str && !strncmp(str, "auto", 4)) { - nospec_call_disable = 0; - nospec_return_disable = 1; + nospec_disable = 0; + __clear_facility(82, S390_lowcore.alt_stfle_fac_list); } + if (str && !strncmp(str, "off", 3)) + nospec_disable = 1; + if (str && !strncmp(str, "auto", 4)) + spectre_v2_auto_early(); return 0; } early_param("spectre_v2", spectre_v2_setup_early); @@ -105,15 +127,9 @@ static void __init_or_module __nospec_revert(s32 *start, s32 *end) } }
-void __init_or_module nospec_call_revert(s32 *start, s32 *end) -{ - if (nospec_call_disable) - __nospec_revert(start, end); -} - -void __init_or_module nospec_return_revert(s32 *start, s32 *end) +void __init_or_module nospec_revert(s32 *start, s32 *end) { - if (nospec_return_disable) + if (nospec_disable) __nospec_revert(start, end); }
@@ -121,8 +137,8 @@ extern s32 __nospec_call_start[], __nospec_call_end[]; extern s32 __nospec_return_start[], __nospec_return_end[]; void __init nospec_init_branches(void) { - nospec_call_revert(__nospec_call_start, __nospec_call_end); - nospec_return_revert(__nospec_return_start, __nospec_return_end); + nospec_revert(__nospec_call_start, __nospec_call_end); + nospec_revert(__nospec_return_start, __nospec_return_end); }
#endif /* CONFIG_EXPOLINE */
[ Upstream commit bc035599718412cfba9249aa713f90ef13f13ee9 ]
Add a boot message if either of the spectre defenses is active. The message is "Spectre V2 mitigation: execute trampolines." or "Spectre V2 mitigation: limited branch prediction."
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/nospec-branch.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index daa36a38a8b7..7387ebace891 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -33,6 +33,16 @@ static int __init nospec_setup_early(char *str) } early_param("nospec", nospec_setup_early);
+static int __init nospec_report(void) +{ + if (IS_ENABLED(CC_USING_EXPOLINE) && !nospec_disable) + pr_info("Spectre V2 mitigation: execute trampolines.\n"); + if (__test_facility(82, S390_lowcore.alt_stfle_fac_list)) + pr_info("Spectre V2 mitigation: limited branch prediction.\n"); + return 0; +} +arch_initcall(nospec_report); + #ifdef CONFIG_EXPOLINE
int nospec_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF);
[ Upstream commit d424986f1d6b16079b3231db0314923f4f8deed1 ]
Set CONFIG_GENERIC_CPU_VULNERABILITIES and provide the two functions cpu_show_spectre_v1 and cpu_show_spectre_v2 to report the spectre mitigations.
Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/Kconfig | 1 + arch/s390/kernel/nospec-branch.c | 19 +++++++++++++++++++ 2 files changed, 20 insertions(+)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 75c4a78d2e1c..9bdaeb38a768 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -111,6 +111,7 @@ config S390 select GENERIC_CLOCKEVENTS select GENERIC_CPU_AUTOPROBE select GENERIC_CPU_DEVICES if !SMP + select GENERIC_CPU_VULNERABILITIES select GENERIC_FIND_FIRST_BIT select GENERIC_SMP_IDLE_THREAD select GENERIC_TIME_VSYSCALL diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index 7387ebace891..73c06d42792d 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include <linux/module.h> +#include <linux/device.h> #include <asm/facility.h> #include <asm/nospec-branch.h>
@@ -43,6 +44,24 @@ static int __init nospec_report(void) } arch_initcall(nospec_report);
+#ifdef CONFIG_SYSFS +ssize_t cpu_show_spectre_v1(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sprintf(buf, "Mitigation: __user pointer sanitization\n"); +} + +ssize_t cpu_show_spectre_v2(struct device *dev, + struct device_attribute *attr, char *buf) +{ + if (IS_ENABLED(CC_USING_EXPOLINE) && !nospec_disable) + return sprintf(buf, "Mitigation: execute trampolines\n"); + if (__test_facility(82, S390_lowcore.alt_stfle_fac_list)) + return sprintf(buf, "Mitigation: limited branch prediction.\n"); + return sprintf(buf, "Vulnerable\n"); +} +#endif + #ifdef CONFIG_EXPOLINE
int nospec_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF);
[ Upstream commit 6a3d1e81a434fc311f224b8be77258bafc18ccc6 ]
With CONFIG_EXPOLINE_AUTO=y the call of spectre_v2_auto_early() via early_initcall is done *after* the early_param functions. This overwrites any settings done with the nobp/no_spectre_v2/spectre_v2 parameters. The code patching for the kernel is done after the evaluation of the early parameters but before the early_initcall is done. The end result is a kernel image that is patched correctly but the kernel modules are not.
Make sure that the nospec auto detection function is called before the early parameters are evaluated and before the code patching is done.
Fixes: 6e179d64126b ("s390: add automatic detection of the spectre defense") Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/include/asm/nospec-branch.h | 1 + arch/s390/kernel/nospec-branch.c | 8 ++------ arch/s390/kernel/setup.c | 3 +++ 3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/s390/include/asm/nospec-branch.h b/arch/s390/include/asm/nospec-branch.h index 35bf28fe4c64..b4bd8c41e9d3 100644 --- a/arch/s390/include/asm/nospec-branch.h +++ b/arch/s390/include/asm/nospec-branch.h @@ -9,6 +9,7 @@ extern int nospec_disable;
void nospec_init_branches(void); +void nospec_auto_detect(void); void nospec_revert(s32 *start, s32 *end);
#endif /* __ASSEMBLY__ */ diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index 73c06d42792d..9f3b5b382743 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -73,7 +73,7 @@ static int __init nospectre_v2_setup_early(char *str) } early_param("nospectre_v2", nospectre_v2_setup_early);
-static int __init spectre_v2_auto_early(void) +void __init nospec_auto_detect(void) { if (IS_ENABLED(CC_USING_EXPOLINE)) { /* @@ -88,11 +88,7 @@ static int __init spectre_v2_auto_early(void) * nobp setting decides what is done, this depends on the * CONFIG_KERNEL_NP option and the nobp/nospec parameters. */ - return 0; } -#ifdef CONFIG_EXPOLINE_AUTO -early_initcall(spectre_v2_auto_early); -#endif
static int __init spectre_v2_setup_early(char *str) { @@ -103,7 +99,7 @@ static int __init spectre_v2_setup_early(char *str) if (str && !strncmp(str, "off", 3)) nospec_disable = 1; if (str && !strncmp(str, "auto", 4)) - spectre_v2_auto_early(); + nospec_auto_detect(); return 0; } early_param("spectre_v2", spectre_v2_setup_early); diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index 959e7646cec7..e7a43a30e3ff 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -846,6 +846,9 @@ void __init setup_arch(char **cmdline_p) init_mm.end_data = (unsigned long) &_edata; init_mm.brk = (unsigned long) &_end;
+ if (IS_ENABLED(CONFIG_EXPOLINE_AUTO)) + nospec_auto_detect(); + parse_early_param(); os_info_init(); setup_ipl();
[ Upstream commit 6cf09958f32b9667bb3ebadf74367c791112771b ]
The main linker script vmlinux.lds.S for the kernel image merges the expoline code patch tables into two section ".nospec_call_table" and ".nospec_return_table". This is *not* done for the modules, there the sections retain their original names as generated by gcc: ".s390_indirect_call", ".s390_return_mem" and ".s390_return_reg".
The module_finalize code has to check for the compiler generated section names, otherwise no code patching is done. This slows down the module code in case of "spectre_v2=off".
Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com --- arch/s390/kernel/module.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/module.c b/arch/s390/kernel/module.c index 2fdfa64ba014..9bd1933848b8 100644 --- a/arch/s390/kernel/module.c +++ b/arch/s390/kernel/module.c @@ -473,11 +473,11 @@ int module_finalize(const Elf_Ehdr *hdr, apply_alternatives(aseg, aseg + s->sh_size);
if (IS_ENABLED(CONFIG_EXPOLINE) && - (!strcmp(".nospec_call_table", secname))) + (!strncmp(".s390_indirect", secname, 14))) nospec_revert(aseg, aseg + s->sh_size);
if (IS_ENABLED(CONFIG_EXPOLINE) && - (!strcmp(".nospec_return_table", secname))) + (!strncmp(".s390_return", secname, 12))) nospec_revert(aseg, aseg + s->sh_size); }
On Fri, Apr 27, 2018 at 07:36:44AM +0200, Martin Schwidefsky wrote:
Greetings,
this series is the backport of 18 upstream patches to add the current s390 spectre mitigation to kernel version 4.14. One less patch than 4.14 and 4.9 as there is no nested KVM guest support (aka vsie) in 4.4.
It follows the x86 approach with array_index_nospec for the v1 spectre attack and retpoline/expoline for v2. As a fallback there is the ppa-12/ppa-13 based defense which requires an micro-code update.
All now applied, thanks!
greg k-h
linux-stable-mirror@lists.linaro.org