This is the start of the stable review cycle for the 4.14.15 release. There are 89 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 24 08:39:25 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.15-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.14.15-rc1
Yan Markman ymarkman@marvell.com net: mvpp2: do not disable GMAC padding
Kirill A. Shutemov kirill.shutemov@linux.intel.com mm, page_vma_mapped: Drop faulty pointer arithmetics in check_pte()
Tom Lendacky thomas.lendacky@amd.com x86/mm: Rework wbinvd, hlt operation in stop_this_cpu()
Andi Kleen ak@linux.intel.com x86/retpoline: Optimize inline assembler for vmexit_fill_RSB
zhenwei.pi zhenwei.pi@youruncloud.com x86/pti: Document fix wrong index
Masami Hiramatsu mhiramat@kernel.org kprobes/x86: Disable optimizing on the function jumps to indirect thunk
Masami Hiramatsu mhiramat@kernel.org kprobes/x86: Blacklist indirect thunk functions for kprobes
Masami Hiramatsu mhiramat@kernel.org retpoline: Introduce start/end markers of indirect thunk
Thomas Gleixner tglx@linutronix.de x86/mce: Make machine check speculation protected
Marc Zyngier marc.zyngier@arm.com arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
Punit Agrawal punit.agrawal@arm.com KVM: arm/arm64: Check pagesize when allocating a hugepage at Stage 2
James Hogan jhogan@kernel.org MIPS: CM: Drop WARN_ON(vp != 0)
Lorenzo Pieralisi lorenzo.pieralisi@arm.com alpha/PCI: Fix noname IRQ level detection
Laura Abbott labbott@redhat.com x86: Use __nostackprotect for sme_encrypt_kernel
Wei Yongjun weiyongjun1@huawei.com dm crypt: fix error return code in crypt_ctr()
Ondrej Kozina okozina@redhat.com dm crypt: wipe kernel key copy after IV initialization
Milan Broz gmazyland@gmail.com dm crypt: fix crash by adding missing check for auth key size
Mikulas Patocka mpatocka@redhat.com dm integrity: don't store cipher request on the stack
Dennis Yang dennisyang@qnap.com dm thin metadata: THIN_MAX_CONCURRENT_LOCKS should be 6
Joe Thornber thornber@redhat.com dm btree: fix serious bug in btree_split_beneath()
Rob Clark rclark@redhat.com drm/vmwgfx: fix memory corruption with legacy/sou connectors
Sergey Senozhatsky sergey.senozhatsky.work@gmail.com workqueue: avoid hard lockups in show_workqueue_state()
Hannes Reinecke hare@suse.de scsi: libsas: Disable asynchronous aborts for SATA devices
Xinyu Lin xinyu0123@gmail.com libata: apply MAX_SEC_1024 to all LITEON EP1 series devices
Alexey Dobriyan adobriyan@gmail.com proc: fix coredump vs read /proc/*/stat race
Xi Kangjie imxikangjie@gmail.com scripts/gdb/linux/tasks.py: fix get_thread_info
Jeremy Compostella jeremy.compostella@intel.com i2c: core-smbus: prevent stack corruption on read I2C_BLOCK_DATA
Marc Kleine-Budde mkl@pengutronix.de can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once
Marc Kleine-Budde mkl@pengutronix.de can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once
Stephane Grosjean s.grosjean@peak-system.com can: peak: fix potential bug in packet fragmentation
Thomas Petazzoni thomas.petazzoni@free-electrons.com ARM: dts: kirkwood: fix pin-muxing of MPP7 on OpenBlocks A7
Maxime Ripard maxime.ripard@free-electrons.com ARM: sunxi_defconfig: Enable CMA
Gregory CLEMENT gregory.clement@free-electrons.com ARM64: dts: marvell: armada-cp110: Fix clock resources for various node
Arnd Bergmann arnd@arndb.de phy: work around 'phys' references to usb-nop-xceiv devices
Steven Rostedt (VMware) rostedt@goodmis.org tracing: Fix converting enum's from the map in trace_event_eval_update()
Johan Hovold johan@kernel.org Input: twl4030-vibra - fix sibling-node lookup
Johan Hovold johan@kernel.org Input: twl6040-vibra - fix child-node lookup
Johan Hovold johan@kernel.org Input: 88pm860x-ts - fix child-node lookup
Nick Desaulniers nick.desaulniers@gmail.com Input: synaptics-rmi4 - prevent UAF reported by KASAN
Nir Perry nirperry@gmail.com Input: ALPS - fix multi-touch decoding on SS4 plus touchpads
Tom Lendacky thomas.lendacky@amd.com x86/mm: Encrypt the initrd earlier for BSP microcode update
Tero Kristo t-kristo@ti.com ARM: OMAP3: hwmod_data: add missing module_offs for MMC3
Tom Lendacky thomas.lendacky@amd.com x86/mm: Prepare sme_encrypt_kernel() for PAGE aligned encryption
Tom Lendacky thomas.lendacky@amd.com x86/mm: Centralize PMD flags in sme_encrypt_kernel()
Tom Lendacky thomas.lendacky@amd.com x86/mm: Use a struct to reduce parameters for SME PGD mapping
Tom Lendacky thomas.lendacky@amd.com x86/mm: Clean up register saving in the __enc_copy() assembly code
Thomas Gleixner tglx@linutronix.de x86/apic/vector: Fix off by one in error path
Joe Lawrence joe.lawrence@redhat.com pipe: avoid round_pipe_size() nr_pages overflow on 32-bit
Len Brown len.brown@intel.com x86/tsc: Fix erroneous TSC rate on Skylake Xeon
Len Brown len.brown@intel.com x86/tsc: Future-proof native_calibrate_tsc()
Andi Kleen ak@linux.intel.com x86/idt: Mark IDT tables __initconst
Eric W. Biederman ebiederm@xmission.com x86/mm/pkeys: Fix fill_sig_info_pkey
Thomas Gleixner tglx@linutronix.de x86/intel_rdt/cqm: Prevent use after free
Andi Kleen ak@linux.intel.com module: Add retpoline tag to VERMAGIC
Paolo Bonzini pbonzini@redhat.com x86/cpufeature: Move processor tracing out of scattered features
Josh Poimboeuf jpoimboe@redhat.com objtool: Improve error message for bad file argument
Tom Lendacky thomas.lendacky@amd.com x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros
David Woodhouse dwmw@amazon.co.uk x86/retpoline: Fill RSB on context switch for affected CPUs
Andrey Ryabinin aryabinin@virtuozzo.com x86/kasan: Panic if there is not enough memory to boot
Benoît Thébaudeau benoit.thebaudeau.dev@gmail.com mmc: sdhci-esdhc-imx: Fix i.MX53 eSDHCv3 clock
Josh Poimboeuf jpoimboe@redhat.com objtool: Fix seg fault with gold linker
Josh Snyder joshs@netflix.com delayacct: Account blkio completion on the correct task
Sagi Grimberg sagi@grimberg.me iser-target: Fix possible use-after-free in connection establishment error
Eric Biggers ebiggers@google.com af_key: fix buffer overread in parse_exthdrs()
Eric Biggers ebiggers@google.com af_key: fix buffer overread in verify_address_len()
Thomas Gleixner tglx@linutronix.de timers: Unconditionally check deferrable base
Leon Romanovsky leonro@mellanox.com RDMA/mlx5: Fix out-of-bound access while querying AH
Dan Carpenter dan.carpenter@oracle.com IB/hfi1: Prevent a NULL dereference
Takashi Iwai tiwai@suse.de ALSA: hda - Apply the existing quirk to iMac 14,1
Takashi Iwai tiwai@suse.de ALSA: hda - Apply headphone noise quirk for another Dell XPS 13 variant
Takashi Iwai tiwai@suse.de ALSA: pcm: Remove yet superfluous WARN_ON()
Takashi Iwai tiwai@suse.de ALSA: seq: Make ioctls race-free
Li Jinyue lijinyue@huawei.com futex: Prevent overflow by strengthen input validation
Peter Zijlstra peterz@infradead.org futex: Avoid violating the 10th rule of futex
Oliver O'Halloran oohall@gmail.com powerpc/powernv: Check device-tree for RFI flush settings
Michael Neuling mikey@neuling.org powerpc/pseries: Query hypervisor for RFI flush settings
Michael Ellerman mpe@ellerman.id.au powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti
Michael Ellerman mpe@ellerman.id.au powerpc/64s: Add support for RFI flush of L1-D cache
Nicholas Piggin npiggin@gmail.com powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL
Nicholas Piggin npiggin@gmail.com powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL
Nicholas Piggin npiggin@gmail.com powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL
Nicholas Piggin npiggin@gmail.com powerpc/64s: Simple RFI macro conversions
Nicholas Piggin npiggin@gmail.com powerpc/64: Add macros for annotating the destination of rfid/hrfid
Michael Neuling mikey@neuling.org powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper
Simon Ser contact@emersion.fr objtool: Fix seg fault caused by missing parameter
Lukas Bulwahn lukas.bulwahn@gmail.com objtool: Fix Clang enum conversion warning
Simon Ser contact@emersion.fr objtool: Fix seg fault with clang-compiled objects
Rob Clark robdclark@gmail.com drm/nouveau/disp/gf119: add missing drive vfunc ptr
Andrew Morton akpm@linux-foundation.org tools/objtool/Makefile: don't assume sync-check.sh is executable
-------------
Diffstat:
Documentation/x86/pti.txt | 2 +- Makefile | 4 +- arch/alpha/kernel/sys_sio.c | 35 +- arch/arm/boot/dts/kirkwood-openblocks_a7.dts | 10 +- arch/arm/configs/sunxi_defconfig | 2 + arch/arm/mach-omap2/omap_hwmod_3xxx_data.c | 1 + .../boot/dts/marvell/armada-cp110-master.dtsi | 13 +- .../arm64/boot/dts/marvell/armada-cp110-slave.dtsi | 9 +- arch/arm64/kvm/handle_exit.c | 4 +- arch/mips/kernel/mips-cm.c | 1 - arch/powerpc/include/asm/exception-64e.h | 6 + arch/powerpc/include/asm/exception-64s.h | 57 +++- arch/powerpc/include/asm/feature-fixups.h | 13 + arch/powerpc/include/asm/hvcall.h | 17 + arch/powerpc/include/asm/paca.h | 10 + arch/powerpc/include/asm/plpar_wrappers.h | 14 + arch/powerpc/include/asm/setup.h | 13 + arch/powerpc/kernel/asm-offsets.c | 5 + arch/powerpc/kernel/entry_64.S | 44 ++- arch/powerpc/kernel/exceptions-64s.S | 135 +++++++- arch/powerpc/kernel/setup_64.c | 101 ++++++ arch/powerpc/kernel/vmlinux.lds.S | 9 + arch/powerpc/kvm/book3s_hv_rmhandlers.S | 7 +- arch/powerpc/kvm/book3s_rmhandlers.S | 7 +- arch/powerpc/kvm/book3s_segment.S | 4 +- arch/powerpc/lib/feature-fixups.c | 41 +++ arch/powerpc/platforms/powernv/setup.c | 49 +++ arch/powerpc/platforms/pseries/setup.c | 35 ++ arch/x86/entry/entry_32.S | 11 + arch/x86/entry/entry_64.S | 13 +- arch/x86/include/asm/cpufeatures.h | 3 +- arch/x86/include/asm/mem_encrypt.h | 4 +- arch/x86/include/asm/nospec-branch.h | 16 +- arch/x86/include/asm/traps.h | 1 + arch/x86/kernel/apic/vector.c | 7 +- arch/x86/kernel/cpu/bugs.c | 36 +++ arch/x86/kernel/cpu/intel_rdt.c | 8 +- arch/x86/kernel/cpu/mcheck/mce.c | 5 + arch/x86/kernel/cpu/scattered.c | 1 - arch/x86/kernel/head64.c | 4 +- arch/x86/kernel/idt.c | 12 +- arch/x86/kernel/kprobes/opt.c | 23 +- arch/x86/kernel/process.c | 25 +- arch/x86/kernel/setup.c | 8 - arch/x86/kernel/tsc.c | 3 +- arch/x86/kernel/vmlinux.lds.S | 6 + arch/x86/lib/retpoline.S | 5 +- arch/x86/mm/fault.c | 7 +- arch/x86/mm/kasan_init_64.c | 24 +- arch/x86/mm/mem_encrypt.c | 356 +++++++++++++++------ arch/x86/mm/mem_encrypt_boot.S | 80 ++--- drivers/ata/libata-core.c | 1 + .../gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c | 1 + drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c | 4 +- drivers/i2c/i2c-core-smbus.c | 13 +- drivers/infiniband/hw/hfi1/file_ops.c | 4 +- drivers/infiniband/hw/mlx5/qp.c | 7 +- drivers/infiniband/ulp/isert/ib_isert.c | 1 + drivers/input/misc/twl4030-vibra.c | 6 +- drivers/input/misc/twl6040-vibra.c | 3 +- drivers/input/mouse/alps.c | 23 +- drivers/input/mouse/alps.h | 10 +- drivers/input/rmi4/rmi_driver.c | 4 +- drivers/input/touchscreen/88pm860x-ts.c | 16 +- drivers/md/dm-crypt.c | 20 +- drivers/md/dm-integrity.c | 49 ++- drivers/md/dm-thin-metadata.c | 6 +- drivers/md/persistent-data/dm-btree.c | 19 +- drivers/mmc/host/sdhci-esdhc-imx.c | 14 + drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 21 +- drivers/net/ethernet/marvell/mvpp2.c | 9 - drivers/phy/phy-core.c | 4 + drivers/scsi/libsas/sas_scsi_host.c | 17 +- fs/pipe.c | 17 +- fs/proc/array.c | 7 +- include/linux/delayacct.h | 8 +- include/linux/swapops.h | 21 ++ include/linux/vermagic.h | 8 +- kernel/delayacct.c | 42 ++- kernel/futex.c | 86 ++++- kernel/locking/rtmutex.c | 26 +- kernel/locking/rtmutex_common.h | 1 + kernel/sched/core.c | 6 +- kernel/time/timer.c | 2 +- kernel/trace/trace_events.c | 16 +- kernel/workqueue.c | 13 + mm/page_vma_mapped.c | 63 ++-- net/can/af_can.c | 36 +-- net/key/af_key.c | 8 + scripts/Makefile.build | 14 +- scripts/gdb/linux/tasks.py | 2 + sound/core/pcm_lib.c | 1 - sound/core/seq/seq_clientmgr.c | 3 + sound/core/seq/seq_clientmgr.h | 1 + sound/pci/hda/patch_cirrus.c | 1 + sound/pci/hda/patch_realtek.c | 1 + tools/objtool/Makefile | 2 +- tools/objtool/arch/x86/decode.c | 2 +- tools/objtool/builtin-orc.c | 4 +- tools/objtool/elf.c | 4 +- tools/objtool/orc_gen.c | 2 + virt/kvm/arm/mmu.c | 2 +- 103 files changed, 1514 insertions(+), 447 deletions(-)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rob Clark robdclark@gmail.com
commit 1b5c7ef3d0d0610bda9b63263f7c5b7178d11015 upstream.
Fixes broken dp on GF119:
Call Trace: ? nvkm_dp_train_drive+0x183/0x2c0 [nouveau] nvkm_dp_acquire+0x4f3/0xcd0 [nouveau] nv50_disp_super_2_2+0x5d/0x470 [nouveau] ? nvkm_devinit_pll_set+0xf/0x20 [nouveau] gf119_disp_super+0x19c/0x2f0 [nouveau] process_one_work+0x193/0x3c0 worker_thread+0x35/0x3b0 kthread+0x125/0x140 ? process_one_work+0x3c0/0x3c0 ? kthread_park+0x60/0x60 ret_from_fork+0x25/0x30 Code: Bad RIP value. RIP: (null) RSP: ffffb1e243e4bc38 CR2: 0000000000000000
Fixes: af85389c614a drm/nouveau/disp: shuffle functions around Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103421 Signed-off-by: Rob Clark robdclark@gmail.com Signed-off-by: Ben Skeggs bskeggs@redhat.com Cc: Sven Joachim svenjoac@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c @@ -174,6 +174,7 @@ gf119_sor = { .links = gf119_sor_dp_links, .power = g94_sor_dp_power, .pattern = gf119_sor_dp_pattern, + .drive = gf119_sor_dp_drive, .vcpi = gf119_sor_dp_vcpi, .audio = gf119_sor_dp_audio, .audio_sym = gf119_sor_dp_audio_sym,
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Simon Ser contact@emersion.fr
commit ce90aaf5cde4ce057b297bb6c955caf16ef00ee6 upstream.
Fix a seg fault which happens when an input file provided to 'objtool orc generate' doesn't have a '.shstrtab' section (for instance, object files produced by clang don't have this section).
Signed-off-by: Simon Ser contact@emersion.fr Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/c0f2231683e9bed40fac1f13ce2c33b8389854bc.1514666459... Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/objtool/orc_gen.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/tools/objtool/orc_gen.c +++ b/tools/objtool/orc_gen.c @@ -165,6 +165,8 @@ int create_orc_sections(struct objtool_f
/* create .orc_unwind_ip and .rela.orc_unwind_ip sections */ sec = elf_create_section(file->elf, ".orc_unwind_ip", sizeof(int), idx); + if (!sec) + return -1;
ip_relasec = elf_create_rela_section(file->elf, sec); if (!ip_relasec)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lukas Bulwahn lukas.bulwahn@gmail.com
commit e7e83dd3ff1dd2f9e60213f6eedc7e5b08192062 upstream.
Fix the following Clang enum conversion warning:
arch/x86/decode.c:141:20: error: implicit conversion from enumeration type 'enum op_src_type' to different enumeration type 'enum op_dest_type' [-Werror,-Wenum-conversion]
op->dest.type = OP_SRC_REG; ~ ^~~~~~~~~~
It just happened to work before because OP_SRC_REG and OP_DEST_REG have the same value.
Signed-off-by: Lukas Bulwahn lukas.bulwahn@gmail.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Reviewed-by: Nicholas Mc Guire der.herr@hofr.at Reviewed-by: Nick Desaulniers nick.desaulniers@gmail.com Cc: Jiri Slaby jslaby@suse.cz Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Fixes: baa41469a7b9 ("objtool: Implement stack validation 2.0") Link: http://lkml.kernel.org/r/b4156c5738bae781c392e7a3691aed4514ebbdf2.1514323568... Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/objtool/arch/x86/decode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/objtool/arch/x86/decode.c +++ b/tools/objtool/arch/x86/decode.c @@ -138,7 +138,7 @@ int arch_decode_instruction(struct elf * *type = INSN_STACK; op->src.type = OP_SRC_ADD; op->src.reg = op_to_cfi_reg[modrm_reg][rex_r]; - op->dest.type = OP_SRC_REG; + op->dest.type = OP_DEST_REG; op->dest.reg = CFI_SP; } break;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Simon Ser contact@emersion.fr
commit d89e426499cf36b96161bd32970d6783f1fbcb0e upstream.
Fix a seg fault when no parameter is provided to 'objtool orc'.
Signed-off-by: Simon Ser contact@emersion.fr Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/9172803ec7ebb72535bcd0b7f966ae96d515968e.1514666459... Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/objtool/builtin-orc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/tools/objtool/builtin-orc.c +++ b/tools/objtool/builtin-orc.c @@ -44,6 +44,9 @@ int cmd_orc(int argc, const char **argv) const char *objname;
argc--; argv++; + if (argc <= 0) + usage_with_options(orc_usage, check_options); + if (!strncmp(argv[0], "gen", 3)) { argc = parse_options(argc, argv, check_options, orc_usage, 0); if (argc != 1) @@ -52,7 +55,6 @@ int cmd_orc(int argc, const char **argv) objname = argv[0];
return check(objname, no_fp, no_unreachable, true); - }
if (!strcmp(argv[0], "dump")) {
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Neuling mikey@neuling.org
commit 191eccb1580939fb0d47deb405b82a85b0379070 upstream.
A new hypervisor call has been defined to communicate various characteristics of the CPU to guests. Add definitions for the hcall number, flags and a wrapper function.
Signed-off-by: Michael Neuling mikey@neuling.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/include/asm/hvcall.h | 17 +++++++++++++++++ arch/powerpc/include/asm/plpar_wrappers.h | 14 ++++++++++++++ 2 files changed, 31 insertions(+)
--- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -241,6 +241,7 @@ #define H_GET_HCA_INFO 0x1B8 #define H_GET_PERF_COUNT 0x1BC #define H_MANAGE_TRACE 0x1C0 +#define H_GET_CPU_CHARACTERISTICS 0x1C8 #define H_FREE_LOGICAL_LAN_BUFFER 0x1D4 #define H_QUERY_INT_STATE 0x1E4 #define H_POLL_PENDING 0x1D8 @@ -330,6 +331,17 @@ #define H_SIGNAL_SYS_RESET_ALL_OTHERS -2 /* >= 0 values are CPU number */
+/* H_GET_CPU_CHARACTERISTICS return values */ +#define H_CPU_CHAR_SPEC_BAR_ORI31 (1ull << 63) // IBM bit 0 +#define H_CPU_CHAR_BCCTRL_SERIALISED (1ull << 62) // IBM bit 1 +#define H_CPU_CHAR_L1D_FLUSH_ORI30 (1ull << 61) // IBM bit 2 +#define H_CPU_CHAR_L1D_FLUSH_TRIG2 (1ull << 60) // IBM bit 3 +#define H_CPU_CHAR_L1D_THREAD_PRIV (1ull << 59) // IBM bit 4 + +#define H_CPU_BEHAV_FAVOUR_SECURITY (1ull << 63) // IBM bit 0 +#define H_CPU_BEHAV_L1D_FLUSH_PR (1ull << 62) // IBM bit 1 +#define H_CPU_BEHAV_BNDS_CHK_SPEC_BAR (1ull << 61) // IBM bit 2 + /* Flag values used in H_REGISTER_PROC_TBL hcall */ #define PROC_TABLE_OP_MASK 0x18 #define PROC_TABLE_DEREG 0x10 @@ -436,6 +448,11 @@ static inline unsigned int get_longbusy_ } }
+struct h_cpu_char_result { + u64 character; + u64 behaviour; +}; + #endif /* __ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_HVCALL_H */ --- a/arch/powerpc/include/asm/plpar_wrappers.h +++ b/arch/powerpc/include/asm/plpar_wrappers.h @@ -326,4 +326,18 @@ static inline long plapr_signal_sys_rese return plpar_hcall_norets(H_SIGNAL_SYS_RESET, cpu); }
+static inline long plpar_get_cpu_characteristics(struct h_cpu_char_result *p) +{ + unsigned long retbuf[PLPAR_HCALL_BUFSIZE]; + long rc; + + rc = plpar_hcall(H_GET_CPU_CHARACTERISTICS, retbuf); + if (rc == H_SUCCESS) { + p->character = retbuf[0]; + p->behaviour = retbuf[1]; + } + + return rc; +} + #endif /* _ASM_POWERPC_PLPAR_WRAPPERS_H */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Piggin npiggin@gmail.com
commit 50e51c13b3822d14ff6df4279423e4b7b2269bc3 upstream.
The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is used for switching from the kernel to userspace, and from the hypervisor to the guest kernel. However it can and is also used for other transitions, eg. from real mode kernel code to virtual mode kernel code, and it's not always clear from the code what the destination context is.
To make it clearer when reading the code, add macros which encode the expected destination context.
Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/include/asm/exception-64e.h | 6 ++++++ arch/powerpc/include/asm/exception-64s.h | 29 +++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+)
--- a/arch/powerpc/include/asm/exception-64e.h +++ b/arch/powerpc/include/asm/exception-64e.h @@ -209,5 +209,11 @@ exc_##label##_book3e: ori r3,r3,vector_offset@l; \ mtspr SPRN_IVOR##vector_number,r3;
+#define RFI_TO_KERNEL \ + rfi + +#define RFI_TO_USER \ + rfi + #endif /* _ASM_POWERPC_EXCEPTION_64E_H */
--- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -69,6 +69,35 @@ */ #define EX_R3 EX_DAR
+/* Macros for annotating the expected destination of (h)rfid */ + +#define RFI_TO_KERNEL \ + rfid + +#define RFI_TO_USER \ + rfid + +#define RFI_TO_USER_OR_KERNEL \ + rfid + +#define RFI_TO_GUEST \ + rfid + +#define HRFI_TO_KERNEL \ + hrfid + +#define HRFI_TO_USER \ + hrfid + +#define HRFI_TO_USER_OR_KERNEL \ + hrfid + +#define HRFI_TO_GUEST \ + hrfid + +#define HRFI_TO_UNKNOWN \ + hrfid + #ifdef CONFIG_RELOCATABLE #define __EXCEPTION_RELON_PROLOG_PSERIES_1(label, h) \ mfspr r11,SPRN_##h##SRR0; /* save SRR0 */ \
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Piggin npiggin@gmail.com
commit 222f20f140623ef6033491d0103ee0875fe87d35 upstream.
This commit does simple conversions of rfi/rfid to the new macros that include the expected destination context. By simple we mean cases where there is a single well known destination context, and it's simply a matter of substituting the instruction for the appropriate macro.
Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/include/asm/exception-64s.h | 4 ++-- arch/powerpc/kernel/entry_64.S | 14 +++++++++----- arch/powerpc/kernel/exceptions-64s.S | 22 +++++++++++----------- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 7 +++---- arch/powerpc/kvm/book3s_rmhandlers.S | 7 +++++-- arch/powerpc/kvm/book3s_segment.S | 4 ++-- 6 files changed, 32 insertions(+), 26 deletions(-)
--- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -242,7 +242,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) mtspr SPRN_##h##SRR0,r12; \ mfspr r12,SPRN_##h##SRR1; /* and SRR1 */ \ mtspr SPRN_##h##SRR1,r10; \ - h##rfid; \ + h##RFI_TO_KERNEL; \ b . /* prevent speculative execution */ #define EXCEPTION_PROLOG_PSERIES_1(label, h) \ __EXCEPTION_PROLOG_PSERIES_1(label, h) @@ -256,7 +256,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) mtspr SPRN_##h##SRR0,r12; \ mfspr r12,SPRN_##h##SRR1; /* and SRR1 */ \ mtspr SPRN_##h##SRR1,r10; \ - h##rfid; \ + h##RFI_TO_KERNEL; \ b . /* prevent speculative execution */
#define EXCEPTION_PROLOG_PSERIES_1_NORI(label, h) \ --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -37,6 +37,11 @@ #include <asm/tm.h> #include <asm/ppc-opcode.h> #include <asm/export.h> +#ifdef CONFIG_PPC_BOOK3S +#include <asm/exception-64s.h> +#else +#include <asm/exception-64e.h> +#endif
/* * System calls. @@ -397,8 +402,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) mtmsrd r10, 1 mtspr SPRN_SRR0, r11 mtspr SPRN_SRR1, r12 - - rfid + RFI_TO_USER b . /* prevent speculative execution */ #endif _ASM_NOKPROBE_SYMBOL(system_call_common); @@ -1073,7 +1077,7 @@ __enter_rtas: mtspr SPRN_SRR0,r5 mtspr SPRN_SRR1,r6 - rfid + RFI_TO_KERNEL b . /* prevent speculative execution */
rtas_return_loc: @@ -1098,7 +1102,7 @@ rtas_return_loc:
mtspr SPRN_SRR0,r3 mtspr SPRN_SRR1,r4 - rfid + RFI_TO_KERNEL b . /* prevent speculative execution */ _ASM_NOKPROBE_SYMBOL(__enter_rtas) _ASM_NOKPROBE_SYMBOL(rtas_return_loc) @@ -1171,7 +1175,7 @@ _GLOBAL(enter_prom) LOAD_REG_IMMEDIATE(r12, MSR_SF | MSR_ISF | MSR_LE) andc r11,r11,r12 mtsrr1 r11 - rfid + RFI_TO_KERNEL #endif /* CONFIG_PPC_BOOK3E */
1: /* Return from OF */ --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -254,7 +254,7 @@ BEGIN_FTR_SECTION LOAD_HANDLER(r12, machine_check_handle_early) 1: mtspr SPRN_SRR0,r12 mtspr SPRN_SRR1,r11 - rfid + RFI_TO_KERNEL b . /* prevent speculative execution */ 2: /* Stack overflow. Stay on emergency stack and panic. @@ -443,7 +443,7 @@ EXC_COMMON_BEGIN(machine_check_handle_ea li r3,MSR_ME andc r10,r10,r3 /* Turn off MSR_ME */ mtspr SPRN_SRR1,r10 - rfid + RFI_TO_KERNEL b . 2: /* @@ -461,7 +461,7 @@ EXC_COMMON_BEGIN(machine_check_handle_ea */ bl machine_check_queue_event MACHINE_CHECK_HANDLER_WINDUP - rfid + RFI_TO_USER_OR_KERNEL 9: /* Deliver the machine check to host kernel in V mode. */ MACHINE_CHECK_HANDLER_WINDUP @@ -649,7 +649,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_R mtspr SPRN_SRR0,r10 ld r10,PACAKMSR(r13) mtspr SPRN_SRR1,r10 - rfid + RFI_TO_KERNEL b .
8: std r3,PACA_EXSLB+EX_DAR(r13) @@ -660,7 +660,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_R mtspr SPRN_SRR0,r10 ld r10,PACAKMSR(r13) mtspr SPRN_SRR1,r10 - rfid + RFI_TO_KERNEL b .
EXC_COMMON_BEGIN(unrecov_slb) @@ -905,7 +905,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE) mtspr SPRN_SRR0,r10 ; \ ld r10,PACAKMSR(r13) ; \ mtspr SPRN_SRR1,r10 ; \ - rfid ; \ + RFI_TO_KERNEL ; \ b . ; /* prevent speculative execution */
#define SYSCALL_FASTENDIAN \ @@ -914,7 +914,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE) xori r12,r12,MSR_LE ; \ mtspr SPRN_SRR1,r12 ; \ mr r13,r9 ; \ - rfid ; /* return to userspace */ \ + RFI_TO_USER ; /* return to userspace */ \ b . ; /* prevent speculative execution */
#if defined(CONFIG_RELOCATABLE) @@ -1299,7 +1299,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) ld r11,PACA_EXGEN+EX_R11(r13) ld r12,PACA_EXGEN+EX_R12(r13) ld r13,PACA_EXGEN+EX_R13(r13) - HRFID + HRFI_TO_UNKNOWN b . #endif
@@ -1403,7 +1403,7 @@ masked_##_H##interrupt: \ ld r10,PACA_EXGEN+EX_R10(r13); \ ld r11,PACA_EXGEN+EX_R11(r13); \ /* returns to kernel where r13 must be set up, so don't restore it */ \ - ##_H##rfid; \ + ##_H##RFI_TO_KERNEL; \ b .; \ MASKED_DEC_HANDLER(_H)
@@ -1426,7 +1426,7 @@ TRAMP_REAL_BEGIN(kvmppc_skip_interrupt) addi r13, r13, 4 mtspr SPRN_SRR0, r13 GET_SCRATCH0(r13) - rfid + RFI_TO_KERNEL b .
TRAMP_REAL_BEGIN(kvmppc_skip_Hinterrupt) @@ -1438,7 +1438,7 @@ TRAMP_REAL_BEGIN(kvmppc_skip_Hinterrupt) addi r13, r13, 4 mtspr SPRN_HSRR0, r13 GET_SCRATCH0(r13) - hrfid + HRFI_TO_KERNEL b . #endif
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -78,7 +78,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) mtmsrd r0,1 /* clear RI in MSR */ mtsrr0 r5 mtsrr1 r6 - RFI + RFI_TO_KERNEL
kvmppc_call_hv_entry: ld r4, HSTATE_KVM_VCPU(r13) @@ -187,7 +187,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300) mtmsrd r6, 1 /* Clear RI in MSR */ mtsrr0 r8 mtsrr1 r7 - RFI + RFI_TO_KERNEL
/* Virtual-mode return */ .Lvirt_return: @@ -1131,8 +1131,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
ld r0, VCPU_GPR(R0)(r4) ld r4, VCPU_GPR(R4)(r4) - - hrfid + HRFI_TO_GUEST b .
secondary_too_late: --- a/arch/powerpc/kvm/book3s_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_rmhandlers.S @@ -46,6 +46,9 @@
#define FUNC(name) name
+#define RFI_TO_KERNEL RFI +#define RFI_TO_GUEST RFI + .macro INTERRUPT_TRAMPOLINE intno
.global kvmppc_trampoline_\intno @@ -141,7 +144,7 @@ kvmppc_handler_skip_ins: GET_SCRATCH0(r13)
/* And get back into the code */ - RFI + RFI_TO_KERNEL #endif
/* @@ -164,6 +167,6 @@ _GLOBAL_TOC(kvmppc_entry_trampoline) ori r5, r5, MSR_EE mtsrr0 r7 mtsrr1 r6 - RFI + RFI_TO_KERNEL
#include "book3s_segment.S" --- a/arch/powerpc/kvm/book3s_segment.S +++ b/arch/powerpc/kvm/book3s_segment.S @@ -156,7 +156,7 @@ no_dcbz32_on: PPC_LL r9, SVCPU_R9(r3) PPC_LL r3, (SVCPU_R3)(r3)
- RFI + RFI_TO_GUEST kvmppc_handler_trampoline_enter_end:
@@ -407,5 +407,5 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE) cmpwi r12, BOOK3S_INTERRUPT_DOORBELL beqa BOOK3S_INTERRUPT_DOORBELL
- RFI + RFI_TO_KERNEL kvmppc_handler_trampoline_exit_end:
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Piggin npiggin@gmail.com
commit b8e90cb7bc04a509e821e82ab6ed7a8ef11ba333 upstream.
In the syscall exit path we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that.
Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/entry_64.S | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
--- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -267,13 +267,23 @@ BEGIN_FTR_SECTION END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
ld r13,GPR13(r1) /* only restore r13 if returning to usermode */ + ld r2,GPR2(r1) + ld r1,GPR1(r1) + mtlr r4 + mtcr r5 + mtspr SPRN_SRR0,r7 + mtspr SPRN_SRR1,r8 + RFI_TO_USER + b . /* prevent speculative execution */ + + /* exit to kernel */ 1: ld r2,GPR2(r1) ld r1,GPR1(r1) mtlr r4 mtcr r5 mtspr SPRN_SRR0,r7 mtspr SPRN_SRR1,r8 - RFI + RFI_TO_KERNEL b . /* prevent speculative execution */
.Lsyscall_error:
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Piggin npiggin@gmail.com
commit a08f828cf47e6c605af21d2cdec68f84e799c318 upstream.
Similar to the syscall return path, in fast_exception_return we may be returning to user or kernel context. We already have a test for that, because we conditionally restore r13. So use that existing test and branch, and bifurcate the return based on that.
Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/entry_64.S | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)
--- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -892,7 +892,7 @@ BEGIN_FTR_SECTION END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ACCOUNT_CPU_USER_EXIT(r13, r2, r4) REST_GPR(13, r1) -1: + mtspr SPRN_SRR1,r3
ld r2,_CCR(r1) @@ -905,8 +905,22 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r3,GPR3(r1) ld r4,GPR4(r1) ld r1,GPR1(r1) + RFI_TO_USER + b . /* prevent speculative execution */
- rfid +1: mtspr SPRN_SRR1,r3 + + ld r2,_CCR(r1) + mtcrf 0xFF,r2 + ld r2,_NIP(r1) + mtspr SPRN_SRR0,r2 + + ld r0,GPR0(r1) + ld r2,GPR2(r1) + ld r3,GPR3(r1) + ld r4,GPR4(r1) + ld r1,GPR1(r1) + RFI_TO_KERNEL b . /* prevent speculative execution */
#endif /* CONFIG_PPC_BOOK3E */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Piggin npiggin@gmail.com
commit c7305645eb0c1621351cfc104038831ae87c0053 upstream.
In the SLB miss handler we may be returning to user or kernel. We need to add a check early on and save the result in the cr4 register, and then we bifurcate the return path based on that.
Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/exceptions-64s.S | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-)
--- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -596,6 +596,9 @@ EXC_COMMON_BEGIN(slb_miss_common) stw r9,PACA_EXSLB+EX_CCR(r13) /* save CR in exc. frame */ std r10,PACA_EXSLB+EX_LR(r13) /* save LR */
+ andi. r9,r11,MSR_PR // Check for exception from userspace + cmpdi cr4,r9,MSR_PR // And save the result in CR4 for later + /* * Test MSR_RI before calling slb_allocate_realmode, because the * MSR in r11 gets clobbered. However we still want to allocate @@ -622,9 +625,32 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_R
/* All done -- return from exception. */
+ bne cr4,1f /* returning to kernel */ + +.machine push +.machine "power4" + mtcrf 0x80,r9 + mtcrf 0x08,r9 /* MSR[PR] indication is in cr4 */ + mtcrf 0x04,r9 /* MSR[RI] indication is in cr5 */ + mtcrf 0x02,r9 /* I/D indication is in cr6 */ + mtcrf 0x01,r9 /* slb_allocate uses cr0 and cr7 */ +.machine pop + + RESTORE_CTR(r9, PACA_EXSLB) + RESTORE_PPR_PACA(PACA_EXSLB, r9) + mr r3,r12 + ld r9,PACA_EXSLB+EX_R9(r13) + ld r10,PACA_EXSLB+EX_R10(r13) + ld r11,PACA_EXSLB+EX_R11(r13) + ld r12,PACA_EXSLB+EX_R12(r13) + ld r13,PACA_EXSLB+EX_R13(r13) + RFI_TO_USER + b . /* prevent speculative execution */ +1: .machine push .machine "power4" mtcrf 0x80,r9 + mtcrf 0x08,r9 /* MSR[PR] indication is in cr4 */ mtcrf 0x04,r9 /* MSR[RI] indication is in cr5 */ mtcrf 0x02,r9 /* I/D indication is in cr6 */ mtcrf 0x01,r9 /* slb_allocate uses cr0 and cr7 */ @@ -638,9 +664,10 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_R ld r11,PACA_EXSLB+EX_R11(r13) ld r12,PACA_EXSLB+EX_R12(r13) ld r13,PACA_EXSLB+EX_R13(r13) - rfid + RFI_TO_KERNEL b . /* prevent speculative execution */
+ 2: std r3,PACA_EXSLB+EX_DAR(r13) mr r3,r12 mfspr r11,SPRN_SRR0
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman mpe@ellerman.id.au
commit aa8a5e0062ac940f7659394f4817c948dc8c0667 upstream.
On some CPUs we can prevent the Meltdown vulnerability by flushing the L1-D cache on exit from kernel to user mode, and from hypervisor to guest.
This is known to be the case on at least Power7, Power8 and Power9. At this time we do not know the status of the vulnerability on other CPUs such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale CPUs. As more information comes to light we can enable this, or other mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally inaccessible memory region (eg. userspace load of kernel memory) is speculatively executed to the point where its result can influence the address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1, because before the load is sent to the L2 the permission check is performed. Therefore if no kernel addresses hit in the L1 the vulnerability can not occur. We can ensure that is the case by flushing the L1 whenever we return to userspace. Similarly for hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at each (h)rfi location that returns to a lower privileged context, and patch that with some sequence. Newer firmwares are able to advertise to us that there is a special nop instruction that flushes the L1-D. If we do not see that advertised, we fall back to doing a displacement flush in software.
For guest kernels we support migration between some CPU versions, and different CPUs may use different flush instructions. So that we are prepared to migrate to a machine with a different flush instruction activated, we may have to patch more than one flush instruction at boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and Michael Ellerman. However a cast of thousands contributed to analysis of the issue, earlier versions of the patch, back ports testing etc. Many thanks to all of them.
Tested-by: Jon Masters jcm@redhat.com Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/include/asm/exception-64s.h | 40 +++++++++++--- arch/powerpc/include/asm/feature-fixups.h | 13 ++++ arch/powerpc/include/asm/paca.h | 10 +++ arch/powerpc/include/asm/setup.h | 13 ++++ arch/powerpc/kernel/asm-offsets.c | 5 + arch/powerpc/kernel/exceptions-64s.S | 84 ++++++++++++++++++++++++++++++ arch/powerpc/kernel/setup_64.c | 79 ++++++++++++++++++++++++++++ arch/powerpc/kernel/vmlinux.lds.S | 9 +++ arch/powerpc/lib/feature-fixups.c | 41 ++++++++++++++ 9 files changed, 286 insertions(+), 8 deletions(-)
--- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -69,34 +69,58 @@ */ #define EX_R3 EX_DAR
-/* Macros for annotating the expected destination of (h)rfid */ +/* + * Macros for annotating the expected destination of (h)rfid + * + * The nop instructions allow us to insert one or more instructions to flush the + * L1-D cache when returning to userspace or a guest. + */ +#define RFI_FLUSH_SLOT \ + RFI_FLUSH_FIXUP_SECTION; \ + nop; \ + nop; \ + nop
#define RFI_TO_KERNEL \ rfid
#define RFI_TO_USER \ - rfid + RFI_FLUSH_SLOT; \ + rfid; \ + b rfi_flush_fallback
#define RFI_TO_USER_OR_KERNEL \ - rfid + RFI_FLUSH_SLOT; \ + rfid; \ + b rfi_flush_fallback
#define RFI_TO_GUEST \ - rfid + RFI_FLUSH_SLOT; \ + rfid; \ + b rfi_flush_fallback
#define HRFI_TO_KERNEL \ hrfid
#define HRFI_TO_USER \ - hrfid + RFI_FLUSH_SLOT; \ + hrfid; \ + b hrfi_flush_fallback
#define HRFI_TO_USER_OR_KERNEL \ - hrfid + RFI_FLUSH_SLOT; \ + hrfid; \ + b hrfi_flush_fallback
#define HRFI_TO_GUEST \ - hrfid + RFI_FLUSH_SLOT; \ + hrfid; \ + b hrfi_flush_fallback
#define HRFI_TO_UNKNOWN \ - hrfid + RFI_FLUSH_SLOT; \ + hrfid; \ + b hrfi_flush_fallback
#ifdef CONFIG_RELOCATABLE #define __EXCEPTION_RELON_PROLOG_PSERIES_1(label, h) \ --- a/arch/powerpc/include/asm/feature-fixups.h +++ b/arch/powerpc/include/asm/feature-fixups.h @@ -187,7 +187,20 @@ label##3: \ FTR_ENTRY_OFFSET label##1b-label##3b; \ .popsection;
+#define RFI_FLUSH_FIXUP_SECTION \ +951: \ + .pushsection __rfi_flush_fixup,"a"; \ + .align 2; \ +952: \ + FTR_ENTRY_OFFSET 951b-952b; \ + .popsection; + + #ifndef __ASSEMBLY__ +#include <linux/types.h> + +extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup; + void apply_feature_fixups(void); void setup_feature_keys(void); #endif --- a/arch/powerpc/include/asm/paca.h +++ b/arch/powerpc/include/asm/paca.h @@ -231,6 +231,16 @@ struct paca_struct { struct sibling_subcore_state *sibling_subcore_state; #endif #endif +#ifdef CONFIG_PPC_BOOK3S_64 + /* + * rfi fallback flush must be in its own cacheline to prevent + * other paca data leaking into the L1d + */ + u64 exrfi[EX_SIZE] __aligned(0x80); + void *rfi_flush_fallback_area; + u64 l1d_flush_congruence; + u64 l1d_flush_sets; +#endif };
extern void copy_mm_to_paca(struct mm_struct *mm); --- a/arch/powerpc/include/asm/setup.h +++ b/arch/powerpc/include/asm/setup.h @@ -39,6 +39,19 @@ static inline void pseries_big_endian_ex static inline void pseries_little_endian_exceptions(void) {} #endif /* CONFIG_PPC_PSERIES */
+void rfi_flush_enable(bool enable); + +/* These are bit flags */ +enum l1d_flush_type { + L1D_FLUSH_NONE = 0x1, + L1D_FLUSH_FALLBACK = 0x2, + L1D_FLUSH_ORI = 0x4, + L1D_FLUSH_MTTRIG = 0x8, +}; + +void __init setup_rfi_flush(enum l1d_flush_type, bool enable); +void do_rfi_flush_fixups(enum l1d_flush_type types); + #endif /* !__ASSEMBLY__ */
#endif /* _ASM_POWERPC_SETUP_H */ --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -237,6 +237,11 @@ int main(void) OFFSET(PACA_NMI_EMERG_SP, paca_struct, nmi_emergency_sp); OFFSET(PACA_IN_MCE, paca_struct, in_mce); OFFSET(PACA_IN_NMI, paca_struct, in_nmi); + OFFSET(PACA_RFI_FLUSH_FALLBACK_AREA, paca_struct, rfi_flush_fallback_area); + OFFSET(PACA_EXRFI, paca_struct, exrfi); + OFFSET(PACA_L1D_FLUSH_CONGRUENCE, paca_struct, l1d_flush_congruence); + OFFSET(PACA_L1D_FLUSH_SETS, paca_struct, l1d_flush_sets); + #endif OFFSET(PACAHWCPUID, paca_struct, hw_cpu_id); OFFSET(PACAKEXECSTATE, paca_struct, kexec_state); --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -1434,6 +1434,90 @@ masked_##_H##interrupt: \ b .; \ MASKED_DEC_HANDLER(_H)
+TRAMP_REAL_BEGIN(rfi_flush_fallback) + SET_SCRATCH0(r13); + GET_PACA(r13); + std r9,PACA_EXRFI+EX_R9(r13) + std r10,PACA_EXRFI+EX_R10(r13) + std r11,PACA_EXRFI+EX_R11(r13) + std r12,PACA_EXRFI+EX_R12(r13) + std r8,PACA_EXRFI+EX_R13(r13) + mfctr r9 + ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13) + ld r11,PACA_L1D_FLUSH_SETS(r13) + ld r12,PACA_L1D_FLUSH_CONGRUENCE(r13) + /* + * The load adresses are at staggered offsets within cachelines, + * which suits some pipelines better (on others it should not + * hurt). + */ + addi r12,r12,8 + mtctr r11 + DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */ + + /* order ld/st prior to dcbt stop all streams with flushing */ + sync +1: li r8,0 + .rept 8 /* 8-way set associative */ + ldx r11,r10,r8 + add r8,r8,r12 + xor r11,r11,r11 // Ensure r11 is 0 even if fallback area is not + add r8,r8,r11 // Add 0, this creates a dependency on the ldx + .endr + addi r10,r10,128 /* 128 byte cache line */ + bdnz 1b + + mtctr r9 + ld r9,PACA_EXRFI+EX_R9(r13) + ld r10,PACA_EXRFI+EX_R10(r13) + ld r11,PACA_EXRFI+EX_R11(r13) + ld r12,PACA_EXRFI+EX_R12(r13) + ld r8,PACA_EXRFI+EX_R13(r13) + GET_SCRATCH0(r13); + rfid + +TRAMP_REAL_BEGIN(hrfi_flush_fallback) + SET_SCRATCH0(r13); + GET_PACA(r13); + std r9,PACA_EXRFI+EX_R9(r13) + std r10,PACA_EXRFI+EX_R10(r13) + std r11,PACA_EXRFI+EX_R11(r13) + std r12,PACA_EXRFI+EX_R12(r13) + std r8,PACA_EXRFI+EX_R13(r13) + mfctr r9 + ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13) + ld r11,PACA_L1D_FLUSH_SETS(r13) + ld r12,PACA_L1D_FLUSH_CONGRUENCE(r13) + /* + * The load adresses are at staggered offsets within cachelines, + * which suits some pipelines better (on others it should not + * hurt). + */ + addi r12,r12,8 + mtctr r11 + DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */ + + /* order ld/st prior to dcbt stop all streams with flushing */ + sync +1: li r8,0 + .rept 8 /* 8-way set associative */ + ldx r11,r10,r8 + add r8,r8,r12 + xor r11,r11,r11 // Ensure r11 is 0 even if fallback area is not + add r8,r8,r11 // Add 0, this creates a dependency on the ldx + .endr + addi r10,r10,128 /* 128 byte cache line */ + bdnz 1b + + mtctr r9 + ld r9,PACA_EXRFI+EX_R9(r13) + ld r10,PACA_EXRFI+EX_R10(r13) + ld r11,PACA_EXRFI+EX_R11(r13) + ld r12,PACA_EXRFI+EX_R12(r13) + ld r8,PACA_EXRFI+EX_R13(r13) + GET_SCRATCH0(r13); + hrfid + /* * Real mode exceptions actually use this too, but alternate * instruction code patches (which end up in the common .text area) --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -784,3 +784,82 @@ static int __init disable_hardlockup_det return 0; } early_initcall(disable_hardlockup_detector); + +#ifdef CONFIG_PPC_BOOK3S_64 +static enum l1d_flush_type enabled_flush_types; +static void *l1d_flush_fallback_area; +bool rfi_flush; + +static void do_nothing(void *unused) +{ + /* + * We don't need to do the flush explicitly, just enter+exit kernel is + * sufficient, the RFI exit handlers will do the right thing. + */ +} + +void rfi_flush_enable(bool enable) +{ + if (rfi_flush == enable) + return; + + if (enable) { + do_rfi_flush_fixups(enabled_flush_types); + on_each_cpu(do_nothing, NULL, 1); + } else + do_rfi_flush_fixups(L1D_FLUSH_NONE); + + rfi_flush = enable; +} + +static void init_fallback_flush(void) +{ + u64 l1d_size, limit; + int cpu; + + l1d_size = ppc64_caches.l1d.size; + limit = min(safe_stack_limit(), ppc64_rma_size); + + /* + * Align to L1d size, and size it at 2x L1d size, to catch possible + * hardware prefetch runoff. We don't have a recipe for load patterns to + * reliably avoid the prefetcher. + */ + l1d_flush_fallback_area = __va(memblock_alloc_base(l1d_size * 2, l1d_size, limit)); + memset(l1d_flush_fallback_area, 0, l1d_size * 2); + + for_each_possible_cpu(cpu) { + /* + * The fallback flush is currently coded for 8-way + * associativity. Different associativity is possible, but it + * will be treated as 8-way and may not evict the lines as + * effectively. + * + * 128 byte lines are mandatory. + */ + u64 c = l1d_size / 8; + + paca[cpu].rfi_flush_fallback_area = l1d_flush_fallback_area; + paca[cpu].l1d_flush_congruence = c; + paca[cpu].l1d_flush_sets = c / 128; + } +} + +void __init setup_rfi_flush(enum l1d_flush_type types, bool enable) +{ + if (types & L1D_FLUSH_FALLBACK) { + pr_info("rfi-flush: Using fallback displacement flush\n"); + init_fallback_flush(); + } + + if (types & L1D_FLUSH_ORI) + pr_info("rfi-flush: Using ori type flush\n"); + + if (types & L1D_FLUSH_MTTRIG) + pr_info("rfi-flush: Using mttrig type flush\n"); + + enabled_flush_types = types; + + rfi_flush_enable(enable); +} +#endif /* CONFIG_PPC_BOOK3S_64 */ --- a/arch/powerpc/kernel/vmlinux.lds.S +++ b/arch/powerpc/kernel/vmlinux.lds.S @@ -132,6 +132,15 @@ SECTIONS /* Read-only data */ RO_DATA(PAGE_SIZE)
+#ifdef CONFIG_PPC64 + . = ALIGN(8); + __rfi_flush_fixup : AT(ADDR(__rfi_flush_fixup) - LOAD_OFFSET) { + __start___rfi_flush_fixup = .; + *(__rfi_flush_fixup) + __stop___rfi_flush_fixup = .; + } +#endif + EXCEPTION_TABLE(0)
NOTES :kernel :notes --- a/arch/powerpc/lib/feature-fixups.c +++ b/arch/powerpc/lib/feature-fixups.c @@ -116,6 +116,47 @@ void do_feature_fixups(unsigned long val } }
+#ifdef CONFIG_PPC_BOOK3S_64 +void do_rfi_flush_fixups(enum l1d_flush_type types) +{ + unsigned int instrs[3], *dest; + long *start, *end; + int i; + + start = PTRRELOC(&__start___rfi_flush_fixup), + end = PTRRELOC(&__stop___rfi_flush_fixup); + + instrs[0] = 0x60000000; /* nop */ + instrs[1] = 0x60000000; /* nop */ + instrs[2] = 0x60000000; /* nop */ + + if (types & L1D_FLUSH_FALLBACK) + /* b .+16 to fallback flush */ + instrs[0] = 0x48000010; + + i = 0; + if (types & L1D_FLUSH_ORI) { + instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */ + instrs[i++] = 0x63de0000; /* ori 30,30,0 L1d flush*/ + } + + if (types & L1D_FLUSH_MTTRIG) + instrs[i++] = 0x7c12dba6; /* mtspr TRIG2,r0 (SPR #882) */ + + for (i = 0; start < end; start++, i++) { + dest = (void *)start + *start; + + pr_devel("patching dest %lx\n", (unsigned long)dest); + + patch_instruction(dest, instrs[0]); + patch_instruction(dest + 1, instrs[1]); + patch_instruction(dest + 2, instrs[2]); + } + + printk(KERN_DEBUG "rfi-flush: patched %d locations\n", i); +} +#endif /* CONFIG_PPC_BOOK3S_64 */ + void do_lwsync_fixups(unsigned long value, void *fixup_start, void *fixup_end) { long *start, *end;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman mpe@ellerman.id.au
commit bc9c9304a45480797e13a8e1df96ffcf44fb62fe upstream.
Because there may be some performance overhead of the RFI flush, add kernel command line options to disable it.
We add a sensibly named 'no_rfi_flush' option, but we also hijack the x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we see 'nopti' we can guess that the user is trying to avoid any overhead of Meltdown mitigations, and it means we don't have to educate every one about a different command line option.
Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/setup_64.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-)
--- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -788,8 +788,29 @@ early_initcall(disable_hardlockup_detect #ifdef CONFIG_PPC_BOOK3S_64 static enum l1d_flush_type enabled_flush_types; static void *l1d_flush_fallback_area; +static bool no_rfi_flush; bool rfi_flush;
+static int __init handle_no_rfi_flush(char *p) +{ + pr_info("rfi-flush: disabled on command line."); + no_rfi_flush = true; + return 0; +} +early_param("no_rfi_flush", handle_no_rfi_flush); + +/* + * The RFI flush is not KPTI, but because users will see doco that says to use + * nopti we hijack that option here to also disable the RFI flush. + */ +static int __init handle_no_pti(char *p) +{ + pr_info("rfi-flush: disabling due to 'nopti' on command line.\n"); + handle_no_rfi_flush(NULL); + return 0; +} +early_param("nopti", handle_no_pti); + static void do_nothing(void *unused) { /* @@ -860,6 +881,7 @@ void __init setup_rfi_flush(enum l1d_flu
enabled_flush_types = types;
- rfi_flush_enable(enable); + if (!no_rfi_flush) + rfi_flush_enable(enable); } #endif /* CONFIG_PPC_BOOK3S_64 */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Neuling mikey@neuling.org
commit 8989d56878a7735dfdb234707a2fee6faf631085 upstream.
A new hypervisor call is available which tells the guest settings related to the RFI flush. Use it to query the appropriate flush instruction(s), and whether the flush is required.
Signed-off-by: Michael Neuling mikey@neuling.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/platforms/pseries/setup.c | 35 +++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+)
--- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -459,6 +459,39 @@ static void __init find_and_init_phbs(vo of_pci_check_probe_only(); }
+static void pseries_setup_rfi_flush(void) +{ + struct h_cpu_char_result result; + enum l1d_flush_type types; + bool enable; + long rc; + + /* Enable by default */ + enable = true; + + rc = plpar_get_cpu_characteristics(&result); + if (rc == H_SUCCESS) { + types = L1D_FLUSH_NONE; + + if (result.character & H_CPU_CHAR_L1D_FLUSH_TRIG2) + types |= L1D_FLUSH_MTTRIG; + if (result.character & H_CPU_CHAR_L1D_FLUSH_ORI30) + types |= L1D_FLUSH_ORI; + + /* Use fallback if nothing set in hcall */ + if (types == L1D_FLUSH_NONE) + types = L1D_FLUSH_FALLBACK; + + if (!(result.behaviour & H_CPU_BEHAV_L1D_FLUSH_PR)) + enable = false; + } else { + /* Default to fallback if case hcall is not available */ + types = L1D_FLUSH_FALLBACK; + } + + setup_rfi_flush(types, enable); +} + static void __init pSeries_setup_arch(void) { set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT); @@ -476,6 +509,8 @@ static void __init pSeries_setup_arch(vo
fwnmi_init();
+ pseries_setup_rfi_flush(); + /* By default, only probe PCI (can be overridden by rtas_pci) */ pci_add_flags(PCI_PROBE_ONLY);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oliver O'Halloran oohall@gmail.com
commit 6e032b350cd1fdb830f18f8320ef0e13b4e24094 upstream.
New device-tree properties are available which tell the hypervisor settings related to the RFI flush. Use them to determine the appropriate flush instruction to use, and whether the flush is required.
Signed-off-by: Oliver O'Halloran oohall@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/platforms/powernv/setup.c | 49 +++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+)
--- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -36,13 +36,62 @@ #include <asm/opal.h> #include <asm/kexec.h> #include <asm/smp.h> +#include <asm/setup.h>
#include "powernv.h"
+static void pnv_setup_rfi_flush(void) +{ + struct device_node *np, *fw_features; + enum l1d_flush_type type; + int enable; + + /* Default to fallback in case fw-features are not available */ + type = L1D_FLUSH_FALLBACK; + enable = 1; + + np = of_find_node_by_name(NULL, "ibm,opal"); + fw_features = of_get_child_by_name(np, "fw-features"); + of_node_put(np); + + if (fw_features) { + np = of_get_child_by_name(fw_features, "inst-l1d-flush-trig2"); + if (np && of_property_read_bool(np, "enabled")) + type = L1D_FLUSH_MTTRIG; + + of_node_put(np); + + np = of_get_child_by_name(fw_features, "inst-l1d-flush-ori30,30,0"); + if (np && of_property_read_bool(np, "enabled")) + type = L1D_FLUSH_ORI; + + of_node_put(np); + + /* Enable unless firmware says NOT to */ + enable = 2; + np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-hv-1-to-0"); + if (np && of_property_read_bool(np, "disabled")) + enable--; + + of_node_put(np); + + np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-pr-0-to-1"); + if (np && of_property_read_bool(np, "disabled")) + enable--; + + of_node_put(np); + of_node_put(fw_features); + } + + setup_rfi_flush(type, enable > 0); +} + static void __init pnv_setup_arch(void) { set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);
+ pnv_setup_rfi_flush(); + /* Initialize SMP */ pnv_smp_init();
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
commit c1e2f0eaf015fb7076d51a339011f2383e6dd389 upstream.
Julia reported futex state corruption in the following scenario:
waiter waker stealer (prio > waiter)
futex(WAIT_REQUEUE_PI, uaddr, uaddr2, timeout=[N ms]) futex_wait_requeue_pi() futex_wait_queue_me() freezable_schedule() <scheduled out> futex(LOCK_PI, uaddr2) futex(CMP_REQUEUE_PI, uaddr, uaddr2, 1, 0) /* requeues waiter to uaddr2 */ futex(UNLOCK_PI, uaddr2) wake_futex_pi() cmp_futex_value_locked(uaddr2, waiter) wake_up_q() <woken by waker> <hrtimer_wakeup() fires, clears sleeper->task> futex(LOCK_PI, uaddr2) __rt_mutex_start_proxy_lock() try_to_take_rt_mutex() /* steals lock */ rt_mutex_set_owner(lock, stealer) <preempted> <scheduled in> rt_mutex_wait_proxy_lock() __rt_mutex_slowlock() try_to_take_rt_mutex() /* fails, lock held by stealer */ if (timeout && !timeout->task) return -ETIMEDOUT; fixup_owner() /* lock wasn't acquired, so, fixup_pi_state_owner skipped */
return -ETIMEDOUT;
/* At this point, we've returned -ETIMEDOUT to userspace, but the * futex word shows waiter to be the owner, and the pi_mutex has * stealer as the owner */
futex_lock(LOCK_PI, uaddr2) -> bails with EDEADLK, futex word says we're owner.
And suggested that what commit:
73d786bd043e ("futex: Rework inconsistent rt_mutex/futex_q state")
removes from fixup_owner() looks to be just what is needed. And indeed it is -- I completely missed that requeue_pi could also result in this case. So we need to restore that, except that subsequent patches, like commit:
16ffa12d7425 ("futex: Pull rt_mutex_futex_unlock() out from under hb->lock")
changed all the locking rules. Even without that, the sequence:
- if (rt_mutex_futex_trylock(&q->pi_state->pi_mutex)) { - locked = 1; - goto out; - }
- raw_spin_lock_irq(&q->pi_state->pi_mutex.wait_lock); - owner = rt_mutex_owner(&q->pi_state->pi_mutex); - if (!owner) - owner = rt_mutex_next_owner(&q->pi_state->pi_mutex); - raw_spin_unlock_irq(&q->pi_state->pi_mutex.wait_lock); - ret = fixup_pi_state_owner(uaddr, q, owner);
already suggests there were races; otherwise we'd never have to look at next_owner.
So instead of doing 3 consecutive wait_lock sections with who knows what races, we do it all in a single section. Additionally, the usage of pi_state->owner in fixup_owner() was only safe because only the rt_mutex owner would modify it, which this additional case wrecks.
Luckily the values can only change away and not to the value we're testing, this means we can do a speculative test and double check once we have the wait_lock.
Fixes: 73d786bd043e ("futex: Rework inconsistent rt_mutex/futex_q state") Reported-by: Julia Cartwright julia@ni.com Reported-by: Gratian Crisan gratian.crisan@ni.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Julia Cartwright julia@ni.com Tested-by: Gratian Crisan gratian.crisan@ni.com Cc: Darren Hart dvhart@infradead.org Link: https://lkml.kernel.org/r/20171208124939.7livp7no2ov65rrc@hirez.programming.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/futex.c | 83 ++++++++++++++++++++++++++++++++-------- kernel/locking/rtmutex.c | 26 +++++++++--- kernel/locking/rtmutex_common.h | 1 3 files changed, 87 insertions(+), 23 deletions(-)
--- a/kernel/futex.c +++ b/kernel/futex.c @@ -2294,21 +2294,17 @@ static void unqueue_me_pi(struct futex_q spin_unlock(q->lock_ptr); }
-/* - * Fixup the pi_state owner with the new owner. - * - * Must be called with hash bucket lock held and mm->sem held for non - * private futexes. - */ static int fixup_pi_state_owner(u32 __user *uaddr, struct futex_q *q, - struct task_struct *newowner) + struct task_struct *argowner) { - u32 newtid = task_pid_vnr(newowner) | FUTEX_WAITERS; struct futex_pi_state *pi_state = q->pi_state; u32 uval, uninitialized_var(curval), newval; - struct task_struct *oldowner; + struct task_struct *oldowner, *newowner; + u32 newtid; int ret;
+ lockdep_assert_held(q->lock_ptr); + raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
oldowner = pi_state->owner; @@ -2317,11 +2313,17 @@ static int fixup_pi_state_owner(u32 __us newtid |= FUTEX_OWNER_DIED;
/* - * We are here either because we stole the rtmutex from the - * previous highest priority waiter or we are the highest priority - * waiter but have failed to get the rtmutex the first time. + * We are here because either: + * + * - we stole the lock and pi_state->owner needs updating to reflect + * that (@argowner == current), + * + * or: * - * We have to replace the newowner TID in the user space variable. + * - someone stole our lock and we need to fix things to point to the + * new owner (@argowner == NULL). + * + * Either way, we have to replace the TID in the user space variable. * This must be atomic as we have to preserve the owner died bit here. * * Note: We write the user space value _before_ changing the pi_state @@ -2334,6 +2336,42 @@ static int fixup_pi_state_owner(u32 __us * in the PID check in lookup_pi_state. */ retry: + if (!argowner) { + if (oldowner != current) { + /* + * We raced against a concurrent self; things are + * already fixed up. Nothing to do. + */ + ret = 0; + goto out_unlock; + } + + if (__rt_mutex_futex_trylock(&pi_state->pi_mutex)) { + /* We got the lock after all, nothing to fix. */ + ret = 0; + goto out_unlock; + } + + /* + * Since we just failed the trylock; there must be an owner. + */ + newowner = rt_mutex_owner(&pi_state->pi_mutex); + BUG_ON(!newowner); + } else { + WARN_ON_ONCE(argowner != current); + if (oldowner == current) { + /* + * We raced against a concurrent self; things are + * already fixed up. Nothing to do. + */ + ret = 0; + goto out_unlock; + } + newowner = argowner; + } + + newtid = task_pid_vnr(newowner) | FUTEX_WAITERS; + if (get_futex_value_locked(&uval, uaddr)) goto handle_fault;
@@ -2434,15 +2472,28 @@ static int fixup_owner(u32 __user *uaddr * Got the lock. We might not be the anticipated owner if we * did a lock-steal - fix up the PI-state in that case: * - * We can safely read pi_state->owner without holding wait_lock - * because we now own the rt_mutex, only the owner will attempt - * to change it. + * Speculative pi_state->owner read (we don't hold wait_lock); + * since we own the lock pi_state->owner == current is the + * stable state, anything else needs more attention. */ if (q->pi_state->owner != current) ret = fixup_pi_state_owner(uaddr, q, current); goto out; }
+ /* + * If we didn't get the lock; check if anybody stole it from us. In + * that case, we need to fix up the uval to point to them instead of + * us, otherwise bad things happen. [10] + * + * Another speculative read; pi_state->owner == current is unstable + * but needs our attention. + */ + if (q->pi_state->owner == current) { + ret = fixup_pi_state_owner(uaddr, q, NULL); + goto out; + } + /* * Paranoia check. If we did not take the lock, then we should not be * the owner of the rt_mutex. --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -1290,6 +1290,19 @@ rt_mutex_slowlock(struct rt_mutex *lock, return ret; }
+static inline int __rt_mutex_slowtrylock(struct rt_mutex *lock) +{ + int ret = try_to_take_rt_mutex(lock, current, NULL); + + /* + * try_to_take_rt_mutex() sets the lock waiters bit + * unconditionally. Clean this up. + */ + fixup_rt_mutex_waiters(lock); + + return ret; +} + /* * Slow path try-lock function: */ @@ -1312,13 +1325,7 @@ static inline int rt_mutex_slowtrylock(s */ raw_spin_lock_irqsave(&lock->wait_lock, flags);
- ret = try_to_take_rt_mutex(lock, current, NULL); - - /* - * try_to_take_rt_mutex() sets the lock waiters bit - * unconditionally. Clean this up. - */ - fixup_rt_mutex_waiters(lock); + ret = __rt_mutex_slowtrylock(lock);
raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
@@ -1505,6 +1512,11 @@ int __sched rt_mutex_futex_trylock(struc return rt_mutex_slowtrylock(lock); }
+int __sched __rt_mutex_futex_trylock(struct rt_mutex *lock) +{ + return __rt_mutex_slowtrylock(lock); +} + /** * rt_mutex_timed_lock - lock a rt_mutex interruptible * the timeout structure is provided --- a/kernel/locking/rtmutex_common.h +++ b/kernel/locking/rtmutex_common.h @@ -148,6 +148,7 @@ extern bool rt_mutex_cleanup_proxy_lock( struct rt_mutex_waiter *waiter);
extern int rt_mutex_futex_trylock(struct rt_mutex *l); +extern int __rt_mutex_futex_trylock(struct rt_mutex *l);
extern void rt_mutex_futex_unlock(struct rt_mutex *lock); extern bool __rt_mutex_futex_unlock(struct rt_mutex *lock,
Hi Greg,
On Mon, Jan 22, 2018 at 9:44 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
4.14-stable review patch. If anyone has any objections, please let me know.
From: Peter Zijlstra peterz@infradead.org
commit c1e2f0eaf015fb7076d51a339011f2383e6dd389 upstream.
May be a bit premature, given the fishy use of newtid, cfr. my comment in https://lkml.org/lkml/2018/1/22/274
Gr{oetje,eeting}s,
Geert
-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Mon, Jan 22, 2018 at 10:48:55AM +0100, Geert Uytterhoeven wrote:
Hi Greg,
On Mon, Jan 22, 2018 at 9:44 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
4.14-stable review patch. If anyone has any objections, please let me know.
From: Peter Zijlstra peterz@infradead.org
commit c1e2f0eaf015fb7076d51a339011f2383e6dd389 upstream.
May be a bit premature, given the fishy use of newtid, cfr. my comment in https://lkml.org/lkml/2018/1/22/274
Not really a bug, but I will be glad to take any future patches that end up in Linus's tree to resolve your issue with obsolete compilers like this :)
thanks,
greg k-h
Hi Greg,
On Mon, Jan 22, 2018 at 10:53 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Jan 22, 2018 at 10:48:55AM +0100, Geert Uytterhoeven wrote:
On Mon, Jan 22, 2018 at 9:44 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
4.14-stable review patch. If anyone has any objections, please let me know.
From: Peter Zijlstra peterz@infradead.org
commit c1e2f0eaf015fb7076d51a339011f2383e6dd389 upstream.
May be a bit premature, given the fishy use of newtid, cfr. my comment in https://lkml.org/lkml/2018/1/22/274
Not really a bug, but I will be glad to take any future patches that end up in Linus's tree to resolve your issue with obsolete compilers like this :)
I'm not worried about the old compiler. I'm worried about a real bug related to the no longer used assignment.
Gr{oetje,eeting}s,
Geert
-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li Jinyue lijinyue@huawei.com
commit fbe0e839d1e22d88810f3ee3e2f1479be4c0aa4a upstream.
UBSAN reports signed integer overflow in kernel/futex.c:
UBSAN: Undefined behaviour in kernel/futex.c:2041:18 signed integer overflow: 0 - -2147483648 cannot be represented in type 'int'
Add a sanity check to catch negative values of nr_wake and nr_requeue.
Signed-off-by: Li Jinyue lijinyue@huawei.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: peterz@infradead.org Cc: dvhart@infradead.org Link: https://lkml.kernel.org/r/1513242294-31786-1-git-send-email-lijinyue@huawei.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/futex.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/kernel/futex.c +++ b/kernel/futex.c @@ -1878,6 +1878,9 @@ static int futex_requeue(u32 __user *uad struct futex_q *this, *next; DEFINE_WAKE_Q(wake_q);
+ if (nr_wake < 0 || nr_requeue < 0) + return -EINVAL; + /* * When PI not supported: return -ENOSYS if requeue_pi is true, * consequently the compiler knows requeue_pi is always false past
On 01/22/2018, 09:44 AM, Greg Kroah-Hartman wrote:
4.14-stable review patch. If anyone has any objections, please let me know.
From: Li Jinyue lijinyue@huawei.com
commit fbe0e839d1e22d88810f3ee3e2f1479be4c0aa4a upstream.
UBSAN reports signed integer overflow in kernel/futex.c:
UBSAN: Undefined behaviour in kernel/futex.c:2041:18 signed integer overflow: 0 - -2147483648 cannot be represented in type 'int'
Add a sanity check to catch negative values of nr_wake and nr_requeue.
Signed-off-by: Li Jinyue lijinyue@huawei.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: peterz@infradead.org Cc: dvhart@infradead.org Link: https://lkml.kernel.org/r/1513242294-31786-1-git-send-email-lijinyue@huawei.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
kernel/futex.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/kernel/futex.c +++ b/kernel/futex.c @@ -1878,6 +1878,9 @@ static int futex_requeue(u32 __user *uad struct futex_q *this, *next; DEFINE_WAKE_Q(wake_q);
- if (nr_wake < 0 || nr_requeue < 0)
return -EINVAL;
This breaks strace's test suite on 4.14.15 (and is present in upstream obviously too): futex(0x7ff568b44ffc, 0x3, 0xfacefeed, 0xbadda7a0ca7b100d, 0x7ff568b44ffc, 0x9caffee1) = -1: Invalid argument
strace uses weird values in the testkit to pass down to futex as can be seen. I think like in: commit e78c38f6bdd900b2ad9ac9df8eff58b745dc5b3c Author: Jiri Slaby jslaby@suse.cz Date: Mon Oct 23 13:41:51 2017 +0200
futex: futex_wake_op, do not fail on invalid op
something similar should be done here too. Maybe: if (nr_wake < 0) nr_wake = 0; if (nr_requeue < 0) nr_requeue = 0; ?
Maybe also with some pr_info_ratelimited like in the above commit?
thanks,
On Thu, 25 Jan 2018, Jiri Slaby wrote:
On 01/22/2018, 09:44 AM, Greg Kroah-Hartman wrote:
- if (nr_wake < 0 || nr_requeue < 0)
return -EINVAL;
This breaks strace's test suite on 4.14.15 (and is present in upstream obviously too): futex(0x7ff568b44ffc, 0x3, 0xfacefeed, 0xbadda7a0ca7b100d, 0x7ff568b44ffc, 0x9caffee1) = -1: Invalid argument
And why the hell is strace expecting this to be valid?
Thanks,
tglx
On 01/25/2018, 03:03 PM, Thomas Gleixner wrote:
On Thu, 25 Jan 2018, Jiri Slaby wrote:
On 01/22/2018, 09:44 AM, Greg Kroah-Hartman wrote:
- if (nr_wake < 0 || nr_requeue < 0)
return -EINVAL;
This breaks strace's test suite on 4.14.15 (and is present in upstream obviously too): futex(0x7ff568b44ffc, 0x3, 0xfacefeed, 0xbadda7a0ca7b100d, 0x7ff568b44ffc, 0x9caffee1) = -1: Invalid argument
And why the hell is strace expecting this to be valid?
You ought to ask somebody else, I was confused the very same way:
My FIX: https://github.com/strace/strace/pull/16/commits/777587ea509481666274df88671...
Their NACK: https://github.com/strace/strace/pull/16#issuecomment-341614984
thanks,
On Thu, 25 Jan 2018, Jiri Slaby wrote:
On 01/25/2018, 03:03 PM, Thomas Gleixner wrote:
On Thu, 25 Jan 2018, Jiri Slaby wrote:
On 01/22/2018, 09:44 AM, Greg Kroah-Hartman wrote:
- if (nr_wake < 0 || nr_requeue < 0)
return -EINVAL;
This breaks strace's test suite on 4.14.15 (and is present in upstream obviously too): futex(0x7ff568b44ffc, 0x3, 0xfacefeed, 0xbadda7a0ca7b100d, 0x7ff568b44ffc, 0x9caffee1) = -1: Invalid argument
And why the hell is strace expecting this to be valid?
You ought to ask somebody else, I was confused the very same way:
My FIX: https://github.com/strace/strace/pull/16/commits/777587ea509481666274df88671...
Their NACK: https://github.com/strace/strace/pull/16#issuecomment-341614984
https://github.com/strace/strace/commit/79d10dfc20985225e4ea044d3875c4cea090...
Update futex test in accordance with kernel's v4.15-rc7-202-gfbe0e83
* futex.c (VALP, VALP_PR, VAL2P, VAL2P_PR): New macro definitions. (main): Allow EINVAL on *REQUEUE* checks with VAL/VAL2 with higher bit being set, check that the existing behaviour preserved with VALP/VAL2P where higher bit is unset.
So what's the problem?
Thanks,
tglx
On 01/25/2018, 03:30 PM, Thomas Gleixner wrote:
So what's the problem?
The problem I see is that every stable kernel now requires updated strace with their commit from yesterday to build correctly. In particular, the new stable kernels cause rpm build failures of strace in all our distros (based on those stable kernels). Sure, we can patch strace in every distro every nth kernel update, but it's mere impractical. Kernel should not break userspace, right?
BTW why was the patch applied to stable? We actually do pass -fno-strict-overflow.
thanks,
On Thu, Jan 25, 2018 at 03:47:32PM +0100, Jiri Slaby wrote:
On 01/25/2018, 03:30 PM, Thomas Gleixner wrote:
So what's the problem?
The problem I see is that every stable kernel now requires updated strace with their commit from yesterday to build correctly. In particular, the new stable kernels cause rpm build failures of strace in all our distros (based on those stable kernels). Sure, we can patch strace in every distro every nth kernel update, but it's mere impractical. Kernel should not break userspace, right?
Well, when userspace is doing something stupid... :)
BTW why was the patch applied to stable? We actually do pass -fno-strict-overflow.
The same reason it was applied upstream, it fixes a reported issue.
Does that mean that all UBSAN overflow error reports are not valid because of how we build the kernel?
thanks,
greg k-h
On 01/25/2018, 04:12 PM, Greg Kroah-Hartman wrote:
On Thu, Jan 25, 2018 at 03:47:32PM +0100, Jiri Slaby wrote:
On 01/25/2018, 03:30 PM, Thomas Gleixner wrote:
So what's the problem?
The problem I see is that every stable kernel now requires updated strace with their commit from yesterday to build correctly. In particular, the new stable kernels cause rpm build failures of strace in all our distros (based on those stable kernels). Sure, we can patch strace in every distro every nth kernel update, but it's mere impractical. Kernel should not break userspace, right?
Well, when userspace is doing something stupid... :)
No doubt... But does that mean we no longer maintain the "no userspace breakage even if it is stupid" rule?
BTW why was the patch applied to stable? We actually do pass -fno-strict-overflow.
The same reason it was applied upstream, it fixes a reported issue.
Does that mean that all UBSAN overflow error reports are not valid because of how we build the kernel?
IMO yes, because with the option, signed overflow is not undefined.
In the long term, it would be nice to get rid of *all* signed integer overflows and kill the compiler option from Makefile. Therefore the fixes are indeed very valid in upstream.
thanks,
On Thu, Jan 25, 2018 at 04:21:51PM +0100, Jiri Slaby wrote:
The same reason it was applied upstream, it fixes a reported issue.
Does that mean that all UBSAN overflow error reports are not valid because of how we build the kernel?
IMO yes, because with the option, signed overflow is not undefined.
In the long term, it would be nice to get rid of *all* signed integer overflows and kill the compiler option from Makefile. Therefore the fixes are indeed very valid in upstream.
I actually think the option is unconditionally good. Undefined behaviour in a language is bad. Sadly C has lots of it, but any reduction we can have we must take.
On Thu, Jan 25, 2018 at 04:21:51PM +0100, Jiri Slaby wrote:
On 01/25/2018, 04:12 PM, Greg Kroah-Hartman wrote:
On Thu, Jan 25, 2018 at 03:47:32PM +0100, Jiri Slaby wrote:
On 01/25/2018, 03:30 PM, Thomas Gleixner wrote:
So what's the problem?
The problem I see is that every stable kernel now requires updated strace with their commit from yesterday to build correctly. In particular, the new stable kernels cause rpm build failures of strace in all our distros (based on those stable kernels). Sure, we can patch strace in every distro every nth kernel update, but it's mere impractical. Kernel should not break userspace, right?
Well, when userspace is doing something stupid... :)
No doubt... But does that mean we no longer maintain the "no userspace breakage even if it is stupid" rule?
One of the reasons we have been adding these earlier input validation checks to futex has been to mitigate security exploits taking advantage of the complex nature of the system call. Granted we should have done this initially, but if we avoid some of these nasty exploits (and the real harm they enable), then yeah, this is worth fixing userspace which is relying on undefined behavior.
I'd still like to out why various distros are sending garbage to uadd2 for network setup (but that's another topic).
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit b3defb791b26ea0683a93a4f49c77ec45ec96f10 upstream.
The ALSA sequencer ioctls have no protection against racy calls while the concurrent operations may lead to interfere with each other. As reported recently, for example, the concurrent calls of setting client pool with a combination of write calls may lead to either the unkillable dead-lock or UAF.
As a slightly big hammer solution, this patch introduces the mutex to make each ioctl exclusive. Although this may reduce performance via parallel ioctl calls, usually it's not demanded for sequencer usages, hence it should be negligible.
Reported-by: Luo Quan a4651386@163.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/core/seq/seq_clientmgr.c | 3 +++ sound/core/seq/seq_clientmgr.h | 1 + 2 files changed, 4 insertions(+)
--- a/sound/core/seq/seq_clientmgr.c +++ b/sound/core/seq/seq_clientmgr.c @@ -221,6 +221,7 @@ static struct snd_seq_client *seq_create rwlock_init(&client->ports_lock); mutex_init(&client->ports_mutex); INIT_LIST_HEAD(&client->ports_list_head); + mutex_init(&client->ioctl_mutex);
/* find free slot in the client table */ spin_lock_irqsave(&clients_lock, flags); @@ -2126,7 +2127,9 @@ static long snd_seq_ioctl(struct file *f return -EFAULT; }
+ mutex_lock(&client->ioctl_mutex); err = handler->func(client, &buf); + mutex_unlock(&client->ioctl_mutex); if (err >= 0) { /* Some commands includes a bug in 'dir' field. */ if (handler->cmd == SNDRV_SEQ_IOCTL_SET_QUEUE_CLIENT || --- a/sound/core/seq/seq_clientmgr.h +++ b/sound/core/seq/seq_clientmgr.h @@ -61,6 +61,7 @@ struct snd_seq_client { struct list_head ports_list_head; rwlock_t ports_lock; struct mutex ports_mutex; + struct mutex ioctl_mutex; int convert32; /* convert 32->64bit */
/* output pool */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit 23b19b7b50fe1867da8d431eea9cd3e4b6328c2c upstream.
muldiv32() contains a snd_BUG_ON() (which is morphed as WARN_ON() with debug option) for checking the case of 0 / 0. This would be helpful if this happens only as a logical error; however, since the hw refine is performed with any data set provided by user, the inconsistent values that can trigger such a condition might be passed easily. Actually, syzbot caught this by passing some zero'ed old hw_params ioctl.
So, having snd_BUG_ON() there is simply superfluous and rather harmful to give unnecessary confusions. Let's get rid of it.
Reported-by: syzbot+7e6ee55011deeebce15d@syzkaller.appspotmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/core/pcm_lib.c | 1 - 1 file changed, 1 deletion(-)
--- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -560,7 +560,6 @@ static inline unsigned int muldiv32(unsi { u_int64_t n = (u_int64_t) a * b; if (c == 0) { - snd_BUG_ON(!n); *r = 0; return UINT_MAX; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit e4c9fd10eb21376f44723c40ad12395089251c28 upstream.
There is another Dell XPS 13 variant (SSID 1028:082a) that requires the existing fixup for reducing the headphone noise. This patch adds the quirk entry for that.
BugLink: http://lkml.kernel.org/r/CAHXyb9ZCZJzVisuBARa+UORcjRERV8yokez=DP1_5O5isTz0ZA... Reported-and-tested-by: Francisco G. frangio.1@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -6173,6 +6173,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x1028, 0x075b, "Dell XPS 13 9360", ALC256_FIXUP_DELL_XPS_13_HEADPHONE_NOISE), SND_PCI_QUIRK(0x1028, 0x075d, "Dell AIO", ALC298_FIXUP_SPK_VOLUME), SND_PCI_QUIRK(0x1028, 0x0798, "Dell Inspiron 17 7000 Gaming", ALC256_FIXUP_DELL_INSPIRON_7559_SUBWOOFER), + SND_PCI_QUIRK(0x1028, 0x082a, "Dell XPS 13 9360", ALC256_FIXUP_DELL_XPS_13_HEADPHONE_NOISE), SND_PCI_QUIRK(0x1028, 0x164a, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1028, 0x164b, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x1586, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC2),
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit 031f335cda879450095873003abb03ae8ed3b74a upstream.
iMac 14,1 requires the same quirk as iMac 12,2, using GPIO 2 and 3 for headphone and speaker output amps. Add the codec SSID quirk entry (106b:0600) accordingly.
BugLink: http://lkml.kernel.org/r/CAEw6Zyteav09VGHRfD5QwsfuWv5a43r0tFBNbfcHXoNrxVz7ew... Reported-by: Freaky freaky2000@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_cirrus.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_cirrus.c +++ b/sound/pci/hda/patch_cirrus.c @@ -408,6 +408,7 @@ static const struct snd_pci_quirk cs420x /*SND_PCI_QUIRK(0x8086, 0x7270, "IMac 27 Inch", CS420X_IMAC27),*/
/* codec SSID */ + SND_PCI_QUIRK(0x106b, 0x0600, "iMac 14,1", CS420X_IMAC27_122), SND_PCI_QUIRK(0x106b, 0x1c00, "MacBookPro 8,1", CS420X_MBP81), SND_PCI_QUIRK(0x106b, 0x2000, "iMac 12,2", CS420X_IMAC27_122), SND_PCI_QUIRK(0x106b, 0x2800, "MacBookPro 10,1", CS420X_MBP101),
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@oracle.com
commit 57194fa763bfa1a0908f30d4c77835beaa118fcb upstream.
In the original code, we set "fd->uctxt" to NULL and then dereference it which will cause an Oops.
Fixes: f2a3bc00a03c ("IB/hfi1: Protect context array set/clear with spinlock") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Reviewed-by: Michael J. Ruhl michael.j.ruhl@intel.com Signed-off-by: Doug Ledford dledford@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/hfi1/file_ops.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/infiniband/hw/hfi1/file_ops.c +++ b/drivers/infiniband/hw/hfi1/file_ops.c @@ -881,11 +881,11 @@ static int complete_subctxt(struct hfi1_ }
if (ret) { - hfi1_rcd_put(fd->uctxt); - fd->uctxt = NULL; spin_lock_irqsave(&fd->dd->uctxt_lock, flags); __clear_bit(fd->subctxt, fd->uctxt->in_use_ctxts); spin_unlock_irqrestore(&fd->dd->uctxt_lock, flags); + hfi1_rcd_put(fd->uctxt); + fd->uctxt = NULL; }
return ret;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leon Romanovsky leonro@mellanox.com
commit ae59c3f0b6cfd472fed96e50548a799b8971d876 upstream.
The rdma_ah_find_type() accesses the port array based on an index controlled by userspace. The existing bounds check is after the first use of the index, so userspace can generate an out of bounds access, as shown by the KASN report below.
================================================================== BUG: KASAN: slab-out-of-bounds in to_rdma_ah_attr+0xa8/0x3b0 Read of size 4 at addr ffff880019ae2268 by task ibv_rc_pingpong/409
CPU: 0 PID: 409 Comm: ibv_rc_pingpong Not tainted 4.15.0-rc2-00031-gb60a3faf5b83-dirty #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0xe9/0x18f print_address_description+0xa2/0x350 kasan_report+0x3a5/0x400 to_rdma_ah_attr+0xa8/0x3b0 mlx5_ib_query_qp+0xd35/0x1330 ib_query_qp+0x8a/0xb0 ib_uverbs_query_qp+0x237/0x7f0 ib_uverbs_write+0x617/0xd80 __vfs_write+0xf7/0x500 vfs_write+0x149/0x310 SyS_write+0xca/0x190 entry_SYSCALL_64_fastpath+0x18/0x85 RIP: 0033:0x7fe9c7a275a0 RSP: 002b:00007ffee5498738 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fe9c7ce4b00 RCX: 00007fe9c7a275a0 RDX: 0000000000000018 RSI: 00007ffee5498800 RDI: 0000000000000003 RBP: 000055d0c8d3f010 R08: 00007ffee5498800 R09: 0000000000000018 R10: 00000000000000ba R11: 0000000000000246 R12: 0000000000008000 R13: 0000000000004fb0 R14: 000055d0c8d3f050 R15: 00007ffee5498560
Allocated by task 1: __kmalloc+0x3f9/0x430 alloc_mad_private+0x25/0x50 ib_mad_post_receive_mads+0x204/0xa60 ib_mad_init_device+0xa59/0x1020 ib_register_device+0x83a/0xbc0 mlx5_ib_add+0x50e/0x5c0 mlx5_add_device+0x142/0x410 mlx5_register_interface+0x18f/0x210 mlx5_ib_init+0x56/0x63 do_one_initcall+0x15b/0x270 kernel_init_freeable+0x2d8/0x3d0 kernel_init+0x14/0x190 ret_from_fork+0x24/0x30
Freed by task 0: (stack is not available)
The buggy address belongs to the object at ffff880019ae2000 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 104 bytes to the right of 512-byte region [ffff880019ae2000, ffff880019ae2200) The buggy address belongs to the page: page:000000005d674e18 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 0000000000000000 0000000000000000 00000001000c000c raw: dead000000000100 dead000000000200 ffff88001a402000 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff880019ae2100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880019ae2180: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc
ffff880019ae2200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^ ffff880019ae2280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880019ae2300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Disabling lock debugging due to kernel taint
Fixes: 44c58487d51a ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types") Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/mlx5/qp.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -4303,12 +4303,11 @@ static void to_rdma_ah_attr(struct mlx5_
memset(ah_attr, 0, sizeof(*ah_attr));
- ah_attr->type = rdma_ah_find_type(&ibdev->ib_dev, path->port); - rdma_ah_set_port_num(ah_attr, path->port); - if (rdma_ah_get_port_num(ah_attr) == 0 || - rdma_ah_get_port_num(ah_attr) > MLX5_CAP_GEN(dev, num_ports)) + if (!path->port || path->port > MLX5_CAP_GEN(dev, num_ports)) return;
+ ah_attr->type = rdma_ah_find_type(&ibdev->ib_dev, path->port); + rdma_ah_set_port_num(ah_attr, path->port); rdma_ah_set_sl(ah_attr, path->dci_cfi_prio_sl & 0xf);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit ed4bbf7910b28ce3c691aef28d245585eaabda06 upstream.
When the timer base is checked for expired timers then the deferrable base must be checked as well. This was missed when making the deferrable base independent of base::nohz_active.
Fixes: ced6d5c11d3e ("timers: Use deferrable base independent of base::nohz_active") Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Anna-Maria Gleixner anna-maria@linutronix.de Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Peter Zijlstra peterz@infradead.org Cc: Sebastian Siewior bigeasy@linutronix.de Cc: Paul McKenney paulmck@linux.vnet.ibm.com Cc: rt@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/time/timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1656,7 +1656,7 @@ void run_local_timers(void) hrtimer_run_queues(); /* Raise the softirq only if required. */ if (time_before(jiffies, base->clk)) { - if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->nohz_active) + if (!IS_ENABLED(CONFIG_NO_HZ_COMMON)) return; /* CPU is awake, so check the deferrable base. */ base++;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 06b335cb51af018d5feeff5dd4fd53847ddb675a upstream.
If a message sent to a PF_KEY socket ended with one of the extensions that takes a 'struct sadb_address' but there were not enough bytes remaining in the message for the ->sa_family member of the 'struct sockaddr' which is supposed to follow, then verify_address_len() read past the end of the message, into uninitialized memory. Fix it by returning -EINVAL in this case.
This bug was found using syzkaller with KMSAN.
Reproducer:
#include <linux/pfkeyv2.h> #include <sys/socket.h> #include <unistd.h>
int main() { int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2); char buf[24] = { 0 }; struct sadb_msg *msg = (void *)buf; struct sadb_address *addr = (void *)(msg + 1);
msg->sadb_msg_version = PF_KEY_V2; msg->sadb_msg_type = SADB_DELETE; msg->sadb_msg_len = 3; addr->sadb_address_len = 1; addr->sadb_address_exttype = SADB_EXT_ADDRESS_SRC;
write(sock, buf, 24); }
Reported-by: Alexander Potapenko glider@google.com Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/key/af_key.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -401,6 +401,11 @@ static int verify_address_len(const void #endif int len;
+ if (sp->sadb_address_len < + DIV_ROUND_UP(sizeof(*sp) + offsetofend(typeof(*addr), sa_family), + sizeof(uint64_t))) + return -EINVAL; + switch (addr->sa_family) { case AF_INET: len = DIV_ROUND_UP(sizeof(*sp) + sizeof(*sin), sizeof(uint64_t));
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 4e765b4972af7b07adcb1feb16e7a525ce1f6b28 upstream.
If a message sent to a PF_KEY socket ended with an incomplete extension header (fewer than 4 bytes remaining), then parse_exthdrs() read past the end of the message, into uninitialized memory. Fix it by returning -EINVAL in this case.
Reproducer:
#include <linux/pfkeyv2.h> #include <sys/socket.h> #include <unistd.h>
int main() { int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2); char buf[17] = { 0 }; struct sadb_msg *msg = (void *)buf;
msg->sadb_msg_version = PF_KEY_V2; msg->sadb_msg_type = SADB_DELETE; msg->sadb_msg_len = 2;
write(sock, buf, 17); }
Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/key/af_key.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -516,6 +516,9 @@ static int parse_exthdrs(struct sk_buff uint16_t ext_type; int ext_len;
+ if (len < sizeof(*ehdr)) + return -EINVAL; + ext_len = ehdr->sadb_ext_len; ext_len *= sizeof(uint64_t); ext_type = ehdr->sadb_ext_type;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sagi Grimberg sagi@grimberg.me
commit cd52cb26e7ead5093635e98e07e221e4df482d34 upstream.
In case we fail to establish the connection we must drain our pre-posted login recieve work request before continuing safely with connection teardown.
Fixes: a060b5629ab0 ("IB/core: generic RDMA READ/WRITE API") Reported-by: Amrani, Ram Ram.Amrani@cavium.com Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Doug Ledford dledford@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/ulp/isert/ib_isert.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -741,6 +741,7 @@ isert_connect_error(struct rdma_cm_id *c { struct isert_conn *isert_conn = cma_id->qp->qp_context;
+ ib_drain_qp(isert_conn->qp); list_del_init(&isert_conn->node); isert_conn->cm_id = NULL; isert_put_conn(isert_conn);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josh Snyder joshs@netflix.com
commit c96f5471ce7d2aefd0dda560cc23f08ab00bc65d upstream.
Before commit:
e33a9bba85a8 ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler")
delayacct_blkio_end() was called after context-switching into the task which completed I/O.
This resulted in double counting: the task would account a delay both waiting for I/O and for time spent in the runqueue.
With e33a9bba85a8, delayacct_blkio_end() is called by try_to_wake_up(). In ttwu, we have not yet context-switched. This is more correct, in that the delay accounting ends when the I/O is complete.
But delayacct_blkio_end() relies on 'get_current()', and we have not yet context-switched into the task whose I/O completed. This results in the wrong task having its delay accounting statistics updated.
Instead of doing that, pass the task_struct being woken to delayacct_blkio_end(), so that it can update the statistics of the correct task.
Signed-off-by: Josh Snyder joshs@netflix.com Acked-by: Tejun Heo tj@kernel.org Acked-by: Balbir Singh bsingharora@gmail.com Cc: Brendan Gregg bgregg@netflix.com Cc: Jens Axboe axboe@kernel.dk Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: linux-block@vger.kernel.org Fixes: e33a9bba85a8 ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler") Link: http://lkml.kernel.org/r/1513613712-571-1-git-send-email-joshs@netflix.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/delayacct.h | 8 ++++---- kernel/delayacct.c | 42 ++++++++++++++++++++++++++---------------- kernel/sched/core.c | 6 +++--- 3 files changed, 33 insertions(+), 23 deletions(-)
--- a/include/linux/delayacct.h +++ b/include/linux/delayacct.h @@ -71,7 +71,7 @@ extern void delayacct_init(void); extern void __delayacct_tsk_init(struct task_struct *); extern void __delayacct_tsk_exit(struct task_struct *); extern void __delayacct_blkio_start(void); -extern void __delayacct_blkio_end(void); +extern void __delayacct_blkio_end(struct task_struct *); extern int __delayacct_add_tsk(struct taskstats *, struct task_struct *); extern __u64 __delayacct_blkio_ticks(struct task_struct *); extern void __delayacct_freepages_start(void); @@ -122,10 +122,10 @@ static inline void delayacct_blkio_start __delayacct_blkio_start(); }
-static inline void delayacct_blkio_end(void) +static inline void delayacct_blkio_end(struct task_struct *p) { if (current->delays) - __delayacct_blkio_end(); + __delayacct_blkio_end(p); delayacct_clear_flag(DELAYACCT_PF_BLKIO); }
@@ -169,7 +169,7 @@ static inline void delayacct_tsk_free(st {} static inline void delayacct_blkio_start(void) {} -static inline void delayacct_blkio_end(void) +static inline void delayacct_blkio_end(struct task_struct *p) {} static inline int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) --- a/kernel/delayacct.c +++ b/kernel/delayacct.c @@ -51,16 +51,16 @@ void __delayacct_tsk_init(struct task_st * Finish delay accounting for a statistic using its timestamps (@start), * accumalator (@total) and @count */ -static void delayacct_end(u64 *start, u64 *total, u32 *count) +static void delayacct_end(spinlock_t *lock, u64 *start, u64 *total, u32 *count) { s64 ns = ktime_get_ns() - *start; unsigned long flags;
if (ns > 0) { - spin_lock_irqsave(¤t->delays->lock, flags); + spin_lock_irqsave(lock, flags); *total += ns; (*count)++; - spin_unlock_irqrestore(¤t->delays->lock, flags); + spin_unlock_irqrestore(lock, flags); } }
@@ -69,17 +69,25 @@ void __delayacct_blkio_start(void) current->delays->blkio_start = ktime_get_ns(); }
-void __delayacct_blkio_end(void) +/* + * We cannot rely on the `current` macro, as we haven't yet switched back to + * the process being woken. + */ +void __delayacct_blkio_end(struct task_struct *p) { - if (current->delays->flags & DELAYACCT_PF_SWAPIN) - /* Swapin block I/O */ - delayacct_end(¤t->delays->blkio_start, - ¤t->delays->swapin_delay, - ¤t->delays->swapin_count); - else /* Other block I/O */ - delayacct_end(¤t->delays->blkio_start, - ¤t->delays->blkio_delay, - ¤t->delays->blkio_count); + struct task_delay_info *delays = p->delays; + u64 *total; + u32 *count; + + if (p->delays->flags & DELAYACCT_PF_SWAPIN) { + total = &delays->swapin_delay; + count = &delays->swapin_count; + } else { + total = &delays->blkio_delay; + count = &delays->blkio_count; + } + + delayacct_end(&delays->lock, &delays->blkio_start, total, count); }
int __delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk) @@ -153,8 +161,10 @@ void __delayacct_freepages_start(void)
void __delayacct_freepages_end(void) { - delayacct_end(¤t->delays->freepages_start, - ¤t->delays->freepages_delay, - ¤t->delays->freepages_count); + delayacct_end( + ¤t->delays->lock, + ¤t->delays->freepages_start, + ¤t->delays->freepages_delay, + ¤t->delays->freepages_count); }
--- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2046,7 +2046,7 @@ try_to_wake_up(struct task_struct *p, un p->state = TASK_WAKING;
if (p->in_iowait) { - delayacct_blkio_end(); + delayacct_blkio_end(p); atomic_dec(&task_rq(p)->nr_iowait); }
@@ -2059,7 +2059,7 @@ try_to_wake_up(struct task_struct *p, un #else /* CONFIG_SMP */
if (p->in_iowait) { - delayacct_blkio_end(); + delayacct_blkio_end(p); atomic_dec(&task_rq(p)->nr_iowait); }
@@ -2112,7 +2112,7 @@ static void try_to_wake_up_local(struct
if (!task_on_rq_queued(p)) { if (p->in_iowait) { - delayacct_blkio_end(); + delayacct_blkio_end(p); atomic_dec(&rq->nr_iowait); } ttwu_activate(rq, p, ENQUEUE_WAKEUP | ENQUEUE_NOCLOCK);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josh Poimboeuf jpoimboe@redhat.com
commit 2a0098d70640dda192a79966c14d449e7a34d675 upstream.
Objtool segfaults when the gold linker is used with CONFIG_MODVERSIONS=y and CONFIG_UNWINDER_ORC=y.
With CONFIG_MODVERSIONS=y, the .o file gets passed to the linker before being passed to objtool. The gold linker seems to strip unused ELF symbols by default, which confuses objtool and causes the seg fault when it's trying to generate ORC metadata.
Objtool should really be running immediately after GCC anyway, without a linker call in between. Change the makefile ordering so that objtool is called before the linker.
Reported-and-tested-by: Markus M4rkusXXL@web.de Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Link: http://lkml.kernel.org/r/355f04da33581f4a3bf82e5b512973624a1e23a2.1516025651... Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- scripts/Makefile.build | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
--- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -270,12 +270,18 @@ else objtool_args += $(call cc-ifversion, -lt, 0405, --no-unreachable) endif
+ifdef CONFIG_MODVERSIONS +objtool_o = $(@D)/.tmp_$(@F) +else +objtool_o = $(@) +endif + # 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file # 'OBJECT_FILES_NON_STANDARD_foo.o := 'n': override directory skip for a file cmd_objtool = $(if $(patsubst y%,, \ $(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n), \ - $(__objtool_obj) $(objtool_args) "$(@)";) + $(__objtool_obj) $(objtool_args) "$(objtool_o)";) objtool_obj = $(if $(patsubst y%,, \ $(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n), \ $(__objtool_obj)) @@ -291,15 +297,15 @@ objtool_dep = $(objtool_obj) \ define rule_cc_o_c $(call echo-cmd,checksrc) $(cmd_checksrc) \ $(call cmd_and_fixdep,cc_o_c) \ - $(cmd_modversions_c) \ $(call echo-cmd,objtool) $(cmd_objtool) \ + $(cmd_modversions_c) \ $(call echo-cmd,record_mcount) $(cmd_record_mcount) endef
define rule_as_o_S $(call cmd_and_fixdep,as_o_S) \ - $(cmd_modversions_S) \ - $(call echo-cmd,objtool) $(cmd_objtool) + $(call echo-cmd,objtool) $(cmd_objtool) \ + $(cmd_modversions_S) endef
# List module undefined symbols (or empty line if not enabled)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrey Ryabinin aryabinin@virtuozzo.com
commit 0d39e2669d7b0fefd2d8f9e7868ae669b364d9ba upstream.
Currently KASAN doesn't panic in case it don't have enough memory to boot. Instead, it crashes in some random place:
kernel BUG at arch/x86/mm/physaddr.c:27!
RIP: 0010:__phys_addr+0x268/0x276 Call Trace: kasan_populate_shadow+0x3f2/0x497 kasan_init+0x12e/0x2b2 setup_arch+0x2825/0x2a2c start_kernel+0xc8/0x15f4 x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0x72/0x75 secondary_startup_64+0xa5/0xb0
Use memblock_virt_alloc_try_nid() for allocations without failure fallback. It will panic with an out of memory message.
Reported-by: kernel test robot xiaolong.ye@intel.com Signed-off-by: Andrey Ryabinin aryabinin@virtuozzo.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Dmitry Vyukov dvyukov@google.com Cc: kasan-dev@googlegroups.com Cc: Alexander Potapenko glider@google.com Cc: lkp@01.org Link: https://lkml.kernel.org/r/20180110153602.18919-1-aryabinin@virtuozzo.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/mm/kasan_init_64.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-)
--- a/arch/x86/mm/kasan_init_64.c +++ b/arch/x86/mm/kasan_init_64.c @@ -21,10 +21,14 @@ extern struct range pfn_mapped[E820_MAX_
static p4d_t tmp_p4d_table[PTRS_PER_P4D] __initdata __aligned(PAGE_SIZE);
-static __init void *early_alloc(size_t size, int nid) +static __init void *early_alloc(size_t size, int nid, bool panic) { - return memblock_virt_alloc_try_nid_nopanic(size, size, - __pa(MAX_DMA_ADDRESS), BOOTMEM_ALLOC_ACCESSIBLE, nid); + if (panic) + return memblock_virt_alloc_try_nid(size, size, + __pa(MAX_DMA_ADDRESS), BOOTMEM_ALLOC_ACCESSIBLE, nid); + else + return memblock_virt_alloc_try_nid_nopanic(size, size, + __pa(MAX_DMA_ADDRESS), BOOTMEM_ALLOC_ACCESSIBLE, nid); }
static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr, @@ -38,14 +42,14 @@ static void __init kasan_populate_pmd(pm if (boot_cpu_has(X86_FEATURE_PSE) && ((end - addr) == PMD_SIZE) && IS_ALIGNED(addr, PMD_SIZE)) { - p = early_alloc(PMD_SIZE, nid); + p = early_alloc(PMD_SIZE, nid, false); if (p && pmd_set_huge(pmd, __pa(p), PAGE_KERNEL)) return; else if (p) memblock_free(__pa(p), PMD_SIZE); }
- p = early_alloc(PAGE_SIZE, nid); + p = early_alloc(PAGE_SIZE, nid, true); pmd_populate_kernel(&init_mm, pmd, p); }
@@ -57,7 +61,7 @@ static void __init kasan_populate_pmd(pm if (!pte_none(*pte)) continue;
- p = early_alloc(PAGE_SIZE, nid); + p = early_alloc(PAGE_SIZE, nid, true); entry = pfn_pte(PFN_DOWN(__pa(p)), PAGE_KERNEL); set_pte_at(&init_mm, addr, pte, entry); } while (pte++, addr += PAGE_SIZE, addr != end); @@ -75,14 +79,14 @@ static void __init kasan_populate_pud(pu if (boot_cpu_has(X86_FEATURE_GBPAGES) && ((end - addr) == PUD_SIZE) && IS_ALIGNED(addr, PUD_SIZE)) { - p = early_alloc(PUD_SIZE, nid); + p = early_alloc(PUD_SIZE, nid, false); if (p && pud_set_huge(pud, __pa(p), PAGE_KERNEL)) return; else if (p) memblock_free(__pa(p), PUD_SIZE); }
- p = early_alloc(PAGE_SIZE, nid); + p = early_alloc(PAGE_SIZE, nid, true); pud_populate(&init_mm, pud, p); }
@@ -101,7 +105,7 @@ static void __init kasan_populate_p4d(p4 unsigned long next;
if (p4d_none(*p4d)) { - void *p = early_alloc(PAGE_SIZE, nid); + void *p = early_alloc(PAGE_SIZE, nid, true);
p4d_populate(&init_mm, p4d, p); } @@ -122,7 +126,7 @@ static void __init kasan_populate_pgd(pg unsigned long next;
if (pgd_none(*pgd)) { - p = early_alloc(PAGE_SIZE, nid); + p = early_alloc(PAGE_SIZE, nid, true); pgd_populate(&init_mm, pgd, p); }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit c995efd5a740d9cbafbf58bde4973e8b50b4d761 upstream.
On context switch from a shallow call stack to a deeper one, as the CPU does 'ret' up the deeper side it may encounter RSB entries (predictions for where the 'ret' goes to) which were populated in userspace.
This is problematic if neither SMEP nor KPTI (the latter of which marks userspace pages as NX for the kernel) are active, as malicious code in userspace may then be executed speculatively.
Overwrite the CPU's return prediction stack with calls which are predicted to return to an infinite loop, to "capture" speculation if this happens. This is required both for retpoline, and also in conjunction with IBRS for !SMEP && !KPTI.
On Skylake+ the problem is slightly different, and an *underflow* of the RSB may cause errant branch predictions to occur. So there it's not so much overwrite, as *filling* the RSB to attempt to prevent it getting empty. This is only a partial solution for Skylake+ since there are many other conditions which may result in the RSB becoming empty. The full solution on Skylake+ is to use IBRS, which will prevent the problem even when the RSB becomes empty. With IBRS, the RSB-stuffing will not be required on context switch.
[ tglx: Added missing vendor check and slighty massaged comments and changelog ]
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Arjan van de Ven arjan@linux.intel.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jiri Kosina jikos@kernel.org Cc: Andy Lutomirski luto@amacapital.net Cc: Dave Hansen dave.hansen@intel.com Cc: Kees Cook keescook@google.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Paul Turner pjt@google.com Link: https://lkml.kernel.org/r/1515779365-9032-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_32.S | 11 +++++++++++ arch/x86/entry/entry_64.S | 11 +++++++++++ arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/bugs.c | 36 ++++++++++++++++++++++++++++++++++++ 4 files changed, 59 insertions(+)
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -244,6 +244,17 @@ ENTRY(__switch_to_asm) movl %ebx, PER_CPU_VAR(stack_canary)+stack_canary_offset #endif
+#ifdef CONFIG_RETPOLINE + /* + * When switching from a shallower to a deeper call stack + * the RSB may either underflow or use entries populated + * with userspace addresses. On CPUs where those concerns + * exist, overwrite the RSB with entries which capture + * speculative execution to prevent attack. + */ + FILL_RETURN_BUFFER %ebx, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW +#endif + /* restore callee-saved registers */ popl %esi popl %edi --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -487,6 +487,17 @@ ENTRY(__switch_to_asm) movq %rbx, PER_CPU_VAR(irq_stack_union)+stack_canary_offset #endif
+#ifdef CONFIG_RETPOLINE + /* + * When switching from a shallower to a deeper call stack + * the RSB may either underflow or use entries populated + * with userspace addresses. On CPUs where those concerns + * exist, overwrite the RSB with entries which capture + * speculative execution to prevent attack. + */ + FILL_RETURN_BUFFER %r12, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW +#endif + /* restore callee-saved registers */ popq %r15 popq %r14 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -211,6 +211,7 @@ #define X86_FEATURE_AVX512_4FMAPS ( 7*32+17) /* AVX-512 Multiply Accumulation Single precision */
#define X86_FEATURE_MBA ( 7*32+18) /* Memory Bandwidth Allocation */ +#define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* Fill RSB on context switches */
/* Virtualization flags: Linux defined, word 8 */ #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -23,6 +23,7 @@ #include <asm/alternative.h> #include <asm/pgtable.h> #include <asm/set_memory.h> +#include <asm/intel-family.h>
static void __init spectre_v2_select_mitigation(void);
@@ -155,6 +156,23 @@ disable: return SPECTRE_V2_CMD_NONE; }
+/* Check for Skylake-like CPUs (for RSB handling) */ +static bool __init is_skylake_era(void) +{ + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && + boot_cpu_data.x86 == 6) { + switch (boot_cpu_data.x86_model) { + case INTEL_FAM6_SKYLAKE_MOBILE: + case INTEL_FAM6_SKYLAKE_DESKTOP: + case INTEL_FAM6_SKYLAKE_X: + case INTEL_FAM6_KABYLAKE_MOBILE: + case INTEL_FAM6_KABYLAKE_DESKTOP: + return true; + } + } + return false; +} + static void __init spectre_v2_select_mitigation(void) { enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline(); @@ -213,6 +231,24 @@ retpoline_auto:
spectre_v2_enabled = mode; pr_info("%s\n", spectre_v2_strings[mode]); + + /* + * If neither SMEP or KPTI are available, there is a risk of + * hitting userspace addresses in the RSB after a context switch + * from a shallow call stack to a deeper one. To prevent this fill + * the entire RSB, even when using IBRS. + * + * Skylake era CPUs have a separate issue with *underflow* of the + * RSB, when they will predict 'ret' targets from the generic BTB. + * The proper mitigation for this is IBRS. If IBRS is not supported + * or deactivated in favour of retpolines the RSB fill on context + * switch is required. + */ + if ((!boot_cpu_has(X86_FEATURE_PTI) && + !boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) { + setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW); + pr_info("Filling RSB on context switch\n"); + } }
#undef pr_fmt
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit 28d437d550e1e39f805d99f9f8ac399c778827b7 upstream.
The PAUSE instruction is currently used in the retpoline and RSB filling macros as a speculation trap. The use of PAUSE was originally suggested because it showed a very, very small difference in the amount of cycles/time used to execute the retpoline as compared to LFENCE. On AMD, the PAUSE instruction is not a serializing instruction, so the pause/jmp loop will use excess power as it is speculated over waiting for return to mispredict to the correct target.
The RSB filling macro is applicable to AMD, and, if software is unable to verify that LFENCE is serializing on AMD (possible when running under a hypervisor), the generic retpoline support will be used and, so, is also applicable to AMD. Keep the current usage of PAUSE for Intel, but add an LFENCE instruction to the speculation trap for AMD.
The same sequence has been adopted by GCC for the GCC generated retpolines.
Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Borislav Petkov bp@alien8.de Acked-by: David Woodhouse dwmw@amazon.co.uk Acked-by: Arjan van de Ven arjan@linux.intel.com Cc: Rik van Riel riel@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Paul Turner pjt@google.com Cc: Peter Zijlstra peterz@infradead.org Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Jiri Kosina jikos@kernel.org Cc: Dave Hansen dave.hansen@intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Dan Williams dan.j.williams@intel.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Cc: Kees Cook keescook@google.com Link: https://lkml.kernel.org/r/20180113232730.31060.36287.stgit@tlendack-t1.amdof... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/nospec-branch.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -11,7 +11,7 @@ * Fill the CPU return stack buffer. * * Each entry in the RSB, if used for a speculative 'ret', contains an - * infinite 'pause; jmp' loop to capture speculative execution. + * infinite 'pause; lfence; jmp' loop to capture speculative execution. * * This is required in various cases for retpoline and IBRS-based * mitigations for the Spectre variant 2 vulnerability. Sometimes to @@ -38,11 +38,13 @@ call 772f; \ 773: /* speculation trap */ \ pause; \ + lfence; \ jmp 773b; \ 772: \ call 774f; \ 775: /* speculation trap */ \ pause; \ + lfence; \ jmp 775b; \ 774: \ dec reg; \ @@ -73,6 +75,7 @@ call .Ldo_rop_@ .Lspec_trap_@: pause + lfence jmp .Lspec_trap_@ .Ldo_rop_@: mov \reg, (%_ASM_SP) @@ -165,6 +168,7 @@ " .align 16\n" \ "901: call 903f;\n" \ "902: pause;\n" \ + " lfence;\n" \ " jmp 902b;\n" \ " .align 16\n" \ "903: addl $4, %%esp;\n" \
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josh Poimboeuf jpoimboe@redhat.com
commit 385d11b152c4eb638eeb769edcb3249533bb9a00 upstream.
If a nonexistent file is supplied to objtool, it complains with a non-helpful error:
open: No such file or directory
Improve it to:
objtool: Can't open 'foo': No such file or directory
Reported-by: Markus M4rkusXXL@web.de Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/406a3d00a21225eee2819844048e17f68523ccf6.1516025651... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/objtool/elf.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -26,6 +26,7 @@ #include <stdlib.h> #include <string.h> #include <unistd.h> +#include <errno.h>
#include "elf.h" #include "warn.h" @@ -358,7 +359,8 @@ struct elf *elf_open(const char *name, i
elf->fd = open(name, flags); if (elf->fd == -1) { - perror("open"); + fprintf(stderr, "objtool: Can't open '%s': %s\n", + name, strerror(errno)); goto err; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andi Kleen ak@linux.intel.com
commit 6cfb521ac0d5b97470883ff9b7facae264b7ab12 upstream.
Add a marker for retpoline to the module VERMAGIC. This catches the case when a non RETPOLINE compiled module gets loaded into a retpoline kernel, making it insecure.
It doesn't handle the case when retpoline has been runtime disabled. Even in this case the match of the retcompile status will be enforced. This implies that even with retpoline run time disabled all modules loaded need to be recompiled.
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Acked-by: David Woodhouse dwmw@amazon.co.uk Cc: rusty@rustcorp.com.au Cc: arjan.van.de.ven@intel.com Cc: jeyu@kernel.org Cc: torvalds@linux-foundation.org Link: https://lkml.kernel.org/r/20180116205228.4890-1-andi@firstfloor.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/vermagic.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/include/linux/vermagic.h +++ b/include/linux/vermagic.h @@ -31,11 +31,17 @@ #else #define MODULE_RANDSTRUCT_PLUGIN #endif +#ifdef RETPOLINE +#define MODULE_VERMAGIC_RETPOLINE "retpoline " +#else +#define MODULE_VERMAGIC_RETPOLINE "" +#endif
#define VERMAGIC_STRING \ UTS_RELEASE " " \ MODULE_VERMAGIC_SMP MODULE_VERMAGIC_PREEMPT \ MODULE_VERMAGIC_MODULE_UNLOAD MODULE_VERMAGIC_MODVERSIONS \ MODULE_ARCH_VERMAGIC \ - MODULE_RANDSTRUCT_PLUGIN + MODULE_RANDSTRUCT_PLUGIN \ + MODULE_VERMAGIC_RETPOLINE
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit d47924417319e3b6a728c0b690f183e75bc2a702 upstream.
intel_rdt_iffline_cpu() -> domain_remove_cpu() frees memory first and then proceeds accessing it.
BUG: KASAN: use-after-free in find_first_bit+0x1f/0x80 Read of size 8 at addr ffff883ff7c1e780 by task cpuhp/31/195 find_first_bit+0x1f/0x80 has_busy_rmid+0x47/0x70 intel_rdt_offline_cpu+0x4b4/0x510
Freed by task 195: kfree+0x94/0x1a0 intel_rdt_offline_cpu+0x17d/0x510
Do the teardown first and then free memory.
Fixes: 24247aeeabe9 ("x86/intel_rdt/cqm: Improve limbo list processing") Reported-by: Joseph Salisbury joseph.salisbury@canonical.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Ravi Shankar ravi.v.shankar@intel.com Cc: Peter Zilstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Vikas Shivappa vikas.shivappa@linux.intel.com Cc: Andi Kleen ak@linux.intel.com Cc: "Roderick W. Smith" rod.smith@canonical.com Cc: 1733662@bugs.launchpad.net Cc: Fenghua Yu fenghua.yu@intel.com Cc: Tony Luck tony.luck@intel.com Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801161957510.2366@nanos Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/cpu/intel_rdt.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/arch/x86/kernel/cpu/intel_rdt.c +++ b/arch/x86/kernel/cpu/intel_rdt.c @@ -525,10 +525,6 @@ static void domain_remove_cpu(int cpu, s */ if (static_branch_unlikely(&rdt_mon_enable_key)) rmdir_mondata_subdir_allrdtgrp(r, d->id); - kfree(d->ctrl_val); - kfree(d->rmid_busy_llc); - kfree(d->mbm_total); - kfree(d->mbm_local); list_del(&d->list); if (is_mbm_enabled()) cancel_delayed_work(&d->mbm_over); @@ -545,6 +541,10 @@ static void domain_remove_cpu(int cpu, s cancel_delayed_work(&d->cqm_limbo); }
+ kfree(d->ctrl_val); + kfree(d->rmid_busy_llc); + kfree(d->mbm_total); + kfree(d->mbm_local); kfree(d); return; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric W. Biederman ebiederm@xmission.com
commit beacd6f7ed5e2915959442245b3b2480c2e37490 upstream.
SEGV_PKUERR is a signal specific si_code which happens to have the same numeric value as several others: BUS_MCEERR_AR, ILL_ILLTRP, FPE_FLTOVF, TRAP_HWBKPT, CLD_TRAPPED, POLL_ERR, SEGV_THREAD_ID, as such it is not safe to just test the si_code the signal number must also be tested to prevent a false positive in fill_sig_info_pkey.
This error was by inspection, and BUS_MCEERR_AR appears to be a real candidate for confusion. So pass in si_signo and check for SIG_SEGV to verify that it is actually a SEGV_PKUERR
Fixes: 019132ff3daf ("x86/mm/pkeys: Fill in pkey field in siginfo") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: linux-arch@vger.kernel.org Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Oleg Nesterov oleg@redhat.com Cc: Al Viro viro@zeniv.linux.org.uk Link: https://lkml.kernel.org/r/20180112203135.4669-2-ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/mm/fault.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -173,14 +173,15 @@ is_prefetch(struct pt_regs *regs, unsign * 6. T1 : reaches here, sees vma_pkey(vma)=5, when we really * faulted on a pte with its pkey=4. */ -static void fill_sig_info_pkey(int si_code, siginfo_t *info, u32 *pkey) +static void fill_sig_info_pkey(int si_signo, int si_code, siginfo_t *info, + u32 *pkey) { /* This is effectively an #ifdef */ if (!boot_cpu_has(X86_FEATURE_OSPKE)) return;
/* Fault not from Protection Keys: nothing to do */ - if (si_code != SEGV_PKUERR) + if ((si_code != SEGV_PKUERR) || (si_signo != SIGSEGV)) return; /* * force_sig_info_fault() is called from a number of @@ -219,7 +220,7 @@ force_sig_info_fault(int si_signo, int s lsb = PAGE_SHIFT; info.si_addr_lsb = lsb;
- fill_sig_info_pkey(si_code, &info, pkey); + fill_sig_info_pkey(si_signo, si_code, &info, pkey);
force_sig_info(si_signo, &info, tsk); }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andi Kleen ak@linux.intel.com
commit 327867faa4d66628fcd92a843adb3345736a5313 upstream.
const variables must use __initconst, not __initdata.
Fix this up for the IDT tables, which got it consistently wrong.
Fixes: 16bc18d895ce ("x86/idt: Move 32-bit idt_descr to C code") Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/20171222001821.2157-7-andi@firstfloor.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/idt.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/arch/x86/kernel/idt.c +++ b/arch/x86/kernel/idt.c @@ -56,7 +56,7 @@ struct idt_data { * Early traps running on the DEFAULT_STACK because the other interrupt * stacks work only after cpu_init(). */ -static const __initdata struct idt_data early_idts[] = { +static const __initconst struct idt_data early_idts[] = { INTG(X86_TRAP_DB, debug), SYSG(X86_TRAP_BP, int3), #ifdef CONFIG_X86_32 @@ -70,7 +70,7 @@ static const __initdata struct idt_data * the traps which use them are reinitialized with IST after cpu_init() has * set up TSS. */ -static const __initdata struct idt_data def_idts[] = { +static const __initconst struct idt_data def_idts[] = { INTG(X86_TRAP_DE, divide_error), INTG(X86_TRAP_NMI, nmi), INTG(X86_TRAP_BR, bounds), @@ -108,7 +108,7 @@ static const __initdata struct idt_data /* * The APIC and SMP idt entries */ -static const __initdata struct idt_data apic_idts[] = { +static const __initconst struct idt_data apic_idts[] = { #ifdef CONFIG_SMP INTG(RESCHEDULE_VECTOR, reschedule_interrupt), INTG(CALL_FUNCTION_VECTOR, call_function_interrupt), @@ -150,7 +150,7 @@ static const __initdata struct idt_data * Early traps running on the DEFAULT_STACK because the other interrupt * stacks work only after cpu_init(). */ -static const __initdata struct idt_data early_pf_idts[] = { +static const __initconst struct idt_data early_pf_idts[] = { INTG(X86_TRAP_PF, page_fault), };
@@ -158,7 +158,7 @@ static const __initdata struct idt_data * Override for the debug_idt. Same as the default, but with interrupt * stack set to DEFAULT_STACK (0). Required for NMI trap handling. */ -static const __initdata struct idt_data dbg_idts[] = { +static const __initconst struct idt_data dbg_idts[] = { INTG(X86_TRAP_DB, debug), INTG(X86_TRAP_BP, int3), }; @@ -180,7 +180,7 @@ gate_desc debug_idt_table[IDT_ENTRIES] _ * The exceptions which use Interrupt stacks. They are setup after * cpu_init() when the TSS has been initialized. */ -static const __initdata struct idt_data ist_idts[] = { +static const __initconst struct idt_data ist_idts[] = { ISTG(X86_TRAP_DB, debug, DEBUG_STACK), ISTG(X86_TRAP_NMI, nmi, NMI_STACK), SISTG(X86_TRAP_BP, int3, DEBUG_STACK),
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Len Brown len.brown@intel.com
commit da4ae6c4a0b8dee5a5377a385545d2250fa8cddb upstream.
If the crystal frequency cannot be determined via CPUID(15).crystal_khz or the built-in table then native_calibrate_tsc() will still set the X86_FEATURE_TSC_KNOWN_FREQ flag which prevents the refined TSC calibration.
As a consequence such systems use cpu_khz for the TSC frequency which is incorrect when cpu_khz != tsc_khz resulting in time drift.
Return early when the crystal frequency cannot be retrieved without setting the X86_FEATURE_TSC_KNOWN_FREQ flag. This ensures that the refined TSC calibration is invoked.
[ tglx: Steam-blastered changelog. Sigh ]
Fixes: 4ca4df0b7eb0 ("x86/tsc: Mark TSC frequency determined by CPUID as known") Signed-off-by: Len Brown len.brown@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: peterz@infradead.org Cc: Bin Gao bin.gao@intel.com Link: https://lkml.kernel.org/r/0fe2503aa7d7fc69137141fc705541a78101d2b9.151392041... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/tsc.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -612,6 +612,8 @@ unsigned long native_calibrate_tsc(void) } }
+ if (crystal_khz == 0) + return 0; /* * TSC frequency determined by CPUID is a "hardware reported" * frequency and is the most accurate one so far we have. This
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Len Brown len.brown@intel.com
commit b511203093489eb1829cb4de86e8214752205ac6 upstream.
The INTEL_FAM6_SKYLAKE_X hardcoded crystal_khz value of 25MHZ is problematic:
- SKX workstations (with same model # as server variants) use a 24 MHz crystal. This results in a -4.0% time drift rate on SKX workstations.
- SKX servers subject the crystal to an EMI reduction circuit that reduces its actual frequency by (approximately) -0.25%. This results in -1 second per 10 minute time drift as compared to network time.
This issue can also trigger a timer and power problem, on configurations that use the LAPIC timer (versus the TSC deadline timer). Clock ticks scheduled with the LAPIC timer arrive a few usec before the time they are expected (according to the slow TSC). This causes Linux to poll-idle, when it should be in an idle power saving state. The idle and clock code do not graciously recover from this error, sometimes resulting in significant polling and measurable power impact.
Stop using native_calibrate_tsc() for INTEL_FAM6_SKYLAKE_X. native_calibrate_tsc() will return 0, boot will run with tsc_khz = cpu_khz, and the TSC refined calibration will update tsc_khz to correct for the difference.
[ tglx: Sanitized change log ]
Fixes: 6baf3d61821f ("x86/tsc: Add additional Intel CPU models to the crystal quirk list") Signed-off-by: Len Brown len.brown@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: peterz@infradead.org Cc: Prarit Bhargava prarit@redhat.com Link: https://lkml.kernel.org/r/ff6dcea166e8ff8f2f6a03c17beab2cb436aa779.151392041... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/tsc.c | 1 - 1 file changed, 1 deletion(-)
--- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -602,7 +602,6 @@ unsigned long native_calibrate_tsc(void) case INTEL_FAM6_KABYLAKE_DESKTOP: crystal_khz = 24000; /* 24.0 MHz */ break; - case INTEL_FAM6_SKYLAKE_X: case INTEL_FAM6_ATOM_DENVERTON: crystal_khz = 25000; /* 25.0 MHz */ break;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Lawrence joe.lawrence@redhat.com
commit d3f14c485867cfb2e0c48aa88c41d0ef4bf5209c upstream.
round_pipe_size() contains a right-bit-shift expression which may overflow, which would cause undefined results in a subsequent roundup_pow_of_two() call.
static inline unsigned int round_pipe_size(unsigned int size) { unsigned long nr_pages;
nr_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT; return roundup_pow_of_two(nr_pages) << PAGE_SHIFT; }
PAGE_SIZE is defined as (1UL << PAGE_SHIFT), so: - 4 bytes wide on 32-bit (0 to 0xffffffff) - 8 bytes wide on 64-bit (0 to 0xffffffffffffffff)
That means that 32-bit round_pipe_size(), nr_pages may overflow to 0:
size=0x00000000 nr_pages=0x0 size=0x00000001 nr_pages=0x1 size=0xfffff000 nr_pages=0xfffff size=0xfffff001 nr_pages=0x0 << ! size=0xffffffff nr_pages=0x0 << !
This is bad because roundup_pow_of_two(n) is undefined when n == 0!
64-bit is not a problem as the unsigned int size is 4 bytes wide (similar to 32-bit) and the larger, 8 byte wide unsigned long, is sufficient to handle the largest value of the bit shift expression:
size=0xffffffff nr_pages=100000
Modify round_pipe_size() to return 0 if n == 0 and updates its callers to handle accordingly.
Link: http://lkml.kernel.org/r/1507658689-11669-3-git-send-email-joe.lawrence@redh... Signed-off-by: Joe Lawrence joe.lawrence@redhat.com Reported-by: Mikulas Patocka mpatocka@redhat.com Reviewed-by: Mikulas Patocka mpatocka@redhat.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Jens Axboe axboe@kernel.dk Cc: Michael Kerrisk mtk.manpages@gmail.com Cc: Randy Dunlap rdunlap@infradead.org Cc: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Dong Jinguang dongjinguang@huawei.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org
Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/pipe.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
--- a/fs/pipe.c +++ b/fs/pipe.c @@ -1018,13 +1018,19 @@ const struct file_operations pipefifo_fo
/* * Currently we rely on the pipe array holding a power-of-2 number - * of pages. + * of pages. Returns 0 on error. */ static inline unsigned int round_pipe_size(unsigned int size) { unsigned long nr_pages;
+ if (size < pipe_min_size) + size = pipe_min_size; + nr_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT; + if (nr_pages == 0) + return 0; + return roundup_pow_of_two(nr_pages) << PAGE_SHIFT; }
@@ -1040,6 +1046,8 @@ static long pipe_set_size(struct pipe_in long ret = 0;
size = round_pipe_size(arg); + if (size == 0) + return -EINVAL; nr_pages = size >> PAGE_SHIFT;
if (!nr_pages) @@ -1123,13 +1131,18 @@ out_revert_acct: int pipe_proc_fn(struct ctl_table *table, int write, void __user *buf, size_t *lenp, loff_t *ppos) { + unsigned int rounded_pipe_max_size; int ret;
ret = proc_douintvec_minmax(table, write, buf, lenp, ppos); if (ret < 0 || !write) return ret;
- pipe_max_size = round_pipe_size(pipe_max_size); + rounded_pipe_max_size = round_pipe_size(pipe_max_size); + if (rounded_pipe_max_size == 0) + return -EINVAL; + + pipe_max_size = rounded_pipe_max_size; return ret; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit 45d55e7bac4028af93f5fa324e69958a0b868e96 upstream.
Keith reported the following warning:
WARNING: CPU: 28 PID: 1420 at kernel/irq/matrix.c:222 irq_matrix_remove_managed+0x10f/0x120 x86_vector_free_irqs+0xa1/0x180 x86_vector_alloc_irqs+0x1e4/0x3a0 msi_domain_alloc+0x62/0x130
The reason for this is that if the vector allocation fails the error handling code tries to free the failed vector as well, which causes the above imbalance warning to trigger.
Adjust the error path to handle this correctly.
Fixes: b5dc8e6c21e7 ("x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors") Reported-by: Keith Busch keith.busch@intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Keith Busch keith.busch@intel.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801161217300.1823@nanos Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/apic/vector.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -369,8 +369,11 @@ static int x86_vector_alloc_irqs(struct irq_data->hwirq = virq + i; err = assign_irq_vector_policy(virq + i, node, data, info, irq_data); - if (err) + if (err) { + irq_data->chip_data = NULL; + free_apic_chip_data(data); goto error; + } /* * If the apic destination mode is physical, then the * effective affinity is restricted to a single target @@ -383,7 +386,7 @@ static int x86_vector_alloc_irqs(struct return 0;
error: - x86_vector_free_irqs(domain, virq, i + 1); + x86_vector_free_irqs(domain, virq, i); return err; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit 1303880179e67c59e801429b7e5d0f6b21137d99 upstream.
Clean up the use of PUSH and POP and when registers are saved in the __enc_copy() assembly function in order to improve the readability of the code.
Move parameter register saving into general purpose registers earlier in the code and move all the pushes to the beginning of the function with corresponding pops at the end.
We do this to prepare fixes.
Tested-by: Gabriel Craciunescu nix.or.die@gmail.com Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Reviewed-by: Borislav Petkov bp@suse.de Cc: Borislav Petkov bp@alien8.de Cc: Brijesh Singh brijesh.singh@amd.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20180110192556.6026.74187.stgit@tlendack-t1.amdoffi... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/mm/mem_encrypt_boot.S | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
--- a/arch/x86/mm/mem_encrypt_boot.S +++ b/arch/x86/mm/mem_encrypt_boot.S @@ -103,20 +103,19 @@ ENTRY(__enc_copy) orq $X86_CR4_PGE, %rdx mov %rdx, %cr4
+ push %r15 + + movq %rcx, %r9 /* Save kernel length */ + movq %rdi, %r10 /* Save encrypted kernel address */ + movq %rsi, %r11 /* Save decrypted kernel address */ + /* Set the PAT register PA5 entry to write-protect */ - push %rcx movl $MSR_IA32_CR_PAT, %ecx rdmsr - push %rdx /* Save original PAT value */ + mov %rdx, %r15 /* Save original PAT value */ andl $0xffff00ff, %edx /* Clear PA5 */ orl $0x00000500, %edx /* Set PA5 to WP */ wrmsr - pop %rdx /* RDX contains original PAT value */ - pop %rcx - - movq %rcx, %r9 /* Save kernel length */ - movq %rdi, %r10 /* Save encrypted kernel address */ - movq %rsi, %r11 /* Save decrypted kernel address */
wbinvd /* Invalidate any cache entries */
@@ -138,12 +137,13 @@ ENTRY(__enc_copy) jnz 1b /* Kernel length not zero? */
/* Restore PAT register */ - push %rdx /* Save original PAT value */ movl $MSR_IA32_CR_PAT, %ecx rdmsr - pop %rdx /* Restore original PAT value */ + mov %r15, %rdx /* Restore original PAT value */ wrmsr
+ pop %r15 + ret .L__enc_copy_end: ENDPROC(__enc_copy)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4 upstream.
In preparation for follow-on patches, combine the PGD mapping parameters into a struct to reduce the number of function arguments and allow for direct updating of the next pagetable mapping area pointer.
Tested-by: Gabriel Craciunescu nix.or.die@gmail.com Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Reviewed-by: Borislav Petkov bp@suse.de Cc: Borislav Petkov bp@alien8.de Cc: Brijesh Singh brijesh.singh@amd.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20180110192605.6026.96206.stgit@tlendack-t1.amdoffi... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/mm/mem_encrypt.c | 92 +++++++++++++++++++++++----------------------- 1 file changed, 47 insertions(+), 45 deletions(-)
--- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -213,6 +213,14 @@ void swiotlb_set_mem_attributes(void *va set_memory_decrypted((unsigned long)vaddr, size >> PAGE_SHIFT); }
+struct sme_populate_pgd_data { + void *pgtable_area; + pgd_t *pgd; + + pmdval_t pmd_val; + unsigned long vaddr; +}; + static void __init sme_clear_pgd(pgd_t *pgd_base, unsigned long start, unsigned long end) { @@ -235,15 +243,14 @@ static void __init sme_clear_pgd(pgd_t * #define PUD_FLAGS _KERNPG_TABLE_NOENC #define PMD_FLAGS (__PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL)
-static void __init *sme_populate_pgd(pgd_t *pgd_base, void *pgtable_area, - unsigned long vaddr, pmdval_t pmd_val) +static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd) { pgd_t *pgd_p; p4d_t *p4d_p; pud_t *pud_p; pmd_t *pmd_p;
- pgd_p = pgd_base + pgd_index(vaddr); + pgd_p = ppd->pgd + pgd_index(ppd->vaddr); if (native_pgd_val(*pgd_p)) { if (IS_ENABLED(CONFIG_X86_5LEVEL)) p4d_p = (p4d_t *)(native_pgd_val(*pgd_p) & ~PTE_FLAGS_MASK); @@ -253,15 +260,15 @@ static void __init *sme_populate_pgd(pgd pgd_t pgd;
if (IS_ENABLED(CONFIG_X86_5LEVEL)) { - p4d_p = pgtable_area; + p4d_p = ppd->pgtable_area; memset(p4d_p, 0, sizeof(*p4d_p) * PTRS_PER_P4D); - pgtable_area += sizeof(*p4d_p) * PTRS_PER_P4D; + ppd->pgtable_area += sizeof(*p4d_p) * PTRS_PER_P4D;
pgd = native_make_pgd((pgdval_t)p4d_p + PGD_FLAGS); } else { - pud_p = pgtable_area; + pud_p = ppd->pgtable_area; memset(pud_p, 0, sizeof(*pud_p) * PTRS_PER_PUD); - pgtable_area += sizeof(*pud_p) * PTRS_PER_PUD; + ppd->pgtable_area += sizeof(*pud_p) * PTRS_PER_PUD;
pgd = native_make_pgd((pgdval_t)pud_p + PGD_FLAGS); } @@ -269,44 +276,41 @@ static void __init *sme_populate_pgd(pgd }
if (IS_ENABLED(CONFIG_X86_5LEVEL)) { - p4d_p += p4d_index(vaddr); + p4d_p += p4d_index(ppd->vaddr); if (native_p4d_val(*p4d_p)) { pud_p = (pud_t *)(native_p4d_val(*p4d_p) & ~PTE_FLAGS_MASK); } else { p4d_t p4d;
- pud_p = pgtable_area; + pud_p = ppd->pgtable_area; memset(pud_p, 0, sizeof(*pud_p) * PTRS_PER_PUD); - pgtable_area += sizeof(*pud_p) * PTRS_PER_PUD; + ppd->pgtable_area += sizeof(*pud_p) * PTRS_PER_PUD;
p4d = native_make_p4d((pudval_t)pud_p + P4D_FLAGS); native_set_p4d(p4d_p, p4d); } }
- pud_p += pud_index(vaddr); + pud_p += pud_index(ppd->vaddr); if (native_pud_val(*pud_p)) { if (native_pud_val(*pud_p) & _PAGE_PSE) - goto out; + return;
pmd_p = (pmd_t *)(native_pud_val(*pud_p) & ~PTE_FLAGS_MASK); } else { pud_t pud;
- pmd_p = pgtable_area; + pmd_p = ppd->pgtable_area; memset(pmd_p, 0, sizeof(*pmd_p) * PTRS_PER_PMD); - pgtable_area += sizeof(*pmd_p) * PTRS_PER_PMD; + ppd->pgtable_area += sizeof(*pmd_p) * PTRS_PER_PMD;
pud = native_make_pud((pmdval_t)pmd_p + PUD_FLAGS); native_set_pud(pud_p, pud); }
- pmd_p += pmd_index(vaddr); + pmd_p += pmd_index(ppd->vaddr); if (!native_pmd_val(*pmd_p) || !(native_pmd_val(*pmd_p) & _PAGE_PSE)) - native_set_pmd(pmd_p, native_make_pmd(pmd_val)); - -out: - return pgtable_area; + native_set_pmd(pmd_p, native_make_pmd(ppd->pmd_val)); }
static unsigned long __init sme_pgtable_calc(unsigned long len) @@ -364,11 +368,10 @@ void __init sme_encrypt_kernel(void) unsigned long workarea_start, workarea_end, workarea_len; unsigned long execute_start, execute_end, execute_len; unsigned long kernel_start, kernel_end, kernel_len; + struct sme_populate_pgd_data ppd; unsigned long pgtable_area_len; unsigned long paddr, pmd_flags; unsigned long decrypted_base; - void *pgtable_area; - pgd_t *pgd;
if (!sme_active()) return; @@ -432,18 +435,18 @@ void __init sme_encrypt_kernel(void) * pagetables and when the new encrypted and decrypted kernel * mappings are populated. */ - pgtable_area = (void *)execute_end; + ppd.pgtable_area = (void *)execute_end;
/* * Make sure the current pagetable structure has entries for * addressing the workarea. */ - pgd = (pgd_t *)native_read_cr3_pa(); + ppd.pgd = (pgd_t *)native_read_cr3_pa(); paddr = workarea_start; while (paddr < workarea_end) { - pgtable_area = sme_populate_pgd(pgd, pgtable_area, - paddr, - paddr + PMD_FLAGS); + ppd.pmd_val = paddr + PMD_FLAGS; + ppd.vaddr = paddr; + sme_populate_pgd_large(&ppd);
paddr += PMD_PAGE_SIZE; } @@ -457,17 +460,17 @@ void __init sme_encrypt_kernel(void) * populated with new PUDs and PMDs as the encrypted and decrypted * kernel mappings are created. */ - pgd = pgtable_area; - memset(pgd, 0, sizeof(*pgd) * PTRS_PER_PGD); - pgtable_area += sizeof(*pgd) * PTRS_PER_PGD; + ppd.pgd = ppd.pgtable_area; + memset(ppd.pgd, 0, sizeof(pgd_t) * PTRS_PER_PGD); + ppd.pgtable_area += sizeof(pgd_t) * PTRS_PER_PGD;
/* Add encrypted kernel (identity) mappings */ pmd_flags = PMD_FLAGS | _PAGE_ENC; paddr = kernel_start; while (paddr < kernel_end) { - pgtable_area = sme_populate_pgd(pgd, pgtable_area, - paddr, - paddr + pmd_flags); + ppd.pmd_val = paddr + pmd_flags; + ppd.vaddr = paddr; + sme_populate_pgd_large(&ppd);
paddr += PMD_PAGE_SIZE; } @@ -485,9 +488,9 @@ void __init sme_encrypt_kernel(void) pmd_flags = (PMD_FLAGS & ~_PAGE_CACHE_MASK) | (_PAGE_PAT | _PAGE_PWT); paddr = kernel_start; while (paddr < kernel_end) { - pgtable_area = sme_populate_pgd(pgd, pgtable_area, - paddr + decrypted_base, - paddr + pmd_flags); + ppd.pmd_val = paddr + pmd_flags; + ppd.vaddr = paddr + decrypted_base; + sme_populate_pgd_large(&ppd);
paddr += PMD_PAGE_SIZE; } @@ -495,30 +498,29 @@ void __init sme_encrypt_kernel(void) /* Add decrypted workarea mappings to both kernel mappings */ paddr = workarea_start; while (paddr < workarea_end) { - pgtable_area = sme_populate_pgd(pgd, pgtable_area, - paddr, - paddr + PMD_FLAGS); - - pgtable_area = sme_populate_pgd(pgd, pgtable_area, - paddr + decrypted_base, - paddr + PMD_FLAGS); + ppd.pmd_val = paddr + PMD_FLAGS; + ppd.vaddr = paddr; + sme_populate_pgd_large(&ppd); + + ppd.vaddr = paddr + decrypted_base; + sme_populate_pgd_large(&ppd);
paddr += PMD_PAGE_SIZE; }
/* Perform the encryption */ sme_encrypt_execute(kernel_start, kernel_start + decrypted_base, - kernel_len, workarea_start, (unsigned long)pgd); + kernel_len, workarea_start, (unsigned long)ppd.pgd);
/* * At this point we are running encrypted. Remove the mappings for * the decrypted areas - all that is needed for this is to remove * the PGD entry/entries. */ - sme_clear_pgd(pgd, kernel_start + decrypted_base, + sme_clear_pgd(ppd.pgd, kernel_start + decrypted_base, kernel_end + decrypted_base);
- sme_clear_pgd(pgd, workarea_start + decrypted_base, + sme_clear_pgd(ppd.pgd, workarea_start + decrypted_base, workarea_end + decrypted_base);
/* Flush the TLB - no globals so cr3 is enough */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit 2b5d00b6c2cdd94f6d6a494a6f6c0c0fc7b8e711 upstream.
In preparation for encrypting more than just the kernel during early boot processing, centralize the use of the PMD flag settings based on the type of mapping desired. When 4KB aligned encryption is added, this will allow either PTE flags or large page PMD flags to be used without requiring the caller to adjust.
Tested-by: Gabriel Craciunescu nix.or.die@gmail.com Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Reviewed-by: Borislav Petkov bp@suse.de Cc: Borislav Petkov bp@alien8.de Cc: Brijesh Singh brijesh.singh@amd.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20180110192615.6026.14767.stgit@tlendack-t1.amdoffi... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/mm/mem_encrypt.c | 137 ++++++++++++++++++++++++++-------------------- 1 file changed, 79 insertions(+), 58 deletions(-)
--- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -217,31 +217,39 @@ struct sme_populate_pgd_data { void *pgtable_area; pgd_t *pgd;
- pmdval_t pmd_val; + pmdval_t pmd_flags; + unsigned long paddr; + unsigned long vaddr; + unsigned long vaddr_end; };
-static void __init sme_clear_pgd(pgd_t *pgd_base, unsigned long start, - unsigned long end) +static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd) { unsigned long pgd_start, pgd_end, pgd_size; pgd_t *pgd_p;
- pgd_start = start & PGDIR_MASK; - pgd_end = end & PGDIR_MASK; + pgd_start = ppd->vaddr & PGDIR_MASK; + pgd_end = ppd->vaddr_end & PGDIR_MASK;
- pgd_size = (((pgd_end - pgd_start) / PGDIR_SIZE) + 1); - pgd_size *= sizeof(pgd_t); + pgd_size = (((pgd_end - pgd_start) / PGDIR_SIZE) + 1) * sizeof(pgd_t);
- pgd_p = pgd_base + pgd_index(start); + pgd_p = ppd->pgd + pgd_index(ppd->vaddr);
memset(pgd_p, 0, pgd_size); }
-#define PGD_FLAGS _KERNPG_TABLE_NOENC -#define P4D_FLAGS _KERNPG_TABLE_NOENC -#define PUD_FLAGS _KERNPG_TABLE_NOENC -#define PMD_FLAGS (__PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL) +#define PGD_FLAGS _KERNPG_TABLE_NOENC +#define P4D_FLAGS _KERNPG_TABLE_NOENC +#define PUD_FLAGS _KERNPG_TABLE_NOENC + +#define PMD_FLAGS_LARGE (__PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL) + +#define PMD_FLAGS_DEC PMD_FLAGS_LARGE +#define PMD_FLAGS_DEC_WP ((PMD_FLAGS_DEC & ~_PAGE_CACHE_MASK) | \ + (_PAGE_PAT | _PAGE_PWT)) + +#define PMD_FLAGS_ENC (PMD_FLAGS_LARGE | _PAGE_ENC)
static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd) { @@ -310,7 +318,35 @@ static void __init sme_populate_pgd_larg
pmd_p += pmd_index(ppd->vaddr); if (!native_pmd_val(*pmd_p) || !(native_pmd_val(*pmd_p) & _PAGE_PSE)) - native_set_pmd(pmd_p, native_make_pmd(ppd->pmd_val)); + native_set_pmd(pmd_p, native_make_pmd(ppd->paddr | ppd->pmd_flags)); +} + +static void __init __sme_map_range(struct sme_populate_pgd_data *ppd, + pmdval_t pmd_flags) +{ + ppd->pmd_flags = pmd_flags; + + while (ppd->vaddr < ppd->vaddr_end) { + sme_populate_pgd_large(ppd); + + ppd->vaddr += PMD_PAGE_SIZE; + ppd->paddr += PMD_PAGE_SIZE; + } +} + +static void __init sme_map_range_encrypted(struct sme_populate_pgd_data *ppd) +{ + __sme_map_range(ppd, PMD_FLAGS_ENC); +} + +static void __init sme_map_range_decrypted(struct sme_populate_pgd_data *ppd) +{ + __sme_map_range(ppd, PMD_FLAGS_DEC); +} + +static void __init sme_map_range_decrypted_wp(struct sme_populate_pgd_data *ppd) +{ + __sme_map_range(ppd, PMD_FLAGS_DEC_WP); }
static unsigned long __init sme_pgtable_calc(unsigned long len) @@ -370,7 +406,6 @@ void __init sme_encrypt_kernel(void) unsigned long kernel_start, kernel_end, kernel_len; struct sme_populate_pgd_data ppd; unsigned long pgtable_area_len; - unsigned long paddr, pmd_flags; unsigned long decrypted_base;
if (!sme_active()) @@ -442,14 +477,10 @@ void __init sme_encrypt_kernel(void) * addressing the workarea. */ ppd.pgd = (pgd_t *)native_read_cr3_pa(); - paddr = workarea_start; - while (paddr < workarea_end) { - ppd.pmd_val = paddr + PMD_FLAGS; - ppd.vaddr = paddr; - sme_populate_pgd_large(&ppd); - - paddr += PMD_PAGE_SIZE; - } + ppd.paddr = workarea_start; + ppd.vaddr = workarea_start; + ppd.vaddr_end = workarea_end; + sme_map_range_decrypted(&ppd);
/* Flush the TLB - no globals so cr3 is enough */ native_write_cr3(__native_read_cr3()); @@ -464,17 +495,6 @@ void __init sme_encrypt_kernel(void) memset(ppd.pgd, 0, sizeof(pgd_t) * PTRS_PER_PGD); ppd.pgtable_area += sizeof(pgd_t) * PTRS_PER_PGD;
- /* Add encrypted kernel (identity) mappings */ - pmd_flags = PMD_FLAGS | _PAGE_ENC; - paddr = kernel_start; - while (paddr < kernel_end) { - ppd.pmd_val = paddr + pmd_flags; - ppd.vaddr = paddr; - sme_populate_pgd_large(&ppd); - - paddr += PMD_PAGE_SIZE; - } - /* * A different PGD index/entry must be used to get different * pagetable entries for the decrypted mapping. Choose the next @@ -484,29 +504,28 @@ void __init sme_encrypt_kernel(void) decrypted_base = (pgd_index(workarea_end) + 1) & (PTRS_PER_PGD - 1); decrypted_base <<= PGDIR_SHIFT;
- /* Add decrypted, write-protected kernel (non-identity) mappings */ - pmd_flags = (PMD_FLAGS & ~_PAGE_CACHE_MASK) | (_PAGE_PAT | _PAGE_PWT); - paddr = kernel_start; - while (paddr < kernel_end) { - ppd.pmd_val = paddr + pmd_flags; - ppd.vaddr = paddr + decrypted_base; - sme_populate_pgd_large(&ppd); + /* Add encrypted kernel (identity) mappings */ + ppd.paddr = kernel_start; + ppd.vaddr = kernel_start; + ppd.vaddr_end = kernel_end; + sme_map_range_encrypted(&ppd);
- paddr += PMD_PAGE_SIZE; - } + /* Add decrypted, write-protected kernel (non-identity) mappings */ + ppd.paddr = kernel_start; + ppd.vaddr = kernel_start + decrypted_base; + ppd.vaddr_end = kernel_end + decrypted_base; + sme_map_range_decrypted_wp(&ppd);
/* Add decrypted workarea mappings to both kernel mappings */ - paddr = workarea_start; - while (paddr < workarea_end) { - ppd.pmd_val = paddr + PMD_FLAGS; - ppd.vaddr = paddr; - sme_populate_pgd_large(&ppd); - - ppd.vaddr = paddr + decrypted_base; - sme_populate_pgd_large(&ppd); - - paddr += PMD_PAGE_SIZE; - } + ppd.paddr = workarea_start; + ppd.vaddr = workarea_start; + ppd.vaddr_end = workarea_end; + sme_map_range_decrypted(&ppd); + + ppd.paddr = workarea_start; + ppd.vaddr = workarea_start + decrypted_base; + ppd.vaddr_end = workarea_end + decrypted_base; + sme_map_range_decrypted(&ppd);
/* Perform the encryption */ sme_encrypt_execute(kernel_start, kernel_start + decrypted_base, @@ -517,11 +536,13 @@ void __init sme_encrypt_kernel(void) * the decrypted areas - all that is needed for this is to remove * the PGD entry/entries. */ - sme_clear_pgd(ppd.pgd, kernel_start + decrypted_base, - kernel_end + decrypted_base); - - sme_clear_pgd(ppd.pgd, workarea_start + decrypted_base, - workarea_end + decrypted_base); + ppd.vaddr = kernel_start + decrypted_base; + ppd.vaddr_end = kernel_end + decrypted_base; + sme_clear_pgd(&ppd); + + ppd.vaddr = workarea_start + decrypted_base; + ppd.vaddr_end = workarea_end + decrypted_base; + sme_clear_pgd(&ppd);
/* Flush the TLB - no globals so cr3 is enough */ native_write_cr3(__native_read_cr3());
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit cc5f01e28d6c60f274fd1e33b245f679f79f543c upstream.
In preparation for encrypting more than just the kernel, the encryption support in sme_encrypt_kernel() needs to support 4KB page aligned encryption instead of just 2MB large page aligned encryption.
Update the routines that populate the PGD to support non-2MB aligned addresses. This is done by creating PTE page tables for the start and end portion of the address range that fall outside of the 2MB alignment. This results in, at most, two extra pages to hold the PTE entries for each mapping of a range.
Tested-by: Gabriel Craciunescu nix.or.die@gmail.com Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Reviewed-by: Borislav Petkov bp@suse.de Cc: Borislav Petkov bp@alien8.de Cc: Brijesh Singh brijesh.singh@amd.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20180110192626.6026.75387.stgit@tlendack-t1.amdoffi... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/mm/mem_encrypt.c | 123 +++++++++++++++++++++++++++++++++++------ arch/x86/mm/mem_encrypt_boot.S | 20 ++++-- 2 files changed, 121 insertions(+), 22 deletions(-)
--- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -218,6 +218,7 @@ struct sme_populate_pgd_data { pgd_t *pgd;
pmdval_t pmd_flags; + pteval_t pte_flags; unsigned long paddr;
unsigned long vaddr; @@ -242,6 +243,7 @@ static void __init sme_clear_pgd(struct #define PGD_FLAGS _KERNPG_TABLE_NOENC #define P4D_FLAGS _KERNPG_TABLE_NOENC #define PUD_FLAGS _KERNPG_TABLE_NOENC +#define PMD_FLAGS _KERNPG_TABLE_NOENC
#define PMD_FLAGS_LARGE (__PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL)
@@ -251,7 +253,15 @@ static void __init sme_clear_pgd(struct
#define PMD_FLAGS_ENC (PMD_FLAGS_LARGE | _PAGE_ENC)
-static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd) +#define PTE_FLAGS (__PAGE_KERNEL_EXEC & ~_PAGE_GLOBAL) + +#define PTE_FLAGS_DEC PTE_FLAGS +#define PTE_FLAGS_DEC_WP ((PTE_FLAGS_DEC & ~_PAGE_CACHE_MASK) | \ + (_PAGE_PAT | _PAGE_PWT)) + +#define PTE_FLAGS_ENC (PTE_FLAGS | _PAGE_ENC) + +static pmd_t __init *sme_prepare_pgd(struct sme_populate_pgd_data *ppd) { pgd_t *pgd_p; p4d_t *p4d_p; @@ -302,7 +312,7 @@ static void __init sme_populate_pgd_larg pud_p += pud_index(ppd->vaddr); if (native_pud_val(*pud_p)) { if (native_pud_val(*pud_p) & _PAGE_PSE) - return; + return NULL;
pmd_p = (pmd_t *)(native_pud_val(*pud_p) & ~PTE_FLAGS_MASK); } else { @@ -316,16 +326,55 @@ static void __init sme_populate_pgd_larg native_set_pud(pud_p, pud); }
+ return pmd_p; +} + +static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd) +{ + pmd_t *pmd_p; + + pmd_p = sme_prepare_pgd(ppd); + if (!pmd_p) + return; + pmd_p += pmd_index(ppd->vaddr); if (!native_pmd_val(*pmd_p) || !(native_pmd_val(*pmd_p) & _PAGE_PSE)) native_set_pmd(pmd_p, native_make_pmd(ppd->paddr | ppd->pmd_flags)); }
-static void __init __sme_map_range(struct sme_populate_pgd_data *ppd, - pmdval_t pmd_flags) +static void __init sme_populate_pgd(struct sme_populate_pgd_data *ppd) { - ppd->pmd_flags = pmd_flags; + pmd_t *pmd_p; + pte_t *pte_p;
+ pmd_p = sme_prepare_pgd(ppd); + if (!pmd_p) + return; + + pmd_p += pmd_index(ppd->vaddr); + if (native_pmd_val(*pmd_p)) { + if (native_pmd_val(*pmd_p) & _PAGE_PSE) + return; + + pte_p = (pte_t *)(native_pmd_val(*pmd_p) & ~PTE_FLAGS_MASK); + } else { + pmd_t pmd; + + pte_p = ppd->pgtable_area; + memset(pte_p, 0, sizeof(*pte_p) * PTRS_PER_PTE); + ppd->pgtable_area += sizeof(*pte_p) * PTRS_PER_PTE; + + pmd = native_make_pmd((pteval_t)pte_p + PMD_FLAGS); + native_set_pmd(pmd_p, pmd); + } + + pte_p += pte_index(ppd->vaddr); + if (!native_pte_val(*pte_p)) + native_set_pte(pte_p, native_make_pte(ppd->paddr | ppd->pte_flags)); +} + +static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd) +{ while (ppd->vaddr < ppd->vaddr_end) { sme_populate_pgd_large(ppd);
@@ -334,33 +383,71 @@ static void __init __sme_map_range(struc } }
+static void __init __sme_map_range_pte(struct sme_populate_pgd_data *ppd) +{ + while (ppd->vaddr < ppd->vaddr_end) { + sme_populate_pgd(ppd); + + ppd->vaddr += PAGE_SIZE; + ppd->paddr += PAGE_SIZE; + } +} + +static void __init __sme_map_range(struct sme_populate_pgd_data *ppd, + pmdval_t pmd_flags, pteval_t pte_flags) +{ + unsigned long vaddr_end; + + ppd->pmd_flags = pmd_flags; + ppd->pte_flags = pte_flags; + + /* Save original end value since we modify the struct value */ + vaddr_end = ppd->vaddr_end; + + /* If start is not 2MB aligned, create PTE entries */ + ppd->vaddr_end = ALIGN(ppd->vaddr, PMD_PAGE_SIZE); + __sme_map_range_pte(ppd); + + /* Create PMD entries */ + ppd->vaddr_end = vaddr_end & PMD_PAGE_MASK; + __sme_map_range_pmd(ppd); + + /* If end is not 2MB aligned, create PTE entries */ + ppd->vaddr_end = vaddr_end; + __sme_map_range_pte(ppd); +} + static void __init sme_map_range_encrypted(struct sme_populate_pgd_data *ppd) { - __sme_map_range(ppd, PMD_FLAGS_ENC); + __sme_map_range(ppd, PMD_FLAGS_ENC, PTE_FLAGS_ENC); }
static void __init sme_map_range_decrypted(struct sme_populate_pgd_data *ppd) { - __sme_map_range(ppd, PMD_FLAGS_DEC); + __sme_map_range(ppd, PMD_FLAGS_DEC, PTE_FLAGS_DEC); }
static void __init sme_map_range_decrypted_wp(struct sme_populate_pgd_data *ppd) { - __sme_map_range(ppd, PMD_FLAGS_DEC_WP); + __sme_map_range(ppd, PMD_FLAGS_DEC_WP, PTE_FLAGS_DEC_WP); }
static unsigned long __init sme_pgtable_calc(unsigned long len) { - unsigned long p4d_size, pud_size, pmd_size; + unsigned long p4d_size, pud_size, pmd_size, pte_size; unsigned long total;
/* * Perform a relatively simplistic calculation of the pagetable - * entries that are needed. That mappings will be covered by 2MB - * PMD entries so we can conservatively calculate the required + * entries that are needed. Those mappings will be covered mostly + * by 2MB PMD entries so we can conservatively calculate the required * number of P4D, PUD and PMD structures needed to perform the - * mappings. Incrementing the count for each covers the case where - * the addresses cross entries. + * mappings. For mappings that are not 2MB aligned, PTE mappings + * would be needed for the start and end portion of the address range + * that fall outside of the 2MB alignment. This results in, at most, + * two extra pages to hold PTE entries for each range that is mapped. + * Incrementing the count for each covers the case where the addresses + * cross entries. */ if (IS_ENABLED(CONFIG_X86_5LEVEL)) { p4d_size = (ALIGN(len, PGDIR_SIZE) / PGDIR_SIZE) + 1; @@ -374,8 +461,9 @@ static unsigned long __init sme_pgtable_ } pmd_size = (ALIGN(len, PUD_SIZE) / PUD_SIZE) + 1; pmd_size *= sizeof(pmd_t) * PTRS_PER_PMD; + pte_size = 2 * sizeof(pte_t) * PTRS_PER_PTE;
- total = p4d_size + pud_size + pmd_size; + total = p4d_size + pud_size + pmd_size + pte_size;
/* * Now calculate the added pagetable structures needed to populate @@ -458,10 +546,13 @@ void __init sme_encrypt_kernel(void)
/* * The total workarea includes the executable encryption area and - * the pagetable area. + * the pagetable area. The start of the workarea is already 2MB + * aligned, align the end of the workarea on a 2MB boundary so that + * we don't try to create/allocate PTE entries from the workarea + * before it is mapped. */ workarea_len = execute_len + pgtable_area_len; - workarea_end = workarea_start + workarea_len; + workarea_end = ALIGN(workarea_start + workarea_len, PMD_PAGE_SIZE);
/* * Set the address to the start of where newly created pagetable --- a/arch/x86/mm/mem_encrypt_boot.S +++ b/arch/x86/mm/mem_encrypt_boot.S @@ -104,6 +104,7 @@ ENTRY(__enc_copy) mov %rdx, %cr4
push %r15 + push %r12
movq %rcx, %r9 /* Save kernel length */ movq %rdi, %r10 /* Save encrypted kernel address */ @@ -119,21 +120,27 @@ ENTRY(__enc_copy)
wbinvd /* Invalidate any cache entries */
- /* Copy/encrypt 2MB at a time */ + /* Copy/encrypt up to 2MB at a time */ + movq $PMD_PAGE_SIZE, %r12 1: + cmpq %r12, %r9 + jnb 2f + movq %r9, %r12 + +2: movq %r11, %rsi /* Source - decrypted kernel */ movq %r8, %rdi /* Dest - intermediate copy buffer */ - movq $PMD_PAGE_SIZE, %rcx /* 2MB length */ + movq %r12, %rcx rep movsb
movq %r8, %rsi /* Source - intermediate copy buffer */ movq %r10, %rdi /* Dest - encrypted kernel */ - movq $PMD_PAGE_SIZE, %rcx /* 2MB length */ + movq %r12, %rcx rep movsb
- addq $PMD_PAGE_SIZE, %r11 - addq $PMD_PAGE_SIZE, %r10 - subq $PMD_PAGE_SIZE, %r9 /* Kernel length decrement */ + addq %r12, %r11 + addq %r12, %r10 + subq %r12, %r9 /* Kernel length decrement */ jnz 1b /* Kernel length not zero? */
/* Restore PAT register */ @@ -142,6 +149,7 @@ ENTRY(__enc_copy) mov %r15, %rdx /* Restore original PAT value */ wrmsr
+ pop %r12 pop %r15
ret
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tero Kristo t-kristo@ti.com
commit 3c4d296e58a23687f2076d8ad531e6ae2b725846 upstream.
MMC3 hwmod data is missing the module_offs definition. MMC3 belongs under core, so add CORE_MOD for it.
Signed-off-by: Tero Kristo t-kristo@ti.com Signed-off-by: Tony Lindgren tony@atomide.com Cc: Adam Ford aford173@gmail.com Fixes: 6c0afb503937 ("clk: ti: convert to use proper register definition for all accesses") Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/mach-omap2/omap_hwmod_3xxx_data.c | 1 + 1 file changed, 1 insertion(+)
--- a/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c +++ b/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c @@ -1656,6 +1656,7 @@ static struct omap_hwmod omap3xxx_mmc3_h .main_clk = "mmchs3_fck", .prcm = { .omap2 = { + .module_offs = CORE_MOD, .prcm_reg_id = 1, .module_bit = OMAP3430_EN_MMC3_SHIFT, .idlest_reg_id = 1,
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit 107cd2532181b96c549e8f224cdcca8631c3076b upstream.
Currently the BSP microcode update code examines the initrd very early in the boot process. If SME is active, the initrd is treated as being encrypted but it has not been encrypted (in place) yet. Update the early boot code that encrypts the kernel to also encrypt the initrd so that early BSP microcode updates work.
Tested-by: Gabriel Craciunescu nix.or.die@gmail.com Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Reviewed-by: Borislav Petkov bp@suse.de Cc: Borislav Petkov bp@alien8.de Cc: Brijesh Singh brijesh.singh@amd.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/20180110192634.6026.10452.stgit@tlendack-t1.amdoffi... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/mem_encrypt.h | 4 +- arch/x86/kernel/head64.c | 4 +- arch/x86/kernel/setup.c | 8 ---- arch/x86/mm/mem_encrypt.c | 66 ++++++++++++++++++++++++++++++++----- arch/x86/mm/mem_encrypt_boot.S | 46 ++++++++++++------------- 5 files changed, 85 insertions(+), 43 deletions(-)
--- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -39,7 +39,7 @@ void __init sme_unmap_bootdata(char *rea
void __init sme_early_init(void);
-void __init sme_encrypt_kernel(void); +void __init sme_encrypt_kernel(struct boot_params *bp); void __init sme_enable(struct boot_params *bp);
/* Architecture __weak replacement functions */ @@ -61,7 +61,7 @@ static inline void __init sme_unmap_boot
static inline void __init sme_early_init(void) { }
-static inline void __init sme_encrypt_kernel(void) { } +static inline void __init sme_encrypt_kernel(struct boot_params *bp) { } static inline void __init sme_enable(struct boot_params *bp) { }
#endif /* CONFIG_AMD_MEM_ENCRYPT */ --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -157,8 +157,8 @@ unsigned long __head __startup_64(unsign p = fixup_pointer(&phys_base, physaddr); *p += load_delta - sme_get_me_mask();
- /* Encrypt the kernel (if SME is active) */ - sme_encrypt_kernel(); + /* Encrypt the kernel and related (if SME is active) */ + sme_encrypt_kernel(bp);
/* * Return the SME encryption mask (if SME is active) to be used as a --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -376,14 +376,6 @@ static void __init reserve_initrd(void) !ramdisk_image || !ramdisk_size) return; /* No initrd provided by bootloader */
- /* - * If SME is active, this memory will be marked encrypted by the - * kernel when it is accessed (including relocation). However, the - * ramdisk image was loaded decrypted by the bootloader, so make - * sure that it is encrypted before accessing it. - */ - sme_early_encrypt(ramdisk_image, ramdisk_end - ramdisk_image); - initrd_start = 0;
mapped_size = memblock_mem_size(max_pfn_mapped); --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -487,11 +487,12 @@ static unsigned long __init sme_pgtable_ return total; }
-void __init sme_encrypt_kernel(void) +void __init sme_encrypt_kernel(struct boot_params *bp) { unsigned long workarea_start, workarea_end, workarea_len; unsigned long execute_start, execute_end, execute_len; unsigned long kernel_start, kernel_end, kernel_len; + unsigned long initrd_start, initrd_end, initrd_len; struct sme_populate_pgd_data ppd; unsigned long pgtable_area_len; unsigned long decrypted_base; @@ -500,14 +501,15 @@ void __init sme_encrypt_kernel(void) return;
/* - * Prepare for encrypting the kernel by building new pagetables with - * the necessary attributes needed to encrypt the kernel in place. + * Prepare for encrypting the kernel and initrd by building new + * pagetables with the necessary attributes needed to encrypt the + * kernel in place. * * One range of virtual addresses will map the memory occupied - * by the kernel as encrypted. + * by the kernel and initrd as encrypted. * * Another range of virtual addresses will map the memory occupied - * by the kernel as decrypted and write-protected. + * by the kernel and initrd as decrypted and write-protected. * * The use of write-protect attribute will prevent any of the * memory from being cached. @@ -518,6 +520,20 @@ void __init sme_encrypt_kernel(void) kernel_end = ALIGN(__pa_symbol(_end), PMD_PAGE_SIZE); kernel_len = kernel_end - kernel_start;
+ initrd_start = 0; + initrd_end = 0; + initrd_len = 0; +#ifdef CONFIG_BLK_DEV_INITRD + initrd_len = (unsigned long)bp->hdr.ramdisk_size | + ((unsigned long)bp->ext_ramdisk_size << 32); + if (initrd_len) { + initrd_start = (unsigned long)bp->hdr.ramdisk_image | + ((unsigned long)bp->ext_ramdisk_image << 32); + initrd_end = PAGE_ALIGN(initrd_start + initrd_len); + initrd_len = initrd_end - initrd_start; + } +#endif + /* Set the encryption workarea to be immediately after the kernel */ workarea_start = kernel_end;
@@ -540,6 +556,8 @@ void __init sme_encrypt_kernel(void) */ pgtable_area_len = sizeof(pgd_t) * PTRS_PER_PGD; pgtable_area_len += sme_pgtable_calc(execute_end - kernel_start) * 2; + if (initrd_len) + pgtable_area_len += sme_pgtable_calc(initrd_len) * 2;
/* PUDs and PMDs needed in the current pagetables for the workarea */ pgtable_area_len += sme_pgtable_calc(execute_len + pgtable_area_len); @@ -578,9 +596,9 @@ void __init sme_encrypt_kernel(void)
/* * A new pagetable structure is being built to allow for the kernel - * to be encrypted. It starts with an empty PGD that will then be - * populated with new PUDs and PMDs as the encrypted and decrypted - * kernel mappings are created. + * and initrd to be encrypted. It starts with an empty PGD that will + * then be populated with new PUDs and PMDs as the encrypted and + * decrypted kernel mappings are created. */ ppd.pgd = ppd.pgtable_area; memset(ppd.pgd, 0, sizeof(pgd_t) * PTRS_PER_PGD); @@ -593,6 +611,12 @@ void __init sme_encrypt_kernel(void) * the base of the mapping. */ decrypted_base = (pgd_index(workarea_end) + 1) & (PTRS_PER_PGD - 1); + if (initrd_len) { + unsigned long check_base; + + check_base = (pgd_index(initrd_end) + 1) & (PTRS_PER_PGD - 1); + decrypted_base = max(decrypted_base, check_base); + } decrypted_base <<= PGDIR_SHIFT;
/* Add encrypted kernel (identity) mappings */ @@ -607,6 +631,21 @@ void __init sme_encrypt_kernel(void) ppd.vaddr_end = kernel_end + decrypted_base; sme_map_range_decrypted_wp(&ppd);
+ if (initrd_len) { + /* Add encrypted initrd (identity) mappings */ + ppd.paddr = initrd_start; + ppd.vaddr = initrd_start; + ppd.vaddr_end = initrd_end; + sme_map_range_encrypted(&ppd); + /* + * Add decrypted, write-protected initrd (non-identity) mappings + */ + ppd.paddr = initrd_start; + ppd.vaddr = initrd_start + decrypted_base; + ppd.vaddr_end = initrd_end + decrypted_base; + sme_map_range_decrypted_wp(&ppd); + } + /* Add decrypted workarea mappings to both kernel mappings */ ppd.paddr = workarea_start; ppd.vaddr = workarea_start; @@ -622,6 +661,11 @@ void __init sme_encrypt_kernel(void) sme_encrypt_execute(kernel_start, kernel_start + decrypted_base, kernel_len, workarea_start, (unsigned long)ppd.pgd);
+ if (initrd_len) + sme_encrypt_execute(initrd_start, initrd_start + decrypted_base, + initrd_len, workarea_start, + (unsigned long)ppd.pgd); + /* * At this point we are running encrypted. Remove the mappings for * the decrypted areas - all that is needed for this is to remove @@ -631,6 +675,12 @@ void __init sme_encrypt_kernel(void) ppd.vaddr_end = kernel_end + decrypted_base; sme_clear_pgd(&ppd);
+ if (initrd_len) { + ppd.vaddr = initrd_start + decrypted_base; + ppd.vaddr_end = initrd_end + decrypted_base; + sme_clear_pgd(&ppd); + } + ppd.vaddr = workarea_start + decrypted_base; ppd.vaddr_end = workarea_end + decrypted_base; sme_clear_pgd(&ppd); --- a/arch/x86/mm/mem_encrypt_boot.S +++ b/arch/x86/mm/mem_encrypt_boot.S @@ -22,9 +22,9 @@ ENTRY(sme_encrypt_execute)
/* * Entry parameters: - * RDI - virtual address for the encrypted kernel mapping - * RSI - virtual address for the decrypted kernel mapping - * RDX - length of kernel + * RDI - virtual address for the encrypted mapping + * RSI - virtual address for the decrypted mapping + * RDX - length to encrypt * RCX - virtual address of the encryption workarea, including: * - stack page (PAGE_SIZE) * - encryption routine page (PAGE_SIZE) @@ -41,9 +41,9 @@ ENTRY(sme_encrypt_execute) addq $PAGE_SIZE, %rax /* Workarea encryption routine */
push %r12 - movq %rdi, %r10 /* Encrypted kernel */ - movq %rsi, %r11 /* Decrypted kernel */ - movq %rdx, %r12 /* Kernel length */ + movq %rdi, %r10 /* Encrypted area */ + movq %rsi, %r11 /* Decrypted area */ + movq %rdx, %r12 /* Area length */
/* Copy encryption routine into the workarea */ movq %rax, %rdi /* Workarea encryption routine */ @@ -52,10 +52,10 @@ ENTRY(sme_encrypt_execute) rep movsb
/* Setup registers for call */ - movq %r10, %rdi /* Encrypted kernel */ - movq %r11, %rsi /* Decrypted kernel */ + movq %r10, %rdi /* Encrypted area */ + movq %r11, %rsi /* Decrypted area */ movq %r8, %rdx /* Pagetables used for encryption */ - movq %r12, %rcx /* Kernel length */ + movq %r12, %rcx /* Area length */ movq %rax, %r8 /* Workarea encryption routine */ addq $PAGE_SIZE, %r8 /* Workarea intermediate copy buffer */
@@ -71,7 +71,7 @@ ENDPROC(sme_encrypt_execute)
ENTRY(__enc_copy) /* - * Routine used to encrypt kernel. + * Routine used to encrypt memory in place. * This routine must be run outside of the kernel proper since * the kernel will be encrypted during the process. So this * routine is defined here and then copied to an area outside @@ -79,19 +79,19 @@ ENTRY(__enc_copy) * during execution. * * On entry the registers must be: - * RDI - virtual address for the encrypted kernel mapping - * RSI - virtual address for the decrypted kernel mapping + * RDI - virtual address for the encrypted mapping + * RSI - virtual address for the decrypted mapping * RDX - address of the pagetables to use for encryption - * RCX - length of kernel + * RCX - length of area * R8 - intermediate copy buffer * * RAX - points to this routine * - * The kernel will be encrypted by copying from the non-encrypted - * kernel space to an intermediate buffer and then copying from the - * intermediate buffer back to the encrypted kernel space. The physical - * addresses of the two kernel space mappings are the same which - * results in the kernel being encrypted "in place". + * The area will be encrypted by copying from the non-encrypted + * memory space to an intermediate buffer and then copying from the + * intermediate buffer back to the encrypted memory space. The physical + * addresses of the two mappings are the same which results in the area + * being encrypted "in place". */ /* Enable the new page tables */ mov %rdx, %cr3 @@ -106,9 +106,9 @@ ENTRY(__enc_copy) push %r15 push %r12
- movq %rcx, %r9 /* Save kernel length */ - movq %rdi, %r10 /* Save encrypted kernel address */ - movq %rsi, %r11 /* Save decrypted kernel address */ + movq %rcx, %r9 /* Save area length */ + movq %rdi, %r10 /* Save encrypted area address */ + movq %rsi, %r11 /* Save decrypted area address */
/* Set the PAT register PA5 entry to write-protect */ movl $MSR_IA32_CR_PAT, %ecx @@ -128,13 +128,13 @@ ENTRY(__enc_copy) movq %r9, %r12
2: - movq %r11, %rsi /* Source - decrypted kernel */ + movq %r11, %rsi /* Source - decrypted area */ movq %r8, %rdi /* Dest - intermediate copy buffer */ movq %r12, %rcx rep movsb
movq %r8, %rsi /* Source - intermediate copy buffer */ - movq %r10, %rdi /* Dest - encrypted kernel */ + movq %r10, %rdi /* Dest - encrypted area */ movq %r12, %rcx rep movsb
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nir Perry nirperry@gmail.com
commit 4d94e776bd29670f01befa27e12df784fa05fa2e upstream.
The fix for handling two-finger scroll (i4a646580f793 - "Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad") introduced a minor "typo" that broke decoding of multi-touch events are decoded on some ALPS touchpads. For example, tapping with three-fingers can no longer be used to emulate middle-mouse-button (the kernel doesn't recognize this as the proper event, and doesn't report it correctly to userspace). This affects touchpads that use SS4 "plus" protocol variant, like those found on Dell E7270 & E7470 laptops (tested on E7270).
First, probably the code in alps_decode_ss4_v2() for case SS4_PACKET_ID_MULTI used inconsistent indices to "f->mt[]". You can see 0 & 1 are used for the "if" part but 2 & 3 are used for the "else" part.
Second, in the previous patch, new macros were introduced to decode X coordinates specific to the SS4 "plus" variant, but the macro to define the maximum X value wasn't changed accordingly. The macros to decode X values for "plus" variant are effectively shifted right by 1 bit, but the max wasn't shifted too. This causes the driver to incorrectly handle "no data" cases, which also interfered with how multi-touch was handled.
Fixes: 4a646580f793 ("Input: ALPS - fix two-finger scroll breakage...") Signed-off-by: Nir Perry nirperry@gmail.com Reviewed-by: Masaki Ota masaki.ota@jp.alps.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/mouse/alps.c | 23 +++++++++++++---------- drivers/input/mouse/alps.h | 10 ++++++---- 2 files changed, 19 insertions(+), 14 deletions(-)
--- a/drivers/input/mouse/alps.c +++ b/drivers/input/mouse/alps.c @@ -1250,29 +1250,32 @@ static int alps_decode_ss4_v2(struct alp case SS4_PACKET_ID_MULTI: if (priv->flags & ALPS_BUTTONPAD) { if (IS_SS4PLUS_DEV(priv->dev_id)) { - f->mt[0].x = SS4_PLUS_BTL_MF_X_V2(p, 0); - f->mt[1].x = SS4_PLUS_BTL_MF_X_V2(p, 1); + f->mt[2].x = SS4_PLUS_BTL_MF_X_V2(p, 0); + f->mt[3].x = SS4_PLUS_BTL_MF_X_V2(p, 1); + no_data_x = SS4_PLUS_MFPACKET_NO_AX_BL; } else { f->mt[2].x = SS4_BTL_MF_X_V2(p, 0); f->mt[3].x = SS4_BTL_MF_X_V2(p, 1); + no_data_x = SS4_MFPACKET_NO_AX_BL; } + no_data_y = SS4_MFPACKET_NO_AY_BL;
f->mt[2].y = SS4_BTL_MF_Y_V2(p, 0); f->mt[3].y = SS4_BTL_MF_Y_V2(p, 1); - no_data_x = SS4_MFPACKET_NO_AX_BL; - no_data_y = SS4_MFPACKET_NO_AY_BL; } else { if (IS_SS4PLUS_DEV(priv->dev_id)) { - f->mt[0].x = SS4_PLUS_STD_MF_X_V2(p, 0); - f->mt[1].x = SS4_PLUS_STD_MF_X_V2(p, 1); + f->mt[2].x = SS4_PLUS_STD_MF_X_V2(p, 0); + f->mt[3].x = SS4_PLUS_STD_MF_X_V2(p, 1); + no_data_x = SS4_PLUS_MFPACKET_NO_AX; } else { - f->mt[0].x = SS4_STD_MF_X_V2(p, 0); - f->mt[1].x = SS4_STD_MF_X_V2(p, 1); + f->mt[2].x = SS4_STD_MF_X_V2(p, 0); + f->mt[3].x = SS4_STD_MF_X_V2(p, 1); + no_data_x = SS4_MFPACKET_NO_AX; } + no_data_y = SS4_MFPACKET_NO_AY; + f->mt[2].y = SS4_STD_MF_Y_V2(p, 0); f->mt[3].y = SS4_STD_MF_Y_V2(p, 1); - no_data_x = SS4_MFPACKET_NO_AX; - no_data_y = SS4_MFPACKET_NO_AY; }
f->first_mp = 0; --- a/drivers/input/mouse/alps.h +++ b/drivers/input/mouse/alps.h @@ -141,10 +141,12 @@ enum SS4_PACKET_ID { #define SS4_TS_Z_V2(_b) (s8)(_b[4] & 0x7F)
-#define SS4_MFPACKET_NO_AX 8160 /* X-Coordinate value */ -#define SS4_MFPACKET_NO_AY 4080 /* Y-Coordinate value */ -#define SS4_MFPACKET_NO_AX_BL 8176 /* Buttonless X-Coordinate value */ -#define SS4_MFPACKET_NO_AY_BL 4088 /* Buttonless Y-Coordinate value */ +#define SS4_MFPACKET_NO_AX 8160 /* X-Coordinate value */ +#define SS4_MFPACKET_NO_AY 4080 /* Y-Coordinate value */ +#define SS4_MFPACKET_NO_AX_BL 8176 /* Buttonless X-Coord value */ +#define SS4_MFPACKET_NO_AY_BL 4088 /* Buttonless Y-Coord value */ +#define SS4_PLUS_MFPACKET_NO_AX 4080 /* SS4 PLUS, X */ +#define SS4_PLUS_MFPACKET_NO_AX_BL 4088 /* Buttonless SS4 PLUS, X */
/* * enum V7_PACKET_ID - defines the packet type for V7
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nick Desaulniers nick.desaulniers@gmail.com
commit 55edde9fff1ae4114c893c572e641620c76c9c21 upstream.
KASAN found a UAF due to dangling pointer. As the report below says, rmi_f11_attention() accesses drvdata->attn_data.data, which was freed in rmi_irq_fn.
[ 311.424062] BUG: KASAN: use-after-free in rmi_f11_attention+0x526/0x5e0 [rmi_core] [ 311.424067] Read of size 27 at addr ffff88041fd610db by task irq/131-i2c_hid/1162 [ 311.424075] CPU: 0 PID: 1162 Comm: irq/131-i2c_hid Not tainted 4.15.0-rc8+ #2 [ 311.424076] Hardware name: Razer Blade Stealth/Razer, BIOS 6.05 01/26/2017 [ 311.424078] Call Trace: [ 311.424086] dump_stack+0xae/0x12d [ 311.424090] ? _atomic_dec_and_lock+0x103/0x103 [ 311.424094] ? show_regs_print_info+0xa/0xa [ 311.424099] ? input_handle_event+0x10b/0x810 [ 311.424104] print_address_description+0x65/0x229 [ 311.424108] kasan_report.cold.5+0xa7/0x281 [ 311.424117] rmi_f11_attention+0x526/0x5e0 [rmi_core] [ 311.424123] ? memcpy+0x1f/0x50 [ 311.424132] ? rmi_f11_attention+0x526/0x5e0 [rmi_core] [ 311.424143] ? rmi_f11_probe+0x1e20/0x1e20 [rmi_core] [ 311.424153] ? rmi_process_interrupt_requests+0x220/0x2a0 [rmi_core] [ 311.424163] ? rmi_irq_fn+0x22c/0x270 [rmi_core] [ 311.424173] ? rmi_process_interrupt_requests+0x2a0/0x2a0 [rmi_core] [ 311.424177] ? free_irq+0xa0/0xa0 [ 311.424180] ? irq_finalize_oneshot.part.39+0xeb/0x180 [ 311.424190] ? rmi_process_interrupt_requests+0x2a0/0x2a0 [rmi_core] [ 311.424193] ? irq_thread_fn+0x3d/0x80 [ 311.424197] ? irq_finalize_oneshot.part.39+0x180/0x180 [ 311.424200] ? irq_thread+0x21d/0x290 [ 311.424203] ? irq_thread_check_affinity+0x170/0x170 [ 311.424207] ? remove_wait_queue+0x150/0x150 [ 311.424212] ? kasan_unpoison_shadow+0x30/0x40 [ 311.424214] ? __init_waitqueue_head+0xa0/0xd0 [ 311.424218] ? task_non_contending.cold.55+0x18/0x18 [ 311.424221] ? irq_forced_thread_fn+0xa0/0xa0 [ 311.424226] ? irq_thread_check_affinity+0x170/0x170 [ 311.424230] ? kthread+0x19e/0x1c0 [ 311.424233] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 311.424237] ? ret_from_fork+0x32/0x40
[ 311.424244] Allocated by task 899: [ 311.424249] kasan_kmalloc+0xbf/0xe0 [ 311.424252] __kmalloc_track_caller+0xd9/0x1f0 [ 311.424255] kmemdup+0x17/0x40 [ 311.424264] rmi_set_attn_data+0xa4/0x1b0 [rmi_core] [ 311.424269] rmi_raw_event+0x10b/0x1f0 [hid_rmi] [ 311.424278] hid_input_report+0x1a8/0x2c0 [hid] [ 311.424283] i2c_hid_irq+0x146/0x1d0 [i2c_hid] [ 311.424286] irq_thread_fn+0x3d/0x80 [ 311.424288] irq_thread+0x21d/0x290 [ 311.424291] kthread+0x19e/0x1c0 [ 311.424293] ret_from_fork+0x32/0x40
[ 311.424296] Freed by task 1162: [ 311.424300] kasan_slab_free+0x71/0xc0 [ 311.424303] kfree+0x90/0x190 [ 311.424311] rmi_irq_fn+0x1b2/0x270 [rmi_core] [ 311.424319] rmi_irq_fn+0x257/0x270 [rmi_core] [ 311.424322] irq_thread_fn+0x3d/0x80 [ 311.424324] irq_thread+0x21d/0x290 [ 311.424327] kthread+0x19e/0x1c0 [ 311.424330] ret_from_fork+0x32/0x40
[ 311.424334] The buggy address belongs to the object at ffff88041fd610c0 which belongs to the cache kmalloc-64 of size 64 [ 311.424340] The buggy address is located 27 bytes inside of 64-byte region [ffff88041fd610c0, ffff88041fd61100) [ 311.424344] The buggy address belongs to the page: [ 311.424348] page:ffffea00107f5840 count:1 mapcount:0 mapping: (null) index:0x0 [ 311.424353] flags: 0x17ffffc0000100(slab) [ 311.424358] raw: 0017ffffc0000100 0000000000000000 0000000000000000 00000001802a002a [ 311.424363] raw: dead000000000100 dead000000000200 ffff8804228036c0 0000000000000000 [ 311.424366] page dumped because: kasan: bad access detected
[ 311.424369] Memory state around the buggy address: [ 311.424373] ffff88041fd60f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 311.424377] ffff88041fd61000: fb fb fb fb fb fb fb fb fc fc fc fc fb fb fb fb [ 311.424381] >ffff88041fd61080: fb fb fb fb fc fc fc fc fb fb fb fb fb fb fb fb [ 311.424384] ^ [ 311.424387] ffff88041fd61100: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc [ 311.424391] ffff88041fd61180: fb fb fb fb fb fb fb fb fc fc fc fc fb fb fb fb
Signed-off-by: Nick Desaulniers nick.desaulniers@gmail.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/rmi4/rmi_driver.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/input/rmi4/rmi_driver.c +++ b/drivers/input/rmi4/rmi_driver.c @@ -230,8 +230,10 @@ static irqreturn_t rmi_irq_fn(int irq, v rmi_dbg(RMI_DEBUG_CORE, &rmi_dev->dev, "Failed to process interrupt request: %d\n", ret);
- if (count) + if (count) { kfree(attn_data.data); + attn_data.data = NULL; + }
if (!kfifo_is_empty(&drvdata->attn_fifo)) return rmi_irq_fn(irq, dev_id);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 906bf7daa0618d0ef39f4872ca42218c29a3631f upstream.
Fix child node-lookup during probe, which ended up searching the whole device tree depth-first starting at parent rather than just matching on its children.
To make things worse, the parent node was prematurely freed, while the child node was leaked.
Fixes: 2e57d56747e6 ("mfd: 88pm860x: Device tree support") Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/touchscreen/88pm860x-ts.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
--- a/drivers/input/touchscreen/88pm860x-ts.c +++ b/drivers/input/touchscreen/88pm860x-ts.c @@ -126,7 +126,7 @@ static int pm860x_touch_dt_init(struct p int data, n, ret; if (!np) return -ENODEV; - np = of_find_node_by_name(np, "touch"); + np = of_get_child_by_name(np, "touch"); if (!np) { dev_err(&pdev->dev, "Can't find touch node\n"); return -EINVAL; @@ -144,13 +144,13 @@ static int pm860x_touch_dt_init(struct p if (data) { ret = pm860x_reg_write(i2c, PM8607_GPADC_MISC1, data); if (ret < 0) - return -EINVAL; + goto err_put_node; } /* set tsi prebias time */ if (!of_property_read_u32(np, "marvell,88pm860x-tsi-prebias", &data)) { ret = pm860x_reg_write(i2c, PM8607_TSI_PREBIAS, data); if (ret < 0) - return -EINVAL; + goto err_put_node; } /* set prebias & prechg time of pen detect */ data = 0; @@ -161,10 +161,18 @@ static int pm860x_touch_dt_init(struct p if (data) { ret = pm860x_reg_write(i2c, PM8607_PD_PREBIAS, data); if (ret < 0) - return -EINVAL; + goto err_put_node; } of_property_read_u32(np, "marvell,88pm860x-resistor-X", res_x); + + of_node_put(np); + return 0; + +err_put_node: + of_node_put(np); + + return -EINVAL; } #else #define pm860x_touch_dt_init(x, y, z) (-1)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit dcaf12a8b0bbdbfcfa2be8dff2c4948d9844b4ad upstream.
Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at parent rather than just matching on its children.
Later sanity checks on node properties (which would likely be missing) should prevent this from causing much trouble however, especially as the original premature free of the parent node has already been fixed separately (but that "fix" was apparently never backported to stable).
Fixes: e7ec014a47e4 ("Input: twl6040-vibra - update for device tree support") Fixes: c52c545ead97 ("Input: twl6040-vibra - fix DT node memory management") Signed-off-by: Johan Hovold johan@kernel.org Acked-by: Peter Ujfalusi peter.ujfalusi@ti.com Tested-by: H. Nikolaus Schaller hns@goldelico.com (on Pyra OMAP5 hardware) Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/misc/twl6040-vibra.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/input/misc/twl6040-vibra.c +++ b/drivers/input/misc/twl6040-vibra.c @@ -248,8 +248,7 @@ static int twl6040_vibra_probe(struct pl int vddvibr_uV = 0; int error;
- of_node_get(twl6040_core_dev->of_node); - twl6040_core_node = of_find_node_by_name(twl6040_core_dev->of_node, + twl6040_core_node = of_get_child_by_name(twl6040_core_dev->of_node, "vibra"); if (!twl6040_core_node) { dev_err(&pdev->dev, "parent of node is missing?\n");
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 5b189201993ab03001a398de731045bfea90c689 upstream.
A helper purported to look up a child node based on its name was using the wrong of-helper and ended up prematurely freeing the parent of-node while searching the whole device tree depth-first starting at the parent node.
Fixes: 64b9e4d803b1 ("input: twl4030-vibra: Support for DT booted kernel") Fixes: e661d0a04462 ("Input: twl4030-vibra - fix ERROR: Bad of_node_put() warning") Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/misc/twl4030-vibra.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/input/misc/twl4030-vibra.c +++ b/drivers/input/misc/twl4030-vibra.c @@ -178,12 +178,14 @@ static SIMPLE_DEV_PM_OPS(twl4030_vibra_p twl4030_vibra_suspend, twl4030_vibra_resume);
static bool twl4030_vibra_check_coexist(struct twl4030_vibra_data *pdata, - struct device_node *node) + struct device_node *parent) { + struct device_node *node; + if (pdata && pdata->coexist) return true;
- node = of_find_node_by_name(node, "codec"); + node = of_get_child_by_name(parent, "codec"); if (node) { of_node_put(node); return true;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 1ebe1eaf2f02784921759992ae1fde1a9bec8fd0 upstream.
Since enums do not get converted by the TRACE_EVENT macro into their values, the event format displaces the enum name and not the value. This breaks tools like perf and trace-cmd that need to interpret the raw binary data. To solve this, an enum map was created to convert these enums into their actual numbers on boot up. This is done by TRACE_EVENTS() adding a TRACE_DEFINE_ENUM() macro.
Some enums were not being converted. This was caused by an optization that had a bug in it.
All calls get checked against this enum map to see if it should be converted or not, and it compares the call's system to the system that the enum map was created under. If they match, then they call is processed.
To cut down on the number of iterations needed to find the maps with a matching system, since calls and maps are grouped by system, when a match is made, the index into the map array is saved, so that the next call, if it belongs to the same system as the previous call, could start right at that array index and not have to scan all the previous arrays.
The problem was, the saved index was used as the variable to know if this is a call in a new system or not. If the index was zero, it was assumed that the call is in a new system and would keep incrementing the saved index until it found a matching system. The issue arises when the first matching system was at index zero. The next map, if it belonged to the same system, would then think it was the first match and increment the index to one. If the next call belong to the same system, it would begin its search of the maps off by one, and miss the first enum that should be converted. This left a single enum not converted properly.
Also add a comment to describe exactly what that index was for. It took me a bit too long to figure out what I was thinking when debugging this issue.
Link: http://lkml.kernel.org/r/717BE572-2070-4C1E-9902-9F2E0FEDA4F8@oracle.com
Fixes: 0c564a538aa93 ("tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values") Reported-by: Chuck Lever chuck.lever@oracle.com Teste-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/trace/trace_events.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
--- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -2213,6 +2213,7 @@ void trace_event_eval_update(struct trac { struct trace_event_call *call, *p; const char *last_system = NULL; + bool first = false; int last_i; int i;
@@ -2220,15 +2221,28 @@ void trace_event_eval_update(struct trac list_for_each_entry_safe(call, p, &ftrace_events, list) { /* events are usually grouped together with systems */ if (!last_system || call->class->system != last_system) { + first = true; last_i = 0; last_system = call->class->system; }
+ /* + * Since calls are grouped by systems, the likelyhood that the + * next call in the iteration belongs to the same system as the + * previous call is high. As an optimization, we skip seaching + * for a map[] that matches the call's system if the last call + * was from the same system. That's what last_i is for. If the + * call has the same system as the previous call, then last_i + * will be the index of the first map[] that has a matching + * system. + */ for (i = last_i; i < len; i++) { if (call->class->system == map[i]->system) { /* Save the first system if need be */ - if (!last_i) + if (first) { last_i = i; + first = false; + } update_event_printk(call, map[i]); } }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit b7563e2796f8b23c98afcfea7363194227fa089d upstream.
Stefan Wahren reports a problem with a warning fix that was merged for v4.15: we had lots of device nodes with a 'phys' property pointing to a device node that is not compliant with the binding documented in Documentation/devicetree/bindings/phy/phy-bindings.txt
This generally works because USB HCD drivers that support both the generic phy subsystem and the older usb-phy subsystem ignore most errors from phy_get() and related calls and then use the usb-phy driver instead.
However, it turns out that making the usb-nop-xceiv device compatible with the generic-phy binding changes the phy_get() return code from -EINVAL to -EPROBE_DEFER, and the dwc2 usb controller driver for bcm2835 now returns -EPROBE_DEFER from its probe function rather than ignoring the failure, breaking all USB support on raspberry-pi when CONFIG_GENERIC_PHY is enabled. The same code is used in the dwc3 driver and the usb_add_hcd() function, so a reasonable assumption would be that many other platforms are affected as well.
I have reviewed all the related patches and concluded that "usb-nop-xceiv" is the only USB phy that is affected by the change, and since it is by far the most commonly referenced phy, all the other USB phy drivers appear to be used in ways that are are either safe in DT (they don't use the 'phys' property), or in the driver (they already ignore -EPROBE_DEFER from generic-phy when usb-phy is available).
To work around the problem, this adds a special case to _of_phy_get() so we ignore any PHY node that is compatible with "usb-nop-xceiv", as we know that this can never load no matter how much we defer. In the future, we might implement a generic-phy driver for "usb-nop-xceiv" and then remove this workaround.
Since we generally want older kernels to also want to work with the fixed devicetree files, it would be good to backport the patch into stable kernels as well (3.13+ are possibly affected), even though they don't contain any of the patches that may have caused regressions.
Fixes: 014d6da6cb25 ARM: dts: bcm283x: Fix DTC warnings about missing phy-cells Fixes: c5bbf358b790 arm: dts: nspire: Add missing #phy-cells to usb-nop-xceiv Fixes: 44e5dced2ef6 arm: dts: marvell: Add missing #phy-cells to usb-nop-xceiv Fixes: f568f6f554b8 ARM: dts: omap: Add missing #phy-cells to usb-nop-xceiv Fixes: d745d5f277bf ARM: dts: imx51-zii-rdu1: Add missing #phy-cells to usb-nop-xceiv Fixes: 915fbe59cbf2 ARM: dts: imx: Add missing #phy-cells to usb-nop-xceiv Link: https://marc.info/?l=linux-usb&m=151518314314753&w=2 Link: https://patchwork.kernel.org/patch/10158145/ Cc: Felipe Balbi balbi@kernel.org Cc: Eric Anholt eric@anholt.net Tested-by: Stefan Wahren stefan.wahren@i2se.com Acked-by: Rob Herring robh@kernel.org Tested-by: Hans Verkuil hans.verkuil@cisco.com Acked-by: Kishon Vijay Abraham I kishon@ti.com Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/phy/phy-core.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/phy/phy-core.c +++ b/drivers/phy/phy-core.c @@ -395,6 +395,10 @@ static struct phy *_of_phy_get(struct de if (ret) return ERR_PTR(-ENODEV);
+ /* This phy type handled by the usb-phy subsystem for now */ + if (of_device_is_compatible(args.np, "usb-nop-xceiv")) + return ERR_PTR(-ENODEV); + mutex_lock(&phy_provider_mutex); phy_provider = of_phy_provider_lookup(args.np); if (IS_ERR(phy_provider) || !try_module_get(phy_provider->owner)) {
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gregory CLEMENT gregory.clement@free-electrons.com
commit e3af9f7c6ece29fdb7fe0aeb83ac5d3077a06edb upstream.
On the CP modules we found on Armada 7K/8K, many IP block actually also need a "functional" clock (from the bus). This patch add them which allows to fix some issues hanging the kernel:
If Ethernet and sdhci driver are built as modules and sdhci was loaded first then the kernel hang.
Fixes: bb16ea1742c8 ("mmc: sdhci-xenon: Fix clock resource by adding an optional bus clock") Reported-by: Riku Voipio riku.voipio@linaro.org Signed-off-by: Gregory CLEMENT gregory.clement@free-electrons.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi | 13 ++++++++----- arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi | 9 ++++++--- 2 files changed, 14 insertions(+), 8 deletions(-)
--- a/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi +++ b/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi @@ -63,8 +63,10 @@ cpm_ethernet: ethernet@0 { compatible = "marvell,armada-7k-pp22"; reg = <0x0 0x100000>, <0x129000 0xb000>; - clocks = <&cpm_clk 1 3>, <&cpm_clk 1 9>, <&cpm_clk 1 5>; - clock-names = "pp_clk", "gop_clk", "mg_clk"; + clocks = <&cpm_clk 1 3>, <&cpm_clk 1 9>, + <&cpm_clk 1 5>, <&cpm_clk 1 18>; + clock-names = "pp_clk", "gop_clk", + "mg_clk","axi_clk"; marvell,system-controller = <&cpm_syscon0>; status = "disabled"; dma-coherent; @@ -114,7 +116,8 @@ #size-cells = <0>; compatible = "marvell,orion-mdio"; reg = <0x12a200 0x10>; - clocks = <&cpm_clk 1 9>, <&cpm_clk 1 5>; + clocks = <&cpm_clk 1 9>, <&cpm_clk 1 5>, + <&cpm_clk 1 6>, <&cpm_clk 1 18>; status = "disabled"; };
@@ -295,8 +298,8 @@ compatible = "marvell,armada-cp110-sdhci"; reg = <0x780000 0x300>; interrupts = <ICU_GRP_NSR 27 IRQ_TYPE_LEVEL_HIGH>; - clock-names = "core"; - clocks = <&cpm_clk 1 4>; + clock-names = "core","axi"; + clocks = <&cpm_clk 1 4>, <&cpm_clk 1 18>; dma-coherent; status = "disabled"; }; --- a/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi +++ b/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi @@ -63,8 +63,10 @@ cps_ethernet: ethernet@0 { compatible = "marvell,armada-7k-pp22"; reg = <0x0 0x100000>, <0x129000 0xb000>; - clocks = <&cps_clk 1 3>, <&cps_clk 1 9>, <&cps_clk 1 5>; - clock-names = "pp_clk", "gop_clk", "mg_clk"; + clocks = <&cps_clk 1 3>, <&cps_clk 1 9>, + <&cps_clk 1 5>, <&cps_clk 1 18>; + clock-names = "pp_clk", "gop_clk", + "mg_clk", "axi_clk"; marvell,system-controller = <&cps_syscon0>; status = "disabled"; dma-coherent; @@ -114,7 +116,8 @@ #size-cells = <0>; compatible = "marvell,orion-mdio"; reg = <0x12a200 0x10>; - clocks = <&cps_clk 1 9>, <&cps_clk 1 5>; + clocks = <&cps_clk 1 9>, <&cps_clk 1 5>, + <&cps_clk 1 6>, <&cps_clk 1 18>; status = "disabled"; };
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maxime Ripard maxime.ripard@free-electrons.com
commit c13e7f313da33d1488355440f1a10feb1897480a upstream.
The DRM driver most notably, but also out of tree drivers (for now) like the VPU or GPU drivers, are quite big consumers of large, contiguous memory buffers. However, the sunxi_defconfig doesn't enable CMA in order to mitigate that, which makes them almost unusable.
Enable it to make sure it somewhat works.
Signed-off-by: Maxime Ripard maxime.ripard@free-electrons.com Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/configs/sunxi_defconfig | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/arm/configs/sunxi_defconfig +++ b/arch/arm/configs/sunxi_defconfig @@ -10,6 +10,7 @@ CONFIG_SMP=y CONFIG_NR_CPUS=8 CONFIG_AEABI=y CONFIG_HIGHMEM=y +CONFIG_CMA=y CONFIG_ARM_APPENDED_DTB=y CONFIG_ARM_ATAG_DTB_COMPAT=y CONFIG_CPU_FREQ=y @@ -33,6 +34,7 @@ CONFIG_CAN_SUN4I=y # CONFIG_WIRELESS is not set CONFIG_DEVTMPFS=y CONFIG_DEVTMPFS_MOUNT=y +CONFIG_DMA_CMA=y CONFIG_BLK_DEV_SD=y CONFIG_ATA=y CONFIG_AHCI_SUNXI=y
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Petazzoni thomas.petazzoni@free-electrons.com
commit 56aeb07c914a616ab84357d34f8414a69b140cdf upstream.
MPP7 is currently muxed as "gpio", but this function doesn't exist for MPP7, only "gpo" is available. This causes the following error:
kirkwood-pinctrl f1010000.pin-controller: unsupported function gpio on pin mpp7 pinctrl core: failed to register map default (6): invalid type given kirkwood-pinctrl f1010000.pin-controller: error claiming hogs: -22 kirkwood-pinctrl f1010000.pin-controller: could not claim hogs: -22 kirkwood-pinctrl f1010000.pin-controller: unable to register pinctrl driver kirkwood-pinctrl: probe of f1010000.pin-controller failed with error -22
So the pinctrl driver is not probed, all device drivers (including the UART driver) do a -EPROBE_DEFER, and therefore the system doesn't really boot (well, it boots, but with no UART, and no devices that require pin-muxing).
Back when the Device Tree file for this board was introduced, the definition was already wrong. The pinctrl driver also always described as "gpo" this function for MPP7. However, between Linux 4.10 and 4.11, a hog pin failing to be muxed was turned from a simple warning to a hard error that caused the entire pinctrl driver probe to bail out. This is probably the result of commit 6118714275f0a ("pinctrl: core: Fix pinctrl_register_and_init() with pinctrl_enable()").
This commit fixes the Device Tree to use the proper "gpo" function for MPP7, which fixes the boot of OpenBlocks A7, which was broken since Linux 4.11.
Fixes: f24b56cbcd9d ("ARM: kirkwood: add support for OpenBlocks A7 platform") Signed-off-by: Thomas Petazzoni thomas.petazzoni@free-electrons.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Gregory CLEMENT gregory.clement@free-electrons.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/boot/dts/kirkwood-openblocks_a7.dts | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/arch/arm/boot/dts/kirkwood-openblocks_a7.dts +++ b/arch/arm/boot/dts/kirkwood-openblocks_a7.dts @@ -53,7 +53,8 @@ };
pinctrl: pin-controller@10000 { - pinctrl-0 = <&pmx_dip_switches &pmx_gpio_header>; + pinctrl-0 = <&pmx_dip_switches &pmx_gpio_header + &pmx_gpio_header_gpo>; pinctrl-names = "default";
pmx_uart0: pmx-uart0 { @@ -85,11 +86,16 @@ * ground. */ pmx_gpio_header: pmx-gpio-header { - marvell,pins = "mpp17", "mpp7", "mpp29", "mpp28", + marvell,pins = "mpp17", "mpp29", "mpp28", "mpp35", "mpp34", "mpp40"; marvell,function = "gpio"; };
+ pmx_gpio_header_gpo: pxm-gpio-header-gpo { + marvell,pins = "mpp7"; + marvell,function = "gpo"; + }; + pmx_gpio_init: pmx-init { marvell,pins = "mpp38"; marvell,function = "gpio";
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephane Grosjean s.grosjean@peak-system.com
commit d8a243af1a68395e07ac85384a2740d4134c67f4 upstream.
In some rare conditions when running one PEAK USB-FD interface over a non high-speed USB controller, one useless USB fragment might be sent. This patch fixes the way a USB command is fragmented when its length is greater than 64 bytes and when the underlying USB controller is not a high-speed one.
Signed-off-by: Stephane Grosjean s.grosjean@peak-system.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/can/usb/peak_usb/pcan_usb_fd.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
--- a/drivers/net/can/usb/peak_usb/pcan_usb_fd.c +++ b/drivers/net/can/usb/peak_usb/pcan_usb_fd.c @@ -184,7 +184,7 @@ static int pcan_usb_fd_send_cmd(struct p void *cmd_head = pcan_usb_fd_cmd_buffer(dev); int err = 0; u8 *packet_ptr; - int i, n = 1, packet_len; + int packet_len; ptrdiff_t cmd_len;
/* usb device unregistered? */ @@ -201,17 +201,13 @@ static int pcan_usb_fd_send_cmd(struct p }
packet_ptr = cmd_head; + packet_len = cmd_len;
/* firmware is not able to re-assemble 512 bytes buffer in full-speed */ - if ((dev->udev->speed != USB_SPEED_HIGH) && - (cmd_len > PCAN_UFD_LOSPD_PKT_SIZE)) { - packet_len = PCAN_UFD_LOSPD_PKT_SIZE; - n += cmd_len / packet_len; - } else { - packet_len = cmd_len; - } + if (unlikely(dev->udev->speed != USB_SPEED_HIGH)) + packet_len = min(packet_len, PCAN_UFD_LOSPD_PKT_SIZE);
- for (i = 0; i < n; i++) { + do { err = usb_bulk_msg(dev->udev, usb_sndbulkpipe(dev->udev, PCAN_USBPRO_EP_CMDOUT), @@ -224,7 +220,12 @@ static int pcan_usb_fd_send_cmd(struct p }
packet_ptr += packet_len; - } + cmd_len -= packet_len; + + if (cmd_len < PCAN_UFD_LOSPD_PKT_SIZE) + packet_len = cmd_len; + + } while (packet_len > 0);
return err; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
commit 8cb68751c115d176ec851ca56ecfbb411568c9e8 upstream.
If an invalid CAN frame is received, from a driver or from a tun interface, a Kernel warning is generated.
This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a kernel, bootet with panic_on_warn, does not panic. A printk seems to be more appropriate here.
Reported-by: syzbot+4386709c0c1284dca827@syzkaller.appspotmail.com Suggested-by: Dmitry Vyukov dvyukov@google.com Acked-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/can/af_can.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-)
--- a/net/can/af_can.c +++ b/net/can/af_can.c @@ -721,20 +721,16 @@ static int can_rcv(struct sk_buff *skb, { struct canfd_frame *cfd = (struct canfd_frame *)skb->data;
- if (WARN_ONCE(dev->type != ARPHRD_CAN || - skb->len != CAN_MTU || - cfd->len > CAN_MAX_DLEN, - "PF_CAN: dropped non conform CAN skbuf: " - "dev type %d, len %d, datalen %d\n", - dev->type, skb->len, cfd->len)) - goto drop; + if (unlikely(dev->type != ARPHRD_CAN || skb->len != CAN_MTU || + cfd->len > CAN_MAX_DLEN)) { + pr_warn_once("PF_CAN: dropped non conform CAN skbuf: dev type %d, len %d, datalen %d\n", + dev->type, skb->len, cfd->len); + kfree_skb(skb); + return NET_RX_DROP; + }
can_receive(skb, dev); return NET_RX_SUCCESS; - -drop: - kfree_skb(skb); - return NET_RX_DROP; }
static int canfd_rcv(struct sk_buff *skb, struct net_device *dev,
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
commit d4689846881d160a4d12a514e991a740bcb5d65a upstream.
If an invalid CANFD frame is received, from a driver or from a tun interface, a Kernel warning is generated.
This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a kernel, bootet with panic_on_warn, does not panic. A printk seems to be more appropriate here.
Reported-by: syzbot+e3b775f40babeff6e68b@syzkaller.appspotmail.com Suggested-by: Dmitry Vyukov dvyukov@google.com Acked-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/can/af_can.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-)
--- a/net/can/af_can.c +++ b/net/can/af_can.c @@ -738,20 +738,16 @@ static int canfd_rcv(struct sk_buff *skb { struct canfd_frame *cfd = (struct canfd_frame *)skb->data;
- if (WARN_ONCE(dev->type != ARPHRD_CAN || - skb->len != CANFD_MTU || - cfd->len > CANFD_MAX_DLEN, - "PF_CAN: dropped non conform CAN FD skbuf: " - "dev type %d, len %d, datalen %d\n", - dev->type, skb->len, cfd->len)) - goto drop; + if (unlikely(dev->type != ARPHRD_CAN || skb->len != CANFD_MTU || + cfd->len > CANFD_MAX_DLEN)) { + pr_warn_once("PF_CAN: dropped non conform CAN FD skbuf: dev type %d, len %d, datalen %d\n", + dev->type, skb->len, cfd->len); + kfree_skb(skb); + return NET_RX_DROP; + }
can_receive(skb, dev); return NET_RX_SUCCESS; - -drop: - kfree_skb(skb); - return NET_RX_DROP; }
/*
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeremy Compostella jeremy.compostella@intel.com
commit 89c6efa61f5709327ecfa24bff18e57a4e80c7fa upstream.
On a I2C_SMBUS_I2C_BLOCK_DATA read request, if data->block[0] is greater than I2C_SMBUS_BLOCK_MAX + 1, the underlying I2C driver writes data out of the msgbuf1 array boundary.
It is possible from a user application to run into that issue by calling the I2C_SMBUS ioctl with data.block[0] greater than I2C_SMBUS_BLOCK_MAX + 1.
This patch makes the code compliant with Documentation/i2c/dev-interface by raising an error when the requested size is larger than 32 bytes.
Call Trace: [<ffffffff8139f695>] dump_stack+0x67/0x92 [<ffffffff811802a4>] panic+0xc5/0x1eb [<ffffffff810ecb5f>] ? vprintk_default+0x1f/0x30 [<ffffffff817456d3>] ? i2cdev_ioctl_smbus+0x303/0x320 [<ffffffff8109a68b>] __stack_chk_fail+0x1b/0x20 [<ffffffff817456d3>] i2cdev_ioctl_smbus+0x303/0x320 [<ffffffff81745aed>] i2cdev_ioctl+0x4d/0x1e0 [<ffffffff811f761a>] do_vfs_ioctl+0x2ba/0x490 [<ffffffff81336e43>] ? security_file_ioctl+0x43/0x60 [<ffffffff811f7869>] SyS_ioctl+0x79/0x90 [<ffffffff81a22e97>] entry_SYSCALL_64_fastpath+0x12/0x6a
Signed-off-by: Jeremy Compostella jeremy.compostella@intel.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/i2c/i2c-core-smbus.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
--- a/drivers/i2c/i2c-core-smbus.c +++ b/drivers/i2c/i2c-core-smbus.c @@ -396,16 +396,17 @@ static s32 i2c_smbus_xfer_emulated(struc the underlying bus driver */ break; case I2C_SMBUS_I2C_BLOCK_DATA: + if (data->block[0] > I2C_SMBUS_BLOCK_MAX) { + dev_err(&adapter->dev, "Invalid block %s size %d\n", + read_write == I2C_SMBUS_READ ? "read" : "write", + data->block[0]); + return -EINVAL; + } + if (read_write == I2C_SMBUS_READ) { msg[1].len = data->block[0]; } else { msg[0].len = data->block[0] + 1; - if (msg[0].len > I2C_SMBUS_BLOCK_MAX + 1) { - dev_err(&adapter->dev, - "Invalid block write size %d\n", - data->block[0]); - return -EINVAL; - } for (i = 1; i <= data->block[0]; i++) msgbuf0[i] = data->block[i]; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xi Kangjie imxikangjie@gmail.com
commit 883d50f56d263f70fd73c0d96b09eb36c34e9305 upstream.
Since kernel 4.9, the thread_info has been moved into task_struct, no longer locates at the bottom of kernel stack.
See commits c65eacbe290b ("sched/core: Allow putting thread_info into task_struct") and 15f4eae70d36 ("x86: Move thread_info into task_struct").
Before fix: (gdb) set $current = $lx_current() (gdb) p $lx_thread_info($current) $1 = {flags = 1470918301} (gdb) p $current.thread_info $2 = {flags = 2147483648}
After fix: (gdb) p $lx_thread_info($current) $1 = {flags = 2147483648} (gdb) p $current.thread_info $2 = {flags = 2147483648}
Link: http://lkml.kernel.org/r/20180118210159.17223-1-imxikangjie@gmail.com Fixes: 15f4eae70d36 ("x86: Move thread_info into task_struct") Signed-off-by: Xi Kangjie imxikangjie@gmail.com Acked-by: Jan Kiszka jan.kiszka@siemens.com Acked-by: Kieran Bingham kbingham@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- scripts/gdb/linux/tasks.py | 2 ++ 1 file changed, 2 insertions(+)
--- a/scripts/gdb/linux/tasks.py +++ b/scripts/gdb/linux/tasks.py @@ -96,6 +96,8 @@ def get_thread_info(task): thread_info_addr = task.address + ia64_task_size thread_info = thread_info_addr.cast(thread_info_ptr_type) else: + if task.type.fields()[0].type == thread_info_type.get_type(): + return task['thread_info'] thread_info = task['stack'].cast(thread_info_ptr_type) return thread_info.dereference()
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Dobriyan adobriyan@gmail.com
commit 8bb2ee192e482c5d500df9f2b1b26a560bd3026f upstream.
do_task_stat() accesses IP and SP of a task without bumping reference count of a stack (which became an entity with independent lifetime at some point).
Steps to reproduce:
#include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/time.h> #include <sys/resource.h> #include <unistd.h> #include <sys/wait.h>
int main(void) { setrlimit(RLIMIT_CORE, &(struct rlimit){});
while (1) { char buf[64]; char buf2[4096]; pid_t pid; int fd;
pid = fork(); if (pid == 0) { *(volatile int *)0 = 0; }
snprintf(buf, sizeof(buf), "/proc/%u/stat", pid); fd = open(buf, O_RDONLY); read(fd, buf2, sizeof(buf2)); close(fd);
waitpid(pid, NULL, 0); } return 0; }
BUG: unable to handle kernel paging request at 0000000000003fd8 IP: do_task_stat+0x8b4/0xaf0 PGD 800000003d73e067 P4D 800000003d73e067 PUD 3d558067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 1417 Comm: a.out Not tainted 4.15.0-rc8-dirty #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc27 04/01/2014 RIP: 0010:do_task_stat+0x8b4/0xaf0 Call Trace: proc_single_show+0x43/0x70 seq_read+0xe6/0x3b0 __vfs_read+0x1e/0x120 vfs_read+0x84/0x110 SyS_read+0x3d/0xa0 entry_SYSCALL_64_fastpath+0x13/0x6c RIP: 0033:0x7f4d7928cba0 RSP: 002b:00007ffddb245158 EFLAGS: 00000246 Code: 03 b7 a0 01 00 00 4c 8b 4c 24 70 4c 8b 44 24 78 4c 89 74 24 18 e9 91 f9 ff ff f6 45 4d 02 0f 84 fd f7 ff ff 48 8b 45 40 48 89 ef <48> 8b 80 d8 3f 00 00 48 89 44 24 20 e8 9b 97 eb ff 48 89 44 24 RIP: do_task_stat+0x8b4/0xaf0 RSP: ffffc90000607cc8 CR2: 0000000000003fd8
John Ogness said: for my tests I added an else case to verify that the race is hit and correctly mitigated.
Link: http://lkml.kernel.org/r/20180116175054.GA11513@avx2 Signed-off-by: Alexey Dobriyan adobriyan@gmail.com Reported-by: "Kohli, Gaurav" gkohli@codeaurora.org Tested-by: John Ogness john.ogness@linutronix.de Cc: Peter Zijlstra a.p.zijlstra@chello.nl Cc: Ingo Molnar mingo@elte.hu Cc: Oleg Nesterov oleg@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/proc/array.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -424,8 +424,11 @@ static int do_task_stat(struct seq_file * safe because the task has stopped executing permanently. */ if (permitted && (task->flags & PF_DUMPCORE)) { - eip = KSTK_EIP(task); - esp = KSTK_ESP(task); + if (try_get_task_stack(task)) { + eip = KSTK_EIP(task); + esp = KSTK_ESP(task); + put_task_stack(task); + } } }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xinyu Lin xinyu0123@gmail.com
commit db5ff909798ef0099004ad50a0ff5fde92426fd1 upstream.
LITEON EP1 has the same timeout issues as CX1 series devices.
Revert max_sectors to the value of 1024.
Fixes: e0edc8c54646 ("libata: apply MAX_SEC_1024 to all CX1-JB*-HP devices") Signed-off-by: Xinyu Lin xinyu0123@gmail.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/ata/libata-core.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4439,6 +4439,7 @@ static const struct ata_blacklist_entry * https://bugzilla.kernel.org/show_bug.cgi?id=121671 */ { "LITEON CX1-JB*-HP", NULL, ATA_HORKAGE_MAX_SEC_1024 }, + { "LITEON EP1-*", NULL, ATA_HORKAGE_MAX_SEC_1024 },
/* Devices we expect to fail diagnostics */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hannes Reinecke hare@suse.de
commit c9f926000fe3b84135a81602a9f7e63a6a7898e2 upstream.
Handling CD-ROM devices from libsas is decidedly odd, as libata relies on SCSI EH to be started to figure out that no medium is present. So we cannot do asynchronous aborts for SATA devices.
Fixes: 909657615d9 ("scsi: libsas: allow async aborts") Signed-off-by: Hannes Reinecke hare@suse.com Reviewed-by: Christoph Hellwig hch@lst.de Tested-by: Yves-Alexis Perez corsac@debian.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/libsas/sas_scsi_host.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
--- a/drivers/scsi/libsas/sas_scsi_host.c +++ b/drivers/scsi/libsas/sas_scsi_host.c @@ -486,15 +486,28 @@ static int sas_queue_reset(struct domain
int sas_eh_abort_handler(struct scsi_cmnd *cmd) { - int res; + int res = TMF_RESP_FUNC_FAILED; struct sas_task *task = TO_SAS_TASK(cmd); struct Scsi_Host *host = cmd->device->host; + struct domain_device *dev = cmd_to_domain_dev(cmd); struct sas_internal *i = to_sas_internal(host->transportt); + unsigned long flags;
if (!i->dft->lldd_abort_task) return FAILED;
- res = i->dft->lldd_abort_task(task); + spin_lock_irqsave(host->host_lock, flags); + /* We cannot do async aborts for SATA devices */ + if (dev_is_sata(dev) && !host->host_eh_scheduled) { + spin_unlock_irqrestore(host->host_lock, flags); + return FAILED; + } + spin_unlock_irqrestore(host->host_lock, flags); + + if (task) + res = i->dft->lldd_abort_task(task); + else + SAS_DPRINTK("no task to abort\n"); if (res == TMF_RESP_FUNC_SUCC || res == TMF_RESP_FUNC_COMPLETE) return SUCCESS;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sergey Senozhatsky sergey.senozhatsky.work@gmail.com
commit 62635ea8c18f0f62df4cc58379e4f1d33afd5801 upstream.
show_workqueue_state() can print out a lot of messages while being in atomic context, e.g. sysrq-t -> show_workqueue_state(). If the console device is slow it may end up triggering NMI hard lockup watchdog.
Signed-off-by: Sergey Senozhatsky sergey.senozhatsky@gmail.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/workqueue.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -48,6 +48,7 @@ #include <linux/nodemask.h> #include <linux/moduleparam.h> #include <linux/uaccess.h> +#include <linux/nmi.h>
#include "workqueue_internal.h"
@@ -4479,6 +4480,12 @@ void show_workqueue_state(void) if (pwq->nr_active || !list_empty(&pwq->delayed_works)) show_pwq(pwq); spin_unlock_irqrestore(&pwq->pool->lock, flags); + /* + * We could be printing a lot from atomic context, e.g. + * sysrq-t -> show_workqueue_state(). Avoid triggering + * hard lockup. + */ + touch_nmi_watchdog(); } }
@@ -4506,6 +4513,12 @@ void show_workqueue_state(void) pr_cont("\n"); next_pool: spin_unlock_irqrestore(&pool->lock, flags); + /* + * We could be printing a lot from atomic context, e.g. + * sysrq-t -> show_workqueue_state(). Avoid triggering + * hard lockup. + */ + touch_nmi_watchdog(); }
rcu_read_unlock_sched();
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rob Clark rclark@redhat.com
commit 8a510a5c75261ba0ec39155326982aa786541e29 upstream.
It looks like in all cases 'struct vmw_connector_state' is used. But only in stdu connectors, was atomic_{duplicate,destroy}_state() properly subclassed. Leading to writes beyond the end of the allocated connector state block and all sorts of fun memory corruption related crashes.
Fixes: d7721ca71126 "drm/vmwgfx: Connector atomic state" Signed-off-by: Rob Clark rclark@redhat.com Reviewed-by: Thomas Hellstrom thellstrom@vmware.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c | 4 ++-- drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c @@ -266,8 +266,8 @@ static const struct drm_connector_funcs .set_property = vmw_du_connector_set_property, .destroy = vmw_ldu_connector_destroy, .reset = vmw_du_connector_reset, - .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state, - .atomic_destroy_state = drm_atomic_helper_connector_destroy_state, + .atomic_duplicate_state = vmw_du_connector_duplicate_state, + .atomic_destroy_state = vmw_du_connector_destroy_state, .atomic_set_property = vmw_du_connector_atomic_set_property, .atomic_get_property = vmw_du_connector_atomic_get_property, }; --- a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c @@ -420,8 +420,8 @@ static const struct drm_connector_funcs .set_property = vmw_du_connector_set_property, .destroy = vmw_sou_connector_destroy, .reset = vmw_du_connector_reset, - .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state, - .atomic_destroy_state = drm_atomic_helper_connector_destroy_state, + .atomic_duplicate_state = vmw_du_connector_duplicate_state, + .atomic_destroy_state = vmw_du_connector_destroy_state, .atomic_set_property = vmw_du_connector_atomic_set_property, .atomic_get_property = vmw_du_connector_atomic_get_property, };
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Thornber thornber@redhat.com
commit bc68d0a43560e950850fc69b58f0f8254b28f6d6 upstream.
When inserting a new key/value pair into a btree we walk down the spine of btree nodes performing the following 2 operations:
i) space for a new entry ii) adjusting the first key entry if the new key is lower than any in the node.
If the _root_ node is full, the function btree_split_beneath() allocates 2 new nodes, and redistibutes the root nodes entries between them. The root node is left with 2 entries corresponding to the 2 new nodes.
btree_split_beneath() then adjusts the spine to point to one of the two new children. This means the first key is never adjusted if the new key was lower, ie. operation (ii) gets missed out. This can result in the new key being 'lost' for a period; until another low valued key is inserted that will uncover it.
This is a serious bug, and quite hard to make trigger in normal use. A reproducing test case ("thin create devices-in-reverse-order") is available as part of the thin-provision-tools project: https://github.com/jthornber/thin-provisioning-tools/blob/master/functional-...
Fix the issue by changing btree_split_beneath() so it no longer adjusts the spine. Instead it unlocks both the new nodes, and lets the main loop in btree_insert_raw() relock the appropriate one and make any neccessary adjustments.
Reported-by: Monty Pavel monty_pavel@sina.com Signed-off-by: Joe Thornber thornber@redhat.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/persistent-data/dm-btree.c | 19 ++----------------- 1 file changed, 2 insertions(+), 17 deletions(-)
--- a/drivers/md/persistent-data/dm-btree.c +++ b/drivers/md/persistent-data/dm-btree.c @@ -683,23 +683,8 @@ static int btree_split_beneath(struct sh pn->keys[1] = rn->keys[0]; memcpy_disk(value_ptr(pn, 1), &val, sizeof(__le64));
- /* - * rejig the spine. This is ugly, since it knows too - * much about the spine - */ - if (s->nodes[0] != new_parent) { - unlock_block(s->info, s->nodes[0]); - s->nodes[0] = new_parent; - } - if (key < le64_to_cpu(rn->keys[0])) { - unlock_block(s->info, right); - s->nodes[1] = left; - } else { - unlock_block(s->info, left); - s->nodes[1] = right; - } - s->count = 2; - + unlock_block(s->info, left); + unlock_block(s->info, right); return 0; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dennis Yang dennisyang@qnap.com
commit 490ae017f54e55bde382d45ea24bddfb6d1a0aaf upstream.
For btree removal, there is a corner case that a single thread could takes 6 locks which is more than THIN_MAX_CONCURRENT_LOCKS(5) and leads to deadlock.
A btree removal might eventually call rebalance_children()->rebalance3() to rebalance entries of three neighbor child nodes when shadow_spine has already acquired two write locks. In rebalance3(), it tries to shadow and acquire the write locks of all three child nodes. However, shadowing a child node requires acquiring a read lock of the original child node and a write lock of the new block. Although the read lock will be released after block shadowing, shadowing the third child node in rebalance3() could still take the sixth lock. (2 write locks for shadow_spine + 2 write locks for the first two child nodes's shadow + 1 write lock for the last child node's shadow + 1 read lock for the last child node)
Signed-off-by: Dennis Yang dennisyang@qnap.com Acked-by: Joe Thornber thornber@redhat.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-thin-metadata.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/md/dm-thin-metadata.c +++ b/drivers/md/dm-thin-metadata.c @@ -80,10 +80,14 @@ #define SECTOR_TO_BLOCK_SHIFT 3
/* + * For btree insert: * 3 for btree insert + * 2 for btree lookup used within space map + * For btree remove: + * 2 for shadow spine + + * 4 for rebalance 3 child node */ -#define THIN_MAX_CONCURRENT_LOCKS 5 +#define THIN_MAX_CONCURRENT_LOCKS 6
/* This should be plenty */ #define SPACE_MAP_ROOT_SIZE 128
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
commit 717f4b1c52135f279112df82583e0c77e80f90de upstream.
Some asynchronous cipher implementations may use DMA. The stack may be mapped in the vmalloc area that doesn't support DMA. Therefore, the cipher request and initialization vector shouldn't be on the stack.
Fix this by allocating the request and iv with kmalloc.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-integrity.c | 49 ++++++++++++++++++++++++++++++++++------------ 1 file changed, 37 insertions(+), 12 deletions(-)
--- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -2558,7 +2558,8 @@ static int create_journal(struct dm_inte int r = 0; unsigned i; __u64 journal_pages, journal_desc_size, journal_tree_size; - unsigned char *crypt_data = NULL; + unsigned char *crypt_data = NULL, *crypt_iv = NULL; + struct skcipher_request *req = NULL;
ic->commit_ids[0] = cpu_to_le64(0x1111111111111111ULL); ic->commit_ids[1] = cpu_to_le64(0x2222222222222222ULL); @@ -2616,9 +2617,20 @@ static int create_journal(struct dm_inte
if (blocksize == 1) { struct scatterlist *sg; - SKCIPHER_REQUEST_ON_STACK(req, ic->journal_crypt); - unsigned char iv[ivsize]; - skcipher_request_set_tfm(req, ic->journal_crypt); + + req = skcipher_request_alloc(ic->journal_crypt, GFP_KERNEL); + if (!req) { + *error = "Could not allocate crypt request"; + r = -ENOMEM; + goto bad; + } + + crypt_iv = kmalloc(ivsize, GFP_KERNEL); + if (!crypt_iv) { + *error = "Could not allocate iv"; + r = -ENOMEM; + goto bad; + }
ic->journal_xor = dm_integrity_alloc_page_list(ic); if (!ic->journal_xor) { @@ -2640,9 +2652,9 @@ static int create_journal(struct dm_inte sg_set_buf(&sg[i], va, PAGE_SIZE); } sg_set_buf(&sg[i], &ic->commit_ids, sizeof ic->commit_ids); - memset(iv, 0x00, ivsize); + memset(crypt_iv, 0x00, ivsize);
- skcipher_request_set_crypt(req, sg, sg, PAGE_SIZE * ic->journal_pages + sizeof ic->commit_ids, iv); + skcipher_request_set_crypt(req, sg, sg, PAGE_SIZE * ic->journal_pages + sizeof ic->commit_ids, crypt_iv); init_completion(&comp.comp); comp.in_flight = (atomic_t)ATOMIC_INIT(1); if (do_crypt(true, req, &comp)) @@ -2658,10 +2670,22 @@ static int create_journal(struct dm_inte crypto_free_skcipher(ic->journal_crypt); ic->journal_crypt = NULL; } else { - SKCIPHER_REQUEST_ON_STACK(req, ic->journal_crypt); - unsigned char iv[ivsize]; unsigned crypt_len = roundup(ivsize, blocksize);
+ req = skcipher_request_alloc(ic->journal_crypt, GFP_KERNEL); + if (!req) { + *error = "Could not allocate crypt request"; + r = -ENOMEM; + goto bad; + } + + crypt_iv = kmalloc(ivsize, GFP_KERNEL); + if (!crypt_iv) { + *error = "Could not allocate iv"; + r = -ENOMEM; + goto bad; + } + crypt_data = kmalloc(crypt_len, GFP_KERNEL); if (!crypt_data) { *error = "Unable to allocate crypt data"; @@ -2669,8 +2693,6 @@ static int create_journal(struct dm_inte goto bad; }
- skcipher_request_set_tfm(req, ic->journal_crypt); - ic->journal_scatterlist = dm_integrity_alloc_journal_scatterlist(ic, ic->journal); if (!ic->journal_scatterlist) { *error = "Unable to allocate sg list"; @@ -2694,12 +2716,12 @@ static int create_journal(struct dm_inte struct skcipher_request *section_req; __u32 section_le = cpu_to_le32(i);
- memset(iv, 0x00, ivsize); + memset(crypt_iv, 0x00, ivsize); memset(crypt_data, 0x00, crypt_len); memcpy(crypt_data, §ion_le, min((size_t)crypt_len, sizeof(section_le)));
sg_init_one(&sg, crypt_data, crypt_len); - skcipher_request_set_crypt(req, &sg, &sg, crypt_len, iv); + skcipher_request_set_crypt(req, &sg, &sg, crypt_len, crypt_iv); init_completion(&comp.comp); comp.in_flight = (atomic_t)ATOMIC_INIT(1); if (do_crypt(true, req, &comp)) @@ -2757,6 +2779,9 @@ retest_commit_id: } bad: kfree(crypt_data); + kfree(crypt_iv); + skcipher_request_free(req); + return r; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Milan Broz gmazyland@gmail.com
commit 27c7003697fc2c78f965984aa224ef26cd6b2949 upstream.
If dm-crypt uses authenticated mode with separate MAC, there are two concatenated part of the key structure - key(s) for encryption and authentication key.
Add a missing check for authenticated key length. If this key length is smaller than actually provided key, dm-crypt now properly fails instead of crashing.
Fixes: ef43aa3806 ("dm crypt: add cryptographic data integrity protection (authenticated encryption)") Reported-by: Salah Coronya salahx@yahoo.com Signed-off-by: Milan Broz gmazyland@gmail.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-crypt.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -1954,10 +1954,15 @@ static int crypt_setkey(struct crypt_con /* Ignore extra keys (which are used for IV etc) */ subkey_size = crypt_subkey_size(cc);
- if (crypt_integrity_hmac(cc)) + if (crypt_integrity_hmac(cc)) { + if (subkey_size < cc->key_mac_size) + return -EINVAL; + crypt_copy_authenckey(cc->authenc_key, cc->key, subkey_size - cc->key_mac_size, cc->key_mac_size); + } + for (i = 0; i < cc->tfms_count; i++) { if (crypt_integrity_hmac(cc)) r = crypto_aead_setkey(cc->cipher_tfm.tfms_aead[i],
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ondrej Kozina okozina@redhat.com
commit dc94902bde1e158cd19c4deab208e5d6eb382a44 upstream.
Loading key via kernel keyring service erases the internal key copy immediately after we pass it in crypto layer. This is wrong because IV is initialized later and we use wrong key for the initialization (instead of real key there's just zeroed block).
The bug may cause data corruption if key is loaded via kernel keyring service first and later same crypt device is reactivated using exactly same key in hexbyte representation, or vice versa. The bug (and fix) affects only ciphers using following IVs: essiv, lmk and tcw.
Fixes: c538f6ec9f56 ("dm crypt: add ability to use keys from the kernel key retention service") Signed-off-by: Ondrej Kozina okozina@redhat.com Reviewed-by: Milan Broz gmazyland@gmail.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-crypt.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
--- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -2058,9 +2058,6 @@ static int crypt_set_keyring_key(struct
ret = crypt_setkey(cc);
- /* wipe the kernel key payload copy in each case */ - memset(cc->key, 0, cc->key_size * sizeof(u8)); - if (!ret) { set_bit(DM_CRYPT_KEY_VALID, &cc->flags); kzfree(cc->key_string); @@ -2528,6 +2525,10 @@ static int crypt_ctr_cipher(struct dm_ta } }
+ /* wipe the kernel key payload copy */ + if (cc->key_string) + memset(cc->key, 0, cc->key_size * sizeof(u8)); + return ret; }
@@ -2966,6 +2967,9 @@ static int crypt_message(struct dm_targe return ret; if (cc->iv_gen_ops && cc->iv_gen_ops->init) ret = cc->iv_gen_ops->init(cc); + /* wipe the kernel key payload copy */ + if (cc->key_string) + memset(cc->key, 0, cc->key_size * sizeof(u8)); return ret; } if (argc == 2 && !strcasecmp(argv[1], "wipe")) { @@ -3012,7 +3016,7 @@ static void crypt_io_hints(struct dm_tar
static struct target_type crypt_target = { .name = "crypt", - .version = {1, 18, 0}, + .version = {1, 18, 1}, .module = THIS_MODULE, .ctr = crypt_ctr, .dtr = crypt_dtr,
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wei Yongjun weiyongjun1@huawei.com
commit 3cc2e57c4beabcbbaa46e1ac6d77ca8276a4a42d upstream.
Fix to return error code -ENOMEM from the mempool_create_kmalloc_pool() error handling case instead of 0, as done elsewhere in this function.
Fixes: ef43aa38063a6 ("dm crypt: add cryptographic data integrity protection (authenticated encryption)") Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-crypt.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -2746,6 +2746,7 @@ static int crypt_ctr(struct dm_target *t cc->tag_pool_max_sectors * cc->on_disk_tag_size); if (!cc->tag_pool) { ti->error = "Cannot allocate integrity tags mempool"; + ret = -ENOMEM; goto bad; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Laura Abbott labbott@redhat.com
commit 91cfc88c66bf8ab95937606569670cf67fa73e09 upstream.
Commit bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") moved some parameters into a structure.
The structure was large enough to trigger the stack protection canary in sme_encrypt_kernel which doesn't work this early, causing reboots.
Mark sme_encrypt_kernel appropriately to not use the canary.
Fixes: bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") Signed-off-by: Laura Abbott labbott@redhat.com Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Ingo Molnar mingo@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/mm/mem_encrypt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -487,7 +487,7 @@ static unsigned long __init sme_pgtable_ return total; }
-void __init sme_encrypt_kernel(struct boot_params *bp) +void __init __nostackprotector sme_encrypt_kernel(struct boot_params *bp) { unsigned long workarea_start, workarea_end, workarea_len; unsigned long execute_start, execute_end, execute_len;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lorenzo Pieralisi lorenzo.pieralisi@arm.com
commit 86be89939d11a84800f66e2a283b915b704bf33d upstream.
The conversion of the alpha architecture PCI host bridge legacy IRQ mapping/swizzling to the new PCI host bridge map/swizzle hooks carried out through:
commit 0e4c2eeb758a ("alpha/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooks")
implies that IRQ for devices are now allocated through pci_assign_irq() function in pci_device_probe() that is called when a driver matching a device is found in order to probe the device through the device driver.
Alpha noname platforms required IRQ level programming to be executed in sio_fixup_irq_levels(), that is called in noname_init_pci(), a platform hook called within a subsys_initcall.
In noname_init_pci(), present IRQs are detected through sio_collect_irq_levels() that check the struct pci_dev->irq number to detect if an IRQ has been allocated for the device.
By the time sio_collect_irq_levels() is called, some devices may still have not a matching driver loaded to match them (eg loadable module) therefore their IRQ allocation is still pending - which means that sio_collect_irq_levels() does not programme the correct IRQ level for those devices, causing their IRQ handling to be broken when the device driver is actually loaded and the device is probed.
Fix the issue by adding code in the noname map_irq() function (noname_map_irq()) that, whilst mapping/swizzling the IRQ line, it also ensures that the correct IRQ level programming is executed at platform level, fixing the issue.
Fixes: 0e4c2eeb758a ("alpha/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooks") Reported-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Cc: Bjorn Helgaas bhelgaas@google.com Cc: Richard Henderson rth@twiddle.net Cc: Ivan Kokshaysky ink@jurassic.park.msu.ru Cc: Mikulas Patocka mpatocka@redhat.com Cc: Meelis Roos mroos@linux.ee Signed-off-by: Matt Turner mattst88@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/alpha/kernel/sys_sio.c | 35 +++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-)
--- a/arch/alpha/kernel/sys_sio.c +++ b/arch/alpha/kernel/sys_sio.c @@ -102,6 +102,15 @@ sio_pci_route(void) alpha_mv.sys.sio.route_tab); }
+static bool sio_pci_dev_irq_needs_level(const struct pci_dev *dev) +{ + if ((dev->class >> 16 == PCI_BASE_CLASS_BRIDGE) && + (dev->class >> 8 != PCI_CLASS_BRIDGE_PCMCIA)) + return false; + + return true; +} + static unsigned int __init sio_collect_irq_levels(void) { @@ -110,8 +119,7 @@ sio_collect_irq_levels(void)
/* Iterate through the devices, collecting IRQ levels. */ for_each_pci_dev(dev) { - if ((dev->class >> 16 == PCI_BASE_CLASS_BRIDGE) && - (dev->class >> 8 != PCI_CLASS_BRIDGE_PCMCIA)) + if (!sio_pci_dev_irq_needs_level(dev)) continue;
if (dev->irq) @@ -120,8 +128,7 @@ sio_collect_irq_levels(void) return level_bits; }
-static void __init -sio_fixup_irq_levels(unsigned int level_bits) +static void __sio_fixup_irq_levels(unsigned int level_bits, bool reset) { unsigned int old_level_bits;
@@ -139,12 +146,21 @@ sio_fixup_irq_levels(unsigned int level_ */ old_level_bits = inb(0x4d0) | (inb(0x4d1) << 8);
- level_bits |= (old_level_bits & 0x71ff); + if (reset) + old_level_bits &= 0x71ff; + + level_bits |= old_level_bits;
outb((level_bits >> 0) & 0xff, 0x4d0); outb((level_bits >> 8) & 0xff, 0x4d1); }
+static inline void +sio_fixup_irq_levels(unsigned int level_bits) +{ + __sio_fixup_irq_levels(level_bits, true); +} + static inline int noname_map_irq(const struct pci_dev *dev, u8 slot, u8 pin) { @@ -181,7 +197,14 @@ noname_map_irq(const struct pci_dev *dev const long min_idsel = 6, max_idsel = 14, irqs_per_slot = 5; int irq = COMMON_TABLE_LOOKUP, tmp; tmp = __kernel_extbl(alpha_mv.sys.sio.route_tab, irq); - return irq >= 0 ? tmp : -1; + + irq = irq >= 0 ? tmp : -1; + + /* Fixup IRQ level if an actual IRQ mapping is detected */ + if (sio_pci_dev_irq_needs_level(dev) && irq >= 0) + __sio_fixup_irq_levels(1 << irq, false); + + return irq; }
static inline int
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Hogan jhogan@kernel.org
commit c04de7b1ad645b61c141df8ca903ba0cc03a57f7 upstream.
Since commit 68923cdc2eb3 ("MIPS: CM: Add cluster & block args to mips_cm_lock_other()"), mips_smp_send_ipi_mask() has used mips_cm_lock_other_cpu() with each CPU number, rather than mips_cm_lock_other() with the first VPE in each core. Prior to r6, multicore multithreaded systems such as dual-core dual-thread interAptivs with CPU Idle enabled (e.g. MIPS Creator Ci40) results in mips_cm_lock_other() repeatedly hitting WARN_ON(vp != 0).
There doesn't appear to be anything fundamentally wrong about passing a non-zero VP/VPE number, even if it is a core's region that is locked into the other region before r6, so remove that particular WARN_ON().
Fixes: 68923cdc2eb3 ("MIPS: CM: Add cluster & block args to mips_cm_lock_other()") Signed-off-by: James Hogan jhogan@kernel.org Reviewed-by: Paul Burton paul.burton@mips.com Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17883/ Signed-off-by: Ralf Baechle ralf@linux-mips.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/kernel/mips-cm.c | 1 - 1 file changed, 1 deletion(-)
--- a/arch/mips/kernel/mips-cm.c +++ b/arch/mips/kernel/mips-cm.c @@ -292,7 +292,6 @@ void mips_cm_lock_other(unsigned int clu *this_cpu_ptr(&cm_core_lock_flags)); } else { WARN_ON(cluster != 0); - WARN_ON(vp != 0); WARN_ON(block != CM_GCR_Cx_OTHER_BLOCK_LOCAL);
/*
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Punit Agrawal punit.agrawal@arm.com
commit c507babf10ead4d5c8cca704539b170752a8ac84 upstream.
KVM only supports PMD hugepages at stage 2 but doesn't actually check that the provided hugepage memory pagesize is PMD_SIZE before populating stage 2 entries.
In cases where the backing hugepage size is smaller than PMD_SIZE (such as when using contiguous hugepages), KVM can end up creating stage 2 mappings that extend beyond the supplied memory.
Fix this by checking for the pagesize of userspace vma before creating PMD hugepage at stage 2.
Fixes: 66b3923a1a0f77a ("arm64: hugetlb: add support for PTE contiguous bit") Signed-off-by: Punit Agrawal punit.agrawal@arm.com Cc: Marc Zyngier marc.zyngier@arm.com Reviewed-by: Christoffer Dall christoffer.dall@linaro.org Signed-off-by: Christoffer Dall christoffer.dall@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- virt/kvm/arm/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/virt/kvm/arm/mmu.c +++ b/virt/kvm/arm/mmu.c @@ -1310,7 +1310,7 @@ static int user_mem_abort(struct kvm_vcp return -EFAULT; }
- if (is_vm_hugetlb_page(vma) && !logging_active) { + if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) { hugetlb = true; gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT; } else {
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Zyngier marc.zyngier@arm.com
commit acfb3b883f6d6a4b5d27ad7fdded11f6a09ae6dd upstream.
KVM doesn't follow the SMCCC when it comes to unimplemented calls, and inject an UNDEF instead of returning an error. Since firmware calls are now used for security mitigation, they are becoming more common, and the undef is counter productive.
Instead, let's follow the SMCCC which states that -1 must be returned to the caller when getting an unknown function number.
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Christoffer Dall christoffer.dall@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/kvm/handle_exit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -44,7 +44,7 @@ static int handle_hvc(struct kvm_vcpu *v
ret = kvm_psci_call(vcpu); if (ret < 0) { - kvm_inject_undefined(vcpu); + vcpu_set_reg(vcpu, 0, ~0UL); return 1; }
@@ -53,7 +53,7 @@ static int handle_hvc(struct kvm_vcpu *v
static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run) { - kvm_inject_undefined(vcpu); + vcpu_set_reg(vcpu, 0, ~0UL); return 1; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit 6f41c34d69eb005e7848716bbcafc979b35037d5 upstream.
The machine check idtentry uses an indirect branch directly from the low level code. This evades the speculation protection.
Replace it by a direct call into C code and issue the indirect call there so the compiler can apply the proper speculation protection.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by:Borislav Petkov bp@alien8.de Reviewed-by: David Woodhouse dwmw@amazon.co.uk Niced-by: Peter Zijlstra peterz@infradead.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801181626290.1847@nanos Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_64.S | 2 +- arch/x86/include/asm/traps.h | 1 + arch/x86/kernel/cpu/mcheck/mce.c | 5 +++++ 3 files changed, 7 insertions(+), 1 deletion(-)
--- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1258,7 +1258,7 @@ idtentry async_page_fault do_async_page_ #endif
#ifdef CONFIG_X86_MCE -idtentry machine_check has_error_code=0 paranoid=1 do_sym=*machine_check_vector(%rip) +idtentry machine_check do_mce has_error_code=0 paranoid=1 #endif
/* --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -88,6 +88,7 @@ dotraplinkage void do_simd_coprocessor_e #ifdef CONFIG_X86_32 dotraplinkage void do_iret_error(struct pt_regs *, long); #endif +dotraplinkage void do_mce(struct pt_regs *, long);
static inline int get_si_code(unsigned long condition) { --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -1788,6 +1788,11 @@ static void unexpected_machine_check(str void (*machine_check_vector)(struct pt_regs *, long error_code) = unexpected_machine_check;
+dotraplinkage void do_mce(struct pt_regs *regs, long error_code) +{ + machine_check_vector(regs, error_code); +} + /* * Called for each booted CPU to set up machine checks. * Must be called with preempt off:
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu mhiramat@kernel.org
commit 736e80a4213e9bbce40a7c050337047128b472ac upstream.
Introduce start/end markers of __x86_indirect_thunk_* functions. To make it easy, consolidate .text.__x86.indirect_thunk.* sections to one .text.__x86.indirect_thunk section and put it in the end of kernel text section and adds __indirect_thunk_start/end so that other subsystem (e.g. kprobes) can identify it.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: David Woodhouse dwmw@amazon.co.uk Cc: Andi Kleen ak@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Ananth N Mavinakayanahalli ananth@linux.vnet.ibm.com Cc: Arjan van de Ven arjan@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/151629206178.10241.6828804696410044771.stgit@devbo... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/nospec-branch.h | 3 +++ arch/x86/kernel/vmlinux.lds.S | 6 ++++++ arch/x86/lib/retpoline.S | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -194,6 +194,9 @@ enum spectre_v2_mitigation { SPECTRE_V2_IBRS, };
+extern char __indirect_thunk_start[]; +extern char __indirect_thunk_end[]; + /* * On VMEXIT we must ensure that no RSB predictions learned in the guest * can be followed in the host, by overwriting the RSB completely. Both --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -124,6 +124,12 @@ SECTIONS ASSERT(. - _entry_trampoline == PAGE_SIZE, "entry trampoline is too big"); #endif
+#ifdef CONFIG_RETPOLINE + __indirect_thunk_start = .; + *(.text.__x86.indirect_thunk) + __indirect_thunk_end = .; +#endif + /* End of text section */ _etext = .; } :text = 0x9090 --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -9,7 +9,7 @@ #include <asm/nospec-branch.h>
.macro THUNK reg - .section .text.__x86.indirect_thunk.\reg + .section .text.__x86.indirect_thunk
ENTRY(__x86_indirect_thunk_\reg) CFI_STARTPROC
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu mhiramat@kernel.org
commit c1804a236894ecc942da7dc6c5abe209e56cba93 upstream.
Mark __x86_indirect_thunk_* functions as blacklist for kprobes because those functions can be called from anywhere in the kernel including blacklist functions of kprobes.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: David Woodhouse dwmw@amazon.co.uk Cc: Andi Kleen ak@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Ananth N Mavinakayanahalli ananth@linux.vnet.ibm.com Cc: Arjan van de Ven arjan@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/151629209111.10241.5444852823378068683.stgit@devbo... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/lib/retpoline.S | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -25,7 +25,8 @@ ENDPROC(__x86_indirect_thunk_\reg) * than one per register with the correct names. So we do it * the simple and nasty way... */ -#define EXPORT_THUNK(reg) EXPORT_SYMBOL(__x86_indirect_thunk_ ## reg) +#define __EXPORT_THUNK(sym) _ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym) +#define EXPORT_THUNK(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg) #define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg)
GENERATE_THUNK(_ASM_AX)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu mhiramat@kernel.org
commit c86a32c09f8ced67971a2310e3b0dda4d1749007 upstream.
Since indirect jump instructions will be replaced by jump to __x86_indirect_thunk_*, those jmp instruction must be treated as an indirect jump. Since optprobe prohibits to optimize probes in the function which uses an indirect jump, it also needs to find out the function which jump to __x86_indirect_thunk_* and disable optimization.
Add a check that the jump target address is between the __indirect_thunk_start/end when optimizing kprobe.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: David Woodhouse dwmw@amazon.co.uk Cc: Andi Kleen ak@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Ananth N Mavinakayanahalli ananth@linux.vnet.ibm.com Cc: Arjan van de Ven arjan@linux.intel.com Cc: Greg Kroah-Hartman gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/151629212062.10241.6991266100233002273.stgit@devbo... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/kprobes/opt.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-)
--- a/arch/x86/kernel/kprobes/opt.c +++ b/arch/x86/kernel/kprobes/opt.c @@ -40,6 +40,7 @@ #include <asm/debugreg.h> #include <asm/set_memory.h> #include <asm/sections.h> +#include <asm/nospec-branch.h>
#include "common.h"
@@ -205,7 +206,7 @@ static int copy_optimized_instructions(u }
/* Check whether insn is indirect jump */ -static int insn_is_indirect_jump(struct insn *insn) +static int __insn_is_indirect_jump(struct insn *insn) { return ((insn->opcode.bytes[0] == 0xff && (X86_MODRM_REG(insn->modrm.value) & 6) == 4) || /* Jump */ @@ -239,6 +240,26 @@ static int insn_jump_into_range(struct i return (start <= target && target <= start + len); }
+static int insn_is_indirect_jump(struct insn *insn) +{ + int ret = __insn_is_indirect_jump(insn); + +#ifdef CONFIG_RETPOLINE + /* + * Jump to x86_indirect_thunk_* is treated as an indirect jump. + * Note that even with CONFIG_RETPOLINE=y, the kernel compiled with + * older gcc may use indirect jump. So we add this check instead of + * replace indirect-jump check. + */ + if (!ret) + ret = insn_jump_into_range(insn, + (unsigned long)__indirect_thunk_start, + (unsigned long)__indirect_thunk_end - + (unsigned long)__indirect_thunk_start); +#endif + return ret; +} + /* Decode whole function to ensure any instructions don't jump into target */ static int can_optimize(unsigned long paddr) {
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: zhenwei.pi zhenwei.pi@youruncloud.com
commit 98f0fceec7f84d80bc053e49e596088573086421 upstream.
In section <2. Runtime Cost>, fix wrong index.
Signed-off-by: zhenwei.pi zhenwei.pi@youruncloud.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: dave.hansen@linux.intel.com Link: https://lkml.kernel.org/r/1516237492-27739-1-git-send-email-zhenwei.pi@youru... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Documentation/x86/pti.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/Documentation/x86/pti.txt +++ b/Documentation/x86/pti.txt @@ -78,7 +78,7 @@ this protection comes at a cost: non-PTI SYSCALL entry code, so requires mapping fewer things into the userspace page tables. The downside is that stacks must be switched at entry time. - d. Global pages are disabled for all kernel structures not + c. Global pages are disabled for all kernel structures not mapped into both kernel and userspace page tables. This feature of the MMU allows different processes to share TLB entries mapping the kernel. Losing the feature means more
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andi Kleen ak@linux.intel.com
commit 3f7d875566d8e79c5e0b2c9a413e91b2c29e0854 upstream.
The generated assembler for the C fill RSB inline asm operations has several issues:
- The C code sets up the loop register, which is then immediately overwritten in __FILL_RETURN_BUFFER with the same value again.
- The C code also passes in the iteration count in another register, which is not used at all.
Remove these two unnecessary operations. Just rely on the single constant passed to the macro for the iterations.
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: David Woodhouse dwmw@amazon.co.uk Cc: dave.hansen@intel.com Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: arjan@linux.intel.com Link: https://lkml.kernel.org/r/20180117225328.15414-1-andi@firstfloor.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/nospec-branch.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -206,16 +206,17 @@ extern char __indirect_thunk_end[]; static inline void vmexit_fill_RSB(void) { #ifdef CONFIG_RETPOLINE - unsigned long loops = RSB_CLEAR_LOOPS / 2; + unsigned long loops;
asm volatile (ANNOTATE_NOSPEC_ALTERNATIVE ALTERNATIVE("jmp 910f", __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)), X86_FEATURE_RETPOLINE) "910:" - : "=&r" (loops), ASM_CALL_CONSTRAINT - : "r" (loops) : "memory" ); + : "=r" (loops), ASM_CALL_CONSTRAINT + : : "memory" ); #endif } + #endif /* __ASSEMBLY__ */ #endif /* __NOSPEC_BRANCH_H__ */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit f23d74f6c66c3697e032550eeef3f640391a3a7d upstream.
Some issues have been reported with the for loop in stop_this_cpu() that issues the 'wbinvd; hlt' sequence. Reverting this sequence to halt() has been shown to resolve the issue.
However, the wbinvd is needed when running with SME. The reason for the wbinvd is to prevent cache flush races between encrypted and non-encrypted entries that have the same physical address. This can occur when kexec'ing from memory encryption active to inactive or vice-versa. The important thing is to not have outside of kernel text memory references (such as stack usage), so the usage of the native_*() functions is needed since these expand as inline asm sequences. So instead of reverting the change, rework the sequence.
Move the wbinvd instruction outside of the for loop as native_wbinvd() and make its execution conditional on X86_FEATURE_SME. In the for loop, change the asm 'wbinvd; hlt' sequence back to a halt sequence but use the native_halt() call.
Fixes: bba4ed011a52 ("x86/mm, kexec: Allow kexec to be used with SME") Reported-by: Dave Young dyoung@redhat.com Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Dave Young dyoung@redhat.com Cc: Juergen Gross jgross@suse.com Cc: Tony Luck tony.luck@intel.com Cc: Yu Chen yu.c.chen@intel.com Cc: Baoquan He bhe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: kexec@lists.infradead.org Cc: ebiederm@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: Rui Zhang rui.zhang@intel.com Cc: Arjan van de Ven arjan@linux.intel.com Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Dan Williams dan.j.williams@intel.com Link: https://lkml.kernel.org/r/20180117234141.21184.44067.stgit@tlendack-t1.amdof... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/process.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-)
--- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -380,19 +380,24 @@ void stop_this_cpu(void *dummy) disable_local_APIC(); mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
+ /* + * Use wbinvd on processors that support SME. This provides support + * for performing a successful kexec when going from SME inactive + * to SME active (or vice-versa). The cache must be cleared so that + * if there are entries with the same physical address, both with and + * without the encryption bit, they don't race each other when flushed + * and potentially end up with the wrong entry being committed to + * memory. + */ + if (boot_cpu_has(X86_FEATURE_SME)) + native_wbinvd(); for (;;) { /* - * Use wbinvd followed by hlt to stop the processor. This - * provides support for kexec on a processor that supports - * SME. With kexec, going from SME inactive to SME active - * requires clearing cache entries so that addresses without - * the encryption bit set don't corrupt the same physical - * address that has the encryption bit set when caches are - * flushed. To achieve this a wbinvd is performed followed by - * a hlt. Even if the processor is not in the kexec/SME - * scenario this only adds a wbinvd to a halting processor. + * Use native_halt() so that memory contents don't change + * (stack usage and variables) after possibly issuing the + * native_wbinvd() above. */ - asm volatile("wbinvd; hlt" : : : "memory"); + native_halt(); } }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kirill A. Shutemov kirill.shutemov@linux.intel.com
commit 0d665e7b109d512b7cae3ccef6e8654714887844 upstream.
Tetsuo reported random crashes under memory pressure on 32-bit x86 system and tracked down to change that introduced page_vma_mapped_walk().
The root cause of the issue is the faulty pointer math in check_pte(). As ->pte may point to an arbitrary page we have to check that they are belong to the section before doing math. Otherwise it may lead to weird results.
It wasn't noticed until now as mem_map[] is virtually contiguous on flatmem or vmemmap sparsemem. Pointer arithmetic just works against all 'struct page' pointers. But with classic sparsemem, it doesn't because each section memap is allocated separately and so consecutive pfns crossing two sections might have struct pages at completely unrelated addresses.
Let's restructure code a bit and replace pointer arithmetic with operations on pfns.
Signed-off-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Reported-and-tested-by: Tetsuo Handa penguin-kernel@i-love.sakura.ne.jp Acked-by: Michal Hocko mhocko@suse.com Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()") Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/swapops.h | 21 ++++++++++++++++ mm/page_vma_mapped.c | 63 ++++++++++++++++++++++++++++-------------------- 2 files changed, 59 insertions(+), 25 deletions(-)
--- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -124,6 +124,11 @@ static inline bool is_write_device_priva return unlikely(swp_type(entry) == SWP_DEVICE_WRITE); }
+static inline unsigned long device_private_entry_to_pfn(swp_entry_t entry) +{ + return swp_offset(entry); +} + static inline struct page *device_private_entry_to_page(swp_entry_t entry) { return pfn_to_page(swp_offset(entry)); @@ -154,6 +159,11 @@ static inline bool is_write_device_priva return false; }
+static inline unsigned long device_private_entry_to_pfn(swp_entry_t entry) +{ + return 0; +} + static inline struct page *device_private_entry_to_page(swp_entry_t entry) { return NULL; @@ -189,6 +199,11 @@ static inline int is_write_migration_ent return unlikely(swp_type(entry) == SWP_MIGRATION_WRITE); }
+static inline unsigned long migration_entry_to_pfn(swp_entry_t entry) +{ + return swp_offset(entry); +} + static inline struct page *migration_entry_to_page(swp_entry_t entry) { struct page *p = pfn_to_page(swp_offset(entry)); @@ -218,6 +233,12 @@ static inline int is_migration_entry(swp { return 0; } + +static inline unsigned long migration_entry_to_pfn(swp_entry_t entry) +{ + return 0; +} + static inline struct page *migration_entry_to_page(swp_entry_t entry) { return NULL; --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -30,10 +30,29 @@ static bool map_pte(struct page_vma_mapp return true; }
+/** + * check_pte - check if @pvmw->page is mapped at the @pvmw->pte + * + * page_vma_mapped_walk() found a place where @pvmw->page is *potentially* + * mapped. check_pte() has to validate this. + * + * @pvmw->pte may point to empty PTE, swap PTE or PTE pointing to arbitrary + * page. + * + * If PVMW_MIGRATION flag is set, returns true if @pvmw->pte contains migration + * entry that points to @pvmw->page or any subpage in case of THP. + * + * If PVMW_MIGRATION flag is not set, returns true if @pvmw->pte points to + * @pvmw->page or any subpage in case of THP. + * + * Otherwise, return false. + * + */ static bool check_pte(struct page_vma_mapped_walk *pvmw) { + unsigned long pfn; + if (pvmw->flags & PVMW_MIGRATION) { -#ifdef CONFIG_MIGRATION swp_entry_t entry; if (!is_swap_pte(*pvmw->pte)) return false; @@ -41,37 +60,31 @@ static bool check_pte(struct page_vma_ma
if (!is_migration_entry(entry)) return false; - if (migration_entry_to_page(entry) - pvmw->page >= - hpage_nr_pages(pvmw->page)) { - return false; - } - if (migration_entry_to_page(entry) < pvmw->page) - return false; -#else - WARN_ON_ONCE(1); -#endif - } else { - if (is_swap_pte(*pvmw->pte)) { - swp_entry_t entry;
- entry = pte_to_swp_entry(*pvmw->pte); - if (is_device_private_entry(entry) && - device_private_entry_to_page(entry) == pvmw->page) - return true; - } + pfn = migration_entry_to_pfn(entry); + } else if (is_swap_pte(*pvmw->pte)) { + swp_entry_t entry;
- if (!pte_present(*pvmw->pte)) + /* Handle un-addressable ZONE_DEVICE memory */ + entry = pte_to_swp_entry(*pvmw->pte); + if (!is_device_private_entry(entry)) return false;
- /* THP can be referenced by any subpage */ - if (pte_page(*pvmw->pte) - pvmw->page >= - hpage_nr_pages(pvmw->page)) { - return false; - } - if (pte_page(*pvmw->pte) < pvmw->page) + pfn = device_private_entry_to_pfn(entry); + } else { + if (!pte_present(*pvmw->pte)) return false; + + pfn = pte_pfn(*pvmw->pte); }
+ if (pfn < page_to_pfn(pvmw->page)) + return false; + + /* THP can be referenced by any subpage */ + if (pfn - page_to_pfn(pvmw->page) >= hpage_nr_pages(pvmw->page)) + return false; + return true; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yan Markman ymarkman@marvell.com
commit e749aca84b10f3987b2ee1f76e0c7d8aacc5653c upstream.
Short fragmented packets may never be sent by the hardware when padding is disabled. This patch stop modifying the GMAC padding bits, to leave them to their reset value (disabled).
Fixes: 3919357fb0bb ("net: mvpp2: initialize the GMAC when using a port") Signed-off-by: Yan Markman ymarkman@marvell.com [Antoine: commit message] Signed-off-by: Antoine Tenart antoine.tenart@free-electrons.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/ethernet/marvell/mvpp2.c | 9 --------- 1 file changed, 9 deletions(-)
--- a/drivers/net/ethernet/marvell/mvpp2.c +++ b/drivers/net/ethernet/marvell/mvpp2.c @@ -4552,11 +4552,6 @@ static void mvpp2_port_mii_gmac_configur MVPP22_CTRL4_QSGMII_BYPASS_ACTIVE; val &= ~MVPP22_CTRL4_EXT_PIN_GMII_SEL; writel(val, port->base + MVPP22_GMAC_CTRL_4_REG); - - val = readl(port->base + MVPP2_GMAC_CTRL_2_REG); - val |= MVPP2_GMAC_DISABLE_PADDING; - val &= ~MVPP2_GMAC_FLOW_CTRL_MASK; - writel(val, port->base + MVPP2_GMAC_CTRL_2_REG); } else if (phy_interface_mode_is_rgmii(port->phy_interface)) { val = readl(port->base + MVPP22_GMAC_CTRL_4_REG); val |= MVPP22_CTRL4_EXT_PIN_GMII_SEL | @@ -4564,10 +4559,6 @@ static void mvpp2_port_mii_gmac_configur MVPP22_CTRL4_QSGMII_BYPASS_ACTIVE; val &= ~MVPP22_CTRL4_DP_CLK_SEL; writel(val, port->base + MVPP22_GMAC_CTRL_4_REG); - - val = readl(port->base + MVPP2_GMAC_CTRL_2_REG); - val &= ~MVPP2_GMAC_DISABLE_PADDING; - writel(val, port->base + MVPP2_GMAC_CTRL_2_REG); }
/* The port is connected to a copper PHY */
On Mon, Jan 22, 2018 at 09:44:40AM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.15 release. There are 89 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Note: This is for v4.14.14-91-gf7e703b.
Build results: total: 145 pass: 145 fail: 0 Qemu test results: total: 126 pass: 126 fail: 0
Details are available at http://kerneltests.org/builders.
Guenter
On Mon, Jan 22, 2018 at 11:10:20AM -0800, Guenter Roeck wrote:
On Mon, Jan 22, 2018 at 09:44:40AM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.15 release. There are 89 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Note: This is for v4.14.14-91-gf7e703b.
Build results: total: 145 pass: 145 fail: 0 Qemu test results: total: 126 pass: 126 fail: 0
Details are available at http://kerneltests.org/builders.
Thanks for testing all of these and letting me know.
greg k-h
On 22 January 2018 at 14:14, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.14.15 release. There are 89 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 24 08:39:25 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.15-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm and x86_64.
Summary ------------------------------------------------------------------------
kernel: 4.14.15-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.14.y git commit: f7e703b5b8e7cef9b95a462a9a2bef85bc40d16a git describe: v4.14.14-91-gf7e703b5b8e7 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.14-oe/build/v4.14.14-91...
No regressions (compared to build v4.14.14-90-g9554eb9fb784)
Boards, architectures and test suites: -------------------------------------
hi6220-hikey - arm64 * boot - pass: 20 * kselftest - skip: 16, pass: 46 * libhugetlbfs - skip: 1, pass: 90 * ltp-cap_bounds-tests - pass: 2 * ltp-containers-tests - pass: 64 * ltp-fcntl-locktests-tests - pass: 2 * ltp-filecaps-tests - pass: 2 * ltp-fs-tests - pass: 60 * ltp-fs_bind-tests - pass: 2 * ltp-fs_perms_simple-tests - pass: 19 * ltp-fsx-tests - pass: 2 * ltp-hugetlb-tests - skip: 1, pass: 21 * ltp-io-tests - pass: 3 * ltp-ipc-tests - pass: 9 * ltp-math-tests - pass: 11 * ltp-nptl-tests - pass: 2 * ltp-pty-tests - pass: 4 * ltp-sched-tests - pass: 14 * ltp-securebits-tests - pass: 4 * ltp-syscalls-tests - skip: 121, pass: 983 * ltp-timers-tests - pass: 12
juno-r2 - arm64 * boot - pass: 20 * kselftest - skip: 16, pass: 46 * libhugetlbfs - skip: 1, pass: 90 * ltp-cap_bounds-tests - pass: 2 * ltp-containers-tests - pass: 64 * ltp-fcntl-locktests-tests - pass: 2 * ltp-filecaps-tests - pass: 2 * ltp-fs-tests - pass: 60 * ltp-fs_bind-tests - pass: 2 * ltp-fs_perms_simple-tests - pass: 19 * ltp-fsx-tests - pass: 2 * ltp-hugetlb-tests - pass: 22 * ltp-io-tests - pass: 3 * ltp-ipc-tests - pass: 9 * ltp-math-tests - pass: 11 * ltp-nptl-tests - pass: 2 * ltp-pty-tests - pass: 4 * ltp-sched-tests - pass: 14 * ltp-securebits-tests - pass: 4 * ltp-syscalls-tests - skip: 121, pass: 987 * ltp-timers-tests - pass: 12
x15 - arm * boot - pass: 20 * kselftest - skip: 18, pass: 43 * libhugetlbfs - skip: 1, pass: 87 * ltp-cap_bounds-tests - pass: 2 * ltp-containers-tests - pass: 64 * ltp-fcntl-locktests-tests - pass: 2 * ltp-filecaps-tests - pass: 2 * ltp-fs-tests - pass: 60 * ltp-fs_bind-tests - pass: 2 * ltp-fs_perms_simple-tests - pass: 19 * ltp-fsx-tests - pass: 2 * ltp-hugetlb-tests - skip: 2, pass: 20 * ltp-io-tests - pass: 3 * ltp-ipc-tests - pass: 9 * ltp-math-tests - pass: 11 * ltp-nptl-tests - pass: 2 * ltp-pty-tests - pass: 4 * ltp-sched-tests - skip: 1, pass: 13 * ltp-securebits-tests - pass: 4 * ltp-syscalls-tests - skip: 66, pass: 1037 * ltp-timers-tests - pass: 12
x86_64 * boot - pass: 20 * kselftest - fail: 1, skip: 17, pass: 57 * libhugetlbfs - skip: 1, pass: 89 * ltp-cap_bounds-tests - pass: 2 * ltp-containers-tests - pass: 64 * ltp-fcntl-locktests-tests - pass: 2 * ltp-filecaps-tests - pass: 2 * ltp-fs-tests - skip: 1, pass: 61 * ltp-fs_bind-tests - pass: 2 * ltp-fs_perms_simple-tests - pass: 19 * ltp-fsx-tests - pass: 2 * ltp-hugetlb-tests - pass: 22 * ltp-io-tests - pass: 3 * ltp-ipc-tests - pass: 9 * ltp-math-tests - pass: 11 * ltp-nptl-tests - pass: 2 * ltp-pty-tests - pass: 4 * ltp-sched-tests - skip: 1, pass: 9 * ltp-securebits-tests - pass: 4 * ltp-syscalls-tests - skip: 116, pass: 1016 * ltp-timers-tests - pass: 12
Documentation - https://collaborate.linaro.org/display/LKFT/Email+Reports Tested-by: Naresh Kamboju naresh.kamboju@linaro.org
On 01/22/2018 01:44 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.15 release. There are 89 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 24 08:39:25 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.15-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org