This is the start of the stable review cycle for the 4.18.6 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Sep 5 16:56:53 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.18.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.18.6-rc1
Arnd Bergmann arnd@arndb.de x86: kvm: avoid unused variable warning
Jann Horn jannh@google.com x86/dumpstack: Don't dump kernel memory based on usermode RIP
Scott Bauer scott.bauer@intel.com cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status
Vincent Whitchurch vincent.whitchurch@axis.com watchdog: Mark watchdog touch functions as notrace
H. Nikolaus Schaller hns@goldelico.com power: generic-adc-battery: check for duplicate properties copied from iio channels
H. Nikolaus Schaller hns@goldelico.com power: generic-adc-battery: fix out-of-bounds write when copying channel properties
Dan Carpenter dan.carpenter@oracle.com PM / clk: signedness bug in of_pm_clk_add_clks()
Gustavo A. R. Silva gustavo@embeddedor.com clk: npcm7xx: fix memory allocation
Alberto Panizzo alberto@amarulasolutions.com clk: rockchip: fix clk_i2sout parent selection bits on rk3399
Abhishek Sahu absahu@codeaurora.org mtd: rawnand: qcom: wait for desc completion in all BAM channels
Daniel Mack daniel@zonque.org mtd: rawnand: marvell: add suspend and resume hooks
Boris Brezillon boris.brezillon@bootlin.com mtd: rawnand: fsmc: Stop using chip->read_buf()
Boris Brezillon boris.brezillon@bootlin.com mtd: rawnand: hynix: Use ->exec_op() in hynix_nand_reg_write_op()
Mike Christie mchristi@redhat.com iscsi target: fix session creation failure handling
Bart Van Assche bart.vanassche@wdc.com scsi: core: Avoid that SCSI device removal through sysfs triggers a deadlock
Bart Van Assche bart.vanassche@wdc.com scsi: sysfs: Introduce sysfs_{un,}break_active_protection()
Bart Van Assche bart.vanassche@wdc.com scsi: mpt3sas: Fix _transport_smp_handler() error path
Sreekanth Reddy sreekanth.reddy@broadcom.com scsi: mpt3sas: Fix calltrace observed while running IO & reset
Tomas Winkler tomas.winkler@intel.com tpm: separate cmd_ready/go_idle from runtime_pm
Ricardo Schwarzmeier Ricardo.Schwarzmeier@infineon.com tpm: Return the actual size when receiving an unsupported command
Paul Burton paul.burton@mips.com MIPS: lib: Provide MIPS64r6 __multi3() for GCC < 7
Huacai Chen chenhc@lemote.com MIPS: Change definition of cpu_relax() for Loongson-3
Paul Burton paul.burton@mips.com MIPS: Always use -march=<arch>, not -<arch> shortcuts
Matt Redfearn matt.redfearn@mips.com MIPS: memset.S: Fix byte_fixup for MIPSr6
Maciej W. Rozycki macro@mips.com MIPS: Correct the 64-bit DSP accumulator register size
Masami Hiramatsu mhiramat@kernel.org kprobes: Make list and blacklist root user read only
Masami Hiramatsu mhiramat@kernel.org kprobes/arm: Fix %p uses in error messages
Masami Hiramatsu mhiramat@kernel.org kprobes: Replace %p with other pointer types
Masami Hiramatsu mhiramat@kernel.org kprobes: Show blacklist addresses as same as kallsyms does
Philipp Rudo prudo@linux.ibm.com s390/purgatory: Add missing FORCE to Makefile targets
Philipp Rudo prudo@linux.ibm.com s390/purgatory: Fix crash with expoline enabled
Sebastian Ott sebott@linux.ibm.com s390/pci: fix out of bounds access during irq setup
Martin Schwidefsky schwidefsky@de.ibm.com s390/numa: move initial setup of node_to_cpumask_map
Julian Wiedmann jwi@linux.ibm.com s390/qdio: reset old sbal_state flags
Martin Schwidefsky schwidefsky@de.ibm.com s390: fix br_r1_trampoline for machines without exrl
Martin Schwidefsky schwidefsky@de.ibm.com s390/lib: use expoline for all bcr instructions
Gerald Schaefer gerald.schaefer@de.ibm.com s390/mm: fix addressing exception after suspend/resume
Ben Hutchings ben@decadent.org.uk x86: Allow generating user-space headers without a compiler
Jann Horn jannh@google.com x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()
Gustavo A. R. Silva gustavo@embeddedor.com hwmon: (nct6775) Fix potential Spectre v1
Andi Kleen ak@linux.intel.com x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+
Andi Kleen ak@linux.intel.com x86/spectre: Add missing family 6 check to microcode check
Nick Desaulniers ndesaulniers@google.com x86/irqflags: Mark native_restore_fl extern inline
Andy Lutomirski luto@kernel.org x86/nmi: Fix NMI uaccess race against CR3 switching
Samuel Neves sneves@dei.uc.pt x86/vdso: Fix lsl operand order
Himanshu Madhani himanshu.madhani@cavium.com scsi: qla2xxx: Fix stalled relogin
Dan Carpenter dan.carpenter@oracle.com pinctrl: freescale: off by one in imx1_pinconf_group_dbg_show()
Johan Hovold johan@kernel.org soc: qcom: rmtfs-mem: fix memleak in probe error paths
Ajit Pandey ajit.pandey@cirrus.com ASoC: wm_adsp: Correct DSP pointer for preloader control
Gustavo A. R. Silva gustavo@embeddedor.com ASoC: sirf: Fix potential NULL pointer dereference
Takashi Iwai tiwai@suse.de ASoC: zte: Fix incorrect PCM format bit usages
Jerome Brunet jbrunet@baylibre.com ASoC: dpcm: don't merge format from invalid codec dai
Michael Buesch m@bues.ch b43/leds: Ensure NUL-termination of LED name string
Michael Buesch m@bues.ch b43legacy/leds: Ensure NUL-termination of LED name string
Mikulas Patocka mpatocka@redhat.com udl-kms: avoid division
Mikulas Patocka mpatocka@redhat.com udl-kms: fix crash due to uninitialized memory
Mikulas Patocka mpatocka@redhat.com udl-kms: handle allocation failure
Mikulas Patocka mpatocka@redhat.com udl-kms: change down_interruptible to down
Bart Van Assche bart.vanassche@wdc.com lib/vsprintf: Do not handle %pO[^F] as %px
Kirill Tkhai ktkhai@virtuozzo.com fuse: Add missed unlock_page() to fuse_readpages_fill()
Miklos Szeredi mszeredi@redhat.com fuse: Fix oops at process_init_reply()
Miklos Szeredi mszeredi@redhat.com fuse: umount should wait for all requests
Miklos Szeredi mszeredi@redhat.com fuse: fix unlocked access to processing queue
Miklos Szeredi mszeredi@redhat.com fuse: fix double request_end()
Miklos Szeredi mszeredi@redhat.com fuse: fix initial parallel dirops
Andrey Ryabinin aryabinin@virtuozzo.com fuse: Don't access pipe->buffers without pipe_lock()
Thomas Gleixner tglx@xxxxxxxxxxxxx KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled
Paolo Bonzini pbonzini@redhat.com KVM: x86: ensure all MSRs can always be KVM_GET/SET_MSR'd
Rian Hunter rian@alum.mit.edu x86/process: Re-export start_thread()
Andy Lutomirski luto@kernel.org x86/vdso: Fix vDSO build if a retpoline is emitted
Vlastimil Babka vbabka@suse.cz x86/speculation/l1tf: Suggest what to do on systems with too much RAM
Vlastimil Babka vbabka@suse.cz x86/speculation/l1tf: Fix off-by-one error when warning that system has too much RAM
Vlastimil Babka vbabka@suse.cz x86/speculation/l1tf: Fix overflow in l1tf_pfn_limit() on 32bit
Peter Zijlstra peterz@infradead.org mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE
Takashi Iwai tiwai@suse.de platform/x86: ideapad-laptop: Apply no_hw_rfkill to Y20-15IKBM, too
Kees Cook keescook@chromium.org platform/x86: wmi: Do not mix pages and kmalloc
Paulo Zanoni paulo.r.zanoni@intel.com x86/gpu: reserve ICL's graphics stolen memory
Michal Wnukowski wnukowski@google.com nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event
Wang Shilong wshilong@ddn.com ext4: fix race when setting the bitmap corrupted flag
Eric Sandeen sandeen@redhat.com ext4: reset error code in ext4_find_entry in fallback
Arnd Bergmann arnd@arndb.de ext4: sysfs: print ext4_super_block fields as little-endian
Wang Shilong wshilong@ddn.com ext4: use ext4_warning() for sb_getblk failure
Theodore Ts'o tytso@mit.edu ext4: check for NUL characters in extended attribute's name
Prasad Sodagudi psodagud@codeaurora.org stop_machine: Atomically queue and wake stopper threads
Peter Zijlstra peterz@infradead.org stop_machine: Reflow cpu_stop_queue_two_works()
Thomas Richter tmricht@linux.ibm.com perf kvm: Fix subcommands on s390
Claudio Imbrenda imbrenda@linux.vnet.ibm.com s390/kvm: fix deadlock when killed by oom
Punit Agrawal punit.agrawal@arm.com KVM: arm/arm64: Skip updating PTE entry if no change
Punit Agrawal punit.agrawal@arm.com KVM: arm/arm64: Skip updating PMD entry if no change
Christoffer Dall christoffer.dall@arm.com KVM: arm/arm64: Fix lost IRQs from emulated physcial timer when blocked
Christoffer Dall christoffer.dall@arm.com KVM: arm/arm64: Fix potential loss of ptimer interrupts
Huibin Hong huibin.hong@rock-chips.com arm64: dts: rockchip: corrected uart1 clock-names for rk3328
Greg Hackmann ghackmann@android.com arm64: mm: check for upper PAGE_SHIFT bits in pfn_valid()
Suzuki K Poulose suzuki.poulose@arm.com arm64: Handle mismatched cache type
Suzuki K Poulose suzuki.poulose@arm.com arm64: Fix mismatched cache line size detection
Masami Hiramatsu mhiramat@kernel.org kprobes/arm64: Fix %p uses in error messages
Petr Mladek pmladek@suse.com printk/nmi: Prevent deadlock when accessing the main log buffer in NMI
Petr Mladek pmladek@suse.com printk: Create helper function to queue deferred console handling
Petr Mladek pmladek@suse.com printk: Split the code for storing a message into the log buffer
Vivek Gautam vivek.gautam@codeaurora.org iommu/arm-smmu: Error out only if not enough context interrupts
Charles Keepax ckeepax@opensource.cirrus.com regulator: arizona-ldo1: Use correct device to get enable GPIO
Daniel Borkmann daniel@iogearbox.net bpf, arm32: fix stack var offset in jit
Michael Larabel michael@phoronix.com hwmon: (k10temp) 27C Offset needed for Threadripper2
Filipe Manana fdmanana@suse.com Btrfs: send, fix incorrect file layout after hole punching beyond eof
Filipe Manana fdmanana@suse.com Btrfs: fix send failure when root has deleted files still open
Josef Bacik jbacik@fb.com Btrfs: fix btrfs_write_inode vs delayed iput deadlock
Filipe Manana fdmanana@suse.com Btrfs: fix mount failure after fsync due to hard link recreation
Josef Bacik josef@toxicpanda.com btrfs: don't leak ret from do_chunk_alloc
Ethan Lien ethanlien@synology.com btrfs: use correct compare function of dirty_metadata_bytes
Steve French stfrench@microsoft.com smb3: fill in statfs fsid and correct namelen
Steve French stfrench@microsoft.com smb3: don't request leases in symlink creation and query
Steve French stfrench@microsoft.com smb3: Do not send SMB3 SET_INFO if nothing changed
Steve French stfrench@microsoft.com smb3: enumerating snapshots was leaving part of the data off end
Nicholas Mc Guire hofrat@osadl.org cifs: check kmalloc before use
Ronnie Sahlberg lsahlber@redhat.com cifs: use a refcount to protect open/closing the cached file handle
Steve French stfrench@microsoft.com cifs: add missing debug entries for kconfig options
Aurelien Aptel aaptel@suse.com CIFS: fix uninitialized ptr deref in smb2 signing
Ronnie Sahlberg lsahlber@redhat.com cifs: add missing support for ACLs in SMB 3.11
Alexander Usyskin alexander.usyskin@intel.com mei: don't update offset in write
Chuck Lever chuck.lever@oracle.com xprtrdma: Fix disconnect regression
Jason Yan yanaijie@huawei.com scsi: libsas: dynamically allocate and free ata host
Ben Hutchings ben@decadent.org.uk scripts/kernel-doc: Escape all literal braces in regexes
Valdis Kletnieks valdis.kletnieks@vt.edu PATCH scripts/kernel-doc
-------------
Diffstat:
Makefile | 8 +- arch/Kconfig | 3 + arch/arm/net/bpf_jit_32.c | 2 +- arch/arm/probes/kprobes/core.c | 4 +- arch/arm/probes/kprobes/test-core.c | 1 - arch/arm64/boot/dts/rockchip/rk3328.dtsi | 2 +- arch/arm64/include/asm/cache.h | 4 + arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/kernel/cpu_errata.c | 23 +++- arch/arm64/kernel/cpufeature.c | 2 +- arch/arm64/kernel/probes/kprobes.c | 2 +- arch/arm64/mm/init.c | 6 +- arch/mips/Makefile | 12 +- arch/mips/include/asm/processor.h | 15 ++- arch/mips/kernel/ptrace.c | 2 +- arch/mips/kernel/ptrace32.c | 2 +- arch/mips/lib/memset.S | 3 +- arch/mips/lib/multi3.c | 6 +- arch/s390/include/asm/qdio.h | 1 - arch/s390/lib/mem.S | 16 ++- arch/s390/mm/fault.c | 2 + arch/s390/mm/page-states.c | 2 +- arch/s390/net/bpf_jit_comp.c | 2 - arch/s390/numa/numa.c | 16 +-- arch/s390/pci/pci.c | 2 + arch/s390/purgatory/Makefile | 7 +- arch/x86/Kconfig | 1 + arch/x86/Makefile | 11 +- arch/x86/entry/vdso/Makefile | 6 +- arch/x86/events/core.c | 2 +- arch/x86/include/asm/irqflags.h | 3 +- arch/x86/include/asm/processor.h | 6 +- arch/x86/include/asm/stacktrace.h | 2 +- arch/x86/include/asm/tlbflush.h | 40 +++++++ arch/x86/include/asm/vgtod.h | 2 +- arch/x86/kernel/cpu/bugs.c | 50 ++++++++- arch/x86/kernel/cpu/common.c | 1 + arch/x86/kernel/cpu/intel.c | 3 + arch/x86/kernel/dumpstack.c | 25 ++++- arch/x86/kernel/early-quirks.c | 18 +++ arch/x86/kernel/process_64.c | 1 + arch/x86/kvm/hyperv.c | 27 +++-- arch/x86/kvm/hyperv.h | 2 +- arch/x86/kvm/svm.c | 8 +- arch/x86/kvm/x86.c | 19 ++-- arch/x86/lib/usercopy.c | 5 + arch/x86/mm/fault.c | 2 +- arch/x86/mm/init.c | 4 +- arch/x86/mm/mmap.c | 2 +- arch/x86/mm/tlb.c | 7 ++ drivers/ata/libata-core.c | 3 + drivers/ata/libata.h | 2 - drivers/base/power/clock_ops.c | 2 +- drivers/cdrom/cdrom.c | 2 +- drivers/char/tpm/tpm-interface.c | 53 +++++++-- drivers/char/tpm/tpm.h | 12 +- drivers/char/tpm/tpm2-space.c | 16 ++- drivers/char/tpm/tpm_crb.c | 101 +++++------------ drivers/clk/clk-npcm7xx.c | 4 +- drivers/clk/rockchip/clk-rk3399.c | 2 +- drivers/gpu/drm/udl/udl_drv.h | 2 +- drivers/gpu/drm/udl/udl_fb.c | 17 +-- drivers/gpu/drm/udl/udl_main.c | 35 +++--- drivers/gpu/drm/udl/udl_transfer.c | 39 +++---- drivers/hwmon/k10temp.c | 2 + drivers/hwmon/nct6775.c | 2 + drivers/iommu/arm-smmu.c | 16 ++- drivers/misc/mei/main.c | 1 - drivers/mtd/nand/raw/fsmc_nand.c | 2 +- drivers/mtd/nand/raw/marvell_nand.c | 73 +++++++++++-- drivers/mtd/nand/raw/nand_hynix.c | 10 ++ drivers/mtd/nand/raw/qcom_nandc.c | 53 ++++++++- drivers/net/wireless/broadcom/b43/leds.c | 2 +- drivers/net/wireless/broadcom/b43legacy/leds.c | 2 +- drivers/nvme/host/pci.c | 8 ++ drivers/pinctrl/freescale/pinctrl-imx1-core.c | 2 +- drivers/platform/x86/ideapad-laptop.c | 4 +- drivers/platform/x86/wmi.c | 9 +- drivers/power/supply/generic-adc-battery.c | 25 +++-- drivers/regulator/arizona-ldo1.c | 27 ++++- drivers/s390/cio/qdio_main.c | 5 +- drivers/scsi/libsas/sas_ata.c | 40 ++++--- drivers/scsi/libsas/sas_discover.c | 2 + drivers/scsi/mpt3sas/mpt3sas_base.c | 1 + drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- drivers/scsi/mpt3sas/mpt3sas_transport.c | 5 +- drivers/scsi/qla2xxx/qla_init.c | 2 +- drivers/scsi/qla2xxx/qla_iocb.c | 1 + drivers/scsi/scsi_sysfs.c | 20 +++- drivers/soc/qcom/rmtfs_mem.c | 3 +- drivers/target/iscsi/iscsi_target_login.c | 35 +++--- fs/btrfs/disk-io.c | 10 +- fs/btrfs/extent-tree.c | 2 +- fs/btrfs/inode.c | 26 ----- fs/btrfs/send.c | 146 +++++++++++++++++++++++-- fs/btrfs/super.c | 1 - fs/btrfs/tree-log.c | 66 +++++++++++ fs/cifs/cifs_debug.c | 30 +++-- fs/cifs/cifsfs.c | 18 +-- fs/cifs/cifsglob.h | 1 + fs/cifs/inode.c | 2 + fs/cifs/link.c | 4 +- fs/cifs/sess.c | 6 + fs/cifs/smb2inode.c | 6 +- fs/cifs/smb2ops.c | 72 ++++++++++-- fs/cifs/smb2pdu.c | 8 ++ fs/cifs/smb2pdu.h | 11 ++ fs/cifs/smb2proto.h | 1 + fs/cifs/smb2transport.c | 5 +- fs/ext4/balloc.c | 6 +- fs/ext4/ialloc.c | 6 +- fs/ext4/namei.c | 1 + fs/ext4/super.c | 22 ++-- fs/ext4/sysfs.c | 13 ++- fs/ext4/xattr.c | 2 + fs/fuse/dev.c | 39 +++++-- fs/fuse/dir.c | 10 +- fs/fuse/file.c | 1 + fs/fuse/fuse_i.h | 5 +- fs/fuse/inode.c | 37 ++++--- fs/sysfs/file.c | 44 ++++++++ include/drm/i915_drm.h | 4 +- include/linux/libata.h | 2 + include/linux/printk.h | 4 + include/linux/sysfs.h | 14 +++ include/linux/tpm.h | 2 + include/scsi/libsas.h | 2 +- kernel/kprobes.c | 38 ++++--- kernel/printk/internal.h | 9 +- kernel/printk/printk.c | 57 ++++++---- kernel/printk/printk_safe.c | 58 ++++++---- kernel/stop_machine.c | 43 +++++--- kernel/trace/trace.c | 4 +- kernel/watchdog.c | 4 +- kernel/watchdog_hld.c | 2 +- kernel/workqueue.c | 2 +- lib/nmi_backtrace.c | 3 - lib/vsprintf.c | 1 + mm/memory.c | 18 +++ net/sunrpc/xprtrdma/verbs.c | 5 +- scripts/kernel-doc | 20 ++-- sound/soc/codecs/wm_adsp.c | 8 +- sound/soc/sirf/sirf-usp.c | 7 +- sound/soc/soc-pcm.c | 8 ++ sound/soc/zte/zx-tdm.c | 4 +- tools/perf/arch/s390/util/kvm-stat.c | 2 +- virt/kvm/arm/arch_timer.c | 15 ++- virt/kvm/arm/mmu.c | 42 +++++-- 148 files changed, 1438 insertions(+), 580 deletions(-)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Valdis Kletnieks valdis.kletnieks@vt.edu
commit 701b3a3c0ac42630f74a5efba8545d61ac0e3293 upstream.
Fix a warning whinge from Perl introduced by "scripts: kernel-doc: parse next structs/unions"
Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.32), passed through in regex; marked by <-- HERE in m/({ <-- HERE [^{}]*})/ at ./scripts/kernel-doc line 1155. Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.32), passed through in regex; marked by <-- HERE in m/({ <-- HERE )/ at ./scripts/kernel-doc line 1179.
Signed-off-by: Valdis Kletnieks valdis.kletnieks@vt.edu Reviewed-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Jonathan Corbet corbet@lwn.net Cc: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- scripts/kernel-doc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/scripts/kernel-doc +++ b/scripts/kernel-doc @@ -1152,7 +1152,7 @@ sub dump_struct($$) { }
# Ignore other nested elements, like enums - $members =~ s/({[^{}]*})//g; + $members =~ s/({[^{}]*})//g;
create_parameterlist($members, ';', $file, $declaration_name); check_sections($file, $declaration_name, $decl_type, $sectcheck, $struct_actual); @@ -1176,7 +1176,7 @@ sub dump_struct($$) { $declaration .= "\t" x $level; } $declaration .= "\t" . $clause . "\n"; - $level++ if ($clause =~ m/({)/ && !($clause =~m/}/)); + $level++ if ($clause =~ m/({)/ && !($clause =~m/}/)); } output_declaration($declaration_name, 'struct',
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ben Hutchings ben@decadent.org.uk
commit 673bb2dfc36488abfdbbfc2ce2631204eaf682f2 upstream.
Commit 701b3a3c0ac4 ("PATCH scripts/kernel-doc") fixed the two instances of literal braces that Perl 5.28 warns about, but there are still more than it doesn't warn about.
Escape all left braces that are treated as literal characters. Also escape literal right braces, for consistency and to avoid confusing bracket-matching in text editors.
Signed-off-by: Ben Hutchings ben@decadent.org.uk Signed-off-by: Jonathan Corbet corbet@lwn.net Cc: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- scripts/kernel-doc | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
--- a/scripts/kernel-doc +++ b/scripts/kernel-doc @@ -1062,7 +1062,7 @@ sub dump_struct($$) { my $x = shift; my $file = shift;
- if ($x =~ /(struct|union)\s+(\w+)\s*{(.*)}/) { + if ($x =~ /(struct|union)\s+(\w+)\s*{(.*)}/) { my $decl_type = $1; $declaration_name = $2; my $members = $3; @@ -1148,20 +1148,20 @@ sub dump_struct($$) { } } } - $members =~ s/(struct|union)([^{};]+){([^{}]*)}([^{};]*);/$newmember/; + $members =~ s/(struct|union)([^{};]+){([^{}]*)}([^{};]*);/$newmember/; }
# Ignore other nested elements, like enums - $members =~ s/({[^{}]*})//g; + $members =~ s/({[^{}]*})//g;
create_parameterlist($members, ';', $file, $declaration_name); check_sections($file, $declaration_name, $decl_type, $sectcheck, $struct_actual);
# Adjust declaration for better display - $declaration =~ s/([{;])/$1\n/g; - $declaration =~ s/}\s+;/};/g; + $declaration =~ s/([{;])/$1\n/g; + $declaration =~ s/}\s+;/};/g; # Better handle inlined enums - do {} while ($declaration =~ s/(enum\s+{[^}]+),([^\n])/$1,\n$2/); + do {} while ($declaration =~ s/(enum\s+{[^}]+),([^\n])/$1,\n$2/);
my @def_args = split /\n/, $declaration; my $level = 1; @@ -1171,12 +1171,12 @@ sub dump_struct($$) { $clause =~ s/\s+$//; $clause =~ s/\s+/ /; next if (!$clause); - $level-- if ($clause =~ m/(})/ && $level > 1); + $level-- if ($clause =~ m/(})/ && $level > 1); if (!($clause =~ m/^\s*#/)) { $declaration .= "\t" x $level; } $declaration .= "\t" . $clause . "\n"; - $level++ if ($clause =~ m/({)/ && !($clause =~m/}/)); + $level++ if ($clause =~ m/({)/ && !($clause =~m/}/)); } output_declaration($declaration_name, 'struct', @@ -1244,7 +1244,7 @@ sub dump_enum($$) { # strip #define macros inside enums $x =~ s@#\s*((define|ifdef)\s+|endif)[^;]*;@@gos;
- if ($x =~ /enum\s+(\w+)\s*{(.*)}/) { + if ($x =~ /enum\s+(\w+)\s*{(.*)}/) { $declaration_name = $1; my $members = $2; my %_members; @@ -1785,7 +1785,7 @@ sub process_proto_type($$) { }
while (1) { - if ( $x =~ /([^{};]*)([{};])(.*)/ ) { + if ( $x =~ /([^{};]*)([{};])(.*)/ ) { if( length $prototype ) { $prototype .= " " }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Yan yanaijie@huawei.com
commit 2fa4a32613c9182b00e46872755b0662374424a7 upstream.
Commit 2623c7a5f2 ("libata: add refcounting to ata_host") v4.17+ introduced refcounting to ata_host and will increase or decrease the refcount when adding or deleting transport ATA port.
Now the ata host for libsas is embedded in domain_device, and the ->kref member is not initialized. Afer we add ata transport class, ata_host_get() will be called when adding transport ATA port and a warning will be triggered as below:
refcount_t: increment on 0; use-after-free. WARNING: CPU: 2 PID: 103 at lib/refcount.c:153 refcount_inc+0x40/0x48 ...... Call trace: refcount_inc+0x40/0x48 ata_host_get+0x10/0x18 ata_tport_add+0x40/0x120 ata_sas_tport_add+0xc/0x14 sas_ata_init+0x7c/0xc8 sas_discover_domain+0x380/0x53c process_one_work+0x12c/0x288 worker_thread+0x58/0x3f0 kthread+0xfc/0x128 ret_from_fork+0x10/0x18
And also when removing transport ATA port ata_host_put() will be called and another similar warning will be triggered. If the refcount decreased to zero, the ata host will be freed. But this ata host is only part of domain_device, it cannot be freed directly.
So we have to change this embedded static ata host to a dynamically allocated ata host and initialize the ->kref member. To use ata_host_get() and ata_host_put() in libsas, we need to move the declaration of these functions to the public libata.h and export them.
Fixes: b6240a4df018 ("scsi: libsas: add transport class for ATA devices") Signed-off-by: Jason Yan yanaijie@huawei.com CC: John Garry john.garry@huawei.com CC: Taras Kondratiuk takondra@cisco.com CC: Tejun Heo tj@kernel.org Acked-by: Tejun Heo tj@kernel.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/ata/libata-core.c | 3 ++ drivers/ata/libata.h | 2 - drivers/scsi/libsas/sas_ata.c | 40 ++++++++++++++++++++++++------------- drivers/scsi/libsas/sas_discover.c | 2 + include/linux/libata.h | 2 + include/scsi/libsas.h | 2 - 6 files changed, 34 insertions(+), 17 deletions(-)
--- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -6424,6 +6424,7 @@ void ata_host_init(struct ata_host *host host->n_tags = ATA_MAX_QUEUE; host->dev = dev; host->ops = ops; + kref_init(&host->kref); }
void __ata_port_probe(struct ata_port *ap) @@ -7391,3 +7392,5 @@ EXPORT_SYMBOL_GPL(ata_cable_80wire); EXPORT_SYMBOL_GPL(ata_cable_unknown); EXPORT_SYMBOL_GPL(ata_cable_ignore); EXPORT_SYMBOL_GPL(ata_cable_sata); +EXPORT_SYMBOL_GPL(ata_host_get); +EXPORT_SYMBOL_GPL(ata_host_put); \ No newline at end of file --- a/drivers/ata/libata.h +++ b/drivers/ata/libata.h @@ -100,8 +100,6 @@ extern int ata_port_probe(struct ata_por extern void __ata_port_probe(struct ata_port *ap); extern unsigned int ata_read_log_page(struct ata_device *dev, u8 log, u8 page, void *buf, unsigned int sectors); -extern void ata_host_get(struct ata_host *host); -extern void ata_host_put(struct ata_host *host);
#define to_ata_port(d) container_of(d, struct ata_port, tdev)
--- a/drivers/scsi/libsas/sas_ata.c +++ b/drivers/scsi/libsas/sas_ata.c @@ -557,34 +557,46 @@ int sas_ata_init(struct domain_device *f { struct sas_ha_struct *ha = found_dev->port->ha; struct Scsi_Host *shost = ha->core.shost; + struct ata_host *ata_host; struct ata_port *ap; int rc;
- ata_host_init(&found_dev->sata_dev.ata_host, ha->dev, &sas_sata_ops); - ap = ata_sas_port_alloc(&found_dev->sata_dev.ata_host, - &sata_port_info, - shost); + ata_host = kzalloc(sizeof(*ata_host), GFP_KERNEL); + if (!ata_host) { + SAS_DPRINTK("ata host alloc failed.\n"); + return -ENOMEM; + } + + ata_host_init(ata_host, ha->dev, &sas_sata_ops); + + ap = ata_sas_port_alloc(ata_host, &sata_port_info, shost); if (!ap) { SAS_DPRINTK("ata_sas_port_alloc failed.\n"); - return -ENODEV; + rc = -ENODEV; + goto free_host; }
ap->private_data = found_dev; ap->cbl = ATA_CBL_SATA; ap->scsi_host = shost; rc = ata_sas_port_init(ap); - if (rc) { - ata_sas_port_destroy(ap); - return rc; - } - rc = ata_sas_tport_add(found_dev->sata_dev.ata_host.dev, ap); - if (rc) { - ata_sas_port_destroy(ap); - return rc; - } + if (rc) + goto destroy_port; + + rc = ata_sas_tport_add(ata_host->dev, ap); + if (rc) + goto destroy_port; + + found_dev->sata_dev.ata_host = ata_host; found_dev->sata_dev.ap = ap;
return 0; + +destroy_port: + ata_sas_port_destroy(ap); +free_host: + ata_host_put(ata_host); + return rc; }
void sas_ata_task_abort(struct sas_task *task) --- a/drivers/scsi/libsas/sas_discover.c +++ b/drivers/scsi/libsas/sas_discover.c @@ -316,6 +316,8 @@ void sas_free_device(struct kref *kref) if (dev_is_sata(dev) && dev->sata_dev.ap) { ata_sas_tport_delete(dev->sata_dev.ap); ata_sas_port_destroy(dev->sata_dev.ap); + ata_host_put(dev->sata_dev.ata_host); + dev->sata_dev.ata_host = NULL; dev->sata_dev.ap = NULL; }
--- a/include/linux/libata.h +++ b/include/linux/libata.h @@ -1111,6 +1111,8 @@ extern struct ata_host *ata_host_alloc(s extern struct ata_host *ata_host_alloc_pinfo(struct device *dev, const struct ata_port_info * const * ppi, int n_ports); extern int ata_slave_link_init(struct ata_port *ap); +extern void ata_host_get(struct ata_host *host); +extern void ata_host_put(struct ata_host *host); extern int ata_host_start(struct ata_host *host); extern int ata_host_register(struct ata_host *host, struct scsi_host_template *sht); --- a/include/scsi/libsas.h +++ b/include/scsi/libsas.h @@ -161,7 +161,7 @@ struct sata_device { u8 port_no; /* port number, if this is a PM (Port) */
struct ata_port *ap; - struct ata_host ata_host; + struct ata_host *ata_host; struct smp_resp rps_resp ____cacheline_aligned; /* report_phy_sata_resp */ u8 fis[ATA_RESP_FIS_SIZE]; };
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chuck Lever chuck.lever@oracle.com
commit 8d4fb8ff427a23e573c9373b2bb3d1d6e8ea4399 upstream.
I found that injecting disconnects with v4.18-rc resulted in random failures of the multi-threaded git regression test.
The root cause appears to be that, after a reconnect, the RPC/RDMA transport is waking pending RPCs before the transport has posted enough Receive buffers to receive the Replies. If a Reply arrives before enough Receive buffers are posted, the connection is dropped. A few connection drops happen in quick succession as the client and server struggle to regain credit synchronization.
This regression was introduced with commit 7c8d9e7c8863 ("xprtrdma: Move Receive posting to Receive handler"). The client is supposed to post a single Receive when a connection is established because it's not supposed to send more than one RPC Call before it gets a fresh credit grant in the first RPC Reply [RFC 8166, Section 3.3.3].
Unfortunately there appears to be a longstanding bug in the Linux client's credit accounting mechanism. On connect, it simply dumps all pending RPC Calls onto the new connection. It's possible it has done this ever since the RPC/RDMA transport was added to the kernel ten years ago.
Servers have so far been tolerant of this bad behavior. Currently no server implementation ever changes its credit grant over reconnects, and servers always repost enough Receives before connections are fully established.
The Linux client implementation used to post a Receive before each of these Calls. This has covered up the flooding send behavior.
I could try to correct this old bug so that the client sends exactly one RPC Call and waits for a Reply. Since we are so close to the next merge window, I'm going to instead provide a simple patch to post enough Receives before a reconnect completes (based on the number of credits granted to the previous connection).
The spurious disconnects will be gone, but the client will still send multiple RPC Calls immediately after a reconnect.
Addressing the latter problem will wait for a merge window because a) I expect it to be a large change requiring lots of testing, and b) obviously the Linux client has interoperated successfully since day zero while still being broken.
Fixes: 7c8d9e7c8863 ("xprtrdma: Move Receive posting to ... ") Cc: stable@vger.kernel.org # v4.18+ Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sunrpc/xprtrdma/verbs.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -280,7 +280,6 @@ rpcrdma_conn_upcall(struct rdma_cm_id *i ++xprt->rx_xprt.connect_cookie; connstate = -ECONNABORTED; connected: - xprt->rx_buf.rb_credits = 1; ep->rep_connected = connstate; rpcrdma_conn_func(ep); wake_up_all(&ep->rep_connect_wait); @@ -755,6 +754,7 @@ retry: }
ep->rep_connected = 0; + rpcrdma_post_recvs(r_xprt, true);
rc = rdma_connect(ia->ri_id, &ep->rep_remote_cma); if (rc) { @@ -773,8 +773,6 @@ retry:
dprintk("RPC: %s: connected\n", __func__);
- rpcrdma_post_recvs(r_xprt, true); - out: if (rc) ep->rep_connected = rc; @@ -1171,6 +1169,7 @@ rpcrdma_buffer_create(struct rpcrdma_xpr list_add(&req->rl_list, &buf->rb_send_bufs); }
+ buf->rb_credits = 1; buf->rb_posted_receives = 0; INIT_LIST_HEAD(&buf->rb_recv_bufs);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Usyskin alexander.usyskin@intel.com
commit a103af1b64d74853a5e08ca6c86aeb0e5c6ca4f1 upstream.
MEI enables writes of complete messages only while read can be performed in parts, hence write should not update the file offset to not break interleaving partial reads with writes.
Cc: stable@vger.kernel.org Signed-off-by: Alexander Usyskin alexander.usyskin@intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/misc/mei/main.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/misc/mei/main.c +++ b/drivers/misc/mei/main.c @@ -312,7 +312,6 @@ static ssize_t mei_write(struct file *fi } }
- *offset = 0; cb = mei_cl_alloc_cb(cl, length, MEI_FOP_WRITE, file); if (!cb) { rets = -ENOMEM;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ronnie Sahlberg lsahlber@redhat.com
commit c1777df1a5d541cda918ff0450c8adcc8b69c2fd upstream.
We were missing the methods for get_acl and friends for the 3.11 dialect.
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French stfrench@microsoft.com CC: Stable stable@vger.kernel.org Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/smb2ops.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -3366,6 +3366,11 @@ struct smb_version_operations smb311_ope .query_all_EAs = smb2_query_eas, .set_EA = smb2_set_ea, #endif /* CIFS_XATTR */ +#ifdef CONFIG_CIFS_ACL + .get_acl = get_smb2_acl, + .get_acl_by_fid = get_smb2_acl_by_fid, + .set_acl = set_smb2_acl, +#endif /* CIFS_ACL */ .next_header = smb2_next_header, }; #endif /* CIFS_SMB311 */
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aurelien Aptel aaptel@suse.com
commit a5c62f4833c2c8e6e0f35367b99b717b78f5c029 upstream.
server->secmech.sdeschmacsha256 is not properly initialized before smb2_shash_allocate(), set shash after that call.
also fix typo in error message
Fixes: 8de8c4608fe9 ("cifs: Fix validation of signed data in smb2")
Signed-off-by: Aurelien Aptel aaptel@suse.com Reviewed-by: Paulo Alcantara palcantara@suse.com Reported-by: Xiaoli Feng xifeng@redhat.com Signed-off-by: Steve French stfrench@microsoft.com CC: Stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/smb2transport.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/fs/cifs/smb2transport.c +++ b/fs/cifs/smb2transport.c @@ -173,7 +173,7 @@ smb2_calc_signature(struct smb_rqst *rqs struct kvec *iov = rqst->rq_iov; struct smb2_sync_hdr *shdr = (struct smb2_sync_hdr *)iov[0].iov_base; struct cifs_ses *ses; - struct shash_desc *shash = &server->secmech.sdeschmacsha256->shash; + struct shash_desc *shash; struct smb_rqst drqst;
ses = smb2_find_smb_ses(server, shdr->SessionId); @@ -187,7 +187,7 @@ smb2_calc_signature(struct smb_rqst *rqs
rc = smb2_crypto_shash_allocate(server); if (rc) { - cifs_dbg(VFS, "%s: shah256 alloc failed\n", __func__); + cifs_dbg(VFS, "%s: sha256 alloc failed\n", __func__); return rc; }
@@ -198,6 +198,7 @@ smb2_calc_signature(struct smb_rqst *rqs return rc; }
+ shash = &server->secmech.sdeschmacsha256->shash; rc = crypto_shash_init(shash); if (rc) { cifs_dbg(VFS, "%s: Could not init sha256", __func__);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French stfrench@microsoft.com
commit 950132afd59385caf6e2b84e5235d069fa10681d upstream.
/proc/fs/cifs/DebugData displays the features (Kconfig options) used to build cifs.ko but it was missing some, and needed comma separator. These can be useful in debugging certain problems so we know which optional features were enabled in the user's build. Also clarify them, by making them more closely match the corresponding CONFIG_CIFS_* parm.
Old format: Features: dfs fscache posix spnego xattr acl
New format: Features: DFS,FSCACHE,SMB_DIRECT,STATS,DEBUG2,ALLOW_INSECURE_LEGACY,CIFS_POSIX,UPCALL(SPNEGO),XATTR,ACL
Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Reviewed-by: Paulo Alcantara palcantara@suse.de CC: Stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/cifs_debug.c | 30 +++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-)
--- a/fs/cifs/cifs_debug.c +++ b/fs/cifs/cifs_debug.c @@ -160,25 +160,41 @@ static int cifs_debug_data_proc_show(str seq_printf(m, "CIFS Version %s\n", CIFS_VERSION); seq_printf(m, "Features:"); #ifdef CONFIG_CIFS_DFS_UPCALL - seq_printf(m, " dfs"); + seq_printf(m, " DFS"); #endif #ifdef CONFIG_CIFS_FSCACHE - seq_printf(m, " fscache"); + seq_printf(m, ",FSCACHE"); +#endif +#ifdef CONFIG_CIFS_SMB_DIRECT + seq_printf(m, ",SMB_DIRECT"); +#endif +#ifdef CONFIG_CIFS_STATS2 + seq_printf(m, ",STATS2"); +#elif defined(CONFIG_CIFS_STATS) + seq_printf(m, ",STATS"); +#endif +#ifdef CONFIG_CIFS_DEBUG2 + seq_printf(m, ",DEBUG2"); +#elif defined(CONFIG_CIFS_DEBUG) + seq_printf(m, ",DEBUG"); +#endif +#ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY + seq_printf(m, ",ALLOW_INSECURE_LEGACY"); #endif #ifdef CONFIG_CIFS_WEAK_PW_HASH - seq_printf(m, " lanman"); + seq_printf(m, ",WEAK_PW_HASH"); #endif #ifdef CONFIG_CIFS_POSIX - seq_printf(m, " posix"); + seq_printf(m, ",CIFS_POSIX"); #endif #ifdef CONFIG_CIFS_UPCALL - seq_printf(m, " spnego"); + seq_printf(m, ",UPCALL(SPNEGO)"); #endif #ifdef CONFIG_CIFS_XATTR - seq_printf(m, " xattr"); + seq_printf(m, ",XATTR"); #endif #ifdef CONFIG_CIFS_ACL - seq_printf(m, " acl"); + seq_printf(m, ",ACL"); #endif seq_putc(m, '\n'); seq_printf(m, "Active VFS Requests: %d\n", GlobalTotalActiveXid);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ronnie Sahlberg lsahlber@redhat.com
commit 9da6ec7775d2cd76df53fbf4f1f35f6d490204f5 upstream.
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/cifsglob.h | 1 + fs/cifs/smb2inode.c | 4 +++- fs/cifs/smb2ops.c | 31 ++++++++++++++++++++++++++----- fs/cifs/smb2proto.h | 1 + 4 files changed, 31 insertions(+), 6 deletions(-)
--- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -913,6 +913,7 @@ cap_unix(struct cifs_ses *ses)
struct cached_fid { bool is_valid:1; /* Do we have a useable root fid */ + struct kref refcount; struct cifs_fid *fid; struct mutex fid_mutex; struct cifs_tcon *tcon; --- a/fs/cifs/smb2inode.c +++ b/fs/cifs/smb2inode.c @@ -120,7 +120,9 @@ smb2_open_op_close(const unsigned int xi break; }
- if (use_cached_root_handle == false) + if (use_cached_root_handle) + close_shroot(&tcon->crfid); + else rc = SMB2_close(xid, tcon, fid.persistent_fid, fid.volatile_fid); if (tmprc) rc = tmprc; --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -466,21 +466,36 @@ out: return rc; }
-void -smb2_cached_lease_break(struct work_struct *work) +static void +smb2_close_cached_fid(struct kref *ref) { - struct cached_fid *cfid = container_of(work, - struct cached_fid, lease_break); - mutex_lock(&cfid->fid_mutex); + struct cached_fid *cfid = container_of(ref, struct cached_fid, + refcount); + if (cfid->is_valid) { cifs_dbg(FYI, "clear cached root file handle\n"); SMB2_close(0, cfid->tcon, cfid->fid->persistent_fid, cfid->fid->volatile_fid); cfid->is_valid = false; } +} + +void close_shroot(struct cached_fid *cfid) +{ + mutex_lock(&cfid->fid_mutex); + kref_put(&cfid->refcount, smb2_close_cached_fid); mutex_unlock(&cfid->fid_mutex); }
+void +smb2_cached_lease_break(struct work_struct *work) +{ + struct cached_fid *cfid = container_of(work, + struct cached_fid, lease_break); + + close_shroot(cfid); +} + /* * Open the directory at the root of a share */ @@ -495,6 +510,7 @@ int open_shroot(unsigned int xid, struct if (tcon->crfid.is_valid) { cifs_dbg(FYI, "found a cached root file handle\n"); memcpy(pfid, tcon->crfid.fid, sizeof(struct cifs_fid)); + kref_get(&tcon->crfid.refcount); mutex_unlock(&tcon->crfid.fid_mutex); return 0; } @@ -511,6 +527,8 @@ int open_shroot(unsigned int xid, struct memcpy(tcon->crfid.fid, pfid, sizeof(struct cifs_fid)); tcon->crfid.tcon = tcon; tcon->crfid.is_valid = true; + kref_init(&tcon->crfid.refcount); + kref_get(&tcon->crfid.refcount); } mutex_unlock(&tcon->crfid.fid_mutex); return rc; @@ -552,6 +570,9 @@ smb3_qfs_tcon(const unsigned int xid, st FS_SECTOR_SIZE_INFORMATION); /* SMB3 specific */ if (no_cached_open) SMB2_close(xid, tcon, fid.persistent_fid, fid.volatile_fid); + else + close_shroot(&tcon->crfid); + return; }
--- a/fs/cifs/smb2proto.h +++ b/fs/cifs/smb2proto.h @@ -68,6 +68,7 @@ extern int smb3_handle_read_data(struct
extern int open_shroot(unsigned int xid, struct cifs_tcon *tcon, struct cifs_fid *pfid); +extern void close_shroot(struct cached_fid *cfid); extern void move_smb2_info_to_cifs(FILE_ALL_INFO *dst, struct smb2_file_all_info *src); extern int smb2_query_path_info(const unsigned int xid, struct cifs_tcon *tcon,
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Mc Guire hofrat@osadl.org
commit 126c97f4d0d1b5b956e8b0740c81a2b2a2ae548c upstream.
The kmalloc was not being checked - if it fails issue a warning and return -ENOMEM to the caller.
Signed-off-by: Nicholas Mc Guire hofrat@osadl.org Fixes: b8da344b74c8 ("cifs: dynamic allocation of ntlmssp blob") Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com cc: Stable stable@vger.kernel.org` Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/sess.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/fs/cifs/sess.c +++ b/fs/cifs/sess.c @@ -398,6 +398,12 @@ int build_ntlmssp_auth_blob(unsigned cha goto setup_ntlmv2_ret; } *pbuffer = kmalloc(size_of_ntlmssp_blob(ses), GFP_KERNEL); + if (!*pbuffer) { + rc = -ENOMEM; + cifs_dbg(VFS, "Error %d during NTLMSSP allocation\n", rc); + *buflen = 0; + goto setup_ntlmv2_ret; + } sec_blob = (AUTHENTICATE_MESSAGE *)*pbuffer;
memcpy(sec_blob->Signature, NTLMSSP_SIGNATURE, 8);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French stfrench@microsoft.com
commit e02789a53d71334b067ad72eee5d4e88a0158083 upstream.
When enumerating snapshots, the last few bytes of the final snapshot could be left off since we were miscalculating the length returned (leaving off the sizeof struct SRV_SNAPSHOT_ARRAY) See MS-SMB2 section 2.2.32.2. In addition fixup the length used to allow smaller buffer to be passed in, in order to allow returning the size of the whole snapshot array more easily.
Sample userspace output with a kernel patched with this (mounted to a Windows volume with two snapshots). Before this patch, the second snapshot would be missing a few bytes at the end.
~/cifs-2.6# ~/enum-snapshots /mnt/file press enter to issue the ioctl to retrieve snapshot information ...
size of snapshot array = 102 Num snapshots: 2 Num returned: 2 Array Size: 102
Snapshot 0:@GMT-2018.06.30-19.34.17 Snapshot 1:@GMT-2018.06.30-19.33.37
CC: Stable stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/smb2ops.c | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-)
--- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -1374,6 +1374,13 @@ smb3_set_integrity(const unsigned int xi
}
+/* GMT Token is @GMT-YYYY.MM.DD-HH.MM.SS Unicode which is 48 bytes + null */ +#define GMT_TOKEN_SIZE 50 + +/* + * Input buffer contains (empty) struct smb_snapshot array with size filled in + * For output see struct SRV_SNAPSHOT_ARRAY in MS-SMB2 section 2.2.32.2 + */ static int smb3_enum_snapshots(const unsigned int xid, struct cifs_tcon *tcon, struct cifsFileInfo *cfile, void __user *ioc_buf) @@ -1403,14 +1410,27 @@ smb3_enum_snapshots(const unsigned int x kfree(retbuf); return rc; } - if (snapshot_in.snapshot_array_size < sizeof(struct smb_snapshot_array)) { - rc = -ERANGE; - kfree(retbuf); - return rc; - }
- if (ret_data_len > snapshot_in.snapshot_array_size) - ret_data_len = snapshot_in.snapshot_array_size; + /* + * Check for min size, ie not large enough to fit even one GMT + * token (snapshot). On the first ioctl some users may pass in + * smaller size (or zero) to simply get the size of the array + * so the user space caller can allocate sufficient memory + * and retry the ioctl again with larger array size sufficient + * to hold all of the snapshot GMT tokens on the second try. + */ + if (snapshot_in.snapshot_array_size < GMT_TOKEN_SIZE) + ret_data_len = sizeof(struct smb_snapshot_array); + + /* + * We return struct SRV_SNAPSHOT_ARRAY, followed by + * the snapshot array (of 50 byte GMT tokens) each + * representing an available previous version of the data + */ + if (ret_data_len > (snapshot_in.snapshot_array_size + + sizeof(struct smb_snapshot_array))) + ret_data_len = snapshot_in.snapshot_array_size + + sizeof(struct smb_snapshot_array);
if (copy_to_user(ioc_buf, retbuf, ret_data_len)) rc = -EFAULT;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French stfrench@microsoft.com
commit fd09b7d3b352105f08b8e02f7afecf7e816380ef upstream.
An earlier commit had a typo which prevented the optimization from working:
commit 18dd8e1a65dd ("Do not send SMB3 SET_INFO request if nothing is changing")
Thank you to Metze for noticing this. Also clear a reserved field in the FILE_BASIC_INFO struct we send that should be zero (all the other fields in that struct were set or cleared explicitly already in cifs_set_file_info).
Reviewed-by: Pavel Shilovsky pshilov@microsoft.com CC: Stable stable@vger.kernel.org # 4.9.x+ Reported-by: Stefan Metzmacher metze@samba.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/inode.c | 2 ++ fs/cifs/smb2inode.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
--- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -1122,6 +1122,8 @@ cifs_set_file_info(struct inode *inode, if (!server->ops->set_file_info) return -ENOSYS;
+ info_buf.Pad = 0; + if (attrs->ia_valid & ATTR_ATIME) { set_time = true; info_buf.LastAccessTime = --- a/fs/cifs/smb2inode.c +++ b/fs/cifs/smb2inode.c @@ -283,7 +283,7 @@ smb2_set_file_info(struct inode *inode, int rc;
if ((buf->CreationTime == 0) && (buf->LastAccessTime == 0) && - (buf->LastWriteTime == 0) && (buf->ChangeTime) && + (buf->LastWriteTime == 0) && (buf->ChangeTime == 0) && (buf->Attributes == 0)) return 0; /* would be a no op, no sense sending this */
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French stfrench@microsoft.com
commit 22783155f4bf956c346a81624ec9258930a6fe06 upstream.
Fixes problem pointed out by Pavel in discussions about commit 729c0c9dd55204f0c9a823ac8a7bfa83d36c7e78
Signed-off-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com CC: Stable stable@vger.kernel.org # 3.18.x+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/link.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/cifs/link.c +++ b/fs/cifs/link.c @@ -396,7 +396,7 @@ smb3_query_mf_symlink(unsigned int xid, struct cifs_io_parms io_parms; int buf_type = CIFS_NO_BUFFER; __le16 *utf16_path; - __u8 oplock = SMB2_OPLOCK_LEVEL_II; + __u8 oplock = SMB2_OPLOCK_LEVEL_NONE; struct smb2_file_all_info *pfile_info = NULL;
oparms.tcon = tcon; @@ -459,7 +459,7 @@ smb3_create_mf_symlink(unsigned int xid, struct cifs_io_parms io_parms; int create_options = CREATE_NOT_DIR; __le16 *utf16_path; - __u8 oplock = SMB2_OPLOCK_LEVEL_EXCLUSIVE; + __u8 oplock = SMB2_OPLOCK_LEVEL_NONE; struct kvec iov[2];
if (backup_cred(cifs_sb))
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French stfrench@microsoft.com
commit 21ba3845b59c733a79ed4fe1c4f3732e7ece9df7 upstream.
Fil in the correct namelen (typically 255 not 4096) in the statfs response and also fill in a reasonably unique fsid (in this case taken from the volume id, and the creation time of the volume).
In the case of the POSIX statfs all fields are now filled in, and in the case of non-POSIX mounts, all fields are filled in which can be.
Signed-off-by: Steve French stfrench@gmail.com CC: Stable stable@vger.kernel.org Reviewed-by: Aurelien Aptel aaptel@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/cifsfs.c | 18 ++++++++++-------- fs/cifs/smb2ops.c | 2 ++ fs/cifs/smb2pdu.c | 8 ++++++++ fs/cifs/smb2pdu.h | 11 +++++++++++ 4 files changed, 31 insertions(+), 8 deletions(-)
--- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -209,14 +209,16 @@ cifs_statfs(struct dentry *dentry, struc
xid = get_xid();
- /* - * PATH_MAX may be too long - it would presumably be total path, - * but note that some servers (includinng Samba 3) have a shorter - * maximum path. - * - * Instead could get the real value via SMB_QUERY_FS_ATTRIBUTE_INFO. - */ - buf->f_namelen = PATH_MAX; + if (le32_to_cpu(tcon->fsAttrInfo.MaxPathNameComponentLength) > 0) + buf->f_namelen = + le32_to_cpu(tcon->fsAttrInfo.MaxPathNameComponentLength); + else + buf->f_namelen = PATH_MAX; + + buf->f_fsid.val[0] = tcon->vol_serial_number; + /* are using part of create time for more randomness, see man statfs */ + buf->f_fsid.val[1] = (int)le64_to_cpu(tcon->vol_create_time); + buf->f_files = 0; /* undefined */ buf->f_ffree = 0; /* unlimited */
--- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -567,6 +567,8 @@ smb3_qfs_tcon(const unsigned int xid, st SMB2_QFS_attr(xid, tcon, fid.persistent_fid, fid.volatile_fid, FS_DEVICE_INFORMATION); SMB2_QFS_attr(xid, tcon, fid.persistent_fid, fid.volatile_fid, + FS_VOLUME_INFORMATION); + SMB2_QFS_attr(xid, tcon, fid.persistent_fid, fid.volatile_fid, FS_SECTOR_SIZE_INFORMATION); /* SMB3 specific */ if (no_cached_open) SMB2_close(xid, tcon, fid.persistent_fid, fid.volatile_fid); --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -4046,6 +4046,9 @@ SMB2_QFS_attr(const unsigned int xid, st } else if (level == FS_SECTOR_SIZE_INFORMATION) { max_len = sizeof(struct smb3_fs_ss_info); min_len = sizeof(struct smb3_fs_ss_info); + } else if (level == FS_VOLUME_INFORMATION) { + max_len = sizeof(struct smb3_fs_vol_info) + MAX_VOL_LABEL_LEN; + min_len = sizeof(struct smb3_fs_vol_info); } else { cifs_dbg(FYI, "Invalid qfsinfo level %d\n", level); return -EINVAL; @@ -4090,6 +4093,11 @@ SMB2_QFS_attr(const unsigned int xid, st tcon->ss_flags = le32_to_cpu(ss_info->Flags); tcon->perf_sector_size = le32_to_cpu(ss_info->PhysicalBytesPerSectorForPerf); + } else if (level == FS_VOLUME_INFORMATION) { + struct smb3_fs_vol_info *vol_info = (struct smb3_fs_vol_info *) + (offset + (char *)rsp); + tcon->vol_serial_number = vol_info->VolumeSerialNumber; + tcon->vol_create_time = vol_info->VolumeCreationTime; }
qfsattr_exit: --- a/fs/cifs/smb2pdu.h +++ b/fs/cifs/smb2pdu.h @@ -1248,6 +1248,17 @@ struct smb3_fs_ss_info { __le32 ByteOffsetForPartitionAlignment; } __packed;
+/* volume info struct - see MS-FSCC 2.5.9 */ +#define MAX_VOL_LABEL_LEN 32 +struct smb3_fs_vol_info { + __le64 VolumeCreationTime; + __u32 VolumeSerialNumber; + __le32 VolumeLabelLength; /* includes trailing null */ + __u8 SupportsObjects; /* True if eg like NTFS, supports objects */ + __u8 Reserved; + __u8 VolumeLabel[0]; /* variable len */ +} __packed; + /* partial list of QUERY INFO levels */ #define FILE_DIRECTORY_INFORMATION 1 #define FILE_FULL_DIRECTORY_INFORMATION 2
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ethan Lien ethanlien@synology.com
commit d814a49198eafa6163698bdd93961302f3a877a4 upstream.
We use customized, nodesize batch value to update dirty_metadata_bytes. We should also use batch version of compare function or we will easily goto fast path and get false result from percpu_counter_compare().
Fixes: e2d845211eda ("Btrfs: use percpu counter for dirty metadata count") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Ethan Lien ethanlien@synology.com Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/disk-io.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -961,8 +961,9 @@ static int btree_writepages(struct addre
fs_info = BTRFS_I(mapping->host)->root->fs_info; /* this is a bit racy, but that's ok */ - ret = percpu_counter_compare(&fs_info->dirty_metadata_bytes, - BTRFS_DIRTY_METADATA_THRESH); + ret = __percpu_counter_compare(&fs_info->dirty_metadata_bytes, + BTRFS_DIRTY_METADATA_THRESH, + fs_info->dirty_metadata_batch); if (ret < 0) return 0; } @@ -4150,8 +4151,9 @@ static void __btrfs_btree_balance_dirty( if (flush_delayed) btrfs_balance_delayed_items(fs_info);
- ret = percpu_counter_compare(&fs_info->dirty_metadata_bytes, - BTRFS_DIRTY_METADATA_THRESH); + ret = __percpu_counter_compare(&fs_info->dirty_metadata_bytes, + BTRFS_DIRTY_METADATA_THRESH, + fs_info->dirty_metadata_batch); if (ret > 0) { balance_dirty_pages_ratelimited(fs_info->btree_inode->i_mapping); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
commit 4559b0a71749c442d34f7cfb9e72c9e58db83948 upstream.
If we're trying to make a data reservation and we have to allocate a data chunk we could leak ret == 1, as do_chunk_alloc() will return 1 if it allocated a chunk. Since the end of the function is the success path just return 0.
CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/extent-tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4358,7 +4358,7 @@ commit_trans: data_sinfo->flags, bytes, 1); spin_unlock(&data_sinfo->lock);
- return ret; + return 0; }
int btrfs_check_data_free_space(struct inode *inode,
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit 0d836392cadd5535f4184d46d901a82eb276ed62 upstream.
If we end up with logging an inode reference item which has the same name but different index from the one we have persisted, we end up failing when replaying the log with an errno value of -EEXIST. The error comes from btrfs_add_link(), which is called from add_inode_ref(), when we are replaying an inode reference item.
Example scenario where this happens:
$ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt
$ touch /mnt/foo $ ln /mnt/foo /mnt/bar
$ sync
# Rename the first hard link (foo) to a new name and rename the second # hard link (bar) to the old name of the first hard link (foo). $ mv /mnt/foo /mnt/qwerty $ mv /mnt/bar /mnt/foo
# Create a new file, in the same parent directory, with the old name of # the second hard link (bar) and fsync this new file. # We do this instead of calling fsync on foo/qwerty because if we did # that the fsync resulted in a full transaction commit, not triggering # the problem. $ touch /mnt/bar $ xfs_io -c "fsync" /mnt/bar
<power fail>
$ mount /dev/sdb /mnt mount: mount /dev/sdb on /mnt failed: File exists
So fix this by checking if a conflicting inode reference exists (same name, same parent but different index), removing it (and the associated dir index entries from the parent inode) if it exists, before attempting to add the new reference.
A test case for fstests follows soon.
CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/tree-log.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+)
--- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -1291,6 +1291,46 @@ again: return ret; }
+static int btrfs_inode_ref_exists(struct inode *inode, struct inode *dir, + const u8 ref_type, const char *name, + const int namelen) +{ + struct btrfs_key key; + struct btrfs_path *path; + const u64 parent_id = btrfs_ino(BTRFS_I(dir)); + int ret; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + key.objectid = btrfs_ino(BTRFS_I(inode)); + key.type = ref_type; + if (key.type == BTRFS_INODE_REF_KEY) + key.offset = parent_id; + else + key.offset = btrfs_extref_hash(parent_id, name, namelen); + + ret = btrfs_search_slot(NULL, BTRFS_I(inode)->root, &key, path, 0, 0); + if (ret < 0) + goto out; + if (ret > 0) { + ret = 0; + goto out; + } + if (key.type == BTRFS_INODE_EXTREF_KEY) + ret = btrfs_find_name_in_ext_backref(path->nodes[0], + path->slots[0], parent_id, + name, namelen, NULL); + else + ret = btrfs_find_name_in_backref(path->nodes[0], path->slots[0], + name, namelen, NULL); + +out: + btrfs_free_path(path); + return ret; +} + /* * replay one inode back reference item found in the log tree. * eb, slot and key refer to the buffer and key found in the log tree. @@ -1400,6 +1440,32 @@ static noinline int add_inode_ref(struct } }
+ /* + * If a reference item already exists for this inode + * with the same parent and name, but different index, + * drop it and the corresponding directory index entries + * from the parent before adding the new reference item + * and dir index entries, otherwise we would fail with + * -EEXIST returned from btrfs_add_link() below. + */ + ret = btrfs_inode_ref_exists(inode, dir, key->type, + name, namelen); + if (ret > 0) { + ret = btrfs_unlink_inode(trans, root, + BTRFS_I(dir), + BTRFS_I(inode), + name, namelen); + /* + * If we dropped the link count to 0, bump it so + * that later the iput() on the inode will not + * free it. We will fixup the link count later. + */ + if (!ret && inode->i_nlink == 0) + inc_nlink(inode); + } + if (ret < 0) + goto out; + /* insert our name */ ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik jbacik@fb.com
commit 3c4276936f6fbe52884b4ea4e6cc120b890a0f9f upstream.
We recently ran into the following deadlock involving btrfs_write_inode():
[ +0.005066] __schedule+0x38e/0x8c0 [ +0.007144] schedule+0x36/0x80 [ +0.006447] bit_wait+0x11/0x60 [ +0.006446] __wait_on_bit+0xbe/0x110 [ +0.007487] ? bit_wait_io+0x60/0x60 [ +0.007319] __inode_wait_for_writeback+0x96/0xc0 [ +0.009568] ? autoremove_wake_function+0x40/0x40 [ +0.009565] inode_wait_for_writeback+0x21/0x30 [ +0.009224] evict+0xb0/0x190 [ +0.006099] iput+0x1a8/0x210 [ +0.006103] btrfs_run_delayed_iputs+0x73/0xc0 [ +0.009047] btrfs_commit_transaction+0x799/0x8c0 [ +0.009567] btrfs_write_inode+0x81/0xb0 [ +0.008008] __writeback_single_inode+0x267/0x320 [ +0.009569] writeback_sb_inodes+0x25b/0x4e0 [ +0.008702] wb_writeback+0x102/0x2d0 [ +0.007487] wb_workfn+0xa4/0x310 [ +0.006794] ? wb_workfn+0xa4/0x310 [ +0.007143] process_one_work+0x150/0x410 [ +0.008179] worker_thread+0x6d/0x520 [ +0.007490] kthread+0x12c/0x160 [ +0.006620] ? put_pwq_unlocked+0x80/0x80 [ +0.008185] ? kthread_park+0xa0/0xa0 [ +0.007484] ? do_syscall_64+0x53/0x150 [ +0.007837] ret_from_fork+0x29/0x40
Writeback calls:
btrfs_write_inode btrfs_commit_transaction btrfs_run_delayed_iputs
If iput() is called on that same inode, evict() will wait for writeback forever.
btrfs_write_inode() was originally added way back in 4730a4bc5bf3 ("btrfs_dirty_inode") to support O_SYNC writes. However, ->write_inode() hasn't been used for O_SYNC since 148f948ba877 ("vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode"), so btrfs_write_inode() is actually unnecessary (and leads to a bunch of unnecessary commits). Get rid of it, which also gets rid of the deadlock.
CC: stable@vger.kernel.org # 3.2+ Signed-off-by: Josef Bacik jbacik@fb.com [Omar: new commit message] Signed-off-by: Omar Sandoval osandov@fb.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/inode.c | 26 -------------------------- fs/btrfs/super.c | 1 - 2 files changed, 27 deletions(-)
--- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6027,32 +6027,6 @@ err: return ret; }
-int btrfs_write_inode(struct inode *inode, struct writeback_control *wbc) -{ - struct btrfs_root *root = BTRFS_I(inode)->root; - struct btrfs_trans_handle *trans; - int ret = 0; - bool nolock = false; - - if (test_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags)) - return 0; - - if (btrfs_fs_closing(root->fs_info) && - btrfs_is_free_space_inode(BTRFS_I(inode))) - nolock = true; - - if (wbc->sync_mode == WB_SYNC_ALL) { - if (nolock) - trans = btrfs_join_transaction_nolock(root); - else - trans = btrfs_join_transaction(root); - if (IS_ERR(trans)) - return PTR_ERR(trans); - ret = btrfs_commit_transaction(trans); - } - return ret; -} - /* * This is somewhat expensive, updating the tree every time the * inode changes. But, it is most likely to find the inode in cache. --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2331,7 +2331,6 @@ static const struct super_operations btr .sync_fs = btrfs_sync_fs, .show_options = btrfs_show_options, .show_devname = btrfs_show_devname, - .write_inode = btrfs_write_inode, .alloc_inode = btrfs_alloc_inode, .destroy_inode = btrfs_destroy_inode, .statfs = btrfs_statfs,
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit 46b2f4590aab71d31088a265c86026b1e96c9de4 upstream.
The more common use case of send involves creating a RO snapshot and then use it for a send operation. In this case it's not possible to have inodes in the snapshot that have a link count of zero (inode with an orphan item) since during snapshot creation we do the orphan cleanup. However, other less common use cases for send can end up seeing inodes with a link count of zero and in this case the send operation fails with a ENOENT error because any attempt to generate a path for the inode, with the purpose of creating it or updating it at the receiver, fails since there are no inode reference items. One use case it to use a regular subvolume for a send operation after turning it to RO mode or turning a RW snapshot into RO mode and then using it for a send operation. In both cases, if a file gets all its hard links deleted while there is an open file descriptor before turning the subvolume/snapshot into RO mode, the send operation will encounter an inode with a link count of zero and then fail with errno ENOENT.
Example using a full send with a subvolume:
$ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt
$ btrfs subvolume create /mnt/sv1 $ touch /mnt/sv1/foo $ touch /mnt/sv1/bar
# keep an open file descriptor on file bar $ exec 73</mnt/sv1/bar $ unlink /mnt/sv1/bar
# Turn the subvolume to RO mode and use it for a full send, while # holding the open file descriptor. $ btrfs property set /mnt/sv1 ro true
$ btrfs send -f /tmp/full.send /mnt/sv1 At subvol /mnt/sv1 ERROR: send ioctl failed with -2: No such file or directory
Example using an incremental send with snapshots:
$ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt
$ btrfs subvolume create /mnt/sv1 $ touch /mnt/sv1/foo $ touch /mnt/sv1/bar
$ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap1
$ echo "hello world" >> /mnt/sv1/bar
$ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap2
# Turn the second snapshot to RW mode and delete file foo while # holding an open file descriptor on it. $ btrfs property set /mnt/snap2 ro false $ exec 73</mnt/snap2/foo $ unlink /mnt/snap2/foo
# Set the second snapshot back to RO mode and do an incremental send. $ btrfs property set /mnt/snap2 ro true
$ btrfs send -f /tmp/inc.send -p /mnt/snap1 /mnt/snap2 At subvol /mnt/snap2 ERROR: send ioctl failed with -2: No such file or directory
So fix this by ignoring inodes with a link count of zero if we are either doing a full send or if they do not exist in the parent snapshot (they are new in the send snapshot), and unlink all paths found in the parent snapshot when doing an incremental send (and ignoring all other inode items, such as xattrs and extents).
A test case for fstests follows soon.
CC: stable@vger.kernel.org # 4.4+ Reported-by: Martin Wilck martin.wilck@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/send.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 129 insertions(+), 8 deletions(-)
--- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -100,6 +100,7 @@ struct send_ctx { u64 cur_inode_rdev; u64 cur_inode_last_extent; u64 cur_inode_next_write_offset; + bool ignore_cur_inode;
u64 send_progress;
@@ -5799,6 +5800,9 @@ static int finish_inode_if_needed(struct int pending_move = 0; int refs_processed = 0;
+ if (sctx->ignore_cur_inode) + return 0; + ret = process_recorded_refs_if_needed(sctx, at_end, &pending_move, &refs_processed); if (ret < 0) @@ -5917,6 +5921,93 @@ out: return ret; }
+struct parent_paths_ctx { + struct list_head *refs; + struct send_ctx *sctx; +}; + +static int record_parent_ref(int num, u64 dir, int index, struct fs_path *name, + void *ctx) +{ + struct parent_paths_ctx *ppctx = ctx; + + return record_ref(ppctx->sctx->parent_root, dir, name, ppctx->sctx, + ppctx->refs); +} + +/* + * Issue unlink operations for all paths of the current inode found in the + * parent snapshot. + */ +static int btrfs_unlink_all_paths(struct send_ctx *sctx) +{ + LIST_HEAD(deleted_refs); + struct btrfs_path *path; + struct btrfs_key key; + struct parent_paths_ctx ctx; + int ret; + + path = alloc_path_for_send(); + if (!path) + return -ENOMEM; + + key.objectid = sctx->cur_ino; + key.type = BTRFS_INODE_REF_KEY; + key.offset = 0; + ret = btrfs_search_slot(NULL, sctx->parent_root, &key, path, 0, 0); + if (ret < 0) + goto out; + + ctx.refs = &deleted_refs; + ctx.sctx = sctx; + + while (true) { + struct extent_buffer *eb = path->nodes[0]; + int slot = path->slots[0]; + + if (slot >= btrfs_header_nritems(eb)) { + ret = btrfs_next_leaf(sctx->parent_root, path); + if (ret < 0) + goto out; + else if (ret > 0) + break; + continue; + } + + btrfs_item_key_to_cpu(eb, &key, slot); + if (key.objectid != sctx->cur_ino) + break; + if (key.type != BTRFS_INODE_REF_KEY && + key.type != BTRFS_INODE_EXTREF_KEY) + break; + + ret = iterate_inode_ref(sctx->parent_root, path, &key, 1, + record_parent_ref, &ctx); + if (ret < 0) + goto out; + + path->slots[0]++; + } + + while (!list_empty(&deleted_refs)) { + struct recorded_ref *ref; + + ref = list_first_entry(&deleted_refs, struct recorded_ref, list); + ret = send_unlink(sctx, ref->full_path); + if (ret < 0) + goto out; + fs_path_free(ref->full_path); + list_del(&ref->list); + kfree(ref); + } + ret = 0; +out: + btrfs_free_path(path); + if (ret) + __free_recorded_refs(&deleted_refs); + return ret; +} + static int changed_inode(struct send_ctx *sctx, enum btrfs_compare_tree_result result) { @@ -5931,6 +6022,7 @@ static int changed_inode(struct send_ctx sctx->cur_inode_new_gen = 0; sctx->cur_inode_last_extent = (u64)-1; sctx->cur_inode_next_write_offset = 0; + sctx->ignore_cur_inode = false;
/* * Set send_progress to current inode. This will tell all get_cur_xxx @@ -5971,6 +6063,33 @@ static int changed_inode(struct send_ctx sctx->cur_inode_new_gen = 1; }
+ /* + * Normally we do not find inodes with a link count of zero (orphans) + * because the most common case is to create a snapshot and use it + * for a send operation. However other less common use cases involve + * using a subvolume and send it after turning it to RO mode just + * after deleting all hard links of a file while holding an open + * file descriptor against it or turning a RO snapshot into RW mode, + * keep an open file descriptor against a file, delete it and then + * turn the snapshot back to RO mode before using it for a send + * operation. So if we find such cases, ignore the inode and all its + * items completely if it's a new inode, or if it's a changed inode + * make sure all its previous paths (from the parent snapshot) are all + * unlinked and all other the inode items are ignored. + */ + if (result == BTRFS_COMPARE_TREE_NEW || + result == BTRFS_COMPARE_TREE_CHANGED) { + u32 nlinks; + + nlinks = btrfs_inode_nlink(sctx->left_path->nodes[0], left_ii); + if (nlinks == 0) { + sctx->ignore_cur_inode = true; + if (result == BTRFS_COMPARE_TREE_CHANGED) + ret = btrfs_unlink_all_paths(sctx); + goto out; + } + } + if (result == BTRFS_COMPARE_TREE_NEW) { sctx->cur_inode_gen = left_gen; sctx->cur_inode_new = 1; @@ -6309,15 +6428,17 @@ static int changed_cb(struct btrfs_path key->objectid == BTRFS_FREE_SPACE_OBJECTID) goto out;
- if (key->type == BTRFS_INODE_ITEM_KEY) + if (key->type == BTRFS_INODE_ITEM_KEY) { ret = changed_inode(sctx, result); - else if (key->type == BTRFS_INODE_REF_KEY || - key->type == BTRFS_INODE_EXTREF_KEY) - ret = changed_ref(sctx, result); - else if (key->type == BTRFS_XATTR_ITEM_KEY) - ret = changed_xattr(sctx, result); - else if (key->type == BTRFS_EXTENT_DATA_KEY) - ret = changed_extent(sctx, result); + } else if (!sctx->ignore_cur_inode) { + if (key->type == BTRFS_INODE_REF_KEY || + key->type == BTRFS_INODE_EXTREF_KEY) + ret = changed_ref(sctx, result); + else if (key->type == BTRFS_XATTR_ITEM_KEY) + ret = changed_xattr(sctx, result); + else if (key->type == BTRFS_EXTENT_DATA_KEY) + ret = changed_extent(sctx, result); + }
out: return ret;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit 22d3151c2c4cb517a309154d1e828a28106508c7 upstream.
When doing an incremental send, if we have a file in the parent snapshot that has prealloc extents beyond EOF and in the send snapshot it got a hole punch that partially covers the prealloc extents, the send stream, when replayed by a receiver, can result in a file that has a size bigger than it should and filled with zeroes past the correct EOF.
For example:
$ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt
$ xfs_io -f -c "falloc -k 0 4M" /mnt/foobar $ xfs_io -c "pwrite -S 0xea 0 1M" /mnt/foobar
$ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send -f /tmp/1.send /mnt/snap1
$ xfs_io -c "fpunch 1M 2M" /mnt/foobar
$ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -f /tmp/2.send -p /mnt/snap1 /mnt/snap2
$ stat --format %s /mnt/snap2/foobar 1048576 $ md5sum /mnt/snap2/foobar d31659e82e87798acd4669a1e0a19d4f /mnt/snap2/foobar
$ umount /mnt $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt
$ btrfs receive -f /mnt/1.snap /mnt $ btrfs receive -f /mnt/2.snap /mnt
$ stat --format %s /mnt/snap2/foobar 3145728 # --> should be 1Mb and not 3Mb (which was the end offset of hole # punch operation) $ md5sum /mnt/snap2/foobar 117baf295297c2a995f92da725b0b651 /mnt/snap2/foobar # --> should be d31659e82e87798acd4669a1e0a19d4f as in the original fs
This issue actually happens only since commit ffa7c4296e93 ("Btrfs: send, do not issue unnecessary truncate operations"), but before that commit we were issuing a write operation full of zeroes (to "punch" a hole) which was extending the file size beyond the correct value and then immediately issue a truncate operation to the correct size and undoing the previous write operation. Since the send protocol does not support fallocate, for extent preallocation and hole punching, fix this by not even attempting to send a "hole" (regular write full of zeroes) if it starts at an offset greater then or equals to the file's size. This approach, besides being much more simple then making send issue the truncate operation, adds the benefit of avoiding the useless pair of write of zeroes and truncate operations, saving time and IO at the receiver and reducing the size of the send stream.
A test case for fstests follows soon.
Fixes: ffa7c4296e93 ("Btrfs: send, do not issue unnecessary truncate operations") CC: stable@vger.kernel.org # 4.17+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/send.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -5007,6 +5007,15 @@ static int send_hole(struct send_ctx *sc u64 len; int ret = 0;
+ /* + * A hole that starts at EOF or beyond it. Since we do not yet support + * fallocate (for extent preallocation and hole punching), sending a + * write of zeroes starting at EOF or beyond would later require issuing + * a truncate operation which would undo the write and achieve nothing. + */ + if (offset >= sctx->cur_inode_size) + return 0; + if (sctx->flags & BTRFS_SEND_FLAG_NO_FILE_DATA) return send_update_extent(sctx, offset, end - offset);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Larabel michael@phoronix.com
commit 484a84f25ca7817c3662001316ba7d1e06b74ae2 upstream.
For at least the Threadripper 2950X and Threadripper 2990WX, it's confirmed a 27 degree offset is needed.
Signed-off-by: Michael Larabel michael@phoronix.com Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwmon/k10temp.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/hwmon/k10temp.c +++ b/drivers/hwmon/k10temp.c @@ -105,6 +105,8 @@ static const struct tctl_offset tctl_off { 0x17, "AMD Ryzen Threadripper 1950", 10000 }, { 0x17, "AMD Ryzen Threadripper 1920", 10000 }, { 0x17, "AMD Ryzen Threadripper 1910", 10000 }, + { 0x17, "AMD Ryzen Threadripper 2950X", 27000 }, + { 0x17, "AMD Ryzen Threadripper 2990WX", 27000 }, };
static void read_htcreg_pci(struct pci_dev *pdev, u32 *regval)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Borkmann daniel@iogearbox.net
Commit 38ca93060163 ("bpf, arm32: save 4 bytes of unneeded stack space") messed up STACK_VAR() by 4 bytes presuming it was related to skb scratch buffer space, but it clearly isn't as this refers to the top word in stack, therefore restore it. This fixes a NULL pointer dereference seen during bootup when JIT is enabled and BPF program run in sk_filter_trim_cap() triggered by systemd-udevd.
JIT rework in 1c35ba122d4a ("ARM: net: bpf: use negative numbers for stacked registers") and 96cced4e774a ("ARM: net: bpf: access eBPF scratch space using ARM FP register") removed the affected parts, so only needed in 4.18 stable.
Fixes: 38ca93060163 ("bpf, arm32: save 4 bytes of unneeded stack space") Reported-by: Peter Robinson pbrobinson@gmail.com Reported-by: Marc Haber mh+netdev@zugschlus.de Tested-by: Stefan Wahren stefan.wahren@i2se.com Tested-by: Peter Robinson pbrobinson@gmail.com Cc: Russell King rmk+kernel@armlinux.org.uk Cc: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Alexei Starovoitov ast@kernel.org --- arch/arm/net/bpf_jit_32.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/net/bpf_jit_32.c +++ b/arch/arm/net/bpf_jit_32.c @@ -238,7 +238,7 @@ static void jit_fill_hole(void *area, un #define STACK_SIZE ALIGN(_STACK_SIZE, STACK_ALIGNMENT)
/* Get the offset of eBPF REGISTERs stored on scratch space. */ -#define STACK_VAR(off) (STACK_SIZE - off) +#define STACK_VAR(off) (STACK_SIZE - off - 4)
#if __LINUX_ARM_ARCH__ < 7
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Charles Keepax ckeepax@opensource.cirrus.com
commit a9191579ba1086d91842199263e6fe6bb5eec1ba upstream.
Currently the enable GPIO is being looked up on the regulator device itself but that does not have its own DT node, this causes the lookup to fail and the regulator not to get its GPIO. The DT node is shared across the whole MFD and as such the lookup needs to happen on that parent device. Moving the lookup to the parent device also means devres can no longer be used as the life time would attach to the wrong device.
Additionally, the enable GPIO is active high so we should be passing GPIOD_OUT_LOW to ensure the regulator starts in its off state allowing the driver to enable it when it is ready.
Fixes: e1739e86f0cb ("regulator: arizona-ldo1: Look up a descriptor and pass to the core") Reported-by: Matthias Reichl hias@horus.com Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org From: Matthias Reichl hias@horus.com
--- drivers/regulator/arizona-ldo1.c | 27 ++++++++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-)
--- a/drivers/regulator/arizona-ldo1.c +++ b/drivers/regulator/arizona-ldo1.c @@ -36,6 +36,8 @@ struct arizona_ldo1 {
struct regulator_consumer_supply supply; struct regulator_init_data init_data; + + struct gpio_desc *ena_gpiod; };
static int arizona_ldo1_hc_list_voltage(struct regulator_dev *rdev, @@ -253,12 +255,17 @@ static int arizona_ldo1_common_init(stru } }
- /* We assume that high output = regulator off */ - config.ena_gpiod = devm_gpiod_get_optional(&pdev->dev, "wlf,ldoena", - GPIOD_OUT_HIGH); + /* We assume that high output = regulator off + * Don't use devm, since we need to get against the parent device + * so clean up would happen at the wrong time + */ + config.ena_gpiod = gpiod_get_optional(parent_dev, "wlf,ldoena", + GPIOD_OUT_LOW); if (IS_ERR(config.ena_gpiod)) return PTR_ERR(config.ena_gpiod);
+ ldo1->ena_gpiod = config.ena_gpiod; + if (pdata->init_data) config.init_data = pdata->init_data; else @@ -276,6 +283,9 @@ static int arizona_ldo1_common_init(stru of_node_put(config.of_node);
if (IS_ERR(ldo1->regulator)) { + if (config.ena_gpiod) + gpiod_put(config.ena_gpiod); + ret = PTR_ERR(ldo1->regulator); dev_err(&pdev->dev, "Failed to register LDO1 supply: %d\n", ret); @@ -334,8 +344,19 @@ static int arizona_ldo1_probe(struct pla return ret; }
+static int arizona_ldo1_remove(struct platform_device *pdev) +{ + struct arizona_ldo1 *ldo1 = platform_get_drvdata(pdev); + + if (ldo1->ena_gpiod) + gpiod_put(ldo1->ena_gpiod); + + return 0; +} + static struct platform_driver arizona_ldo1_driver = { .probe = arizona_ldo1_probe, + .remove = arizona_ldo1_remove, .driver = { .name = "arizona-ldo1", },
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vivek Gautam vivek.gautam@codeaurora.org
commit d1e20222d5372e951bbb2fd3f6489ec4a6ea9b11 upstream.
Currently we check if the number of context banks is not equal to num_context_interrupts. However, there are booloaders such as, one on sdm845 that reserves few context banks and thus kernel views less than the total available context banks. So, although the hardware definition in device tree would mention the correct number of context interrupts, this number can be greater than the number of context banks visible to smmu in kernel. We should therefore error out only when the number of context banks is greater than the available number of context interrupts.
Signed-off-by: Vivek Gautam vivek.gautam@codeaurora.org Suggested-by: Tomasz Figa tfiga@chromium.org Cc: Robin Murphy robin.murphy@arm.com Cc: Will Deacon will.deacon@arm.com [will: drop useless printk] Signed-off-by: Will Deacon will.deacon@arm.com Cc: Jitendra Bhivare jitendra.bhivare@broadcom.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iommu/arm-smmu.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
--- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -2103,12 +2103,16 @@ static int arm_smmu_device_probe(struct if (err) return err;
- if (smmu->version == ARM_SMMU_V2 && - smmu->num_context_banks != smmu->num_context_irqs) { - dev_err(dev, - "found only %d context interrupt(s) but %d required\n", - smmu->num_context_irqs, smmu->num_context_banks); - return -ENODEV; + if (smmu->version == ARM_SMMU_V2) { + if (smmu->num_context_banks > smmu->num_context_irqs) { + dev_err(dev, + "found only %d context irq(s) but %d required\n", + smmu->num_context_irqs, smmu->num_context_banks); + return -ENODEV; + } + + /* Ignore superfluous interrupts */ + smmu->num_context_irqs = smmu->num_context_banks; }
for (i = 0; i < smmu->num_global_irqs; ++i) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Petr Mladek pmladek@suse.com
commit ba552399954dde1b388f7749fecad5c349216981 upstream.
It is just a preparation step. The patch does not change the existing behavior.
Link: http://lkml.kernel.org/r/20180627140817.27764-2-pmladek@suse.com To: Steven Rostedt rostedt@goodmis.org Cc: Peter Zijlstra peterz@infradead.org Cc: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Cc: Sergey Senozhatsky sergey.senozhatsky.work@gmail.com Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Acked-by: Sergey Senozhatsky sergey.senozhatsky@gmail.com Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/printk/printk.c | 43 ++++++++++++++++++++++++++----------------- 1 file changed, 26 insertions(+), 17 deletions(-)
--- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1824,28 +1824,16 @@ static size_t log_output(int facility, i return log_store(facility, level, lflags, 0, dict, dictlen, text, text_len); }
-asmlinkage int vprintk_emit(int facility, int level, - const char *dict, size_t dictlen, - const char *fmt, va_list args) +/* Must be called under logbuf_lock. */ +int vprintk_store(int facility, int level, + const char *dict, size_t dictlen, + const char *fmt, va_list args) { static char textbuf[LOG_LINE_MAX]; char *text = textbuf; size_t text_len; enum log_flags lflags = 0; - unsigned long flags; - int printed_len; - bool in_sched = false; - - if (level == LOGLEVEL_SCHED) { - level = LOGLEVEL_DEFAULT; - in_sched = true; - } - - boot_delay_msec(level); - printk_delay();
- /* This stops the holder of console_sem just where we want him */ - logbuf_lock_irqsave(flags); /* * The printf needs to come first; we need the syslog * prefix which might be passed-in as a parameter. @@ -1886,8 +1874,29 @@ asmlinkage int vprintk_emit(int facility if (dict) lflags |= LOG_PREFIX|LOG_NEWLINE;
- printed_len = log_output(facility, level, lflags, dict, dictlen, text, text_len); + return log_output(facility, level, lflags, + dict, dictlen, text, text_len); +}
+asmlinkage int vprintk_emit(int facility, int level, + const char *dict, size_t dictlen, + const char *fmt, va_list args) +{ + int printed_len; + bool in_sched = false; + unsigned long flags; + + if (level == LOGLEVEL_SCHED) { + level = LOGLEVEL_DEFAULT; + in_sched = true; + } + + boot_delay_msec(level); + printk_delay(); + + /* This stops the holder of console_sem just where we want him */ + logbuf_lock_irqsave(flags); + printed_len = vprintk_store(facility, level, dict, dictlen, fmt, args); logbuf_unlock_irqrestore(flags);
/* If called from the scheduler, we can not call up(). */
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Petr Mladek pmladek@suse.com
commit a338f84dc196f44b63ba0863d2f34fd9b1613572 upstream.
It is just a preparation step. The patch does not change the existing behavior.
Link: http://lkml.kernel.org/r/20180627140817.27764-3-pmladek@suse.com To: Steven Rostedt rostedt@goodmis.org Cc: Peter Zijlstra peterz@infradead.org Cc: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Cc: Sergey Senozhatsky sergey.senozhatsky.work@gmail.com Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Acked-by: Sergey Senozhatsky sergey.senozhatsky@gmail.com Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/printk/printk.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
--- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2887,16 +2887,20 @@ void wake_up_klogd(void) preempt_enable(); }
-int vprintk_deferred(const char *fmt, va_list args) +void defer_console_output(void) { - int r; - - r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args); - preempt_disable(); __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT); irq_work_queue(this_cpu_ptr(&wake_up_klogd_work)); preempt_enable(); +} + +int vprintk_deferred(const char *fmt, va_list args) +{ + int r; + + r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args); + defer_console_output();
return r; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Petr Mladek pmladek@suse.com
commit 03fc7f9c99c1e7ae2925d459e8487f1a6f199f79 upstream.
The commit 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI when logbuf_lock is available") brought back the possible deadlocks in printk() and NMI.
The check of logbuf_lock is done only in printk_nmi_enter() to prevent mixed output. But another CPU might take the lock later, enter NMI, and:
+ Both NMIs might be serialized by yet another lock, for example, the one in nmi_cpu_backtrace().
+ The other CPU might get stopped in NMI, see smp_send_stop() in panic().
The only safe solution is to use trylock when storing the message into the main log-buffer. It might cause reordering when some lines go to the main lock buffer directly and others are delayed via the per-CPU buffer. It means that it is not useful in general.
This patch replaces the problematic NMI deferred context with NMI direct context. It can be used to mark a code that might produce many messages in NMI and the risk of losing them is more critical than problems with eventual reordering.
The context is then used when dumping trace buffers on oops. It was the primary motivation for the original fix. Also the reordering is even smaller issue there because some traces have their own time stamps.
Finally, nmi_cpu_backtrace() need not longer be serialized because it will always us the per-CPU buffers again.
Fixes: 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI when logbuf_lock is available") Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180627142028.11259-1-pmladek@suse.com To: Steven Rostedt rostedt@goodmis.org Cc: Peter Zijlstra peterz@infradead.org Cc: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Cc: Sergey Senozhatsky sergey.senozhatsky.work@gmail.com Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Acked-by: Sergey Senozhatsky sergey.senozhatsky@gmail.com Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/printk.h | 4 +++ kernel/printk/internal.h | 9 ++++++ kernel/printk/printk_safe.c | 58 ++++++++++++++++++++++++++++---------------- kernel/trace/trace.c | 4 ++- lib/nmi_backtrace.c | 3 -- 5 files changed, 52 insertions(+), 26 deletions(-)
--- a/include/linux/printk.h +++ b/include/linux/printk.h @@ -148,9 +148,13 @@ void early_printk(const char *s, ...) { #ifdef CONFIG_PRINTK_NMI extern void printk_nmi_enter(void); extern void printk_nmi_exit(void); +extern void printk_nmi_direct_enter(void); +extern void printk_nmi_direct_exit(void); #else static inline void printk_nmi_enter(void) { } static inline void printk_nmi_exit(void) { } +static inline void printk_nmi_direct_enter(void) { } +static inline void printk_nmi_direct_exit(void) { } #endif /* PRINTK_NMI */
#ifdef CONFIG_PRINTK --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -19,11 +19,16 @@ #ifdef CONFIG_PRINTK
#define PRINTK_SAFE_CONTEXT_MASK 0x3fffffff -#define PRINTK_NMI_DEFERRED_CONTEXT_MASK 0x40000000 +#define PRINTK_NMI_DIRECT_CONTEXT_MASK 0x40000000 #define PRINTK_NMI_CONTEXT_MASK 0x80000000
extern raw_spinlock_t logbuf_lock;
+__printf(5, 0) +int vprintk_store(int facility, int level, + const char *dict, size_t dictlen, + const char *fmt, va_list args); + __printf(1, 0) int vprintk_default(const char *fmt, va_list args); __printf(1, 0) int vprintk_deferred(const char *fmt, va_list args); __printf(1, 0) int vprintk_func(const char *fmt, va_list args); @@ -54,6 +59,8 @@ void __printk_safe_exit(void); local_irq_enable(); \ } while (0)
+void defer_console_output(void); + #else
__printf(1, 0) int vprintk_func(const char *fmt, va_list args) { return 0; } --- a/kernel/printk/printk_safe.c +++ b/kernel/printk/printk_safe.c @@ -308,24 +308,33 @@ static __printf(1, 0) int vprintk_nmi(co
void printk_nmi_enter(void) { - /* - * The size of the extra per-CPU buffer is limited. Use it only when - * the main one is locked. If this CPU is not in the safe context, - * the lock must be taken on another CPU and we could wait for it. - */ - if ((this_cpu_read(printk_context) & PRINTK_SAFE_CONTEXT_MASK) && - raw_spin_is_locked(&logbuf_lock)) { - this_cpu_or(printk_context, PRINTK_NMI_CONTEXT_MASK); - } else { - this_cpu_or(printk_context, PRINTK_NMI_DEFERRED_CONTEXT_MASK); - } + this_cpu_or(printk_context, PRINTK_NMI_CONTEXT_MASK); }
void printk_nmi_exit(void) { - this_cpu_and(printk_context, - ~(PRINTK_NMI_CONTEXT_MASK | - PRINTK_NMI_DEFERRED_CONTEXT_MASK)); + this_cpu_and(printk_context, ~PRINTK_NMI_CONTEXT_MASK); +} + +/* + * Marks a code that might produce many messages in NMI context + * and the risk of losing them is more critical than eventual + * reordering. + * + * It has effect only when called in NMI context. Then printk() + * will try to store the messages into the main logbuf directly + * and use the per-CPU buffers only as a fallback when the lock + * is not available. + */ +void printk_nmi_direct_enter(void) +{ + if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK) + this_cpu_or(printk_context, PRINTK_NMI_DIRECT_CONTEXT_MASK); +} + +void printk_nmi_direct_exit(void) +{ + this_cpu_and(printk_context, ~PRINTK_NMI_DIRECT_CONTEXT_MASK); }
#else @@ -363,6 +372,20 @@ void __printk_safe_exit(void)
__printf(1, 0) int vprintk_func(const char *fmt, va_list args) { + /* + * Try to use the main logbuf even in NMI. But avoid calling console + * drivers that might have their own locks. + */ + if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK) && + raw_spin_trylock(&logbuf_lock)) { + int len; + + len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args); + raw_spin_unlock(&logbuf_lock); + defer_console_output(); + return len; + } + /* Use extra buffer in NMI when logbuf_lock is taken or in safe mode. */ if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK) return vprintk_nmi(fmt, args); @@ -371,13 +394,6 @@ __printf(1, 0) int vprintk_func(const ch if (this_cpu_read(printk_context) & PRINTK_SAFE_CONTEXT_MASK) return vprintk_safe(fmt, args);
- /* - * Use the main logbuf when logbuf_lock is available in NMI. - * But avoid calling console drivers that might have their own locks. - */ - if (this_cpu_read(printk_context) & PRINTK_NMI_DEFERRED_CONTEXT_MASK) - return vprintk_deferred(fmt, args); - /* No obstacles. */ return vprintk_default(fmt, args); } --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -8288,6 +8288,7 @@ void ftrace_dump(enum ftrace_dump_mode o tracing_off();
local_irq_save(flags); + printk_nmi_direct_enter();
/* Simulate the iterator */ trace_init_global_iter(&iter); @@ -8367,7 +8368,8 @@ void ftrace_dump(enum ftrace_dump_mode o for_each_tracing_cpu(cpu) { atomic_dec(&per_cpu_ptr(iter.trace_buffer->data, cpu)->disabled); } - atomic_dec(&dump_running); + atomic_dec(&dump_running); + printk_nmi_direct_exit(); local_irq_restore(flags); } EXPORT_SYMBOL_GPL(ftrace_dump); --- a/lib/nmi_backtrace.c +++ b/lib/nmi_backtrace.c @@ -87,11 +87,9 @@ void nmi_trigger_cpumask_backtrace(const
bool nmi_cpu_backtrace(struct pt_regs *regs) { - static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED; int cpu = smp_processor_id();
if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { - arch_spin_lock(&lock); if (regs && cpu_in_idle(instruction_pointer(regs))) { pr_warn("NMI backtrace for cpu %d skipped: idling at %pS\n", cpu, (void *)instruction_pointer(regs)); @@ -102,7 +100,6 @@ bool nmi_cpu_backtrace(struct pt_regs *r else dump_stack(); } - arch_spin_unlock(&lock); cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); return true; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu mhiramat@kernel.org
commit 0722867dcbc28cc9b269b57acd847c7c1aa638d6 upstream.
Fix %p uses in error messages by removing it because those are redundant or meaningless.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Acked-by: Will Deacon will.deacon@arm.com Cc: Ananth N Mavinakayanahalli ananth@in.ibm.com Cc: Anil S Keshavamurthy anil.s.keshavamurthy@intel.com Cc: Arnd Bergmann arnd@arndb.de Cc: David Howells dhowells@redhat.com Cc: David S . Miller davem@davemloft.net Cc: Heiko Carstens heiko.carstens@de.ibm.com Cc: Jon Medhurst tixy@linaro.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Thomas Richter tmricht@linux.ibm.com Cc: Tobin C . Harding me@tobin.cc Cc: acme@kernel.org Cc: akpm@linux-foundation.org Cc: brueckner@linux.vnet.ibm.com Cc: linux-arch@vger.kernel.org Cc: rostedt@goodmis.org Cc: schwidefsky@de.ibm.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/152491908405.9916.12425053035317241111.stgit@de... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/kernel/probes/kprobes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/kernel/probes/kprobes.c +++ b/arch/arm64/kernel/probes/kprobes.c @@ -275,7 +275,7 @@ static int __kprobes reenter_kprobe(stru break; case KPROBE_HIT_SS: case KPROBE_REENTER: - pr_warn("Unrecoverable kprobe detected at %p.\n", p->addr); + pr_warn("Unrecoverable kprobe detected.\n"); dump_kprobe(p); BUG(); break;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Suzuki K Poulose suzuki.poulose@arm.com
commit 4c4a39dd5fe2d13e2d2fa5fceb8ef95d19fc389a upstream.
If there is a mismatch in the I/D min line size, we must always use the system wide safe value both in applications and in the kernel, while performing cache operations. However, we have been checking more bits than just the min line sizes, which triggers false negatives. We may need to trap the user accesses in such cases, but not necessarily patch the kernel.
This patch fixes the check to do the right thing as advertised. A new capability will be added to check mismatches in other fields and ensure we trap the CTR accesses.
Fixes: be68a8aaf925 ("arm64: cpufeature: Fix CTR_EL0 field definitions") Cc: stable@vger.kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Reported-by: Will Deacon will.deacon@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/include/asm/cache.h | 4 ++++ arch/arm64/kernel/cpu_errata.c | 6 ++++-- arch/arm64/kernel/cpufeature.c | 2 +- 3 files changed, 9 insertions(+), 3 deletions(-)
--- a/arch/arm64/include/asm/cache.h +++ b/arch/arm64/include/asm/cache.h @@ -21,12 +21,16 @@ #define CTR_L1IP_SHIFT 14 #define CTR_L1IP_MASK 3 #define CTR_DMINLINE_SHIFT 16 +#define CTR_IMINLINE_SHIFT 0 #define CTR_ERG_SHIFT 20 #define CTR_CWG_SHIFT 24 #define CTR_CWG_MASK 15 #define CTR_IDC_SHIFT 28 #define CTR_DIC_SHIFT 29
+#define CTR_CACHE_MINLINE_MASK \ + (0xf << CTR_DMINLINE_SHIFT | 0xf << CTR_IMINLINE_SHIFT) + #define CTR_L1IP(ctr) (((ctr) >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK)
#define ICACHE_POLICY_VPIPT 0 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -68,9 +68,11 @@ static bool has_mismatched_cache_line_size(const struct arm64_cpu_capabilities *entry, int scope) { + u64 mask = CTR_CACHE_MINLINE_MASK; + WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible()); - return (read_cpuid_cachetype() & arm64_ftr_reg_ctrel0.strict_mask) != - (arm64_ftr_reg_ctrel0.sys_val & arm64_ftr_reg_ctrel0.strict_mask); + return (read_cpuid_cachetype() & mask) != + (arm64_ftr_reg_ctrel0.sys_val & mask); }
static void --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -214,7 +214,7 @@ static const struct arm64_ftr_bits ftr_c * If we have differing I-cache policies, report it as the weakest - VIPT. */ ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_EXACT, 14, 2, ICACHE_POLICY_VIPT), /* L1Ip */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 0, 4, 0), /* IminLine */ + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_IMINLINE_SHIFT, 4, 0), ARM64_FTR_END, };
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Suzuki K Poulose suzuki.poulose@arm.com
commit 314d53d297980676011e6fd83dac60db4a01dc70 upstream.
Track mismatches in the cache type register (CTR_EL0), other than the D/I min line sizes and trap user accesses if there are any.
Fixes: be68a8aaf925 ("arm64: cpufeature: Fix CTR_EL0 field definitions") Cc: stable@vger.kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Will Deacon will.deacon@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/kernel/cpu_errata.c | 17 ++++++++++++++--- 2 files changed, 16 insertions(+), 4 deletions(-)
--- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -49,7 +49,8 @@ #define ARM64_HAS_CACHE_DIC 28 #define ARM64_HW_DBM 29 #define ARM64_SSBD 30 +#define ARM64_MISMATCHED_CACHE_TYPE 31
-#define ARM64_NCAPS 31 +#define ARM64_NCAPS 32
#endif /* __ASM_CPUCAPS_H */ --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -65,11 +65,15 @@ is_kryo_midr(const struct arm64_cpu_capa }
static bool -has_mismatched_cache_line_size(const struct arm64_cpu_capabilities *entry, - int scope) +has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry, + int scope) { u64 mask = CTR_CACHE_MINLINE_MASK;
+ /* Skip matching the min line sizes for cache type check */ + if (entry->capability == ARM64_MISMATCHED_CACHE_TYPE) + mask ^= arm64_ftr_reg_ctrel0.strict_mask; + WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible()); return (read_cpuid_cachetype() & mask) != (arm64_ftr_reg_ctrel0.sys_val & mask); @@ -615,7 +619,14 @@ const struct arm64_cpu_capabilities arm6 { .desc = "Mismatched cache line size", .capability = ARM64_MISMATCHED_CACHE_LINE_SIZE, - .matches = has_mismatched_cache_line_size, + .matches = has_mismatched_cache_type, + .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, + .cpu_enable = cpu_enable_trap_ctr_access, + }, + { + .desc = "Mismatched cache type", + .capability = ARM64_MISMATCHED_CACHE_TYPE, + .matches = has_mismatched_cache_type, .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, .cpu_enable = cpu_enable_trap_ctr_access, },
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Greg Hackmann ghackmann@android.com
commit 5ad356eabc47d26a92140a0c4b20eba471c10de3 upstream.
ARM64's pfn_valid() shifts away the upper PAGE_SHIFT bits of the input before seeing if the PFN is valid. This leads to false positives when some of the upper bits are set, but the lower bits match a valid PFN.
For example, the following userspace code looks up a bogus entry in /proc/kpageflags:
int pagemap = open("/proc/self/pagemap", O_RDONLY); int pageflags = open("/proc/kpageflags", O_RDONLY); uint64_t pfn, val;
lseek64(pagemap, [...], SEEK_SET); read(pagemap, &pfn, sizeof(pfn)); if (pfn & (1UL << 63)) { /* valid PFN */ pfn &= ((1UL << 55) - 1); /* clear flag bits */ pfn |= (1UL << 55); lseek64(pageflags, pfn * sizeof(uint64_t), SEEK_SET); read(pageflags, &val, sizeof(val)); }
On ARM64 this causes the userspace process to crash with SIGSEGV rather than reading (1 << KPF_NOPAGE). kpageflags_read() treats the offset as valid, and stable_page_flags() will try to access an address between the user and kernel address ranges.
Fixes: c1cc1552616d ("arm64: MMU initialisation") Cc: stable@vger.kernel.org Signed-off-by: Greg Hackmann ghackmann@google.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/mm/init.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -287,7 +287,11 @@ static void __init zone_sizes_init(unsig #ifdef CONFIG_HAVE_ARCH_PFN_VALID int pfn_valid(unsigned long pfn) { - return memblock_is_map_memory(pfn << PAGE_SHIFT); + phys_addr_t addr = pfn << PAGE_SHIFT; + + if ((addr >> PAGE_SHIFT) != pfn) + return 0; + return memblock_is_map_memory(addr); } EXPORT_SYMBOL(pfn_valid); #endif
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Huibin Hong huibin.hong@rock-chips.com
commit d0414fdd58eb51ffd6528280fd66705123663964 upstream.
Corrected the uart clock-names or the uart driver might fail.
Fixes: 52e02d377a72 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs") Cc: stable@vger.kernel.org Signed-off-by: Huibin Hong huibin.hong@rock-chips.com Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/boot/dts/rockchip/rk3328.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/boot/dts/rockchip/rk3328.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3328.dtsi @@ -331,7 +331,7 @@ reg = <0x0 0xff120000 0x0 0x100>; interrupts = <GIC_SPI 56 IRQ_TYPE_LEVEL_HIGH>; clocks = <&cru SCLK_UART1>, <&cru PCLK_UART1>; - clock-names = "sclk_uart", "pclk_uart"; + clock-names = "baudclk", "apb_pclk"; dmas = <&dmac 4>, <&dmac 5>; dma-names = "tx", "rx"; pinctrl-names = "default";
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoffer Dall christoffer.dall@arm.com
commit 7afc4ddbf299a13aaf28406783d141a34c6b4f5a upstream.
kvm_timer_update_state() is called when changing the phys timer configuration registers, either via vcpu reset, as a result of a trap from the guest, or when userspace programs the registers.
phys_timer_emulate() is in turn called by kvm_timer_update_state() to either cancel an existing software timer, or program a new software timer, to emulate the behavior of a real phys timer, based on the change in configuration registers.
Unfortunately, the interaction between these two functions left a small race; if the conceptual emulated phys timer should actually fire, but the soft timer hasn't executed its callback yet, we cancel the timer in phys_timer_emulate without injecting an irq. This only happens if the check in kvm_timer_update_state is called before the timer should fire, which is relatively unlikely, but possible.
The solution is to update the state of the phys timer after calling phys_timer_emulate, which will pick up the pending timer state and update the interrupt value.
Note that this leaves the opportunity of raising the interrupt twice, once in the just-programmed soft timer, and once in kvm_timer_update_state. Since this always happens synchronously with the VCPU execution, there is no harm in this, and the guest ever only sees a single timer interrupt.
Cc: Stable stable@vger.kernel.org # 4.15+ Signed-off-by: Christoffer Dall christoffer.dall@arm.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- virt/kvm/arm/arch_timer.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/virt/kvm/arm/arch_timer.c +++ b/virt/kvm/arm/arch_timer.c @@ -295,9 +295,9 @@ static void phys_timer_emulate(struct kv struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
/* - * If the timer can fire now we have just raised the IRQ line and we - * don't need to have a soft timer scheduled for the future. If the - * timer cannot fire at all, then we also don't need a soft timer. + * If the timer can fire now, we don't need to have a soft timer + * scheduled for the future. If the timer cannot fire at all, + * then we also don't need a soft timer. */ if (kvm_timer_should_fire(ptimer) || !kvm_timer_irq_can_fire(ptimer)) { soft_timer_cancel(&timer->phys_timer, NULL); @@ -332,10 +332,10 @@ static void kvm_timer_update_state(struc level = kvm_timer_should_fire(vtimer); kvm_timer_update_irq(vcpu, level, vtimer);
+ phys_timer_emulate(vcpu); + if (kvm_timer_should_fire(ptimer) != ptimer->irq.level) kvm_timer_update_irq(vcpu, !ptimer->irq.level, ptimer); - - phys_timer_emulate(vcpu); }
static void vtimer_save_state(struct kvm_vcpu *vcpu)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoffer Dall christoffer.dall@arm.com
commit 245715cbe83ca934af5d20e078fd85175c62995e upstream.
When the VCPU is blocked (for example from WFI) we don't inject the physical timer interrupt if it should fire while the CPU is blocked, but instead we just wake up the VCPU and expect kvm_timer_vcpu_load to take care of injecting the interrupt.
Unfortunately, kvm_timer_vcpu_load() doesn't actually do that, it only has support to schedule a soft timer if the emulated phys timer is expected to fire in the future.
Follow the same pattern as kvm_timer_update_state() and update the irq state after potentially scheduling a soft timer.
Reported-by: Andre Przywara andre.przywara@arm.com Cc: Stable stable@vger.kernel.org # 4.15+ Fixes: bbdd52cfcba29 ("KVM: arm/arm64: Avoid phys timer emulation in vcpu entry/exit") Signed-off-by: Christoffer Dall christoffer.dall@arm.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- virt/kvm/arm/arch_timer.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/virt/kvm/arm/arch_timer.c +++ b/virt/kvm/arm/arch_timer.c @@ -487,6 +487,7 @@ void kvm_timer_vcpu_load(struct kvm_vcpu { struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; struct arch_timer_context *vtimer = vcpu_vtimer(vcpu); + struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
if (unlikely(!timer->enabled)) return; @@ -502,6 +503,10 @@ void kvm_timer_vcpu_load(struct kvm_vcpu
/* Set the background timer for the physical timer emulation. */ phys_timer_emulate(vcpu); + + /* If the timer fired while we weren't running, inject it now */ + if (kvm_timer_should_fire(ptimer) != ptimer->irq.level) + kvm_timer_update_irq(vcpu, !ptimer->irq.level, ptimer); }
bool kvm_timer_should_notify_user(struct kvm_vcpu *vcpu)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Punit Agrawal punit.agrawal@arm.com
commit 86658b819cd0a9aa584cd84453ed268a6f013770 upstream.
Contention on updating a PMD entry by a large number of vcpus can lead to duplicate work when handling stage 2 page faults. As the page table update follows the break-before-make requirement of the architecture, it can lead to repeated refaults due to clearing the entry and flushing the tlbs.
This problem is more likely when -
* there are large number of vcpus * the mapping is large block mapping
such as when using PMD hugepages (512MB) with 64k pages.
Fix this by skipping the page table update if there is no change in the entry being updated.
Cc: stable@vger.kernel.org Fixes: ad361f093c1e ("KVM: ARM: Support hugetlbfs backed huge pages") Reviewed-by: Suzuki Poulose suzuki.poulose@arm.com Acked-by: Christoffer Dall christoffer.dall@arm.com Signed-off-by: Punit Agrawal punit.agrawal@arm.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- virt/kvm/arm/mmu.c | 38 +++++++++++++++++++++++++++----------- 1 file changed, 27 insertions(+), 11 deletions(-)
--- a/virt/kvm/arm/mmu.c +++ b/virt/kvm/arm/mmu.c @@ -1015,19 +1015,35 @@ static int stage2_set_pmd_huge(struct kv pmd = stage2_get_pmd(kvm, cache, addr); VM_BUG_ON(!pmd);
- /* - * Mapping in huge pages should only happen through a fault. If a - * page is merged into a transparent huge page, the individual - * subpages of that huge page should be unmapped through MMU - * notifiers before we get here. - * - * Merging of CompoundPages is not supported; they should become - * splitting first, unmapped, merged, and mapped back in on-demand. - */ - VM_BUG_ON(pmd_present(*pmd) && pmd_pfn(*pmd) != pmd_pfn(*new_pmd)); - old_pmd = *pmd; if (pmd_present(old_pmd)) { + /* + * Multiple vcpus faulting on the same PMD entry, can + * lead to them sequentially updating the PMD with the + * same value. Following the break-before-make + * (pmd_clear() followed by tlb_flush()) process can + * hinder forward progress due to refaults generated + * on missing translations. + * + * Skip updating the page table if the entry is + * unchanged. + */ + if (pmd_val(old_pmd) == pmd_val(*new_pmd)) + return 0; + + /* + * Mapping in huge pages should only happen through a + * fault. If a page is merged into a transparent huge + * page, the individual subpages of that huge page + * should be unmapped through MMU notifiers before we + * get here. + * + * Merging of CompoundPages is not supported; they + * should become splitting first, unmapped, merged, + * and mapped back in on-demand. + */ + VM_BUG_ON(pmd_pfn(old_pmd) != pmd_pfn(*new_pmd)); + pmd_clear(pmd); kvm_tlb_flush_vmid_ipa(kvm, addr); } else {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Punit Agrawal punit.agrawal@arm.com
commit 976d34e2dab10ece5ea8fe7090b7692913f89084 upstream.
When there is contention on faulting in a particular page table entry at stage 2, the break-before-make requirement of the architecture can lead to additional refaulting due to TLB invalidation.
Avoid this by skipping a page table update if the new value of the PTE matches the previous value.
Cc: stable@vger.kernel.org Fixes: d5d8184d35c9 ("KVM: ARM: Memory virtualization setup") Reviewed-by: Suzuki Poulose suzuki.poulose@arm.com Acked-by: Christoffer Dall christoffer.dall@arm.com Signed-off-by: Punit Agrawal punit.agrawal@arm.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- virt/kvm/arm/mmu.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/virt/kvm/arm/mmu.c +++ b/virt/kvm/arm/mmu.c @@ -1118,6 +1118,10 @@ static int stage2_set_pte(struct kvm *kv /* Create 2nd stage page table mapping - Level 3 */ old_pte = *pte; if (pte_present(old_pte)) { + /* Skip page table update if there is no change */ + if (pte_val(old_pte) == pte_val(*new_pte)) + return 0; + kvm_set_pte(pte, __pte(0)); kvm_tlb_flush_vmid_ipa(kvm, addr); } else {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Claudio Imbrenda imbrenda@linux.vnet.ibm.com
commit 306d6c49ac9ded11114cb53b0925da52f2c2ada1 upstream.
When the oom killer kills a userspace process in the page fault handler while in guest context, the fault handler fails to release the mm_sem if the FAULT_FLAG_RETRY_NOWAIT option is set. This leads to a deadlock when tearing down the mm when the process terminates. This bug can only happen when pfault is enabled, so only KVM clients are affected.
The problem arises in the rare cases in which handle_mm_fault does not release the mm_sem. This patch fixes the issue by manually releasing the mm_sem when needed.
Fixes: 24eb3a824c4f3 ("KVM: s390: Add FAULT_FLAG_RETRY_NOWAIT for guest fault") Cc: stable@vger.kernel.org # 3.15+ Signed-off-by: Claudio Imbrenda imbrenda@linux.vnet.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/mm/fault.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -502,6 +502,8 @@ retry: /* No reason to continue if interrupted by SIGKILL. */ if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { fault = VM_FAULT_SIGNAL; + if (flags & FAULT_FLAG_RETRY_NOWAIT) + goto out_up; goto out; } if (unlikely(fault & VM_FAULT_ERROR))
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Richter tmricht@linux.ibm.com
commit 8a95c8994509c55abf1e38c0cc037b1205725e21 upstream.
With commit eca0fa28cd0d ("perf record: Provide detailed information on s390 CPU") s390 platform provides detailed type/model/capacity information in the CPU identifier string instead of just "IBM/S390".
This breaks 'perf kvm' support which uses hard coded string IBM/S390 to compare with the CPU identifier string. Fix this by changing the comparison.
Reported-by: Stefan Raspl raspl@linux.ibm.com Signed-off-by: Thomas Richter tmricht@linux.ibm.com Reviewed-by: Hendrik Brueckner brueckner@linux.ibm.com Tested-by: Stefan Raspl raspl@linux.ibm.com Acked-by: Christian Borntraeger borntraeger@de.ibm.com Cc: Heiko Carstens heiko.carstens@de.ibm.com Cc: Martin Schwidefsky schwidefsky@de.ibm.com Cc: stable@vger.kernel.org Fixes: eca0fa28cd0d ("perf record: Provide detailed information on s390 CPU") Link: http://lkml.kernel.org/r/20180712070936.67547-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/arch/s390/util/kvm-stat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/perf/arch/s390/util/kvm-stat.c +++ b/tools/perf/arch/s390/util/kvm-stat.c @@ -102,7 +102,7 @@ const char * const kvm_skip_events[] = {
int cpu_isa_init(struct perf_kvm_stat *kvm, const char *cpuid) { - if (strstr(cpuid, "IBM/S390")) { + if (strstr(cpuid, "IBM")) { kvm->exit_reasons = sie_exit_reasons; kvm->exit_reasons_isa = "SIE"; } else
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
commit b80a2bfce85e1051056d98d04ecb2d0b55cbbc1c upstream.
The code flow in cpu_stop_queue_two_works() is a little arcane; fix this by lifting the preempt_disable() to the top to create more natural nesting wrt the spinlocks and make the wake_up_q() and preempt_enable() unconditional at the end.
Furthermore, enable preemption in the -EDEADLK case, such that we spin-wait with preemption enabled.
Suggested-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: isaacm@codeaurora.org Cc: matt@codeblueprint.co.uk Cc: psodagud@codeaurora.org Cc: gregkh@linuxfoundation.org Cc: pkondeti@codeaurora.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180730112140.GH2494@hirez.programming.kicks-ass.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/stop_machine.c | 41 +++++++++++++++++++++++------------------ 1 file changed, 23 insertions(+), 18 deletions(-)
--- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -236,13 +236,24 @@ static int cpu_stop_queue_two_works(int struct cpu_stopper *stopper2 = per_cpu_ptr(&cpu_stopper, cpu2); DEFINE_WAKE_Q(wakeq); int err; + retry: + /* + * The waking up of stopper threads has to happen in the same + * scheduling context as the queueing. Otherwise, there is a + * possibility of one of the above stoppers being woken up by another + * CPU, and preempting us. This will cause us to not wake up the other + * stopper forever. + */ + preempt_disable(); raw_spin_lock_irq(&stopper1->lock); raw_spin_lock_nested(&stopper2->lock, SINGLE_DEPTH_NESTING);
- err = -ENOENT; - if (!stopper1->enabled || !stopper2->enabled) + if (!stopper1->enabled || !stopper2->enabled) { + err = -ENOENT; goto unlock; + } + /* * Ensure that if we race with __stop_cpus() the stoppers won't get * queued up in reverse order leading to system deadlock. @@ -253,36 +264,30 @@ retry: * It can be falsely true but it is safe to spin until it is cleared, * queue_stop_cpus_work() does everything under preempt_disable(). */ - err = -EDEADLK; - if (unlikely(stop_cpus_in_progress)) - goto unlock; + if (unlikely(stop_cpus_in_progress)) { + err = -EDEADLK; + goto unlock; + }
err = 0; __cpu_stop_queue_work(stopper1, work1, &wakeq); __cpu_stop_queue_work(stopper2, work2, &wakeq); - /* - * The waking up of stopper threads has to happen - * in the same scheduling context as the queueing. - * Otherwise, there is a possibility of one of the - * above stoppers being woken up by another CPU, - * and preempting us. This will cause us to n ot - * wake up the other stopper forever. - */ - preempt_disable(); + unlock: raw_spin_unlock(&stopper2->lock); raw_spin_unlock_irq(&stopper1->lock);
if (unlikely(err == -EDEADLK)) { + preempt_enable(); + while (stop_cpus_in_progress) cpu_relax(); + goto retry; }
- if (!err) { - wake_up_q(&wakeq); - preempt_enable(); - } + wake_up_q(&wakeq); + preempt_enable();
return err; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Prasad Sodagudi psodagud@codeaurora.org
commit cfd355145c32bb7ccb65fccbe2d67280dc2119e1 upstream.
When cpu_stop_queue_work() releases the lock for the stopper thread that was queued into its wake queue, preemption is enabled, which leads to the following deadlock:
CPU0 CPU1 sched_setaffinity(0, ...) __set_cpus_allowed_ptr() stop_one_cpu(0, ...) stop_two_cpus(0, 1, ...) cpu_stop_queue_work(0, ...) cpu_stop_queue_two_works(0, ..., 1, ...)
-grabs lock for migration/0- -spins with preemption disabled, waiting for migration/0's lock to be released-
-adds work items for migration/0 and queues migration/0 to its wake_q-
-releases lock for migration/0 and preemption is enabled-
-current thread is preempted, and __set_cpus_allowed_ptr has changed the thread's cpu allowed mask to CPU1 only-
-acquires migration/0 and migration/1's locks-
-adds work for migration/0 but does not add migration/0 to wake_q, since it is already in a wake_q-
-adds work for migration/1 and adds migration/1 to its wake_q-
-releases migration/0 and migration/1's locks, wakes migration/1, and enables preemption-
-since migration/1 is requested to run, migration/1 begins to run and waits on migration/0, but migration/0 will never be able to run, since the thread that can wake it is affine to CPU1-
Disable preemption in cpu_stop_queue_work() before queueing works for stopper threads, and queueing the stopper thread in the wake queue, to ensure that the operation of queueing the works and waking the stopper threads is atomic.
Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Signed-off-by: Prasad Sodagudi psodagud@codeaurora.org Signed-off-by: Isaac J. Manjarres isaacm@codeaurora.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: peterz@infradead.org Cc: matt@codeblueprint.co.uk Cc: bigeasy@linutronix.de Cc: gregkh@linuxfoundation.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1533329766-4856-1-git-send-email-isaacm@codeaurora... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Co-Developed-by: Isaac J. Manjarres isaacm@codeaurora.org
--- kernel/stop_machine.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -81,6 +81,7 @@ static bool cpu_stop_queue_work(unsigned unsigned long flags; bool enabled;
+ preempt_disable(); raw_spin_lock_irqsave(&stopper->lock, flags); enabled = stopper->enabled; if (enabled) @@ -90,6 +91,7 @@ static bool cpu_stop_queue_work(unsigned raw_spin_unlock_irqrestore(&stopper->lock, flags);
wake_up_q(&wakeq); + preempt_enable();
return enabled; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Theodore Ts'o tytso@mit.edu
commit 7d95178c77014dbd8dce36ee40bbbc5e6c121ff5 upstream.
Extended attribute names are defined to be NUL-terminated, so the name must not contain a NUL character. This is important because there are places when remove extended attribute, the code uses strlen to determine the length of the entry. That should probably be fixed at some point, but code is currently really messy, so the simplest fix for now is to simply validate that the extended attributes are sane.
https://bugzilla.kernel.org/show_bug.cgi?id=200401
Reported-by: Wen Xu wen.xu@gatech.edu Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/xattr.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -190,6 +190,8 @@ ext4_xattr_check_entries(struct ext4_xat struct ext4_xattr_entry *next = EXT4_XATTR_NEXT(e); if ((void *)next >= end) return -EFSCORRUPTED; + if (strnlen(e->e_name, e->e_name_len) != e->e_name_len) + return -EFSCORRUPTED; e = next; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wang Shilong wshilong@ddn.com
commit 5ef2a69993676a0dfd49bf60ae1323eb8a288366 upstream.
Out of memory should not be considered as critical errors; so replace ext4_error() with ext4_warnig().
Signed-off-by: Wang Shilong wshilong@ddn.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/balloc.c | 6 +++--- fs/ext4/ialloc.c | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-)
--- a/fs/ext4/balloc.c +++ b/fs/ext4/balloc.c @@ -426,9 +426,9 @@ ext4_read_block_bitmap_nowait(struct sup } bh = sb_getblk(sb, bitmap_blk); if (unlikely(!bh)) { - ext4_error(sb, "Cannot get buffer for block bitmap - " - "block_group = %u, block_bitmap = %llu", - block_group, bitmap_blk); + ext4_warning(sb, "Cannot get buffer for block bitmap - " + "block_group = %u, block_bitmap = %llu", + block_group, bitmap_blk); return ERR_PTR(-ENOMEM); }
--- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -138,9 +138,9 @@ ext4_read_inode_bitmap(struct super_bloc } bh = sb_getblk(sb, bitmap_blk); if (unlikely(!bh)) { - ext4_error(sb, "Cannot read inode bitmap - " - "block_group = %u, inode_bitmap = %llu", - block_group, bitmap_blk); + ext4_warning(sb, "Cannot read inode bitmap - " + "block_group = %u, inode_bitmap = %llu", + block_group, bitmap_blk); return ERR_PTR(-ENOMEM); } if (bitmap_uptodate(bh))
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit a4d2aadca184ece182418950d45ba4ffc7b652d2 upstream.
While working on extended rand for last_error/first_error timestamps, I noticed that the endianess is wrong; we access the little-endian fields in struct ext4_super_block as native-endian when we print them.
This adds a special case in ext4_attr_show() and ext4_attr_store() to byteswap the superblock fields if needed.
In older kernels, this code was part of super.c, it got moved to sysfs.c in linux-4.4.
Cc: stable@vger.kernel.org Fixes: 52c198c6820f ("ext4: add sysfs entry showing whether the fs contains errors") Reviewed-by: Andreas Dilger adilger@dilger.ca Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/sysfs.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
--- a/fs/ext4/sysfs.c +++ b/fs/ext4/sysfs.c @@ -274,8 +274,12 @@ static ssize_t ext4_attr_show(struct kob case attr_pointer_ui: if (!ptr) return 0; - return snprintf(buf, PAGE_SIZE, "%u\n", - *((unsigned int *) ptr)); + if (a->attr_ptr == ptr_ext4_super_block_offset) + return snprintf(buf, PAGE_SIZE, "%u\n", + le32_to_cpup(ptr)); + else + return snprintf(buf, PAGE_SIZE, "%u\n", + *((unsigned int *) ptr)); case attr_pointer_atomic: if (!ptr) return 0; @@ -308,7 +312,10 @@ static ssize_t ext4_attr_store(struct ko ret = kstrtoul(skip_spaces(buf), 0, &t); if (ret) return ret; - *((unsigned int *) ptr) = t; + if (a->attr_ptr == ptr_ext4_super_block_offset) + *((__le32 *) ptr) = cpu_to_le32(t); + else + *((unsigned int *) ptr) = t; return len; case attr_inode_readahead: return inode_readahead_blks_store(sbi, buf, len);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Sandeen sandeen@redhat.com
commit f39b3f45dbcb0343822cce31ea7636ad66e60bc2 upstream.
When ext4_find_entry() falls back to "searching the old fashioned way" due to a corrupt dx dir, it needs to reset the error code to NULL so that the nonstandard ERR_BAD_DX_DIR code isn't returned to userspace.
https://bugzilla.kernel.org/show_bug.cgi?id=199947
Reported-by: Anatoly Trosinenko anatoly.trosinenko@yandex.com Reviewed-by: Andreas Dilger adilger@dilger.ca Signed-off-by: Eric Sandeen sandeen@redhat.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/namei.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1398,6 +1398,7 @@ static struct buffer_head * ext4_find_en goto cleanup_and_exit; dxtrace(printk(KERN_DEBUG "ext4_find_entry: dx failed, " "falling back\n")); + ret = NULL; } nblocks = dir->i_size >> EXT4_BLOCK_SIZE_BITS(sb); if (!nblocks) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wang Shilong wshilong@ddn.com
commit 9af0b3d1257756394ebbd06b14937b557e3a756b upstream.
Whenever we hit block or inode bitmap corruptions we set bit and then reduce this block group free inode/clusters counter to expose right available space.
However some of ext4_mark_group_bitmap_corrupted() is called inside group spinlock, some are not, this could make it happen that we double reduce one block group free counters from system.
Always hold group spinlock for it could fix it, but it looks a little heavy, we could use test_and_set_bit() to fix race problems here.
Signed-off-by: Wang Shilong wshilong@ddn.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/super.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
--- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -776,26 +776,26 @@ void ext4_mark_group_bitmap_corrupted(st struct ext4_sb_info *sbi = EXT4_SB(sb); struct ext4_group_info *grp = ext4_get_group_info(sb, group); struct ext4_group_desc *gdp = ext4_get_group_desc(sb, group, NULL); + int ret;
- if ((flags & EXT4_GROUP_INFO_BBITMAP_CORRUPT) && - !EXT4_MB_GRP_BBITMAP_CORRUPT(grp)) { - percpu_counter_sub(&sbi->s_freeclusters_counter, - grp->bb_free); - set_bit(EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT, - &grp->bb_state); + if (flags & EXT4_GROUP_INFO_BBITMAP_CORRUPT) { + ret = ext4_test_and_set_bit(EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT, + &grp->bb_state); + if (!ret) + percpu_counter_sub(&sbi->s_freeclusters_counter, + grp->bb_free); }
- if ((flags & EXT4_GROUP_INFO_IBITMAP_CORRUPT) && - !EXT4_MB_GRP_IBITMAP_CORRUPT(grp)) { - if (gdp) { + if (flags & EXT4_GROUP_INFO_IBITMAP_CORRUPT) { + ret = ext4_test_and_set_bit(EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT, + &grp->bb_state); + if (!ret && gdp) { int count;
count = ext4_free_inodes_count(sb, gdp); percpu_counter_sub(&sbi->s_freeinodes_counter, count); } - set_bit(EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT, - &grp->bb_state); } }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Wnukowski wnukowski@google.com
commit f1ed3df20d2d223e0852cc4ac1f19bba869a7e3c upstream.
In many architectures loads may be reordered with older stores to different locations. In the nvme driver the following two operations could be reordered:
- Write shadow doorbell (dbbuf_db) into memory. - Read EventIdx (dbbuf_ei) from memory.
This can result in a potential race condition between driver and VM host processing requests (if given virtual NVMe controller has a support for shadow doorbell). If that occurs, then the NVMe controller may decide to wait for MMIO doorbell from guest operating system, and guest driver may decide not to issue MMIO doorbell on any of subsequent commands.
This issue is purely timing-dependent one, so there is no easy way to reproduce it. Currently the easiest known approach is to run "Oracle IO Numbers" (orion) that is shipped with Oracle DB:
orion -run advanced -num_large 0 -size_small 8 -type rand -simulate \ concat -write 40 -duration 120 -matrix row -testname nvme_test
Where nvme_test is a .lun file that contains a list of NVMe block devices to run test against. Limiting number of vCPUs assigned to given VM instance seems to increase chances for this bug to occur. On test environment with VM that got 4 NVMe drives and 1 vCPU assigned the virtual NVMe controller hang could be observed within 10-20 minutes. That correspond to about 400-500k IO operations processed (or about 100GB of IO read/writes).
Orion tool was used as a validation and set to run in a loop for 36 hours (equivalent of pushing 550M IO operations). No issues were observed. That suggest that the patch fixes the issue.
Fixes: f9f38e33389c ("nvme: improve performance for virtual NVMe devices") Signed-off-by: Michal Wnukowski wnukowski@google.com Reviewed-by: Keith Busch keith.busch@intel.com Reviewed-by: Sagi Grimberg sagi@grimberg.me [hch: updated changelog and comment a bit] Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/nvme/host/pci.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -316,6 +316,14 @@ static bool nvme_dbbuf_update_and_check_ old_value = *dbbuf_db; *dbbuf_db = value;
+ /* + * Ensure that the doorbell is updated before reading the event + * index from memory. The controller needs to provide similar + * ordering to ensure the envent index is updated before reading + * the doorbell. + */ + mb(); + if (!nvme_dbbuf_need_event(*dbbuf_ei, value, old_value)) return false; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paulo Zanoni paulo.r.zanoni@intel.com
commit db0c8d8b031d2b5960f6407f7f2ca20e97e00605 upstream.
ICL changes the registers and addresses to 64 bits.
I also briefly looked at implementing an u64 version of the PCI config read functions, but I concluded this wouldn't be trivial, so it's not worth doing it for a single user that can't have any racing problems while reading the register in two separate operations.
v2: - Scrub the development (non-public) changelog (Joonas). - Remove the i915.ko bits so this can be easily backported in order to properly avoid stolen memory even on machines without i915.ko (Joonas). - CC stable for the reasons above.
Issue: VIZ-9250 CC: stable@vger.kernel.org Cc: Ingo Molnar mingo@kernel.org Cc: H. Peter Anvin hpa@zytor.com Cc: x86@kernel.org Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Signed-off-by: Paulo Zanoni paulo.r.zanoni@intel.com Fixes: 412310019a20 ("drm/i915/icl: Add initial Icelake definitions.") Reviewed-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com Acked-by: Ingo Molnar mingo@kernel.org Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20180504203252.28048-1-paulo.r... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/early-quirks.c | 18 ++++++++++++++++++ include/drm/i915_drm.h | 4 +++- 2 files changed, 21 insertions(+), 1 deletion(-)
--- a/arch/x86/kernel/early-quirks.c +++ b/arch/x86/kernel/early-quirks.c @@ -338,6 +338,18 @@ static resource_size_t __init gen3_stole return bsm & INTEL_BSM_MASK; }
+static resource_size_t __init gen11_stolen_base(int num, int slot, int func, + resource_size_t stolen_size) +{ + u64 bsm; + + bsm = read_pci_config(num, slot, func, INTEL_GEN11_BSM_DW0); + bsm &= INTEL_BSM_MASK; + bsm |= (u64)read_pci_config(num, slot, func, INTEL_GEN11_BSM_DW1) << 32; + + return bsm; +} + static resource_size_t __init i830_stolen_size(int num, int slot, int func) { u16 gmch_ctrl; @@ -498,6 +510,11 @@ static const struct intel_early_ops chv_ .stolen_size = chv_stolen_size, };
+static const struct intel_early_ops gen11_early_ops __initconst = { + .stolen_base = gen11_stolen_base, + .stolen_size = gen9_stolen_size, +}; + static const struct pci_device_id intel_early_ids[] __initconst = { INTEL_I830_IDS(&i830_early_ops), INTEL_I845G_IDS(&i845_early_ops), @@ -529,6 +546,7 @@ static const struct pci_device_id intel_ INTEL_CFL_IDS(&gen9_early_ops), INTEL_GLK_IDS(&gen9_early_ops), INTEL_CNL_IDS(&gen9_early_ops), + INTEL_ICL_11_IDS(&gen11_early_ops), };
struct resource intel_graphics_stolen_res __ro_after_init = DEFINE_RES_MEM(0, 0); --- a/include/drm/i915_drm.h +++ b/include/drm/i915_drm.h @@ -95,7 +95,9 @@ extern struct resource intel_graphics_st #define I845_TSEG_SIZE_512K (2 << 1) #define I845_TSEG_SIZE_1M (3 << 1)
-#define INTEL_BSM 0x5c +#define INTEL_BSM 0x5c +#define INTEL_GEN11_BSM_DW0 0xc0 +#define INTEL_GEN11_BSM_DW1 0xc4 #define INTEL_BSM_MASK (-(1u << 20))
#endif /* _I915_DRM_H_ */
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit 58e73aa177850babb947555257fd4f79e5275cf1 upstream.
The commit 5d9f40b56630 ("platform/x86: ideapad-laptop: Add Y520-15IKBN to no_hw_rfkill") added the entry for Y20-15IKBN, and it turned out that another variant, Y20-15IKBM, also requires the no_hw_rfkill.
Trim the last letter from the string so that it matches to both Y20-15IKBN and Y20-15IKBM models.
Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1098626 Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Darren Hart (VMware) dvhart@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/platform/x86/ideapad-laptop.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/platform/x86/ideapad-laptop.c +++ b/drivers/platform/x86/ideapad-laptop.c @@ -1133,10 +1133,10 @@ static const struct dmi_system_id no_hw_ }, }, { - .ident = "Lenovo Legion Y520-15IKBN", + .ident = "Lenovo Legion Y520-15IKB", .matches = { DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), - DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo Y520-15IKBN"), + DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo Y520-15IKB"), }, }, {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
commit d86564a2f085b79ec046a5cba90188e612352806 upstream.
Jann reported that x86 was missing required TLB invalidates when he hit the !*batch slow path in tlb_remove_table().
This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache) invalidates, the PowerPC-hash where this code originated and the Sparc-hash where this was subsequently used did not need that. ARM which later used this put an explicit TLB invalidate in their __p*_free_tlb() functions, and PowerPC-radix followed that example.
But when we hooked up x86 we failed to consider this. Fix this by (optionally) hooking tlb_remove_table() into the TLB invalidate code.
NOTE: s390 was also needing something like this and might now be able to use the generic code again.
[ Modified to be on top of Nick's cleanups, which simplified this patch now that tlb_flush_mmu_tlbonly() really only flushes the TLB - Linus ]
Fixes: 9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)") Reported-by: Jann Horn jannh@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Rik van Riel riel@surriel.com Cc: Nicholas Piggin npiggin@gmail.com Cc: David Miller davem@davemloft.net Cc: Will Deacon will.deacon@arm.com Cc: Martin Schwidefsky schwidefsky@de.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: stable@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/Kconfig | 3 +++ arch/x86/Kconfig | 1 + mm/memory.c | 18 ++++++++++++++++++ 3 files changed, 22 insertions(+)
--- a/arch/Kconfig +++ b/arch/Kconfig @@ -354,6 +354,9 @@ config HAVE_ARCH_JUMP_LABEL config HAVE_RCU_TABLE_FREE bool
+config HAVE_RCU_TABLE_INVALIDATE + bool + config ARCH_HAVE_NMI_SAFE_CMPXCHG bool
--- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -179,6 +179,7 @@ config X86 select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP select HAVE_RCU_TABLE_FREE + select HAVE_RCU_TABLE_INVALIDATE if HAVE_RCU_TABLE_FREE select HAVE_REGS_AND_STACK_ACCESS_API select HAVE_RELIABLE_STACKTRACE if X86_64 && UNWINDER_FRAME_POINTER && STACK_VALIDATION select HAVE_STACKPROTECTOR if CC_HAS_SANE_STACKPROTECTOR --- a/mm/memory.c +++ b/mm/memory.c @@ -330,6 +330,21 @@ bool __tlb_remove_page_size(struct mmu_g * See the comment near struct mmu_table_batch. */
+/* + * If we want tlb_remove_table() to imply TLB invalidates. + */ +static inline void tlb_table_invalidate(struct mmu_gather *tlb) +{ +#ifdef CONFIG_HAVE_RCU_TABLE_INVALIDATE + /* + * Invalidate page-table caches used by hardware walkers. Then we still + * need to RCU-sched wait while freeing the pages because software + * walkers can still be in-flight. + */ + tlb_flush_mmu_tlbonly(tlb); +#endif +} + static void tlb_remove_table_smp_sync(void *arg) { /* Simply deliver the interrupt */ @@ -366,6 +381,7 @@ void tlb_table_flush(struct mmu_gather * struct mmu_table_batch **batch = &tlb->batch;
if (*batch) { + tlb_table_invalidate(tlb); call_rcu_sched(&(*batch)->rcu, tlb_remove_table_rcu); *batch = NULL; } @@ -387,11 +403,13 @@ void tlb_remove_table(struct mmu_gather if (*batch == NULL) { *batch = (struct mmu_table_batch *)__get_free_page(GFP_NOWAIT | __GFP_NOWARN); if (*batch == NULL) { + tlb_table_invalidate(tlb); tlb_remove_table_one(table); return; } (*batch)->nr = 0; } + (*batch)->tables[(*batch)->nr++] = table; if ((*batch)->nr == MAX_TABLE_BATCH) tlb_table_flush(tlb);
On 3 September 2018 at 22:26, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
4.18-stable review patch. If anyone has any objections, please let me know.
From: Peter Zijlstra peterz@infradead.org
commit d86564a2f085b79ec046a5cba90188e612352806 upstream.
Jann reported that x86 was missing required TLB invalidates when he hit the !*batch slow path in tlb_remove_table().
This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache) invalidates, the PowerPC-hash where this code originated and the Sparc-hash where this was subsequently used did not need that. ARM which later used this put an explicit TLB invalidate in their __p*_free_tlb() functions, and PowerPC-radix followed that example.
But when we hooked up x86 we failed to consider this. Fix this by (optionally) hooking tlb_remove_table() into the TLB invalidate code.
NOTE: s390 was also needing something like this and might now be able to use the generic code again.
[ Modified to be on top of Nick's cleanups, which simplified this patch now that tlb_flush_mmu_tlbonly() really only flushes the TLB - Linus ]
Fixes: 9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)") Reported-by: Jann Horn jannh@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Rik van Riel riel@surriel.com Cc: Nicholas Piggin npiggin@gmail.com Cc: David Miller davem@davemloft.net Cc: Will Deacon will.deacon@arm.com Cc: Martin Schwidefsky schwidefsky@de.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: stable@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
arch/Kconfig | 3 +++ arch/x86/Kconfig | 1 + mm/memory.c | 18 ++++++++++++++++++ 3 files changed, 22 insertions(+)
--- a/arch/Kconfig +++ b/arch/Kconfig @@ -354,6 +354,9 @@ config HAVE_ARCH_JUMP_LABEL config HAVE_RCU_TABLE_FREE bool
+config HAVE_RCU_TABLE_INVALIDATE
bool
config ARCH_HAVE_NMI_SAFE_CMPXCHG bool
--- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -179,6 +179,7 @@ config X86 select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP select HAVE_RCU_TABLE_FREE
select HAVE_RCU_TABLE_INVALIDATE if HAVE_RCU_TABLE_FREE select HAVE_REGS_AND_STACK_ACCESS_API select HAVE_RELIABLE_STACKTRACE if X86_64 && UNWINDER_FRAME_POINTER && STACK_VALIDATION select HAVE_STACKPROTECTOR if CC_HAS_SANE_STACKPROTECTOR
--- a/mm/memory.c +++ b/mm/memory.c @@ -330,6 +330,21 @@ bool __tlb_remove_page_size(struct mmu_g
- See the comment near struct mmu_table_batch.
*/
+/*
- If we want tlb_remove_table() to imply TLB invalidates.
- */
+static inline void tlb_table_invalidate(struct mmu_gather *tlb) +{ +#ifdef CONFIG_HAVE_RCU_TABLE_INVALIDATE
/*
* Invalidate page-table caches used by hardware walkers. Then we still
* need to RCU-sched wait while freeing the pages because software
* walkers can still be in-flight.
*/
tlb_flush_mmu_tlbonly(tlb);
+#endif +}
static void tlb_remove_table_smp_sync(void *arg) { /* Simply deliver the interrupt */ @@ -366,6 +381,7 @@ void tlb_table_flush(struct mmu_gather * struct mmu_table_batch **batch = &tlb->batch;
if (*batch) {
tlb_table_invalidate(tlb); call_rcu_sched(&(*batch)->rcu, tlb_remove_table_rcu); *batch = NULL; }
@@ -387,11 +403,13 @@ void tlb_remove_table(struct mmu_gather if (*batch == NULL) { *batch = (struct mmu_table_batch *)__get_free_page(GFP_NOWAIT | __GFP_NOWARN); if (*batch == NULL) {
tlb_table_invalidate(tlb); tlb_remove_table_one(table); return; } (*batch)->nr = 0; }
(*batch)->tables[(*batch)->nr++] = table; if ((*batch)->nr == MAX_TABLE_BATCH) tlb_table_flush(tlb);
Kernel crashed on x86 device running LTP fcntl34 test case on 4.18.6-rc1, fcntl34.c:58: INFO: waiting for '12' threads
[ 1075.624862] BUG: stack guard page was hit at 0000000079c81098 (stack is 000000002c7d6db4..00000000d386d6df) [ 1075.634606] kernel stack overflow (double-fault): 0000 [#2] SMP PTI [ 1075.640871] CPU: 3 PID: 17735 Comm: fcntl34_64 Tainted: G D W 4.18.6-rc1 #1 [ 1075.648954] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.0b 07/27/2017 [ 1075.656428] RIP: 0010:flush_tlb_func_common.constprop.14+0x29c/0x4d0 [ 1075.662776] Code: 03 1d 70 e3 da 4e 83 c2 01 0f b7 d2 49 0f ab 13 eb b5 0f 1f 44 00 00 e9 70 fe ff ff 65 ff 05 6b 40 db 4e 48 8b 05 bc e8 8f 01 <e8> ff 95 08 00 85 c0 74 0d 80 3d ee c5 8f 01 00 0f 84 4a 01 00 00 [ 1075.681645] RSP: 0018:ffffbd2482cbc000 EFLAGS: 00010083 [ 1075.686863] RAX: 0000000000000000 RBX: ffff98915adf0002 RCX: ffffbd2482cbc010 [ 1075.693986] RDX: 0000000000000803 RSI: 00007f5aae00a000 RDI: ffffbd2482cbc080 [ 1075.701124] RBP: ffffbd2482cbc060 R08: ffffffffb2b86c00 R09: 0000008000000000 [ 1075.708287] R10: 000000000002161a R11: 2008188a00000121 R12: 0000000000000162 [ 1075.715410] R13: 0000000000000003 R14: 00007f5aae00a000 R15: 00007f5aae000000 [ 1075.722536] FS: 00007f5aaeff3740(0000) GS:ffff98916fd80000(0000) knlGS:0000000000000000 [ 1075.730619] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1075.736357] CR2: ffffbd2482cbbff8 CR3: 000000045368c003 CR4: 00000000003606e0 [ 1075.743481] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1075.750606] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1075.757730] Call Trace: [ 1075.760176] flush_tlb_mm_range+0x119/0x130 [ 1075.764358] ? flush_tlb_mm_range+0x119/0x130 [ 1075.768711] tlb_flush_mmu_tlbonly+0x6e/0xd0 [ 1075.772984] ? tlb_flush_mmu_tlbonly+0x6e/0xd0 [ 1075.777428] tlb_table_flush.part.113+0x12/0x30 [ 1075.781954] tlb_flush_mmu_tlbonly+0x4b/0xd0 [ 1075.786224] tlb_table_flush.part.113+0x12/0x30 [ 1075.790749] tlb_flush_mmu_tlbonly+0x4b/0xd0
Full test log link, https://lkft.validation.linaro.org/scheduler/job/404027#L4051
Best regards Naresh Kamboju
On Tue, Sep 04, 2018 at 10:08:13AM +0530, Naresh Kamboju wrote:
On 3 September 2018 at 22:26, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
4.18-stable review patch. If anyone has any objections, please let me know.
From: Peter Zijlstra peterz@infradead.org
commit d86564a2f085b79ec046a5cba90188e612352806 upstream.
Jann reported that x86 was missing required TLB invalidates when he hit the !*batch slow path in tlb_remove_table().
This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache) invalidates, the PowerPC-hash where this code originated and the Sparc-hash where this was subsequently used did not need that. ARM which later used this put an explicit TLB invalidate in their __p*_free_tlb() functions, and PowerPC-radix followed that example.
But when we hooked up x86 we failed to consider this. Fix this by (optionally) hooking tlb_remove_table() into the TLB invalidate code.
NOTE: s390 was also needing something like this and might now be able to use the generic code again.
[ Modified to be on top of Nick's cleanups, which simplified this patch now that tlb_flush_mmu_tlbonly() really only flushes the TLB - Linus ]
Fixes: 9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)") Reported-by: Jann Horn jannh@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Rik van Riel riel@surriel.com Cc: Nicholas Piggin npiggin@gmail.com Cc: David Miller davem@davemloft.net Cc: Will Deacon will.deacon@arm.com Cc: Martin Schwidefsky schwidefsky@de.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: stable@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
arch/Kconfig | 3 +++ arch/x86/Kconfig | 1 + mm/memory.c | 18 ++++++++++++++++++ 3 files changed, 22 insertions(+)
--- a/arch/Kconfig +++ b/arch/Kconfig @@ -354,6 +354,9 @@ config HAVE_ARCH_JUMP_LABEL config HAVE_RCU_TABLE_FREE bool
+config HAVE_RCU_TABLE_INVALIDATE
bool
config ARCH_HAVE_NMI_SAFE_CMPXCHG bool
--- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -179,6 +179,7 @@ config X86 select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP select HAVE_RCU_TABLE_FREE
select HAVE_RCU_TABLE_INVALIDATE if HAVE_RCU_TABLE_FREE select HAVE_REGS_AND_STACK_ACCESS_API select HAVE_RELIABLE_STACKTRACE if X86_64 && UNWINDER_FRAME_POINTER && STACK_VALIDATION select HAVE_STACKPROTECTOR if CC_HAS_SANE_STACKPROTECTOR
--- a/mm/memory.c +++ b/mm/memory.c @@ -330,6 +330,21 @@ bool __tlb_remove_page_size(struct mmu_g
- See the comment near struct mmu_table_batch.
*/
+/*
- If we want tlb_remove_table() to imply TLB invalidates.
- */
+static inline void tlb_table_invalidate(struct mmu_gather *tlb) +{ +#ifdef CONFIG_HAVE_RCU_TABLE_INVALIDATE
/*
* Invalidate page-table caches used by hardware walkers. Then we still
* need to RCU-sched wait while freeing the pages because software
* walkers can still be in-flight.
*/
tlb_flush_mmu_tlbonly(tlb);
+#endif +}
static void tlb_remove_table_smp_sync(void *arg) { /* Simply deliver the interrupt */ @@ -366,6 +381,7 @@ void tlb_table_flush(struct mmu_gather * struct mmu_table_batch **batch = &tlb->batch;
if (*batch) {
tlb_table_invalidate(tlb); call_rcu_sched(&(*batch)->rcu, tlb_remove_table_rcu); *batch = NULL; }
@@ -387,11 +403,13 @@ void tlb_remove_table(struct mmu_gather if (*batch == NULL) { *batch = (struct mmu_table_batch *)__get_free_page(GFP_NOWAIT | __GFP_NOWARN); if (*batch == NULL) {
tlb_table_invalidate(tlb); tlb_remove_table_one(table); return; } (*batch)->nr = 0; }
(*batch)->tables[(*batch)->nr++] = table; if ((*batch)->nr == MAX_TABLE_BATCH) tlb_table_flush(tlb);
Kernel crashed on x86 device running LTP fcntl34 test case on 4.18.6-rc1, fcntl34.c:58: INFO: waiting for '12' threads
[ 1075.624862] BUG: stack guard page was hit at 0000000079c81098 (stack is 000000002c7d6db4..00000000d386d6df) [ 1075.634606] kernel stack overflow (double-fault): 0000 [#2] SMP PTI [ 1075.640871] CPU: 3 PID: 17735 Comm: fcntl34_64 Tainted: G D W 4.18.6-rc1 #1 [ 1075.648954] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.0b 07/27/2017 [ 1075.656428] RIP: 0010:flush_tlb_func_common.constprop.14+0x29c/0x4d0 [ 1075.662776] Code: 03 1d 70 e3 da 4e 83 c2 01 0f b7 d2 49 0f ab 13 eb b5 0f 1f 44 00 00 e9 70 fe ff ff 65 ff 05 6b 40 db 4e 48 8b 05 bc e8 8f 01 <e8> ff 95 08 00 85 c0 74 0d 80 3d ee c5 8f 01 00 0f 84 4a 01 00 00 [ 1075.681645] RSP: 0018:ffffbd2482cbc000 EFLAGS: 00010083 [ 1075.686863] RAX: 0000000000000000 RBX: ffff98915adf0002 RCX: ffffbd2482cbc010 [ 1075.693986] RDX: 0000000000000803 RSI: 00007f5aae00a000 RDI: ffffbd2482cbc080 [ 1075.701124] RBP: ffffbd2482cbc060 R08: ffffffffb2b86c00 R09: 0000008000000000 [ 1075.708287] R10: 000000000002161a R11: 2008188a00000121 R12: 0000000000000162 [ 1075.715410] R13: 0000000000000003 R14: 00007f5aae00a000 R15: 00007f5aae000000 [ 1075.722536] FS: 00007f5aaeff3740(0000) GS:ffff98916fd80000(0000) knlGS:0000000000000000 [ 1075.730619] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1075.736357] CR2: ffffbd2482cbbff8 CR3: 000000045368c003 CR4: 00000000003606e0 [ 1075.743481] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1075.750606] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1075.757730] Call Trace: [ 1075.760176] flush_tlb_mm_range+0x119/0x130 [ 1075.764358] ? flush_tlb_mm_range+0x119/0x130 [ 1075.768711] tlb_flush_mmu_tlbonly+0x6e/0xd0 [ 1075.772984] ? tlb_flush_mmu_tlbonly+0x6e/0xd0 [ 1075.777428] tlb_table_flush.part.113+0x12/0x30 [ 1075.781954] tlb_flush_mmu_tlbonly+0x4b/0xd0 [ 1075.786224] tlb_table_flush.part.113+0x12/0x30 [ 1075.790749] tlb_flush_mmu_tlbonly+0x4b/0xd0
Full test log link, https://lkft.validation.linaro.org/scheduler/job/404027#L4051
Does Linus's tree also crash with this patch applied? Being "bug compatible" is good :)
thanks,
greg k-h
On 04. sep. 2018 07:24, Greg Kroah-Hartman wrote:
Full test log link, https://lkft.validation.linaro.org/scheduler/job/404027#L4051
Does Linus's tree also crash with this patch applied? Being "bug compatible" is good :)
thanks,
greg k-h
I suspect it is because we're missing upstream commit db7ddef301128dad394f1c0f77027f86ee9a4edb mm: move tlb_table_flush to tlb_flush_mmu_free
Have not tested.
On 04. sep. 2018 08:10, Andre Tomt wrote:
On 04. sep. 2018 07:24, Greg Kroah-Hartman wrote:
Full test log link, https://lkft.validation.linaro.org/scheduler/job/404027#L4051
Does Linus's tree also crash with this patch applied? Being "bug compatible" is good :)
thanks,
greg k-h
I suspect it is because we're missing upstream commit db7ddef301128dad394f1c0f77027f86ee9a4edb mm: move tlb_table_flush to tlb_flush_mmu_free
Have not tested.
Tested it now. Adding this commit to my local 4.18.6-rc1 based tree fixes the crashing on my systems.
On 4.18.6-rc1 they would fall over within seconds of getting to the login prompt.
Full test log link, https://lkft.validation.linaro.org/scheduler/job/404027#L4051
Does Linus's tree also crash with this patch applied? Being "bug compatible" is good :)
No. It did not crash on mainline kernel.
- Naresh
thanks,
greg k-h
On Tue, Sep 04, 2018 at 10:08:13AM +0530, Naresh Kamboju wrote:
On 3 September 2018 at 22:26, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
4.18-stable review patch. If anyone has any objections, please let me know.
From: Peter Zijlstra peterz@infradead.org
commit d86564a2f085b79ec046a5cba90188e612352806 upstream.
Jann reported that x86 was missing required TLB invalidates when he hit the !*batch slow path in tlb_remove_table().
This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache) invalidates, the PowerPC-hash where this code originated and the Sparc-hash where this was subsequently used did not need that. ARM which later used this put an explicit TLB invalidate in their __p*_free_tlb() functions, and PowerPC-radix followed that example.
But when we hooked up x86 we failed to consider this. Fix this by (optionally) hooking tlb_remove_table() into the TLB invalidate code.
NOTE: s390 was also needing something like this and might now be able to use the generic code again.
[ Modified to be on top of Nick's cleanups, which simplified this patch now that tlb_flush_mmu_tlbonly() really only flushes the TLB - Linus ]
Fixes: 9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)") Reported-by: Jann Horn jannh@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Rik van Riel riel@surriel.com Cc: Nicholas Piggin npiggin@gmail.com Cc: David Miller davem@davemloft.net Cc: Will Deacon will.deacon@arm.com Cc: Martin Schwidefsky schwidefsky@de.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: stable@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
arch/Kconfig | 3 +++ arch/x86/Kconfig | 1 + mm/memory.c | 18 ++++++++++++++++++ 3 files changed, 22 insertions(+)
--- a/arch/Kconfig +++ b/arch/Kconfig @@ -354,6 +354,9 @@ config HAVE_ARCH_JUMP_LABEL config HAVE_RCU_TABLE_FREE bool
+config HAVE_RCU_TABLE_INVALIDATE
bool
config ARCH_HAVE_NMI_SAFE_CMPXCHG bool
--- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -179,6 +179,7 @@ config X86 select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP select HAVE_RCU_TABLE_FREE
select HAVE_RCU_TABLE_INVALIDATE if HAVE_RCU_TABLE_FREE select HAVE_REGS_AND_STACK_ACCESS_API select HAVE_RELIABLE_STACKTRACE if X86_64 && UNWINDER_FRAME_POINTER && STACK_VALIDATION select HAVE_STACKPROTECTOR if CC_HAS_SANE_STACKPROTECTOR
--- a/mm/memory.c +++ b/mm/memory.c @@ -330,6 +330,21 @@ bool __tlb_remove_page_size(struct mmu_g
- See the comment near struct mmu_table_batch.
*/
+/*
- If we want tlb_remove_table() to imply TLB invalidates.
- */
+static inline void tlb_table_invalidate(struct mmu_gather *tlb) +{ +#ifdef CONFIG_HAVE_RCU_TABLE_INVALIDATE
/*
* Invalidate page-table caches used by hardware walkers. Then we still
* need to RCU-sched wait while freeing the pages because software
* walkers can still be in-flight.
*/
tlb_flush_mmu_tlbonly(tlb);
+#endif +}
static void tlb_remove_table_smp_sync(void *arg) { /* Simply deliver the interrupt */ @@ -366,6 +381,7 @@ void tlb_table_flush(struct mmu_gather * struct mmu_table_batch **batch = &tlb->batch;
if (*batch) {
tlb_table_invalidate(tlb); call_rcu_sched(&(*batch)->rcu, tlb_remove_table_rcu); *batch = NULL; }
@@ -387,11 +403,13 @@ void tlb_remove_table(struct mmu_gather if (*batch == NULL) { *batch = (struct mmu_table_batch *)__get_free_page(GFP_NOWAIT | __GFP_NOWARN); if (*batch == NULL) {
tlb_table_invalidate(tlb); tlb_remove_table_one(table); return; } (*batch)->nr = 0; }
(*batch)->tables[(*batch)->nr++] = table; if ((*batch)->nr == MAX_TABLE_BATCH) tlb_table_flush(tlb);
Kernel crashed on x86 device running LTP fcntl34 test case on 4.18.6-rc1, fcntl34.c:58: INFO: waiting for '12' threads
[ 1075.624862] BUG: stack guard page was hit at 0000000079c81098 (stack is 000000002c7d6db4..00000000d386d6df) [ 1075.634606] kernel stack overflow (double-fault): 0000 [#2] SMP PTI [ 1075.640871] CPU: 3 PID: 17735 Comm: fcntl34_64 Tainted: G D W 4.18.6-rc1 #1 [ 1075.648954] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.0b 07/27/2017 [ 1075.656428] RIP: 0010:flush_tlb_func_common.constprop.14+0x29c/0x4d0 [ 1075.662776] Code: 03 1d 70 e3 da 4e 83 c2 01 0f b7 d2 49 0f ab 13 eb b5 0f 1f 44 00 00 e9 70 fe ff ff 65 ff 05 6b 40 db 4e 48 8b 05 bc e8 8f 01 <e8> ff 95 08 00 85 c0 74 0d 80 3d ee c5 8f 01 00 0f 84 4a 01 00 00 [ 1075.681645] RSP: 0018:ffffbd2482cbc000 EFLAGS: 00010083 [ 1075.686863] RAX: 0000000000000000 RBX: ffff98915adf0002 RCX: ffffbd2482cbc010 [ 1075.693986] RDX: 0000000000000803 RSI: 00007f5aae00a000 RDI: ffffbd2482cbc080 [ 1075.701124] RBP: ffffbd2482cbc060 R08: ffffffffb2b86c00 R09: 0000008000000000 [ 1075.708287] R10: 000000000002161a R11: 2008188a00000121 R12: 0000000000000162 [ 1075.715410] R13: 0000000000000003 R14: 00007f5aae00a000 R15: 00007f5aae000000 [ 1075.722536] FS: 00007f5aaeff3740(0000) GS:ffff98916fd80000(0000) knlGS:0000000000000000 [ 1075.730619] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1075.736357] CR2: ffffbd2482cbbff8 CR3: 000000045368c003 CR4: 00000000003606e0 [ 1075.743481] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1075.750606] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1075.757730] Call Trace: [ 1075.760176] flush_tlb_mm_range+0x119/0x130 [ 1075.764358] ? flush_tlb_mm_range+0x119/0x130 [ 1075.768711] tlb_flush_mmu_tlbonly+0x6e/0xd0 [ 1075.772984] ? tlb_flush_mmu_tlbonly+0x6e/0xd0 [ 1075.777428] tlb_table_flush.part.113+0x12/0x30 [ 1075.781954] tlb_flush_mmu_tlbonly+0x4b/0xd0 [ 1075.786224] tlb_table_flush.part.113+0x12/0x30 [ 1075.790749] tlb_flush_mmu_tlbonly+0x4b/0xd0
Full test log link, https://lkft.validation.linaro.org/scheduler/job/404027#L4051
I have pushed out a -rc2 that should fix this problem.
thanks,
greg k-h
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vlastimil Babka vbabka@suse.cz
commit 9df9516940a61d29aedf4d91b483ca6597e7d480 upstream.
On 32bit PAE kernels on 64bit hardware with enough physical bits, l1tf_pfn_limit() will overflow unsigned long. This in turn affects max_swapfile_size() and can lead to swapon returning -EINVAL. This has been observed in a 32bit guest with 42 bits physical address size, where max_swapfile_size() overflows exactly to 1 << 32, thus zero, and produces the following warning to dmesg:
[ 6.396845] Truncating oversized swap area, only using 0k out of 2047996k
Fix this by using unsigned long long instead.
Fixes: 17dbca119312 ("x86/speculation/l1tf: Add sysfs reporting for l1tf") Fixes: 377eeaa8e11f ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2") Reported-by: Dominique Leuenberger dimstar@suse.de Reported-by: Adrian Schroeter adrian@suse.de Signed-off-by: Vlastimil Babka vbabka@suse.cz Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Andi Kleen ak@linux.intel.com Acked-by: Michal Hocko mhocko@suse.com Cc: "H . Peter Anvin" hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Dave Hansen dave.hansen@intel.com Cc: Michal Hocko mhocko@kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180820095835.5298-1-vbabka@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/processor.h | 4 ++-- arch/x86/mm/init.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
--- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -181,9 +181,9 @@ extern const struct seq_operations cpuin
extern void cpu_detect(struct cpuinfo_x86 *c);
-static inline unsigned long l1tf_pfn_limit(void) +static inline unsigned long long l1tf_pfn_limit(void) { - return BIT(boot_cpu_data.x86_phys_bits - 1 - PAGE_SHIFT) - 1; + return BIT_ULL(boot_cpu_data.x86_phys_bits - 1 - PAGE_SHIFT) - 1; }
extern void early_cpu_init(void); --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -923,7 +923,7 @@ unsigned long max_swapfile_size(void)
if (boot_cpu_has_bug(X86_BUG_L1TF)) { /* Limit the swap file size to MAX_PA/2 for L1TF workaround */ - unsigned long l1tf_limit = l1tf_pfn_limit() + 1; + unsigned long long l1tf_limit = l1tf_pfn_limit() + 1; /* * We encode swap offsets also with 3 bits below those for pfn * which makes the usable limit higher. @@ -931,7 +931,7 @@ unsigned long max_swapfile_size(void) #if CONFIG_PGTABLE_LEVELS > 2 l1tf_limit <<= PAGE_SHIFT - SWP_OFFSET_FIRST_BIT; #endif - pages = min_t(unsigned long, l1tf_limit, pages); + pages = min_t(unsigned long long, l1tf_limit, pages); } return pages; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vlastimil Babka vbabka@suse.cz
commit b0a182f875689647b014bc01d36b340217792852 upstream.
Two users have reported [1] that they have an "extremely unlikely" system with more than MAX_PA/2 memory and L1TF mitigation is not effective. In fact it's a CPU with 36bits phys limit (64GB) and 32GB memory, but due to holes in the e820 map, the main region is almost 500MB over the 32GB limit:
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000081effffff] usable
Suggestions to use 'mem=32G' to enable the L1TF mitigation while losing the 500MB revealed, that there's an off-by-one error in the check in l1tf_select_mitigation().
l1tf_pfn_limit() returns the last usable pfn (inclusive) and the range check in the mitigation path does not take this into account.
Instead of amending the range check, make l1tf_pfn_limit() return the first PFN which is over the limit which is less error prone. Adjust the other users accordingly.
[1] https://bugzilla.suse.com/show_bug.cgi?id=1105536
Fixes: 17dbca119312 ("x86/speculation/l1tf: Add sysfs reporting for l1tf") Reported-by: George Anchev studio@anchev.net Reported-by: Christopher Snowhill kode54@gmail.com Signed-off-by: Vlastimil Babka vbabka@suse.cz Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: "H . Peter Anvin" hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Andi Kleen ak@linux.intel.com Cc: Dave Hansen dave.hansen@intel.com Cc: Michal Hocko mhocko@kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180823134418.17008-1-vbabka@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/processor.h | 2 +- arch/x86/mm/init.c | 2 +- arch/x86/mm/mmap.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-)
--- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -183,7 +183,7 @@ extern void cpu_detect(struct cpuinfo_x8
static inline unsigned long long l1tf_pfn_limit(void) { - return BIT_ULL(boot_cpu_data.x86_phys_bits - 1 - PAGE_SHIFT) - 1; + return BIT_ULL(boot_cpu_data.x86_phys_bits - 1 - PAGE_SHIFT); }
extern void early_cpu_init(void); --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -923,7 +923,7 @@ unsigned long max_swapfile_size(void)
if (boot_cpu_has_bug(X86_BUG_L1TF)) { /* Limit the swap file size to MAX_PA/2 for L1TF workaround */ - unsigned long long l1tf_limit = l1tf_pfn_limit() + 1; + unsigned long long l1tf_limit = l1tf_pfn_limit(); /* * We encode swap offsets also with 3 bits below those for pfn * which makes the usable limit higher. --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -257,7 +257,7 @@ bool pfn_modify_allowed(unsigned long pf /* If it's real memory always allow */ if (pfn_valid(pfn)) return true; - if (pfn > l1tf_pfn_limit() && !capable(CAP_SYS_ADMIN)) + if (pfn >= l1tf_pfn_limit() && !capable(CAP_SYS_ADMIN)) return false; return true; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vlastimil Babka vbabka@suse.cz
commit 6a012288d6906fee1dbc244050ade1dafe4a9c8d upstream.
Two users have reported [1] that they have an "extremely unlikely" system with more than MAX_PA/2 memory and L1TF mitigation is not effective.
Make the warning more helpful by suggesting the proper mem=X kernel boot parameter to make it effective and a link to the L1TF document to help decide if the mitigation is worth the unusable RAM.
[1] https://bugzilla.suse.com/show_bug.cgi?id=1105536
Suggested-by: Michal Hocko mhocko@suse.com Signed-off-by: Vlastimil Babka vbabka@suse.cz Acked-by: Michal Hocko mhocko@suse.com Cc: "H . Peter Anvin" hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Andi Kleen ak@linux.intel.com Cc: Dave Hansen dave.hansen@intel.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/966571f0-9d7f-43dc-92c6-a10eec7a1254@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/cpu/bugs.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -686,6 +686,10 @@ static void __init l1tf_select_mitigatio half_pa = (u64)l1tf_pfn_limit() << PAGE_SHIFT; if (e820__mapped_any(half_pa, ULLONG_MAX - half_pa, E820_TYPE_RAM)) { pr_warn("System has more than MAX_PA/2 memory. L1TF mitigation not effective.\n"); + pr_info("You may make it effective by booting the kernel with mem=%llu parameter.\n", + half_pa); + pr_info("However, doing so will make a part of your RAM unusable.\n"); + pr_info("Reading https://www.kernel.org/doc/html/latest/admin-guide/l1tf.html might help you decide.\n"); return; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Lutomirski luto@kernel.org
commit 2e549b2ee0e358bc758480e716b881f9cabedb6a upstream.
Currently, if the vDSO ends up containing an indirect branch or call, GCC will emit the "external thunk" style of retpoline, and it will fail to link.
Fix it by building the vDSO with inline retpoline thunks.
I haven't seen any reports of this triggering on an unpatched kernel.
Fixes: commit 76b043848fd2 ("x86/retpoline: Add initial retpoline support") Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Matt Rickard matt@softrans.com.au Cc: Borislav Petkov bp@alien8.de Cc: Jason Vas Dias jason.vas.dias@gmail.com Cc: David Woodhouse dwmw2@infradead.org Cc: Peter Zijlstra peterz@infradead.org Cc: Andi Kleen ak@linux.intel.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/c76538cd3afbe19c6246c2d1715bc6a60bd63985.153444838... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 4 ++++ arch/x86/entry/vdso/Makefile | 6 ++++-- 2 files changed, 8 insertions(+), 2 deletions(-)
--- a/Makefile +++ b/Makefile @@ -493,9 +493,13 @@ KBUILD_AFLAGS += $(call cc-option, -no-i endif
RETPOLINE_CFLAGS_GCC := -mindirect-branch=thunk-extern -mindirect-branch-register +RETPOLINE_VDSO_CFLAGS_GCC := -mindirect-branch=thunk-inline -mindirect-branch-register RETPOLINE_CFLAGS_CLANG := -mretpoline-external-thunk +RETPOLINE_VDSO_CFLAGS_CLANG := -mretpoline RETPOLINE_CFLAGS := $(call cc-option,$(RETPOLINE_CFLAGS_GCC),$(call cc-option,$(RETPOLINE_CFLAGS_CLANG))) +RETPOLINE_VDSO_CFLAGS := $(call cc-option,$(RETPOLINE_VDSO_CFLAGS_GCC),$(call cc-option,$(RETPOLINE_VDSO_CFLAGS_CLANG))) export RETPOLINE_CFLAGS +export RETPOLINE_VDSO_CFLAGS
KBUILD_CFLAGS += $(call cc-option,-fno-PIE) KBUILD_AFLAGS += $(call cc-option,-fno-PIE) --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -72,9 +72,9 @@ $(obj)/vdso-image-%.c: $(obj)/vdso%.so.d CFL := $(PROFILING) -mcmodel=small -fPIC -O2 -fasynchronous-unwind-tables -m64 \ $(filter -g%,$(KBUILD_CFLAGS)) $(call cc-option, -fno-stack-protector) \ -fno-omit-frame-pointer -foptimize-sibling-calls \ - -DDISABLE_BRANCH_PROFILING -DBUILD_VDSO + -DDISABLE_BRANCH_PROFILING -DBUILD_VDSO $(RETPOLINE_VDSO_CFLAGS)
-$(vobjs): KBUILD_CFLAGS := $(filter-out $(GCC_PLUGINS_CFLAGS),$(KBUILD_CFLAGS)) $(CFL) +$(vobjs): KBUILD_CFLAGS := $(filter-out $(GCC_PLUGINS_CFLAGS) $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS)) $(CFL)
# # vDSO code runs in userspace and -pg doesn't help with profiling anyway. @@ -138,11 +138,13 @@ KBUILD_CFLAGS_32 := $(filter-out -mcmode KBUILD_CFLAGS_32 := $(filter-out -fno-pic,$(KBUILD_CFLAGS_32)) KBUILD_CFLAGS_32 := $(filter-out -mfentry,$(KBUILD_CFLAGS_32)) KBUILD_CFLAGS_32 := $(filter-out $(GCC_PLUGINS_CFLAGS),$(KBUILD_CFLAGS_32)) +KBUILD_CFLAGS_32 := $(filter-out $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS_32)) KBUILD_CFLAGS_32 += -m32 -msoft-float -mregparm=0 -fpic KBUILD_CFLAGS_32 += $(call cc-option, -fno-stack-protector) KBUILD_CFLAGS_32 += $(call cc-option, -foptimize-sibling-calls) KBUILD_CFLAGS_32 += -fno-omit-frame-pointer KBUILD_CFLAGS_32 += -DDISABLE_BRANCH_PROFILING +KBUILD_CFLAGS_32 += $(RETPOLINE_VDSO_CFLAGS) $(obj)/vdso32.so.dbg: KBUILD_CFLAGS = $(KBUILD_CFLAGS_32)
$(obj)/vdso32.so.dbg: FORCE \
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rian Hunter rian@alum.mit.edu
commit dc76803e57cc86589c4efcb5362918f9b0c0436f upstream.
The consolidation of the start_thread() functions removed the export unintentionally. This breaks binfmt handlers built as a module.
Add it back.
Fixes: e634d8fc792c ("x86-64: merge the standard and compat start_thread() functions") Signed-off-by: Rian Hunter rian@alum.mit.edu Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: "H. Peter Anvin" hpa@zytor.com Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bpetkov@suse.de Cc: Vitaly Kuznetsov vkuznets@redhat.com Cc: Joerg Roedel jroedel@suse.de Cc: Dmitry Safonov dima@arista.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180819230854.7275-1-rian@alum.mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/process_64.c | 1 + 1 file changed, 1 insertion(+)
--- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -384,6 +384,7 @@ start_thread(struct pt_regs *regs, unsig start_thread_common(regs, new_ip, new_sp, __USER_CS, __USER_DS, 0); } +EXPORT_SYMBOL_GPL(start_thread);
#ifdef CONFIG_COMPAT void compat_start_thread(struct pt_regs *regs, u32 new_ip, u32 new_sp)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Bonzini pbonzini@redhat.com
commit 44883f01fe6ae436a8604c47d8435276fef369b0 upstream.
Some of the MSRs returned by GET_MSR_INDEX_LIST currently cannot be sent back to KVM_GET_MSR and/or KVM_SET_MSR; either they can never be sent back, or you they are only accepted under special conditions. This makes the API a pain to use.
To avoid this pain, this patch makes it so that the result of the get-list ioctl can always be used for host-initiated get and set. Since we don't have a separate way to check for read-only MSRs, this means some Hyper-V MSRs are ignored when written. Arguably they should not even be in the result of GET_MSR_INDEX_LIST, but I am leaving there in case userspace is using the outcome of GET_MSR_INDEX_LIST to derive the support for the corresponding Hyper-V feature.
Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/hyperv.c | 27 ++++++++++++++++++++------- arch/x86/kvm/hyperv.h | 2 +- arch/x86/kvm/x86.c | 15 +++++++++------ 3 files changed, 30 insertions(+), 14 deletions(-)
--- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -235,7 +235,7 @@ static int synic_set_msr(struct kvm_vcpu struct kvm_vcpu *vcpu = synic_to_vcpu(synic); int ret;
- if (!synic->active) + if (!synic->active && !host) return 1;
trace_kvm_hv_synic_set_msr(vcpu->vcpu_id, msr, data, host); @@ -295,11 +295,12 @@ static int synic_set_msr(struct kvm_vcpu return ret; }
-static int synic_get_msr(struct kvm_vcpu_hv_synic *synic, u32 msr, u64 *pdata) +static int synic_get_msr(struct kvm_vcpu_hv_synic *synic, u32 msr, u64 *pdata, + bool host) { int ret;
- if (!synic->active) + if (!synic->active && !host) return 1;
ret = 0; @@ -1014,6 +1015,11 @@ static int kvm_hv_set_msr_pw(struct kvm_ case HV_X64_MSR_TSC_EMULATION_STATUS: hv->hv_tsc_emulation_status = data; break; + case HV_X64_MSR_TIME_REF_COUNT: + /* read-only, but still ignore it if host-initiated */ + if (!host) + return 1; + break; default: vcpu_unimpl(vcpu, "Hyper-V uhandled wrmsr: 0x%x data 0x%llx\n", msr, data); @@ -1101,6 +1107,12 @@ static int kvm_hv_set_msr(struct kvm_vcp return stimer_set_count(vcpu_to_stimer(vcpu, timer_index), data, host); } + case HV_X64_MSR_TSC_FREQUENCY: + case HV_X64_MSR_APIC_FREQUENCY: + /* read-only, but still ignore it if host-initiated */ + if (!host) + return 1; + break; default: vcpu_unimpl(vcpu, "Hyper-V uhandled wrmsr: 0x%x data 0x%llx\n", msr, data); @@ -1156,7 +1168,8 @@ static int kvm_hv_get_msr_pw(struct kvm_ return 0; }
-static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) +static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, + bool host) { u64 data = 0; struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv; @@ -1183,7 +1196,7 @@ static int kvm_hv_get_msr(struct kvm_vcp case HV_X64_MSR_SIMP: case HV_X64_MSR_EOM: case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15: - return synic_get_msr(vcpu_to_synic(vcpu), msr, pdata); + return synic_get_msr(vcpu_to_synic(vcpu), msr, pdata, host); case HV_X64_MSR_STIMER0_CONFIG: case HV_X64_MSR_STIMER1_CONFIG: case HV_X64_MSR_STIMER2_CONFIG: @@ -1229,7 +1242,7 @@ int kvm_hv_set_msr_common(struct kvm_vcp return kvm_hv_set_msr(vcpu, msr, data, host); }
-int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) +int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host) { if (kvm_hv_msr_partition_wide(msr)) { int r; @@ -1239,7 +1252,7 @@ int kvm_hv_get_msr_common(struct kvm_vcp mutex_unlock(&vcpu->kvm->arch.hyperv.hv_lock); return r; } else - return kvm_hv_get_msr(vcpu, msr, pdata); + return kvm_hv_get_msr(vcpu, msr, pdata, host); }
static __always_inline int get_sparse_bank_no(u64 valid_bank_mask, int bank_no) --- a/arch/x86/kvm/hyperv.h +++ b/arch/x86/kvm/hyperv.h @@ -48,7 +48,7 @@ static inline struct kvm_vcpu *synic_to_ }
int kvm_hv_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host); -int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata); +int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host);
bool kvm_hv_hypercall_enabled(struct kvm *kvm); int kvm_hv_hypercall(struct kvm_vcpu *vcpu); --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2185,10 +2185,11 @@ static int set_msr_mce(struct kvm_vcpu * vcpu->arch.mcg_status = data; break; case MSR_IA32_MCG_CTL: - if (!(mcg_cap & MCG_CTL_P)) + if (!(mcg_cap & MCG_CTL_P) && + (data || !msr_info->host_initiated)) return 1; if (data != 0 && data != ~(u64)0) - return -1; + return 1; vcpu->arch.mcg_ctl = data; break; default: @@ -2576,7 +2577,7 @@ int kvm_get_msr(struct kvm_vcpu *vcpu, s } EXPORT_SYMBOL_GPL(kvm_get_msr);
-static int get_msr_mce(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) +static int get_msr_mce(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host) { u64 data; u64 mcg_cap = vcpu->arch.mcg_cap; @@ -2591,7 +2592,7 @@ static int get_msr_mce(struct kvm_vcpu * data = vcpu->arch.mcg_cap; break; case MSR_IA32_MCG_CTL: - if (!(mcg_cap & MCG_CTL_P)) + if (!(mcg_cap & MCG_CTL_P) && !host) return 1; data = vcpu->arch.mcg_ctl; break; @@ -2724,7 +2725,8 @@ int kvm_get_msr_common(struct kvm_vcpu * case MSR_IA32_MCG_CTL: case MSR_IA32_MCG_STATUS: case MSR_IA32_MC0_CTL ... MSR_IA32_MCx_CTL(KVM_MAX_MCE_BANKS) - 1: - return get_msr_mce(vcpu, msr_info->index, &msr_info->data); + return get_msr_mce(vcpu, msr_info->index, &msr_info->data, + msr_info->host_initiated); case MSR_K7_CLK_CTL: /* * Provide expected ramp-up count for K7. All other @@ -2745,7 +2747,8 @@ int kvm_get_msr_common(struct kvm_vcpu * case HV_X64_MSR_TSC_EMULATION_CONTROL: case HV_X64_MSR_TSC_EMULATION_STATUS: return kvm_hv_get_msr_common(vcpu, - msr_info->index, &msr_info->data); + msr_info->index, &msr_info->data, + msr_info->host_initiated); break; case MSR_IA32_BBL_CR_CTL3: /* This legacy MSR exists but isn't fully documented in current
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrey Ryabinin aryabinin@virtuozzo.com
commit a2477b0e67c52f4364a47c3ad70902bc2a61bd4c upstream.
fuse_dev_splice_write() reads pipe->buffers to determine the size of 'bufs' array before taking the pipe_lock(). This is not safe as another thread might change the 'pipe->buffers' between the allocation and taking the pipe_lock(). So we end up with too small 'bufs' array.
Move the bufs allocations inside pipe_lock()/pipe_unlock() to fix this.
Fixes: dd3bb14f44a6 ("fuse: support splice() writing to fuse device") Signed-off-by: Andrey Ryabinin aryabinin@virtuozzo.com Cc: stable@vger.kernel.org # v2.6.35 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/dev.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1944,12 +1944,15 @@ static ssize_t fuse_dev_splice_write(str if (!fud) return -EPERM;
+ pipe_lock(pipe); + bufs = kmalloc_array(pipe->buffers, sizeof(struct pipe_buffer), GFP_KERNEL); - if (!bufs) + if (!bufs) { + pipe_unlock(pipe); return -ENOMEM; + }
- pipe_lock(pipe); nbuf = 0; rem = 0; for (idx = 0; idx < pipe->nrbufs && rem < len; idx++)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
commit 63576c13bd17848376c8ba4a98f5d5151140c4ac upstream.
If parallel dirops are enabled in FUSE_INIT reply, then first operation may leave fi->mutex held.
Reported-by: syzbot syzbot+3f7b29af1baa9d0a55be@syzkaller.appspotmail.com Fixes: 5c672ab3f0ee ("fuse: serialize dirops by default") Cc: stable@vger.kernel.org # v4.7 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/dir.c | 10 ++++++---- fs/fuse/fuse_i.h | 4 ++-- fs/fuse/inode.c | 14 ++++++++++---- 3 files changed, 18 insertions(+), 10 deletions(-)
--- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -355,11 +355,12 @@ static struct dentry *fuse_lookup(struct struct inode *inode; struct dentry *newent; bool outarg_valid = true; + bool locked;
- fuse_lock_inode(dir); + locked = fuse_lock_inode(dir); err = fuse_lookup_name(dir->i_sb, get_node_id(dir), &entry->d_name, &outarg, &inode); - fuse_unlock_inode(dir); + fuse_unlock_inode(dir, locked); if (err == -ENOENT) { outarg_valid = false; err = 0; @@ -1340,6 +1341,7 @@ static int fuse_readdir(struct file *fil struct fuse_conn *fc = get_fuse_conn(inode); struct fuse_req *req; u64 attr_version = 0; + bool locked;
if (is_bad_inode(inode)) return -EIO; @@ -1367,9 +1369,9 @@ static int fuse_readdir(struct file *fil fuse_read_fill(req, file, ctx->pos, PAGE_SIZE, FUSE_READDIR); } - fuse_lock_inode(inode); + locked = fuse_lock_inode(inode); fuse_request_send(fc, req); - fuse_unlock_inode(inode); + fuse_unlock_inode(inode, locked); nbytes = req->out.args[0].size; err = req->out.h.error; fuse_put_request(fc, req); --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -974,8 +974,8 @@ int fuse_do_setattr(struct dentry *dentr
void fuse_set_initialized(struct fuse_conn *fc);
-void fuse_unlock_inode(struct inode *inode); -void fuse_lock_inode(struct inode *inode); +void fuse_unlock_inode(struct inode *inode, bool locked); +bool fuse_lock_inode(struct inode *inode);
int fuse_setxattr(struct inode *inode, const char *name, const void *value, size_t size, int flags); --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -357,15 +357,21 @@ int fuse_reverse_inval_inode(struct supe return 0; }
-void fuse_lock_inode(struct inode *inode) +bool fuse_lock_inode(struct inode *inode) { - if (!get_fuse_conn(inode)->parallel_dirops) + bool locked = false; + + if (!get_fuse_conn(inode)->parallel_dirops) { mutex_lock(&get_fuse_inode(inode)->mutex); + locked = true; + } + + return locked; }
-void fuse_unlock_inode(struct inode *inode) +void fuse_unlock_inode(struct inode *inode, bool locked) { - if (!get_fuse_conn(inode)->parallel_dirops) + if (locked) mutex_unlock(&get_fuse_inode(inode)->mutex); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
commit 87114373ea507895a62afb10d2910bd9adac35a8 upstream.
Refcounting of request is broken when fuse_abort_conn() is called and request is on the fpq->io list:
- ref is taken too late - then it is not dropped
Fixes: 0d8e84b0432b ("fuse: simplify request abort") Cc: stable@vger.kernel.org # v4.2 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/dev.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -371,7 +371,7 @@ static void request_end(struct fuse_conn struct fuse_iqueue *fiq = &fc->iq;
if (test_and_set_bit(FR_FINISHED, &req->flags)) - return; + goto out_put_req;
spin_lock(&fiq->waitq.lock); list_del_init(&req->intr_entry); @@ -400,6 +400,7 @@ static void request_end(struct fuse_conn wake_up(&req->waitq); if (req->end) req->end(fc, req); +out_put_req: fuse_put_request(fc, req); }
@@ -2108,6 +2109,7 @@ void fuse_abort_conn(struct fuse_conn *f set_bit(FR_ABORTED, &req->flags); if (!test_bit(FR_LOCKED, &req->flags)) { set_bit(FR_PRIVATE, &req->flags); + __fuse_get_request(req); list_move(&req->list, &to_end1); } spin_unlock(&req->waitq.lock); @@ -2134,7 +2136,6 @@ void fuse_abort_conn(struct fuse_conn *f
while (!list_empty(&to_end1)) { req = list_first_entry(&to_end1, struct fuse_req, list); - __fuse_get_request(req); list_del_init(&req->list); request_end(fc, req); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
commit 45ff350bbd9d0f0977ff270a0d427c71520c0c37 upstream.
fuse_dev_release() assumes that it's the only one referencing the fpq->processing list, but that's not true, since fuse_abort_conn() can be doing the same without any serialization between the two.
Fixes: c3696046beb3 ("fuse: separate pqueue for clones") Cc: stable@vger.kernel.org # v4.2 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/dev.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -2153,9 +2153,15 @@ int fuse_dev_release(struct inode *inode if (fud) { struct fuse_conn *fc = fud->fc; struct fuse_pqueue *fpq = &fud->pq; + LIST_HEAD(to_end);
+ spin_lock(&fpq->lock); WARN_ON(!list_empty(&fpq->io)); - end_requests(fc, &fpq->processing); + list_splice_init(&fpq->processing, &to_end); + spin_unlock(&fpq->lock); + + end_requests(fc, &to_end); + /* Are we the last open device? */ if (atomic_dec_and_test(&fc->dev_count)) { WARN_ON(fc->iq.fasync != NULL);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
commit b8f95e5d13f5f0191dcb4b9113113d241636e7cb upstream.
fuse_abort_conn() does not guarantee that all async requests have actually finished aborting (i.e. their ->end() function is called). This could actually result in still used inodes after umount.
Add a helper to wait until all requests are fully done. This is done by looking at the "num_waiting" counter. When this counter drops to zero, we can be sure that no more requests are outstanding.
Fixes: 0d8e84b0432b ("fuse: simplify request abort") Cc: stable@vger.kernel.org # v4.2 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/dev.c | 23 +++++++++++++++++++---- fs/fuse/fuse_i.h | 1 + fs/fuse/inode.c | 2 ++ 3 files changed, 22 insertions(+), 4 deletions(-)
--- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -127,6 +127,16 @@ static bool fuse_block_alloc(struct fuse return !fc->initialized || (for_background && fc->blocked); }
+static void fuse_drop_waiting(struct fuse_conn *fc) +{ + if (fc->connected) { + atomic_dec(&fc->num_waiting); + } else if (atomic_dec_and_test(&fc->num_waiting)) { + /* wake up aborters */ + wake_up_all(&fc->blocked_waitq); + } +} + static struct fuse_req *__fuse_get_req(struct fuse_conn *fc, unsigned npages, bool for_background) { @@ -175,7 +185,7 @@ static struct fuse_req *__fuse_get_req(s return req;
out: - atomic_dec(&fc->num_waiting); + fuse_drop_waiting(fc); return ERR_PTR(err); }
@@ -285,7 +295,7 @@ void fuse_put_request(struct fuse_conn *
if (test_bit(FR_WAITING, &req->flags)) { __clear_bit(FR_WAITING, &req->flags); - atomic_dec(&fc->num_waiting); + fuse_drop_waiting(fc); }
if (req->stolen_file) @@ -371,7 +381,7 @@ static void request_end(struct fuse_conn struct fuse_iqueue *fiq = &fc->iq;
if (test_and_set_bit(FR_FINISHED, &req->flags)) - goto out_put_req; + goto put_request;
spin_lock(&fiq->waitq.lock); list_del_init(&req->intr_entry); @@ -400,7 +410,7 @@ static void request_end(struct fuse_conn wake_up(&req->waitq); if (req->end) req->end(fc, req); -out_put_req: +put_request: fuse_put_request(fc, req); }
@@ -2146,6 +2156,11 @@ void fuse_abort_conn(struct fuse_conn *f } EXPORT_SYMBOL_GPL(fuse_abort_conn);
+void fuse_wait_aborted(struct fuse_conn *fc) +{ + wait_event(fc->blocked_waitq, atomic_read(&fc->num_waiting) == 0); +} + int fuse_dev_release(struct inode *inode, struct file *file) { struct fuse_dev *fud = fuse_get_dev(file); --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -862,6 +862,7 @@ void fuse_request_send_background_locked
/* Abort all requests */ void fuse_abort_conn(struct fuse_conn *fc, bool is_abort); +void fuse_wait_aborted(struct fuse_conn *fc);
/** * Invalidate inode attributes --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -400,6 +400,8 @@ static void fuse_put_super(struct super_ fuse_send_destroy(fc);
fuse_abort_conn(fc, false); + fuse_wait_aborted(fc); + mutex_lock(&fuse_mutex); list_del(&fc->entry); fuse_ctl_remove_conn(fc);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
commit e8f3bd773d22f488724dffb886a1618da85c2966 upstream.
syzbot is hitting NULL pointer dereference at process_init_reply(). This is because deactivate_locked_super() is called before response for initial request is processed.
Fix this by aborting and waiting for all requests (including FUSE_INIT) before resetting fc->sb.
Original patch by Tetsuo Handa penguin-kernel@I-love.SKAURA.ne.jp.
Reported-by: syzbot syzbot+b62f08f4d5857755e3bc@syzkaller.appspotmail.com Fixes: e27c9d3877a0 ("fuse: fuse: add time_gran to INIT_OUT") Cc: stable@vger.kernel.org # v3.19 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/inode.c | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-)
--- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -397,11 +397,6 @@ static void fuse_put_super(struct super_ { struct fuse_conn *fc = get_fuse_conn_super(sb);
- fuse_send_destroy(fc); - - fuse_abort_conn(fc, false); - fuse_wait_aborted(fc); - mutex_lock(&fuse_mutex); list_del(&fc->entry); fuse_ctl_remove_conn(fc); @@ -1218,16 +1213,25 @@ static struct dentry *fuse_mount(struct return mount_nodev(fs_type, flags, raw_data, fuse_fill_super); }
-static void fuse_kill_sb_anon(struct super_block *sb) +static void fuse_sb_destroy(struct super_block *sb) { struct fuse_conn *fc = get_fuse_conn_super(sb);
if (fc) { + fuse_send_destroy(fc); + + fuse_abort_conn(fc, false); + fuse_wait_aborted(fc); + down_write(&fc->killsb); fc->sb = NULL; up_write(&fc->killsb); } +}
+static void fuse_kill_sb_anon(struct super_block *sb) +{ + fuse_sb_destroy(sb); kill_anon_super(sb); }
@@ -1250,14 +1254,7 @@ static struct dentry *fuse_mount_blk(str
static void fuse_kill_sb_blk(struct super_block *sb) { - struct fuse_conn *fc = get_fuse_conn_super(sb); - - if (fc) { - down_write(&fc->killsb); - fc->sb = NULL; - up_write(&fc->killsb); - } - + fuse_sb_destroy(sb); kill_block_super(sb); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kirill Tkhai ktkhai@virtuozzo.com
commit 109728ccc5933151c68d1106e4065478a487a323 upstream.
The above error path returns with page unlocked, so this place seems also to behave the same.
Fixes: f8dbdf81821b ("fuse: rework fuse_readpages()") Signed-off-by: Kirill Tkhai ktkhai@virtuozzo.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/file.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -866,6 +866,7 @@ static int fuse_readpages_fill(void *_da }
if (WARN_ON(req->num_pages >= req->max_pages)) { + unlock_page(page); fuse_put_request(fc, req); return -EIO; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bart Van Assche bart.vanassche@wdc.com
commit 554ec508653688c21d9b8024af73a1ffaa0164b9 upstream.
This patch avoids that gcc reports the following when building with W=1:
lib/vsprintf.c:1941:3: warning: this statement may fall through [-Wimplicit-fallthrough=] switch (fmt[1]) { ^~~~~~
Fixes: 7b1924a1d930eb2 ("vsprintf: add printk specifier %px") Link: http://lkml.kernel.org/r/20180806223421.11995-1-bart.vanassche@wdc.com Cc: linux-kernel@vger.kernel.org Cc: Bart Van Assche bart.vanassche@wdc.com Cc: Pantelis Antoniou pantelis.antoniou@konsulko.com Cc: Joe Perches joe@perches.com Cc: Rob Herring robh@kernel.org Cc: v4.15+ stable@vger.kernel.org Signed-off-by: Bart Van Assche bart.vanassche@wdc.com Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- lib/vsprintf.c | 1 + 1 file changed, 1 insertion(+)
--- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -1942,6 +1942,7 @@ char *pointer(const char *fmt, char *buf case 'F': return device_node_string(buf, end, ptr, spec, fmt + 1); } + break; case 'x': return pointer_string(buf, end, ptr, spec); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
commit 8456b99c16d193c4c3b7df305cf431e027f0189c upstream.
If we leave urbs around, it causes not only leak, but also memory corruption. This patch fixes the function udl_free_urb_list, so that it always waits for all urbs that are in progress.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie airlied@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/udl/udl_main.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
--- a/drivers/gpu/drm/udl/udl_main.c +++ b/drivers/gpu/drm/udl/udl_main.c @@ -170,18 +170,13 @@ static void udl_free_urb_list(struct drm struct list_head *node; struct urb_node *unode; struct urb *urb; - int ret; unsigned long flags;
DRM_DEBUG("Waiting for completes and freeing all render urbs\n");
/* keep waiting and freeing, until we've got 'em all */ while (count--) { - - /* Getting interrupted means a leak, but ok at shutdown*/ - ret = down_interruptible(&udl->urbs.limit_sem); - if (ret) - break; + down(&udl->urbs.limit_sem);
spin_lock_irqsave(&udl->urbs.lock, flags);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
commit 542bb9788a1f485eb1a2229178f665d8ea166156 upstream.
Allocations larger than PAGE_ALLOC_COSTLY_ORDER are unreliable and they may fail anytime. This patch fixes the udl kms driver so that when a large alloactions fails, it tries to do multiple smaller allocations.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie airlied@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/udl/udl_main.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-)
--- a/drivers/gpu/drm/udl/udl_main.c +++ b/drivers/gpu/drm/udl/udl_main.c @@ -200,17 +200,22 @@ static void udl_free_urb_list(struct drm static int udl_alloc_urb_list(struct drm_device *dev, int count, size_t size) { struct udl_device *udl = dev->dev_private; - int i = 0; struct urb *urb; struct urb_node *unode; char *buf; + size_t wanted_size = count * size;
spin_lock_init(&udl->urbs.lock);
+retry: udl->urbs.size = size; INIT_LIST_HEAD(&udl->urbs.list);
- while (i < count) { + sema_init(&udl->urbs.limit_sem, 0); + udl->urbs.count = 0; + udl->urbs.available = 0; + + while (udl->urbs.count * size < wanted_size) { unode = kzalloc(sizeof(struct urb_node), GFP_KERNEL); if (!unode) break; @@ -226,11 +231,16 @@ static int udl_alloc_urb_list(struct drm } unode->urb = urb;
- buf = usb_alloc_coherent(udl->udev, MAX_TRANSFER, GFP_KERNEL, + buf = usb_alloc_coherent(udl->udev, size, GFP_KERNEL, &urb->transfer_dma); if (!buf) { kfree(unode); usb_free_urb(urb); + if (size > PAGE_SIZE) { + size /= 2; + udl_free_urb_list(dev); + goto retry; + } break; }
@@ -241,16 +251,14 @@ static int udl_alloc_urb_list(struct drm
list_add_tail(&unode->entry, &udl->urbs.list);
- i++; + up(&udl->urbs.limit_sem); + udl->urbs.count++; + udl->urbs.available++; }
- sema_init(&udl->urbs.limit_sem, i); - udl->urbs.count = i; - udl->urbs.available = i; - - DRM_DEBUG("allocated %d %d byte urbs\n", i, (int) size); + DRM_DEBUG("allocated %d %d byte urbs\n", udl->urbs.count, (int) size);
- return i; + return udl->urbs.count; }
struct urb *udl_get_urb(struct drm_device *dev)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
commit 09a00abe3a9941c2715ca83eb88172cd2f54d8fd upstream.
We must use kzalloc when allocating the fb_deferred_io structure. Otherwise, the field first_io is undefined and it causes a crash.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie airlied@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/udl/udl_fb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/udl/udl_fb.c +++ b/drivers/gpu/drm/udl/udl_fb.c @@ -221,7 +221,7 @@ static int udl_fb_open(struct fb_info *i
struct fb_deferred_io *fbdefio;
- fbdefio = kmalloc(sizeof(struct fb_deferred_io), GFP_KERNEL); + fbdefio = kzalloc(sizeof(struct fb_deferred_io), GFP_KERNEL);
if (fbdefio) { fbdefio->delay = DL_DEFIO_WRITE_DELAY;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
commit 91ba11fb7d7ca0a3bbe8a512e65e666e2ec1e889 upstream.
Division is slow, so it shouldn't be done by the pixel generating code. The driver supports only 2 or 4 bytes per pixel, so we can replace division with a shift.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie airlied@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/udl/udl_drv.h | 2 - drivers/gpu/drm/udl/udl_fb.c | 15 ++++++++------ drivers/gpu/drm/udl/udl_transfer.c | 39 ++++++++++++++++++------------------- 3 files changed, 30 insertions(+), 26 deletions(-)
--- a/drivers/gpu/drm/udl/udl_drv.h +++ b/drivers/gpu/drm/udl/udl_drv.h @@ -112,7 +112,7 @@ udl_fb_user_fb_create(struct drm_device struct drm_file *file, const struct drm_mode_fb_cmd2 *mode_cmd);
-int udl_render_hline(struct drm_device *dev, int bpp, struct urb **urb_ptr, +int udl_render_hline(struct drm_device *dev, int log_bpp, struct urb **urb_ptr, const char *front, char **urb_buf_ptr, u32 byte_offset, u32 device_byte_offset, u32 byte_width, int *ident_ptr, int *sent_ptr); --- a/drivers/gpu/drm/udl/udl_fb.c +++ b/drivers/gpu/drm/udl/udl_fb.c @@ -90,7 +90,10 @@ int udl_handle_damage(struct udl_framebu int bytes_identical = 0; struct urb *urb; int aligned_x; - int bpp = fb->base.format->cpp[0]; + int log_bpp; + + BUG_ON(!is_power_of_2(fb->base.format->cpp[0])); + log_bpp = __ffs(fb->base.format->cpp[0]);
if (!fb->active_16) return 0; @@ -125,12 +128,12 @@ int udl_handle_damage(struct udl_framebu
for (i = y; i < y + height ; i++) { const int line_offset = fb->base.pitches[0] * i; - const int byte_offset = line_offset + (x * bpp); - const int dev_byte_offset = (fb->base.width * bpp * i) + (x * bpp); - if (udl_render_hline(dev, bpp, &urb, + const int byte_offset = line_offset + (x << log_bpp); + const int dev_byte_offset = (fb->base.width * i + x) << log_bpp; + if (udl_render_hline(dev, log_bpp, &urb, (char *) fb->obj->vmapping, &cmd, byte_offset, dev_byte_offset, - width * bpp, + width << log_bpp, &bytes_identical, &bytes_sent)) goto error; } @@ -149,7 +152,7 @@ int udl_handle_damage(struct udl_framebu error: atomic_add(bytes_sent, &udl->bytes_sent); atomic_add(bytes_identical, &udl->bytes_identical); - atomic_add(width*height*bpp, &udl->bytes_rendered); + atomic_add((width * height) << log_bpp, &udl->bytes_rendered); end_cycles = get_cycles(); atomic_add(((unsigned int) ((end_cycles - start_cycles) >> 10)), /* Kcycles */ --- a/drivers/gpu/drm/udl/udl_transfer.c +++ b/drivers/gpu/drm/udl/udl_transfer.c @@ -83,12 +83,12 @@ static inline u16 pixel32_to_be16(const ((pixel >> 8) & 0xf800)); }
-static inline u16 get_pixel_val16(const uint8_t *pixel, int bpp) +static inline u16 get_pixel_val16(const uint8_t *pixel, int log_bpp) { - u16 pixel_val16 = 0; - if (bpp == 2) + u16 pixel_val16; + if (log_bpp == 1) pixel_val16 = *(const uint16_t *)pixel; - else if (bpp == 4) + else pixel_val16 = pixel32_to_be16(*(const uint32_t *)pixel); return pixel_val16; } @@ -125,8 +125,9 @@ static void udl_compress_hline16( const u8 *const pixel_end, uint32_t *device_address_ptr, uint8_t **command_buffer_ptr, - const uint8_t *const cmd_buffer_end, int bpp) + const uint8_t *const cmd_buffer_end, int log_bpp) { + const int bpp = 1 << log_bpp; const u8 *pixel = *pixel_start_ptr; uint32_t dev_addr = *device_address_ptr; uint8_t *cmd = *command_buffer_ptr; @@ -153,12 +154,12 @@ static void udl_compress_hline16( raw_pixels_count_byte = cmd++; /* we'll know this later */ raw_pixel_start = pixel;
- cmd_pixel_end = pixel + min3(MAX_CMD_PIXELS + 1UL, - (unsigned long)(pixel_end - pixel) / bpp, - (unsigned long)(cmd_buffer_end - 1 - cmd) / 2) * bpp; + cmd_pixel_end = pixel + (min3(MAX_CMD_PIXELS + 1UL, + (unsigned long)(pixel_end - pixel) >> log_bpp, + (unsigned long)(cmd_buffer_end - 1 - cmd) / 2) << log_bpp);
prefetch_range((void *) pixel, cmd_pixel_end - pixel); - pixel_val16 = get_pixel_val16(pixel, bpp); + pixel_val16 = get_pixel_val16(pixel, log_bpp);
while (pixel < cmd_pixel_end) { const u8 *const start = pixel; @@ -170,7 +171,7 @@ static void udl_compress_hline16( pixel += bpp;
while (pixel < cmd_pixel_end) { - pixel_val16 = get_pixel_val16(pixel, bpp); + pixel_val16 = get_pixel_val16(pixel, log_bpp); if (pixel_val16 != repeating_pixel_val16) break; pixel += bpp; @@ -179,10 +180,10 @@ static void udl_compress_hline16( if (unlikely(pixel > start + bpp)) { /* go back and fill in raw pixel count */ *raw_pixels_count_byte = (((start - - raw_pixel_start) / bpp) + 1) & 0xFF; + raw_pixel_start) >> log_bpp) + 1) & 0xFF;
/* immediately after raw data is repeat byte */ - *cmd++ = (((pixel - start) / bpp) - 1) & 0xFF; + *cmd++ = (((pixel - start) >> log_bpp) - 1) & 0xFF;
/* Then start another raw pixel span */ raw_pixel_start = pixel; @@ -192,14 +193,14 @@ static void udl_compress_hline16(
if (pixel > raw_pixel_start) { /* finalize last RAW span */ - *raw_pixels_count_byte = ((pixel-raw_pixel_start) / bpp) & 0xFF; + *raw_pixels_count_byte = ((pixel - raw_pixel_start) >> log_bpp) & 0xFF; } else { /* undo unused byte */ cmd--; }
- *cmd_pixels_count_byte = ((pixel - cmd_pixel_start) / bpp) & 0xFF; - dev_addr += ((pixel - cmd_pixel_start) / bpp) * 2; + *cmd_pixels_count_byte = ((pixel - cmd_pixel_start) >> log_bpp) & 0xFF; + dev_addr += ((pixel - cmd_pixel_start) >> log_bpp) * 2; }
if (cmd_buffer_end <= MIN_RLX_CMD_BYTES + cmd) { @@ -222,19 +223,19 @@ static void udl_compress_hline16( * (that we can only write to, slowly, and can never read), and (optionally) * our shadow copy that tracks what's been sent to that hardware buffer. */ -int udl_render_hline(struct drm_device *dev, int bpp, struct urb **urb_ptr, +int udl_render_hline(struct drm_device *dev, int log_bpp, struct urb **urb_ptr, const char *front, char **urb_buf_ptr, u32 byte_offset, u32 device_byte_offset, u32 byte_width, int *ident_ptr, int *sent_ptr) { const u8 *line_start, *line_end, *next_pixel; - u32 base16 = 0 + (device_byte_offset / bpp) * 2; + u32 base16 = 0 + (device_byte_offset >> log_bpp) * 2; struct urb *urb = *urb_ptr; u8 *cmd = *urb_buf_ptr; u8 *cmd_end = (u8 *) urb->transfer_buffer + urb->transfer_buffer_length;
- BUG_ON(!(bpp == 2 || bpp == 4)); + BUG_ON(!(log_bpp == 1 || log_bpp == 2));
line_start = (u8 *) (front + byte_offset); next_pixel = line_start; @@ -244,7 +245,7 @@ int udl_render_hline(struct drm_device *
udl_compress_hline16(&next_pixel, line_end, &base16, - (u8 **) &cmd, (u8 *) cmd_end, bpp); + (u8 **) &cmd, (u8 *) cmd_end, log_bpp);
if (cmd >= cmd_end) { int len = cmd - (u8 *) urb->transfer_buffer;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Buesch m@bues.ch
commit 4d77a89e3924b12f4a5628b21237e57ab4703866 upstream.
strncpy might not NUL-terminate the string, if the name equals the buffer size. Use strlcpy instead.
Signed-off-by: Michael Buesch m@bues.ch Cc: stable@vger.kernel.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/broadcom/b43legacy/leds.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/wireless/broadcom/b43legacy/leds.c +++ b/drivers/net/wireless/broadcom/b43legacy/leds.c @@ -101,7 +101,7 @@ static int b43legacy_register_led(struct led->dev = dev; led->index = led_index; led->activelow = activelow; - strncpy(led->name, name, sizeof(led->name)); + strlcpy(led->name, name, sizeof(led->name));
led->led_dev.name = led->name; led->led_dev.default_trigger = default_trigger;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Buesch m@bues.ch
commit 2aa650d1950fce94f696ebd7db30b8830c2c946f upstream.
strncpy might not NUL-terminate the string, if the name equals the buffer size. Use strlcpy instead.
Signed-off-by: Michael Buesch m@bues.ch Cc: stable@vger.kernel.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/broadcom/b43/leds.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/wireless/broadcom/b43/leds.c +++ b/drivers/net/wireless/broadcom/b43/leds.c @@ -131,7 +131,7 @@ static int b43_register_led(struct b43_w led->wl = dev->wl; led->index = led_index; led->activelow = activelow; - strncpy(led->name, name, sizeof(led->name)); + strlcpy(led->name, name, sizeof(led->name)); atomic_set(&led->state, 0);
led->led_dev.name = led->name;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jerome Brunet jbrunet@baylibre.com
commit 4febced15ac8ddb9cf3e603edb111842e4863d9a upstream.
When merging codec formats, dpcm_runtime_base_format() should skip the codecs which are not supporting the current stream direction.
At the moment, if a BE link has more than one codec, and only one of these codecs has no capture DAI, it becomes impossible to start a capture stream because the merged format would be 0.
Skipping invalid codec DAI solves the problem.
Fixes: b073ed4e2126 ("ASoC: soc-pcm: DPCM cares BE format") Signed-off-by: Jerome Brunet jbrunet@baylibre.com Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/soc/soc-pcm.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -1694,6 +1694,14 @@ static u64 dpcm_runtime_base_format(stru int i;
for (i = 0; i < be->num_codecs; i++) { + /* + * Skip CODECs which don't support the current stream + * type. See soc_pcm_init_runtime_hw() for more details + */ + if (!snd_soc_dai_stream_valid(be->codec_dais[i], + stream)) + continue; + codec_dai_drv = be->codec_dais[i]->driver; if (stream == SNDRV_PCM_STREAM_PLAYBACK) codec_stream = &codec_dai_drv->playback;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit c889a45d229938a94b50aadb819def8bb11a6a54 upstream.
zx-tdm driver sets the DAI driver definitions with the format bits wrongly set with SNDRV_PCM_FORMAT_*, instead of SNDRV_PCM_FMTBIT_*.
This patch corrects the definitions.
Spotted by a sparse warning: sound/soc/zte/zx-tdm.c:363:35: warning: restricted snd_pcm_format_t degrades to integer
Fixes: 870e0ddc4345 ("ASoC: zx-tdm: add zte's tdm controller driver") Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/soc/zte/zx-tdm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/sound/soc/zte/zx-tdm.c +++ b/sound/soc/zte/zx-tdm.c @@ -144,8 +144,8 @@ static void zx_tdm_rx_dma_en(struct zx_t #define ZX_TDM_RATES (SNDRV_PCM_RATE_8000 | SNDRV_PCM_RATE_16000)
#define ZX_TDM_FMTBIT \ - (SNDRV_PCM_FMTBIT_S16_LE | SNDRV_PCM_FORMAT_MU_LAW | \ - SNDRV_PCM_FORMAT_A_LAW) + (SNDRV_PCM_FMTBIT_S16_LE | SNDRV_PCM_FMTBIT_MU_LAW | \ + SNDRV_PCM_FMTBIT_A_LAW)
static int zx_tdm_dai_probe(struct snd_soc_dai *dai) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavo@embeddedor.com
commit ae1c696a480c67c45fb23b35162183f72c6be0e1 upstream.
There is a potential execution path in which function platform_get_resource() returns NULL. If this happens, we will end up having a NULL pointer dereference.
Fix this by replacing devm_ioremap with devm_ioremap_resource, which has the NULL check and the memory region request.
This code was detected with the help of Coccinelle.
Cc: stable@vger.kernel.org Fixes: 2bd8d1d5cf89 ("ASoC: sirf: Add audio usp interface driver") Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/soc/sirf/sirf-usp.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/sound/soc/sirf/sirf-usp.c +++ b/sound/soc/sirf/sirf-usp.c @@ -370,10 +370,9 @@ static int sirf_usp_pcm_probe(struct pla platform_set_drvdata(pdev, usp);
mem_res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - base = devm_ioremap(&pdev->dev, mem_res->start, - resource_size(mem_res)); - if (base == NULL) - return -ENOMEM; + base = devm_ioremap_resource(&pdev->dev, mem_res); + if (IS_ERR(base)) + return PTR_ERR(base); usp->regmap = devm_regmap_init_mmio(&pdev->dev, base, &sirf_usp_regmap_config); if (IS_ERR(usp->regmap))
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ajit Pandey ajit.pandey@cirrus.com
commit b1470d4ce77c2d60661c7d5325d4fb8063e15ff8 upstream.
The offset of the DSP core needs to be taken into account for the DSP preloader control get and put. Currently the dsp->preloaded variable will only ever be read/updated on the first DSP, whilst this doesn't affect the operation of the control the readback will be incorrect.
Signed-off-by: Ajit Pandey ajit.pandey@cirrus.com Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/soc/codecs/wm_adsp.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/sound/soc/codecs/wm_adsp.c +++ b/sound/soc/codecs/wm_adsp.c @@ -2642,7 +2642,10 @@ int wm_adsp2_preloader_get(struct snd_kc struct snd_ctl_elem_value *ucontrol) { struct snd_soc_component *component = snd_soc_kcontrol_component(kcontrol); - struct wm_adsp *dsp = snd_soc_component_get_drvdata(component); + struct wm_adsp *dsps = snd_soc_component_get_drvdata(component); + struct soc_mixer_control *mc = + (struct soc_mixer_control *)kcontrol->private_value; + struct wm_adsp *dsp = &dsps[mc->shift - 1];
ucontrol->value.integer.value[0] = dsp->preloaded;
@@ -2654,10 +2657,11 @@ int wm_adsp2_preloader_put(struct snd_kc struct snd_ctl_elem_value *ucontrol) { struct snd_soc_component *component = snd_soc_kcontrol_component(kcontrol); - struct wm_adsp *dsp = snd_soc_component_get_drvdata(component); + struct wm_adsp *dsps = snd_soc_component_get_drvdata(component); struct snd_soc_dapm_context *dapm = snd_soc_component_get_dapm(component); struct soc_mixer_control *mc = (struct soc_mixer_control *)kcontrol->private_value; + struct wm_adsp *dsp = &dsps[mc->shift - 1]; char preload[32];
snprintf(preload, ARRAY_SIZE(preload), "DSP%u Preload", mc->shift);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 78ee559d7fc65e37670a46cfbeaaa62cb014af67 upstream.
Make sure to set the mem device release callback before calling put_device() in a couple of probe error paths so that the containing object also gets freed.
Fixes: d1de6d6c639b ("soc: qcom: Remote filesystem memory driver") Cc: stable stable@vger.kernel.org # 4.15 Cc: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Andy Gross andy.gross@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/soc/qcom/rmtfs_mem.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/soc/qcom/rmtfs_mem.c +++ b/drivers/soc/qcom/rmtfs_mem.c @@ -184,6 +184,7 @@ static int qcom_rmtfs_mem_probe(struct p device_initialize(&rmtfs_mem->dev); rmtfs_mem->dev.parent = &pdev->dev; rmtfs_mem->dev.groups = qcom_rmtfs_mem_groups; + rmtfs_mem->dev.release = qcom_rmtfs_mem_release_device;
rmtfs_mem->base = devm_memremap(&rmtfs_mem->dev, rmtfs_mem->addr, rmtfs_mem->size, MEMREMAP_WC); @@ -206,8 +207,6 @@ static int qcom_rmtfs_mem_probe(struct p goto put_device; }
- rmtfs_mem->dev.release = qcom_rmtfs_mem_release_device; - ret = of_property_read_u32(node, "qcom,vmid", &vmid); if (ret < 0 && ret != -EINVAL) { dev_err(&pdev->dev, "failed to parse qcom,vmid\n");
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Himanshu Madhani himanshu.madhani@cavium.com
commit 15b6c3c9568765f0717b2dd3aa67a5f7eadd9734 upstream.
This patch sets and clears FCF_ASYNC_{SENT|ACTIVE} flags to prevent stalling of relogin attempt. Once flag are correctly set/cleared, relogin timer can retry relogin attempt for driver to continue login.
Fixes: fa83e65885b9 ("scsi: qla2xxx: ensure async flags are reset correctly") Cc: stable@vger.kernel.org #4.17 Signed-off-by: Himanshu Madhani himanshu.madhani@cavium.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/qla2xxx/qla_init.c | 2 +- drivers/scsi/qla2xxx/qla_iocb.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -382,7 +382,7 @@ qla2x00_async_adisc_sp_done(void *ptr, i "Async done-%s res %x %8phC\n", sp->name, res, sp->fcport->port_name);
- sp->fcport->flags &= ~FCF_ASYNC_SENT; + sp->fcport->flags &= ~(FCF_ASYNC_SENT | FCF_ASYNC_ACTIVE);
memset(&ea, 0, sizeof(ea)); ea.event = FCME_ADISC_DONE; --- a/drivers/scsi/qla2xxx/qla_iocb.c +++ b/drivers/scsi/qla2xxx/qla_iocb.c @@ -2656,6 +2656,7 @@ qla24xx_els_dcmd2_iocb(scsi_qla_host_t * ql_dbg(ql_dbg_io, vha, 0x3073, "Enter: PLOGI portid=%06x\n", fcport->d_id.b24);
+ fcport->flags |= FCF_ASYNC_SENT; sp->type = SRB_ELS_DCMD; sp->name = "ELS_DCMD"; sp->fcport = fcport;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Samuel Neves sneves@dei.uc.pt
commit e78e5a91456fcecaa2efbb3706572fe043766f4d upstream.
In the __getcpu function, lsl is using the wrong target and destination registers. Luckily, the compiler tends to choose %eax for both variables, so it has been working so far.
Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available") Signed-off-by: Samuel Neves sneves@dei.uc.pt Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Andy Lutomirski luto@kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180901201452.27828-1-sneves@dei.uc.pt Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/vgtod.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/include/asm/vgtod.h +++ b/arch/x86/include/asm/vgtod.h @@ -93,7 +93,7 @@ static inline unsigned int __getcpu(void * * If RDPID is available, use it. */ - alternative_io ("lsl %[p],%[seg]", + alternative_io ("lsl %[seg],%[p]", ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */ X86_FEATURE_RDPID, [p] "=a" (p), [seg] "r" (__PER_CPU_SEG));
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Lutomirski luto@kernel.org
commit 4012e77a903d114f915fc607d6d2ed54a3d6c9b1 upstream.
A NMI can hit in the middle of context switching or in the middle of switch_mm_irqs_off(). In either case, CR3 might not match current->mm, which could cause copy_from_user_nmi() and friends to read the wrong memory.
Fix it by adding a new nmi_uaccess_okay() helper and checking it in copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.
Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Rik van Riel riel@surriel.com Cc: Nadav Amit nadav.amit@gmail.com Cc: Borislav Petkov bp@alien8.de Cc: Jann Horn jannh@google.com Cc: Peter Zijlstra peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/dd956eba16646fd0b15c3c0741269dfd84452dac.153555728... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/events/core.c | 2 +- arch/x86/include/asm/tlbflush.h | 40 ++++++++++++++++++++++++++++++++++++++++ arch/x86/lib/usercopy.c | 5 +++++ arch/x86/mm/tlb.c | 7 +++++++ 4 files changed, 53 insertions(+), 1 deletion(-)
--- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -2465,7 +2465,7 @@ perf_callchain_user(struct perf_callchai
perf_callchain_store(entry, regs->ip);
- if (!current->mm) + if (!nmi_uaccess_okay()) return;
if (perf_callchain_user32(regs, entry)) --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -175,8 +175,16 @@ struct tlb_state { * are on. This means that it may not match current->active_mm, * which will contain the previous user mm when we're in lazy TLB * mode even if we've already switched back to swapper_pg_dir. + * + * During switch_mm_irqs_off(), loaded_mm will be set to + * LOADED_MM_SWITCHING during the brief interrupts-off window + * when CR3 and loaded_mm would otherwise be inconsistent. This + * is for nmi_uaccess_okay()'s benefit. */ struct mm_struct *loaded_mm; + +#define LOADED_MM_SWITCHING ((struct mm_struct *)1) + u16 loaded_mm_asid; u16 next_asid; /* last user mm's ctx id */ @@ -246,6 +254,38 @@ struct tlb_state { }; DECLARE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate);
+/* + * Blindly accessing user memory from NMI context can be dangerous + * if we're in the middle of switching the current user task or + * switching the loaded mm. It can also be dangerous if we + * interrupted some kernel code that was temporarily using a + * different mm. + */ +static inline bool nmi_uaccess_okay(void) +{ + struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm); + struct mm_struct *current_mm = current->mm; + + VM_WARN_ON_ONCE(!loaded_mm); + + /* + * The condition we want to check is + * current_mm->pgd == __va(read_cr3_pa()). This may be slow, though, + * if we're running in a VM with shadow paging, and nmi_uaccess_okay() + * is supposed to be reasonably fast. + * + * Instead, we check the almost equivalent but somewhat conservative + * condition below, and we rely on the fact that switch_mm_irqs_off() + * sets loaded_mm to LOADED_MM_SWITCHING before writing to CR3. + */ + if (loaded_mm != current_mm) + return false; + + VM_WARN_ON_ONCE(current_mm->pgd != __va(read_cr3_pa())); + + return true; +} + /* Initialize cr4 shadow for this CPU. */ static inline void cr4_init_shadow(void) { --- a/arch/x86/lib/usercopy.c +++ b/arch/x86/lib/usercopy.c @@ -7,6 +7,8 @@ #include <linux/uaccess.h> #include <linux/export.h>
+#include <asm/tlbflush.h> + /* * We rely on the nested NMI work to allow atomic faults from the NMI path; the * nested NMI paths are careful to preserve CR2. @@ -19,6 +21,9 @@ copy_from_user_nmi(void *to, const void if (__range_not_ok(from, n, TASK_SIZE)) return n;
+ if (!nmi_uaccess_okay()) + return n; + /* * Even though this function is typically called from NMI/IRQ context * disable pagefaults so that its behaviour is consistent even when --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -298,6 +298,10 @@ void switch_mm_irqs_off(struct mm_struct
choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush);
+ /* Let nmi_uaccess_okay() know that we're changing CR3. */ + this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING); + barrier(); + if (need_flush) { this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id); this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen); @@ -328,6 +332,9 @@ void switch_mm_irqs_off(struct mm_struct if (next != &init_mm) this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id);
+ /* Make sure we write CR3 before loaded_mm. */ + barrier(); + this_cpu_write(cpu_tlbstate.loaded_mm, next); this_cpu_write(cpu_tlbstate.loaded_mm_asid, new_asid); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nick Desaulniers ndesaulniers@google.com
commit 1f59a4581b5ecfe9b4f049a7a2cf904d8352842d upstream.
This should have been marked extern inline in order to pick up the out of line definition in arch/x86/kernel/irqflags.S.
Fixes: 208cbb325589 ("x86/irqflags: Provide a declaration for native_save_fl") Reported-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Nick Desaulniers ndesaulniers@google.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Juergen Gross jgross@suse.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180827214011.55428-1-ndesaulniers@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/irqflags.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/irqflags.h +++ b/arch/x86/include/asm/irqflags.h @@ -33,7 +33,8 @@ extern inline unsigned long native_save_ return flags; }
-static inline void native_restore_fl(unsigned long flags) +extern inline void native_restore_fl(unsigned long flags); +extern inline void native_restore_fl(unsigned long flags) { asm volatile("push %0 ; popf" : /* no output */
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andi Kleen ak@linux.intel.com
commit 1ab534e85c93945f7862378d8c8adcf408205b19 upstream.
The check for Spectre microcodes does not check for family 6, only the model numbers.
Add a family 6 check to avoid ambiguity with other families.
Fixes: a5b296636453 ("x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes") Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180824170351.34874-2-andi@firstfloor.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/cpu/intel.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -150,6 +150,9 @@ static bool bad_spectre_microcode(struct if (cpu_has(c, X86_FEATURE_HYPERVISOR)) return false;
+ if (c->x86 != 6) + return false; + for (i = 0; i < ARRAY_SIZE(spectre_bad_microcodes); i++) { if (c->x86_model == spectre_bad_microcodes[i].model && c->x86_stepping == spectre_bad_microcodes[i].stepping)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andi Kleen ak@linux.intel.com
commit cc51e5428ea54f575d49cfcede1d4cb3a72b4ec4 upstream.
On Nehalem and newer core CPUs the CPU cache internally uses 44 bits physical address space. The L1TF workaround is limited by this internal cache address width, and needs to have one bit free there for the mitigation to work.
Older client systems report only 36bit physical address space so the range check decides that L1TF is not mitigated for a 36bit phys/32GB system with some memory holes.
But since these actually have the larger internal cache width this warning is bogus because it would only really be needed if the system had more than 43bits of memory.
Add a new internal x86_cache_bits field. Normally it is the same as the physical bits field reported by CPUID, but for Nehalem and newerforce it to be at least 44bits.
Change the L1TF memory size warning to use the new cache_bits field to avoid bogus warnings and remove the bogus comment about memory size.
Fixes: 17dbca119312 ("x86/speculation/l1tf: Add sysfs reporting for l1tf") Reported-by: George Anchev studio@anchev.net Reported-by: Christopher Snowhill kode54@gmail.com Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Cc: Michael Hocko mhocko@suse.com Cc: vbabka@suse.cz Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180824170351.34874-1-andi@firstfloor.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/processor.h | 4 ++- arch/x86/kernel/cpu/bugs.c | 46 ++++++++++++++++++++++++++++++++++----- arch/x86/kernel/cpu/common.c | 1 3 files changed, 45 insertions(+), 6 deletions(-)
--- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -132,6 +132,8 @@ struct cpuinfo_x86 { /* Index into per_cpu list: */ u16 cpu_index; u32 microcode; + /* Address space bits used by the cache internally */ + u8 x86_cache_bits; unsigned initialized : 1; } __randomize_layout;
@@ -183,7 +185,7 @@ extern void cpu_detect(struct cpuinfo_x8
static inline unsigned long long l1tf_pfn_limit(void) { - return BIT_ULL(boot_cpu_data.x86_phys_bits - 1 - PAGE_SHIFT); + return BIT_ULL(boot_cpu_data.x86_cache_bits - 1 - PAGE_SHIFT); }
extern void early_cpu_init(void); --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -652,6 +652,45 @@ EXPORT_SYMBOL_GPL(l1tf_mitigation); enum vmx_l1d_flush_state l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_AUTO; EXPORT_SYMBOL_GPL(l1tf_vmx_mitigation);
+/* + * These CPUs all support 44bits physical address space internally in the + * cache but CPUID can report a smaller number of physical address bits. + * + * The L1TF mitigation uses the top most address bit for the inversion of + * non present PTEs. When the installed memory reaches into the top most + * address bit due to memory holes, which has been observed on machines + * which report 36bits physical address bits and have 32G RAM installed, + * then the mitigation range check in l1tf_select_mitigation() triggers. + * This is a false positive because the mitigation is still possible due to + * the fact that the cache uses 44bit internally. Use the cache bits + * instead of the reported physical bits and adjust them on the affected + * machines to 44bit if the reported bits are less than 44. + */ +static void override_cache_bits(struct cpuinfo_x86 *c) +{ + if (c->x86 != 6) + return; + + switch (c->x86_model) { + case INTEL_FAM6_NEHALEM: + case INTEL_FAM6_WESTMERE: + case INTEL_FAM6_SANDYBRIDGE: + case INTEL_FAM6_IVYBRIDGE: + case INTEL_FAM6_HASWELL_CORE: + case INTEL_FAM6_HASWELL_ULT: + case INTEL_FAM6_HASWELL_GT3E: + case INTEL_FAM6_BROADWELL_CORE: + case INTEL_FAM6_BROADWELL_GT3E: + case INTEL_FAM6_SKYLAKE_MOBILE: + case INTEL_FAM6_SKYLAKE_DESKTOP: + case INTEL_FAM6_KABYLAKE_MOBILE: + case INTEL_FAM6_KABYLAKE_DESKTOP: + if (c->x86_cache_bits < 44) + c->x86_cache_bits = 44; + break; + } +} + static void __init l1tf_select_mitigation(void) { u64 half_pa; @@ -659,6 +698,8 @@ static void __init l1tf_select_mitigatio if (!boot_cpu_has_bug(X86_BUG_L1TF)) return;
+ override_cache_bits(&boot_cpu_data); + switch (l1tf_mitigation) { case L1TF_MITIGATION_OFF: case L1TF_MITIGATION_FLUSH_NOWARN: @@ -678,11 +719,6 @@ static void __init l1tf_select_mitigatio return; #endif
- /* - * This is extremely unlikely to happen because almost all - * systems have far more MAX_PA/2 than RAM can be fit into - * DIMM slots. - */ half_pa = (u64)l1tf_pfn_limit() << PAGE_SHIFT; if (e820__mapped_any(half_pa, ULLONG_MAX - half_pa, E820_TYPE_RAM)) { pr_warn("System has more than MAX_PA/2 memory. L1TF mitigation not effective.\n"); --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -919,6 +919,7 @@ void get_cpu_address_sizes(struct cpuinf else if (cpu_has(c, X86_FEATURE_PAE) || cpu_has(c, X86_FEATURE_PSE36)) c->x86_phys_bits = 36; #endif + c->x86_cache_bits = c->x86_phys_bits; }
static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavo@embeddedor.com
commit d49dbfade96d5b0863ca8a90122a805edd5ef50a upstream.
val can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
vers/hwmon/nct6775.c:2698 store_pwm_weight_temp_sel() warn: potential spectre issue 'data->temp_src' [r]
Fix this by sanitizing val before using it to index data->temp_src
Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwmon/nct6775.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/hwmon/nct6775.c +++ b/drivers/hwmon/nct6775.c @@ -63,6 +63,7 @@ #include <linux/bitops.h> #include <linux/dmi.h> #include <linux/io.h> +#include <linux/nospec.h> #include "lm75.h"
#define USE_ALTERNATE @@ -2689,6 +2690,7 @@ store_pwm_weight_temp_sel(struct device return err; if (val > NUM_TEMP) return -EINVAL; + val = array_index_nospec(val, NUM_TEMP + 1); if (val && (!(data->have_temp & BIT(val - 1)) || !data->temp_src[val - 1])) return -EINVAL;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
commit f12d11c5c184626b4befdee3d573ec8237405a33 upstream.
Reset the KASAN shadow state of the task stack before rewinding RSP. Without this, a kernel oops will leave parts of the stack poisoned, and code running under do_exit() can trip over such poisoned regions and cause nonsensical false-positive KASAN reports about stack-out-of-bounds bugs.
This does not wipe the exception stacks; if an oops happens on an exception stack, it might result in random KASAN false-positives from other tasks afterwards. This is probably relatively uninteresting, since if the kernel oopses on an exception stack, there are most likely bigger things to worry about. It'd be more interesting if vmapped stacks and KASAN were compatible, since then handle_stack_overflow() would oops from exception stack context.
Fixes: 2deb4be28077 ("x86/dumpstack: When OOPSing, rewind the stack before do_exit()") Signed-off-by: Jann Horn jannh@google.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Andrey Ryabinin aryabinin@virtuozzo.com Cc: Andy Lutomirski luto@kernel.org Cc: Dmitry Vyukov dvyukov@google.com Cc: Alexander Potapenko glider@google.com Cc: Kees Cook keescook@chromium.org Cc: kasan-dev@googlegroups.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180828184033.93712-1-jannh@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/dumpstack.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/arch/x86/kernel/dumpstack.c +++ b/arch/x86/kernel/dumpstack.c @@ -17,6 +17,7 @@ #include <linux/bug.h> #include <linux/nmi.h> #include <linux/sysfs.h> +#include <linux/kasan.h>
#include <asm/cpu_entry_area.h> #include <asm/stacktrace.h> @@ -356,7 +357,10 @@ void oops_end(unsigned long flags, struc * We're not going to return, but we might be on an IST stack or * have very little stack space left. Rewind the stack and kill * the task. + * Before we rewind the stack, we have to tell KASAN that we're going to + * reuse the task stack and that existing poisons are invalid. */ + kasan_unpoison_task_stack(current); rewind_stack_do_exit(signr); } NOKPROBE_SYMBOL(oops_end);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ben Hutchings ben@decadent.org.uk
commit 829fe4aa9ac16417a904ad1de1307de906854bcf upstream.
When bootstrapping an architecture, it's usual to generate the kernel's user-space headers (make headers_install) before building a compiler. Move the compiler check (for asm goto support) to the archprepare target so that it is only done when building code for the target.
Fixes: e501ce957a78 ("x86: Force asm-goto") Reported-by: Helmut Grohne helmutg@debian.org Signed-off-by: Ben Hutchings ben@decadent.org.uk Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180829194317.GA4765@decadent.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/Makefile | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
--- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -180,10 +180,6 @@ ifdef CONFIG_FUNCTION_GRAPH_TRACER endif endif
-ifndef CC_HAVE_ASM_GOTO - $(error Compiler lacks asm-goto support.) -endif - # # Jump labels need '-maccumulate-outgoing-args' for gcc < 4.5.2 to prevent a # GCC bug (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46226). There's no way @@ -317,6 +313,13 @@ PHONY += vdso_install vdso_install: $(Q)$(MAKE) $(build)=arch/x86/entry/vdso $@
+archprepare: checkbin +checkbin: +ifndef CC_HAVE_ASM_GOTO + @echo Compiler lacks asm-goto support. + @exit 1 +endif + archclean: $(Q)rm -rf $(objtree)/arch/i386 $(Q)rm -rf $(objtree)/arch/x86_64
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gerald Schaefer gerald.schaefer@de.ibm.com
commit 37a366face294facb9c9d9fdd9f5b64a27456cbd upstream.
Commit c9b5ad546e7d "s390/mm: tag normal pages vs pages used in page tables" accidentally changed the logic in arch_set_page_states(), which is used by the suspend/resume code. set_page_stable(page, order) was changed to set_page_stable_dat(page, 0). After this, only the first page of higher order pages will be set to stable, and a write to one of the unstable pages will result in an addressing exception.
Fix this by using "order" again, instead of "0".
Fixes: c9b5ad546e7d ("s390/mm: tag normal pages vs pages used in page tables") Cc: stable@vger.kernel.org # 4.14+ Reviewed-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Gerald Schaefer gerald.schaefer@de.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/mm/page-states.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/s390/mm/page-states.c +++ b/arch/s390/mm/page-states.c @@ -271,7 +271,7 @@ void arch_set_page_states(int make_stabl list_for_each(l, &zone->free_area[order].free_list[t]) { page = list_entry(l, struct page, lru); if (make_stable) - set_page_stable_dat(page, 0); + set_page_stable_dat(page, order); else set_page_unused(page, order); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin Schwidefsky schwidefsky@de.ibm.com
commit 5eda25b10297684c1f46a14199ec00210f3c346e upstream.
The memove, memset, memcpy, __memset16, __memset32 and __memset64 function have an additional indirect return branch in form of a "bzr" instruction. These need to use expolines as well.
Cc: stable@vger.kernel.org # v4.17+ Fixes: 97489e0663 ("s390/lib: use expoline for indirect branches") Reviewed-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/lib/mem.S | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
--- a/arch/s390/lib/mem.S +++ b/arch/s390/lib/mem.S @@ -17,7 +17,7 @@ ENTRY(memmove) ltgr %r4,%r4 lgr %r1,%r2 - bzr %r14 + jz .Lmemmove_exit aghi %r4,-1 clgr %r2,%r3 jnh .Lmemmove_forward @@ -36,6 +36,7 @@ ENTRY(memmove) .Lmemmove_forward_remainder: larl %r5,.Lmemmove_mvc ex %r4,0(%r5) +.Lmemmove_exit: BR_EX %r14 .Lmemmove_reverse: ic %r0,0(%r4,%r3) @@ -65,7 +66,7 @@ EXPORT_SYMBOL(memmove) */ ENTRY(memset) ltgr %r4,%r4 - bzr %r14 + jz .Lmemset_exit ltgr %r3,%r3 jnz .Lmemset_fill aghi %r4,-1 @@ -80,6 +81,7 @@ ENTRY(memset) .Lmemset_clear_remainder: larl %r3,.Lmemset_xc ex %r4,0(%r3) +.Lmemset_exit: BR_EX %r14 .Lmemset_fill: cghi %r4,1 @@ -115,7 +117,7 @@ EXPORT_SYMBOL(memset) */ ENTRY(memcpy) ltgr %r4,%r4 - bzr %r14 + jz .Lmemcpy_exit aghi %r4,-1 srlg %r5,%r4,8 ltgr %r5,%r5 @@ -124,6 +126,7 @@ ENTRY(memcpy) .Lmemcpy_remainder: larl %r5,.Lmemcpy_mvc ex %r4,0(%r5) +.Lmemcpy_exit: BR_EX %r14 .Lmemcpy_loop: mvc 0(256,%r1),0(%r3) @@ -145,9 +148,9 @@ EXPORT_SYMBOL(memcpy) .macro __MEMSET bits,bytes,insn ENTRY(__memset\bits) ltgr %r4,%r4 - bzr %r14 + jz .L__memset_exit\bits cghi %r4,\bytes - je .L__memset_exit\bits + je .L__memset_store\bits aghi %r4,-(\bytes+1) srlg %r5,%r4,8 ltgr %r5,%r5 @@ -163,8 +166,9 @@ ENTRY(__memset\bits) larl %r5,.L__memset_mvc\bits ex %r4,0(%r5) BR_EX %r14 -.L__memset_exit\bits: +.L__memset_store\bits: \insn %r3,0(%r2) +.L__memset_exit\bits: BR_EX %r14 .L__memset_mvc\bits: mvc \bytes(1,%r1),0(%r1)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin Schwidefsky schwidefsky@de.ibm.com
commit 26f843848bae973817b3587780ce6b7b0200d3e4 upstream.
For machines without the exrl instruction the BFP jit generates code that uses an "br %r1" instruction located in the lowcore page. Unfortunately there is a cut & paste error that puts an additional "larl %r1,.+14" instruction in the code that clobbers the branch target address in %r1. Remove the larl instruction.
Cc: stable@vger.kernel.org # v4.17+ Fixes: de5cb6eb51 ("s390: use expoline thunks in the BPF JIT") Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/net/bpf_jit_comp.c | 2 -- 1 file changed, 2 deletions(-)
--- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -485,8 +485,6 @@ static void bpf_jit_epilogue(struct bpf_ /* br %r1 */ _EMIT2(0x07f1); } else { - /* larl %r1,.+14 */ - EMIT6_PCREL_RILB(0xc0000000, REG_1, jit->prg + 14); /* ex 0,S390_lowcore.br_r1_tampoline */ EMIT4_DISP(0x44000000, REG_0, REG_0, offsetof(struct lowcore, br_r1_trampoline));
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Julian Wiedmann jwi@linux.ibm.com
commit 64e03ff72623b8c2ea89ca3cb660094e019ed4ae upstream.
When allocating a new AOB fails, handle_outbound() is still capable of transmitting the selected buffer (just without async completion).
But if a previous transfer on this queue slot used async completion, its sbal_state flags field is still set to QDIO_OUTBUF_STATE_FLAG_PENDING. So when the upper layer driver sees this stale flag, it expects an async completion that never happens.
Fix this by unconditionally clearing the flags field.
Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks") Cc: stable@vger.kernel.org #v3.2+ Signed-off-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/include/asm/qdio.h | 1 - drivers/s390/cio/qdio_main.c | 5 ++--- 2 files changed, 2 insertions(+), 4 deletions(-)
--- a/arch/s390/include/asm/qdio.h +++ b/arch/s390/include/asm/qdio.h @@ -262,7 +262,6 @@ struct qdio_outbuf_state { void *user; };
-#define QDIO_OUTBUF_STATE_FLAG_NONE 0x00 #define QDIO_OUTBUF_STATE_FLAG_PENDING 0x01
#define CHSC_AC1_INITIATE_INPUTQ 0x80 --- a/drivers/s390/cio/qdio_main.c +++ b/drivers/s390/cio/qdio_main.c @@ -631,21 +631,20 @@ static inline unsigned long qdio_aob_for unsigned long phys_aob = 0;
if (!q->use_cq) - goto out; + return 0;
if (!q->aobs[bufnr]) { struct qaob *aob = qdio_allocate_aob(); q->aobs[bufnr] = aob; } if (q->aobs[bufnr]) { - q->sbal_state[bufnr].flags = QDIO_OUTBUF_STATE_FLAG_NONE; q->sbal_state[bufnr].aob = q->aobs[bufnr]; q->aobs[bufnr]->user1 = (u64) q->sbal_state[bufnr].user; phys_aob = virt_to_phys(q->aobs[bufnr]); WARN_ON_ONCE(phys_aob & 0xFF); }
-out: + q->sbal_state[bufnr].flags = 0; return phys_aob; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin Schwidefsky schwidefsky@de.ibm.com
commit fb7d7518b0d65955f91c7b875c36eae7694c69bd upstream.
The numa_init_early initcall sets the node_to_cpumask_map[0] to the full cpu_possible_mask. Unfortunately this early_initcall is too late, the NUMA setup for numa=emu is done even earlier. The order of calls is numa_setup() -> emu_update_cpu_topology(), then the early_initcalls(), followed by sched_init_domains().
Starting with git commit 051f3ca02e46432c0965e8948f00c07d8a2f09c0 "sched/topology: Introduce NUMA identity node sched domain" the incorrect node_to_cpumask_map[0] really screws up the domain setup and the kernel panics with the follow oops:
Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/numa/numa.c | 16 ++-------------- 1 file changed, 2 insertions(+), 14 deletions(-)
--- a/arch/s390/numa/numa.c +++ b/arch/s390/numa/numa.c @@ -134,6 +134,8 @@ void __init numa_setup(void) { pr_info("NUMA mode: %s\n", mode->name); nodes_clear(node_possible_map); + /* Initially attach all possible CPUs to node 0. */ + cpumask_copy(&node_to_cpumask_map[0], cpu_possible_mask); if (mode->setup) mode->setup(); numa_setup_memory(); @@ -141,20 +143,6 @@ void __init numa_setup(void) }
/* - * numa_init_early() - Initialization initcall - * - * This runs when only one CPU is online and before the first - * topology update is called for by the scheduler. - */ -static int __init numa_init_early(void) -{ - /* Attach all possible CPUs to node 0 for now. */ - cpumask_copy(&node_to_cpumask_map[0], cpu_possible_mask); - return 0; -} -early_initcall(numa_init_early); - -/* * numa_init_late() - Initialization initcall * * Register NUMA nodes.
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sebastian Ott sebott@linux.ibm.com
commit 866f3576a72b2233a76dffb80290f8086dc49e17 upstream.
During interrupt setup we allocate interrupt vectors, walk the list of msi descriptors, and fill in the message data. Requesting more interrupts than supported on s390 can lead to an out of bounds access.
When we restrict the number of interrupts we should also stop walking the msi list after all supported interrupts are handled.
Cc: stable@vger.kernel.org Signed-off-by: Sebastian Ott sebott@linux.ibm.com Signed-off-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/pci/pci.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/s390/pci/pci.c +++ b/arch/s390/pci/pci.c @@ -421,6 +421,8 @@ int arch_setup_msi_irqs(struct pci_dev * hwirq = 0; for_each_pci_msi_entry(msi, pdev) { rc = -EIO; + if (hwirq >= msi_vecs) + break; irq = irq_alloc_desc(0); /* Alloc irq on node 0 */ if (irq < 0) return -ENOMEM;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Rudo prudo@linux.ibm.com
commit ad03b821fbc30395b72af438f5bb41676a5f891d upstream.
When the kernel is built with CONFIG_EXPOLINE=y and a compiler with indirect branch mitigation enabled the purgatory crashes. The reason for that is that the macros defined for expoline are used in mem.S. These macros define new sections (.text.__s390x_indirect_*) which are marked executable. Due to the missing linker script those sections are linked to address 0, just as the .text section. In combination with the entry point also being at address 0 this causes the purgatory load code (kernel/kexec_file.c: kexec_purgatory_setup_sechdrs) to update the entry point twice. Thus the old kernel jumps to some 'random' address causing the crash.
To fix this turn off expolines for the purgatory. There is no problem with this in this case due to the fact that the purgatory only runs once and the tlb is purged (diag 308) in the end.
Fixes: 840798a1f5299 ("s390/kexec_file: Add purgatory") Cc: stable@vger.kernel.org # 4.17 Signed-off-by: Philipp Rudo prudo@linux.ibm.com Reviewed-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/purgatory/Makefile | 1 + 1 file changed, 1 insertion(+)
--- a/arch/s390/purgatory/Makefile +++ b/arch/s390/purgatory/Makefile @@ -23,6 +23,7 @@ KBUILD_CFLAGS += -Wno-pointer-sign -Wno- KBUILD_CFLAGS += -fno-zero-initialized-in-bss -fno-builtin -ffreestanding KBUILD_CFLAGS += -c -MD -Os -m64 -msoft-float KBUILD_CFLAGS += $(call cc-option,-fno-PIE) +KBUILD_AFLAGS := $(filter-out -DCC_USING_EXPOLINE,$(KBUILD_AFLAGS))
$(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE $(call if_changed,ld)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Rudo prudo@linux.ibm.com
commit c315e69308c739a43c4ebc539bedbc1ac8d79854 upstream.
Without FORCE make does not detect changes only made to the command line options. So object files might not be re-built even when they should be. Fix this by adding FORCE where it is missing.
Fixes: 840798a1f5299 ("s390/kexec_file: Add purgatory") Cc: stable@vger.kernel.org # 4.17 Signed-off-by: Philipp Rudo prudo@linux.ibm.com Acked-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/purgatory/Makefile | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/s390/purgatory/Makefile +++ b/arch/s390/purgatory/Makefile @@ -7,13 +7,13 @@ purgatory-y := head.o purgatory.o string targets += $(purgatory-y) purgatory.ro kexec-purgatory.c PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
-$(obj)/sha256.o: $(srctree)/lib/sha256.c +$(obj)/sha256.o: $(srctree)/lib/sha256.c FORCE $(call if_changed_rule,cc_o_c)
-$(obj)/mem.o: $(srctree)/arch/s390/lib/mem.S +$(obj)/mem.o: $(srctree)/arch/s390/lib/mem.S FORCE $(call if_changed_rule,as_o_S)
-$(obj)/string.o: $(srctree)/arch/s390/lib/string.c +$(obj)/string.o: $(srctree)/arch/s390/lib/string.c FORCE $(call if_changed_rule,cc_o_c)
LDFLAGS_purgatory.ro := -e purgatory_start -r --no-undefined -nostdlib
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu mhiramat@kernel.org
commit ffb9bd68ebdb3b8d00ef5a79bbe8167a3281cace upstream.
Show kprobes blacklist addresses under same condition of showing kallsyms addresses.
Since there are several name conflict for local symbols, kprobe blacklist needs to show each addresses so that user can identify where is on blacklist by comparing with kallsyms.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Cc: Ananth N Mavinakayanahalli ananth@in.ibm.com Cc: Anil S Keshavamurthy anil.s.keshavamurthy@intel.com Cc: Arnd Bergmann arnd@arndb.de Cc: David Howells dhowells@redhat.com Cc: David S . Miller davem@davemloft.net Cc: Heiko Carstens heiko.carstens@de.ibm.com Cc: Jon Medhurst tixy@linaro.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Thomas Richter tmricht@linux.ibm.com Cc: Tobin C . Harding me@tobin.cc Cc: Will Deacon will.deacon@arm.com Cc: acme@kernel.org Cc: akpm@linux-foundation.org Cc: brueckner@linux.vnet.ibm.com Cc: linux-arch@vger.kernel.org Cc: rostedt@goodmis.org Cc: schwidefsky@de.ibm.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/152491893217.9916.14760965896164273464.stgit@de... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/kprobes.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -2428,8 +2428,16 @@ static int kprobe_blacklist_seq_show(str struct kprobe_blacklist_entry *ent = list_entry(v, struct kprobe_blacklist_entry, list);
- seq_printf(m, "0x%px-0x%px\t%ps\n", (void *)ent->start_addr, - (void *)ent->end_addr, (void *)ent->start_addr); + /* + * If /proc/kallsyms is not showing kernel address, we won't + * show them here either. + */ + if (!kallsyms_show_value()) + seq_printf(m, "0x%px-0x%px\t%ps\n", NULL, NULL, + (void *)ent->start_addr); + else + seq_printf(m, "0x%px-0x%px\t%ps\n", (void *)ent->start_addr, + (void *)ent->end_addr, (void *)ent->start_addr); return 0; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu mhiramat@kernel.org
commit 4458515b2c52831ee622411d2fe3e774d1f5c49a upstream.
Replace %p with %pS or just remove it if unneeded. And use WARN_ONCE() if it is a single bug.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Cc: Ananth N Mavinakayanahalli ananth@in.ibm.com Cc: Anil S Keshavamurthy anil.s.keshavamurthy@intel.com Cc: Arnd Bergmann arnd@arndb.de Cc: David Howells dhowells@redhat.com Cc: David S . Miller davem@davemloft.net Cc: Heiko Carstens heiko.carstens@de.ibm.com Cc: Jon Medhurst tixy@linaro.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Thomas Richter tmricht@linux.ibm.com Cc: Tobin C . Harding me@tobin.cc Cc: Will Deacon will.deacon@arm.com Cc: acme@kernel.org Cc: akpm@linux-foundation.org Cc: brueckner@linux.vnet.ibm.com Cc: linux-arch@vger.kernel.org Cc: rostedt@goodmis.org Cc: schwidefsky@de.ibm.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/152491899284.9916.5350534544808158621.stgit@dev... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/kprobes.c | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-)
--- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -710,9 +710,7 @@ static void reuse_unused_kprobe(struct k * there is still a relative jump) and disabled. */ op = container_of(ap, struct optimized_kprobe, kp); - if (unlikely(list_empty(&op->list))) - printk(KERN_WARNING "Warning: found a stray unused " - "aggrprobe@%p\n", ap->addr); + WARN_ON_ONCE(list_empty(&op->list)); /* Enable the probe again */ ap->flags &= ~KPROBE_FLAG_DISABLED; /* Optimize it again (remove from op->list) */ @@ -985,7 +983,8 @@ static int arm_kprobe_ftrace(struct kpro ret = ftrace_set_filter_ip(&kprobe_ftrace_ops, (unsigned long)p->addr, 0, 0); if (ret) { - pr_debug("Failed to arm kprobe-ftrace at %p (%d)\n", p->addr, ret); + pr_debug("Failed to arm kprobe-ftrace at %pS (%d)\n", + p->addr, ret); return ret; }
@@ -1025,7 +1024,8 @@ static int disarm_kprobe_ftrace(struct k
ret = ftrace_set_filter_ip(&kprobe_ftrace_ops, (unsigned long)p->addr, 1, 0); - WARN(ret < 0, "Failed to disarm kprobe-ftrace at %p (%d)\n", p->addr, ret); + WARN_ONCE(ret < 0, "Failed to disarm kprobe-ftrace at %pS (%d)\n", + p->addr, ret); return ret; } #else /* !CONFIG_KPROBES_ON_FTRACE */ @@ -2169,11 +2169,12 @@ out: } EXPORT_SYMBOL_GPL(enable_kprobe);
+/* Caller must NOT call this in usual path. This is only for critical case */ void dump_kprobe(struct kprobe *kp) { - printk(KERN_WARNING "Dumping kprobe:\n"); - printk(KERN_WARNING "Name: %s\nAddress: %p\nOffset: %x\n", - kp->symbol_name, kp->addr, kp->offset); + pr_err("Dumping kprobe:\n"); + pr_err("Name: %s\nOffset: %x\nAddress: %pS\n", + kp->symbol_name, kp->offset, kp->addr); } NOKPROBE_SYMBOL(dump_kprobe);
@@ -2196,11 +2197,8 @@ static int __init populate_kprobe_blackl entry = arch_deref_entry_point((void *)*iter);
if (!kernel_text_address(entry) || - !kallsyms_lookup_size_offset(entry, &size, &offset)) { - pr_err("Failed to find blacklist at %p\n", - (void *)entry); + !kallsyms_lookup_size_offset(entry, &size, &offset)) continue; - }
ent = kmalloc(sizeof(*ent), GFP_KERNEL); if (!ent)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu mhiramat@kernel.org
commit 75b2f5f5911fe7a2fc82969b2b24dde34e8f820d upstream.
Fix %p uses in error messages by removing it and using general dumper.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Cc: Ananth N Mavinakayanahalli ananth@in.ibm.com Cc: Anil S Keshavamurthy anil.s.keshavamurthy@intel.com Cc: Arnd Bergmann arnd@arndb.de Cc: David Howells dhowells@redhat.com Cc: David S . Miller davem@davemloft.net Cc: Heiko Carstens heiko.carstens@de.ibm.com Cc: Jon Medhurst tixy@linaro.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Thomas Richter tmricht@linux.ibm.com Cc: Tobin C . Harding me@tobin.cc Cc: Will Deacon will.deacon@arm.com Cc: acme@kernel.org Cc: akpm@linux-foundation.org Cc: brueckner@linux.vnet.ibm.com Cc: linux-arch@vger.kernel.org Cc: rostedt@goodmis.org Cc: schwidefsky@de.ibm.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/152491905361.9916.15300852365956231645.stgit@de... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/probes/kprobes/core.c | 4 ++-- arch/arm/probes/kprobes/test-core.c | 1 - 2 files changed, 2 insertions(+), 3 deletions(-)
--- a/arch/arm/probes/kprobes/core.c +++ b/arch/arm/probes/kprobes/core.c @@ -289,8 +289,8 @@ void __kprobes kprobe_handler(struct pt_ break; case KPROBE_REENTER: /* A nested probe was hit in FIQ, it is a BUG */ - pr_warn("Unrecoverable kprobe detected at %p.\n", - p->addr); + pr_warn("Unrecoverable kprobe detected.\n"); + dump_kprobe(p); /* fall through */ default: /* impossible cases */ --- a/arch/arm/probes/kprobes/test-core.c +++ b/arch/arm/probes/kprobes/test-core.c @@ -1461,7 +1461,6 @@ fail: print_registers(&result_regs);
if (mem) { - pr_err("current_stack=%p\n", current_stack); pr_err("expected_memory:\n"); print_memory(expected_memory, mem_size); pr_err("result_memory:\n");
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu mhiramat@kernel.org
commit f2a3ab36077222437b4826fc76111caa14562b7c upstream.
Since the blacklist and list files on debugfs indicates a sensitive address information to reader, it should be restricted to the root user.
Suggested-by: Thomas Richter tmricht@linux.ibm.com Suggested-by: Ingo Molnar mingo@kernel.org Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Cc: Ananth N Mavinakayanahalli ananth@in.ibm.com Cc: Anil S Keshavamurthy anil.s.keshavamurthy@intel.com Cc: Arnd Bergmann arnd@arndb.de Cc: David Howells dhowells@redhat.com Cc: David S . Miller davem@davemloft.net Cc: Heiko Carstens heiko.carstens@de.ibm.com Cc: Jon Medhurst tixy@linaro.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Tobin C . Harding me@tobin.cc Cc: Will Deacon will.deacon@arm.com Cc: acme@kernel.org Cc: akpm@linux-foundation.org Cc: brueckner@linux.vnet.ibm.com Cc: linux-arch@vger.kernel.org Cc: rostedt@goodmis.org Cc: schwidefsky@de.ibm.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/152491890171.9916.5183693615601334087.stgit@dev... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/kprobes.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -2617,7 +2617,7 @@ static int __init debugfs_kprobe_init(vo if (!dir) return -ENOMEM;
- file = debugfs_create_file("list", 0444, dir, NULL, + file = debugfs_create_file("list", 0400, dir, NULL, &debugfs_kprobes_operations); if (!file) goto error; @@ -2627,7 +2627,7 @@ static int __init debugfs_kprobe_init(vo if (!file) goto error;
- file = debugfs_create_file("blacklist", 0444, dir, NULL, + file = debugfs_create_file("blacklist", 0400, dir, NULL, &debugfs_kprobe_blacklist_ops); if (!file) goto error;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maciej W. Rozycki macro@mips.com
commit f5958b4cf4fc38ed4583ab83fb7c4cd1ab05f47b upstream.
Use the `unsigned long' rather than `__u32' type for DSP accumulator registers, like with the regular MIPS multiply/divide accumulator and general-purpose registers, as all are 64-bit in 64-bit implementations and using a 32-bit data type leads to contents truncation on context saving.
Update `arch_ptrace' and `compat_arch_ptrace' accordingly, removing casts that are similarly not used with multiply/divide accumulator or general-purpose register accesses.
Signed-off-by: Maciej W. Rozycki macro@mips.com Signed-off-by: Paul Burton paul.burton@mips.com Fixes: e50c0a8fa60d ("Support the MIPS32 / MIPS64 DSP ASE.") Patchwork: https://patchwork.linux-mips.org/patch/19329/ Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: James Hogan jhogan@kernel.org Cc: Ralf Baechle ralf@linux-mips.org Cc: linux-fsdevel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # 2.6.15+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/include/asm/processor.h | 2 +- arch/mips/kernel/ptrace.c | 2 +- arch/mips/kernel/ptrace32.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-)
--- a/arch/mips/include/asm/processor.h +++ b/arch/mips/include/asm/processor.h @@ -141,7 +141,7 @@ struct mips_fpu_struct {
#define NUM_DSP_REGS 6
-typedef __u32 dspreg_t; +typedef unsigned long dspreg_t;
struct mips_dsp_state { dspreg_t dspr[NUM_DSP_REGS]; --- a/arch/mips/kernel/ptrace.c +++ b/arch/mips/kernel/ptrace.c @@ -856,7 +856,7 @@ long arch_ptrace(struct task_struct *chi goto out; } dregs = __get_dsp_regs(child); - tmp = (unsigned long) (dregs[addr - DSP_BASE]); + tmp = dregs[addr - DSP_BASE]; break; } case DSP_CONTROL: --- a/arch/mips/kernel/ptrace32.c +++ b/arch/mips/kernel/ptrace32.c @@ -142,7 +142,7 @@ long compat_arch_ptrace(struct task_stru goto out; } dregs = __get_dsp_regs(child); - tmp = (unsigned long) (dregs[addr - DSP_BASE]); + tmp = dregs[addr - DSP_BASE]; break; } case DSP_CONTROL:
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matt Redfearn matt.redfearn@mips.com
commit b1c03f1ef48d36ff28afb06e8f0c1233ef072f1d upstream.
The __clear_user function is defined to return the number of bytes that could not be cleared. From the underlying memset / bzero implementation this means setting register a2 to that number on return. Currently if a page fault is triggered within the MIPSr6 version of setting of initial unaligned bytes, the value loaded into a2 on return is meaningless.
During the MIPSr6 version of the initial unaligned bytes block, register a2 contains the number of bytes to be set beyond the initial unaligned bytes. The t0 register is initally set to the number of unaligned bytes - STORSIZE, effectively a negative version of the number of unaligned bytes. This is then incremented before each byte is saved.
The label .Lbyte_fixup@ is jumped to on page fault. Currently the value in a2 is incorrectly replaced by 0 - t0 + 1, effectively the number of unaligned bytes remaining. This leads to the failures being reported by the following test code:
static int __init test_clear_user(void) { int j, k;
pr_info("\n\n\nTesting clear_user\n"); for (j = 0; j < 512; j++) { if ((k = clear_user(NULL+3, j)) != j) { pr_err("clear_user (NULL %d) returned %d\n", j, k); } } return 0; } late_initcall(test_clear_user);
Which reports: [ 3.965439] Testing clear_user [ 3.973169] clear_user (NULL 8) returned 6 [ 3.976782] clear_user (NULL 9) returned 6 [ 3.980390] clear_user (NULL 10) returned 6 [ 3.984052] clear_user (NULL 11) returned 6 [ 3.987524] clear_user (NULL 12) returned 6
Fix this by subtracting t0 from a2 (rather than $0), effectivey giving: unset_bytes = (#bytes - (#unaligned bytes)) - (-#unaligned bytes remaining + 1) + 1 a2 = a2 - t0 + 1
This fixes the value returned from __clear user when the number of bytes to set is > LONGSIZE and the address is invalid and unaligned.
Unfortunately, this breaks the fixup handling for unaligned bytes after the final long, where register a2 still contains the number of bytes remaining to be set and the t0 register is to 0 - the number of unaligned bytes remaining.
Because t0 is now is now subtracted from a2 rather than 0, the number of bytes unset is reported incorrectly:
static int __init test_clear_user(void) { char *test; int j, k;
pr_info("\n\n\nTesting clear_user\n"); test = vmalloc(PAGE_SIZE);
for (j = 256; j < 512; j++) { if ((k = clear_user(test + PAGE_SIZE - 254, j)) != j - 254) { pr_err("clear_user (%px %d) returned %d\n", test + PAGE_SIZE - 254, j, k); } } return 0; } late_initcall(test_clear_user);
[ 3.976775] clear_user (c00000000000df02 256) returned 4 [ 3.981957] clear_user (c00000000000df02 257) returned 6 [ 3.986425] clear_user (c00000000000df02 258) returned 8 [ 3.990850] clear_user (c00000000000df02 259) returned 10 [ 3.995332] clear_user (c00000000000df02 260) returned 12 [ 3.999815] clear_user (c00000000000df02 261) returned 14
Fix this by ensuring that a2 is set to 0 during the set of final unaligned bytes.
Signed-off-by: Matt Redfearn matt.redfearn@mips.com Signed-off-by: Paul Burton paul.burton@mips.com Fixes: 8c56208aff77 ("MIPS: lib: memset: Add MIPS R6 support") Patchwork: https://patchwork.linux-mips.org/patch/19338/ Cc: James Hogan jhogan@kernel.org Cc: Ralf Baechle ralf@linux-mips.org Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/lib/memset.S | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/mips/lib/memset.S +++ b/arch/mips/lib/memset.S @@ -195,6 +195,7 @@ #endif #else PTR_SUBU t0, $0, a2 + move a2, zero /* No remaining longs */ PTR_ADDIU t0, 1 STORE_BYTE(0) STORE_BYTE(1) @@ -231,7 +232,7 @@
#ifdef CONFIG_CPU_MIPSR6 .Lbyte_fixup@: - PTR_SUBU a2, $0, t0 + PTR_SUBU a2, t0 jr ra PTR_ADDIU a2, 1 #endif /* CONFIG_CPU_MIPSR6 */
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Burton paul.burton@mips.com
commit 344ebf09949c31bcb8818d8458b65add29f1d67b upstream.
The VDSO Makefile filters CFLAGS to select a subset which it uses whilst building the VDSO ELF. One of the flags it allows through is the -march= flag that selects the architecture/ISA to target.
Unfortunately in cases where CONFIG_CPU_MIPS32_R{1,2}=y and the toolchain defaults to building for MIPS64, the main MIPS Makefile ends up using the short-form -<arch> flags in cflags-y. This is because the calls to cc-option always fail to use the long-form -march=<arch> flag due to the lack of an -mabi=<abi> flag in KBUILD_CFLAGS at the point where the cc-option function is executed. The resulting GCC invocation is something like:
$ mips64-linux-gcc -Werror -march=mips32r2 -c -x c /dev/null -o tmp cc1: error: '-march=mips32r2' is not compatible with the selected ABI
These short-form -<arch> flags are dropped by the VDSO Makefile's filtering, and so we attempt to build the VDSO without specifying any architecture. This results in an attempt to build the VDSO using whatever the compiler's default architecture is, regardless of whether that is suitable for the kernel configuration.
One encountered build failure resulting from this mismatch is a rejection of the sync instruction if the kernel is configured for a MIPS32 or MIPS64 r1 or r2 target but the toolchain defaults to an older architecture revision such as MIPS1 which did not include the sync instruction:
CC arch/mips/vdso/gettimeofday.o /tmp/ccGQKoOj.s: Assembler messages: /tmp/ccGQKoOj.s:273: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:329: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:520: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:714: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1009: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1066: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1114: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1279: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1334: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1374: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1459: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1514: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:1814: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:2002: Error: opcode not supported on this processor: mips1 (mips1) `sync' /tmp/ccGQKoOj.s:2066: Error: opcode not supported on this processor: mips1 (mips1) `sync' make[2]: *** [scripts/Makefile.build:318: arch/mips/vdso/gettimeofday.o] Error 1 make[1]: *** [scripts/Makefile.build:558: arch/mips/vdso] Error 2 make[1]: *** Waiting for unfinished jobs....
This can be reproduced for example by attempting to build pistachio_defconfig using Arnd's GCC 8.1.0 mips64 toolchain from kernel.org:
https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/8.1.0/x...
Resolve this problem by using the long-form -march=<arch> in all cases, which makes it through the arch/mips/vdso/Makefile's filtering & is thus consistently used to build both the kernel proper & the VDSO.
The use of cc-option to prefer the long-form & fall back to the short-form flags makes no sense since the short-form is just an abbreviation for the also-supported long-form in all GCC versions that we support building with. This means there is no case in which we have to use the short-form -<arch> flags, so we can simply remove them.
The manual redefinition of _MIPS_ISA is removed naturally along with the use of the short-form flags that it accompanied, and whilst here we remove the separate assembler ISA selection. I suspect that both of these were only required due to the mips32 vs mips2 mismatch that was introduced by commit 59b3e8e9aac6 ("[MIPS] Makefile crapectomy.") and fixed but not cleaned up by commit 9200c0b2a07c ("[MIPS] Fix Makefile bugs for MIPS32/MIPS64 R1 and R2.").
I've marked this for backport as far as v4.4 where the MIPS VDSO was introduced. In earlier kernels there should be no ill effect to using the short-form flags.
Signed-off-by: Paul Burton paul.burton@mips.com Cc: Ralf Baechle ralf@linux-mips.org Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.4+ Reviewed-by: James Hogan jhogan@kernel.org Patchwork: https://patchwork.linux-mips.org/patch/19579/ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/Makefile | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)
--- a/arch/mips/Makefile +++ b/arch/mips/Makefile @@ -155,15 +155,11 @@ cflags-$(CONFIG_CPU_R4300) += -march=r43 cflags-$(CONFIG_CPU_VR41XX) += -march=r4100 -Wa,--trap cflags-$(CONFIG_CPU_R4X00) += -march=r4600 -Wa,--trap cflags-$(CONFIG_CPU_TX49XX) += -march=r4600 -Wa,--trap -cflags-$(CONFIG_CPU_MIPS32_R1) += $(call cc-option,-march=mips32,-mips32 -U_MIPS_ISA -D_MIPS_ISA=_MIPS_ISA_MIPS32) \ - -Wa,-mips32 -Wa,--trap -cflags-$(CONFIG_CPU_MIPS32_R2) += $(call cc-option,-march=mips32r2,-mips32r2 -U_MIPS_ISA -D_MIPS_ISA=_MIPS_ISA_MIPS32) \ - -Wa,-mips32r2 -Wa,--trap +cflags-$(CONFIG_CPU_MIPS32_R1) += -march=mips32 -Wa,--trap +cflags-$(CONFIG_CPU_MIPS32_R2) += -march=mips32r2 -Wa,--trap cflags-$(CONFIG_CPU_MIPS32_R6) += -march=mips32r6 -Wa,--trap -modd-spreg -cflags-$(CONFIG_CPU_MIPS64_R1) += $(call cc-option,-march=mips64,-mips64 -U_MIPS_ISA -D_MIPS_ISA=_MIPS_ISA_MIPS64) \ - -Wa,-mips64 -Wa,--trap -cflags-$(CONFIG_CPU_MIPS64_R2) += $(call cc-option,-march=mips64r2,-mips64r2 -U_MIPS_ISA -D_MIPS_ISA=_MIPS_ISA_MIPS64) \ - -Wa,-mips64r2 -Wa,--trap +cflags-$(CONFIG_CPU_MIPS64_R1) += -march=mips64 -Wa,--trap +cflags-$(CONFIG_CPU_MIPS64_R2) += -march=mips64r2 -Wa,--trap cflags-$(CONFIG_CPU_MIPS64_R6) += -march=mips64r6 -Wa,--trap cflags-$(CONFIG_CPU_R5000) += -march=r5000 -Wa,--trap cflags-$(CONFIG_CPU_R5432) += $(call cc-option,-march=r5400,-march=r5000) \
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Huacai Chen chenhc@lemote.com
commit a30718868915fbb991a9ae9e45594b059f28e9ae upstream.
Linux expects that if a CPU modifies a memory location, then that modification will eventually become visible to other CPUs in the system.
Loongson 3 CPUs include a Store Fill Buffer (SFB) which sits between a core & its L1 data cache, queueing memory accesses & allowing for faster forwarding of data from pending stores to younger loads from the core. Unfortunately the SFB prioritizes loads such that a continuous stream of loads may cause a pending write to be buffered indefinitely. This is problematic if we end up with 2 CPUs which each perform a store that the other polls for - one or both CPUs may end up with their stores buffered in the SFB, never reaching cache due to the continuous reads from the poll loop. Such a deadlock condition has been observed whilst running qspinlock code.
This patch changes the definition of cpu_relax() to smp_mb() for Loongson-3, forcing a flush of the SFB on SMP systems which will cause any pending writes to make it as far as the L1 caches where they will become visible to other CPUs. If the kernel is not compiled for SMP support, this will expand to a barrier() as before.
This workaround matches that currently implemented for ARM when CONFIG_ARM_ERRATA_754327=y, which was introduced by commit 534be1d5a2da ("ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore").
Although the workaround is only required when the Loongson 3 SFB functionality is enabled, and we only began explicitly enabling that functionality in v4.7 with commit 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT"), existing or future firmware may enable the SFB which means we may need the workaround backported to earlier kernels too.
[paul.burton@mips.com: - Reword commit message & comment. - Limit stable backport to v3.15+ where we support Loongson 3 CPUs.]
Signed-off-by: Huacai Chen chenhc@lemote.com Signed-off-by: Paul Burton paul.burton@mips.com References: 534be1d5a2da ("ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore") References: 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT") Patchwork: https://patchwork.linux-mips.org/patch/19830/ Cc: Ralf Baechle ralf@linux-mips.org Cc: James Hogan jhogan@kernel.org Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang zhangfx@lemote.com Cc: Zhangjin Wu wuzhangjin@gmail.com Cc: Huacai Chen chenhuacai@gmail.com Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/include/asm/processor.h | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/arch/mips/include/asm/processor.h +++ b/arch/mips/include/asm/processor.h @@ -386,7 +386,20 @@ unsigned long get_wchan(struct task_stru #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
+#ifdef CONFIG_CPU_LOONGSON3 +/* + * Loongson-3's SFB (Store-Fill-Buffer) may buffer writes indefinitely when a + * tight read loop is executed, because reads take priority over writes & the + * hardware (incorrectly) doesn't ensure that writes will eventually occur. + * + * Since spin loops of any kind should have a cpu_relax() in them, force an SFB + * flush from cpu_relax() such that any pending writes will become visible as + * expected. + */ +#define cpu_relax() smp_mb() +#else #define cpu_relax() barrier() +#endif
/* * Return_address is a replacement for __builtin_return_address(count)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Burton paul.burton@mips.com
commit 690d9163bf4b8563a2682e619f938e6a0443947f upstream.
Some versions of GCC suboptimally generate calls to the __multi3() intrinsic for MIPS64r6 builds, resulting in link failures due to the missing function:
LD vmlinux.o MODPOST vmlinux.o kernel/bpf/verifier.o: In function `kmalloc_array': include/linux/slab.h:631: undefined reference to `__multi3' fs/select.o: In function `kmalloc_array': include/linux/slab.h:631: undefined reference to `__multi3' ...
We already have a workaround for this in which we provide the instrinsic, but we do so selectively for GCC 7 only. Unfortunately the issue occurs with older GCC versions too - it has been observed with both GCC 5.4.0 & GCC 6.4.0.
MIPSr6 support was introduced in GCC 5, so all major GCC versions prior to GCC 8 are affected and we extend our workaround accordingly to all MIPS64r6 builds using GCC versions older than GCC 8.
Signed-off-by: Paul Burton paul.burton@mips.com Reported-by: Vladimir Kondratiev vladimir.kondratiev@intel.com Fixes: ebabcf17bcd7 ("MIPS: Implement __multi3 for GCC7 MIPS64r6 builds") Patchwork: https://patchwork.linux-mips.org/patch/20297/ Cc: James Hogan jhogan@kernel.org Cc: Ralf Baechle ralf@linux-mips.org Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # 4.15+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/lib/multi3.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/mips/lib/multi3.c +++ b/arch/mips/lib/multi3.c @@ -4,12 +4,12 @@ #include "libgcc.h"
/* - * GCC 7 suboptimally generates __multi3 calls for mips64r6, so for that - * specific case only we'll implement it here. + * GCC 7 & older can suboptimally generate __multi3 calls for mips64r6, so for + * that specific case only we implement that intrinsic here. * * See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82981 */ -#if defined(CONFIG_64BIT) && defined(CONFIG_CPU_MIPSR6) && (__GNUC__ == 7) +#if defined(CONFIG_64BIT) && defined(CONFIG_CPU_MIPSR6) && (__GNUC__ < 8)
/* multiply 64-bit values, low 64-bits returned */ static inline long long notrace dmulu(long long a, long long b)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ricardo Schwarzmeier Ricardo.Schwarzmeier@infineon.com
commit 36a11029b07ee30bdc4553274d0efea645ed9d91 upstream.
The userpace expects to read the number of bytes stated in the header. Returning the size of the buffer instead would be unexpected.
Cc: stable@vger.kernel.org Fixes: 095531f891e6 ("tpm: return a TPM_RC_COMMAND_CODE response if command is not implemented") Signed-off-by: Ricardo Schwarzmeier Ricardo.Schwarzmeier@infineon.com Reviewed-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Signed-off-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/char/tpm/tpm-interface.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/char/tpm/tpm-interface.c +++ b/drivers/char/tpm/tpm-interface.c @@ -423,7 +423,7 @@ static ssize_t tpm_try_transmit(struct t header->tag = cpu_to_be16(TPM2_ST_NO_SESSIONS); header->return_code = cpu_to_be32(TPM2_RC_COMMAND_CODE | TSS2_RESMGR_TPM_RC_LAYER); - return bufsiz; + return sizeof(*header); }
if (bufsiz > TPM_BUFSIZE)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tomas Winkler tomas.winkler@intel.com
commit 627448e85c766587f6fdde1ea3886d6615081c77 upstream.
Fix tpm ptt initialization error: tpm tpm0: A TPM error (378) occurred get tpm pcr allocation.
We cannot use go_idle cmd_ready commands via runtime_pm handles as with the introduction of localities this is no longer an optional feature, while runtime pm can be not enabled. Though cmd_ready/go_idle provides a power saving, it's also a part of TPM2 protocol and should be called explicitly. This patch exposes cmd_read/go_idle via tpm class ops and removes runtime pm support as it is not used by any driver.
When calling from nested context always use both flags: TPM_TRANSMIT_UNLOCKED and TPM_TRANSMIT_RAW. Both are needed to resolve tpm spaces and locality request recursive calls to tpm_transmit(). TPM_TRANSMIT_RAW should never be used standalone as it will fail on double locking. While TPM_TRANSMIT_UNLOCKED standalone should be called from non-recursive locked contexts.
New wrappers are added tpm_cmd_ready() and tpm_go_idle() to streamline tpm_try_transmit code.
tpm_crb no longer needs own power saving functions and can drop using tpm_pm_suspend/resume.
This patch cannot be really separated from the locality fix. Fixes: 888d867df441 (tpm: cmd_ready command can be issued only after granting locality)
Cc: stable@vger.kernel.org Fixes: 888d867df441 (tpm: cmd_ready command can be issued only after granting locality) Signed-off-by: Tomas Winkler tomas.winkler@intel.com Tested-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Reviewed-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Signed-off-by: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/char/tpm/tpm-interface.c | 51 +++++++++++++++---- drivers/char/tpm/tpm.h | 12 +++- drivers/char/tpm/tpm2-space.c | 16 +++--- drivers/char/tpm/tpm_crb.c | 101 ++++++++++----------------------------- include/linux/tpm.h | 2 5 files changed, 90 insertions(+), 92 deletions(-)
--- a/drivers/char/tpm/tpm-interface.c +++ b/drivers/char/tpm/tpm-interface.c @@ -29,7 +29,6 @@ #include <linux/mutex.h> #include <linux/spinlock.h> #include <linux/freezer.h> -#include <linux/pm_runtime.h> #include <linux/tpm_eventlog.h>
#include "tpm.h" @@ -369,10 +368,13 @@ err_len: return -EINVAL; }
-static int tpm_request_locality(struct tpm_chip *chip) +static int tpm_request_locality(struct tpm_chip *chip, unsigned int flags) { int rc;
+ if (flags & TPM_TRANSMIT_RAW) + return 0; + if (!chip->ops->request_locality) return 0;
@@ -385,10 +387,13 @@ static int tpm_request_locality(struct t return 0; }
-static void tpm_relinquish_locality(struct tpm_chip *chip) +static void tpm_relinquish_locality(struct tpm_chip *chip, unsigned int flags) { int rc;
+ if (flags & TPM_TRANSMIT_RAW) + return; + if (!chip->ops->relinquish_locality) return;
@@ -399,6 +404,28 @@ static void tpm_relinquish_locality(stru chip->locality = -1; }
+static int tpm_cmd_ready(struct tpm_chip *chip, unsigned int flags) +{ + if (flags & TPM_TRANSMIT_RAW) + return 0; + + if (!chip->ops->cmd_ready) + return 0; + + return chip->ops->cmd_ready(chip); +} + +static int tpm_go_idle(struct tpm_chip *chip, unsigned int flags) +{ + if (flags & TPM_TRANSMIT_RAW) + return 0; + + if (!chip->ops->go_idle) + return 0; + + return chip->ops->go_idle(chip); +} + static ssize_t tpm_try_transmit(struct tpm_chip *chip, struct tpm_space *space, u8 *buf, size_t bufsiz, @@ -449,14 +476,15 @@ static ssize_t tpm_try_transmit(struct t /* Store the decision as chip->locality will be changed. */ need_locality = chip->locality == -1;
- if (!(flags & TPM_TRANSMIT_RAW) && need_locality) { - rc = tpm_request_locality(chip); + if (need_locality) { + rc = tpm_request_locality(chip, flags); if (rc < 0) goto out_no_locality; }
- if (chip->dev.parent) - pm_runtime_get_sync(chip->dev.parent); + rc = tpm_cmd_ready(chip, flags); + if (rc) + goto out;
rc = tpm2_prepare_space(chip, space, ordinal, buf); if (rc) @@ -516,13 +544,16 @@ out_recv: }
rc = tpm2_commit_space(chip, space, ordinal, buf, &len); + if (rc) + dev_err(&chip->dev, "tpm2_commit_space: error %d\n", rc);
out: - if (chip->dev.parent) - pm_runtime_put_sync(chip->dev.parent); + rc = tpm_go_idle(chip, flags); + if (rc) + goto out;
if (need_locality) - tpm_relinquish_locality(chip); + tpm_relinquish_locality(chip, flags);
out_no_locality: if (chip->ops->clk_enable != NULL) --- a/drivers/char/tpm/tpm.h +++ b/drivers/char/tpm/tpm.h @@ -511,9 +511,17 @@ extern const struct file_operations tpm_ extern const struct file_operations tpmrm_fops; extern struct idr dev_nums_idr;
+/** + * enum tpm_transmit_flags + * + * @TPM_TRANSMIT_UNLOCKED: used to lock sequence of tpm_transmit calls. + * @TPM_TRANSMIT_RAW: prevent recursive calls into setup steps + * (go idle, locality,..). Always use with UNLOCKED + * as it will fail on double locking. + */ enum tpm_transmit_flags { - TPM_TRANSMIT_UNLOCKED = BIT(0), - TPM_TRANSMIT_RAW = BIT(1), + TPM_TRANSMIT_UNLOCKED = BIT(0), + TPM_TRANSMIT_RAW = BIT(1), };
ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space, --- a/drivers/char/tpm/tpm2-space.c +++ b/drivers/char/tpm/tpm2-space.c @@ -39,7 +39,8 @@ static void tpm2_flush_sessions(struct t for (i = 0; i < ARRAY_SIZE(space->session_tbl); i++) { if (space->session_tbl[i]) tpm2_flush_context_cmd(chip, space->session_tbl[i], - TPM_TRANSMIT_UNLOCKED); + TPM_TRANSMIT_UNLOCKED | + TPM_TRANSMIT_RAW); } }
@@ -84,7 +85,7 @@ static int tpm2_load_context(struct tpm_ tpm_buf_append(&tbuf, &buf[*offset], body_size);
rc = tpm_transmit_cmd(chip, NULL, tbuf.data, PAGE_SIZE, 4, - TPM_TRANSMIT_UNLOCKED, NULL); + TPM_TRANSMIT_UNLOCKED | TPM_TRANSMIT_RAW, NULL); if (rc < 0) { dev_warn(&chip->dev, "%s: failed with a system error %d\n", __func__, rc); @@ -133,7 +134,7 @@ static int tpm2_save_context(struct tpm_ tpm_buf_append_u32(&tbuf, handle);
rc = tpm_transmit_cmd(chip, NULL, tbuf.data, PAGE_SIZE, 0, - TPM_TRANSMIT_UNLOCKED, NULL); + TPM_TRANSMIT_UNLOCKED | TPM_TRANSMIT_RAW, NULL); if (rc < 0) { dev_warn(&chip->dev, "%s: failed with a system error %d\n", __func__, rc); @@ -170,7 +171,8 @@ static void tpm2_flush_space(struct tpm_ for (i = 0; i < ARRAY_SIZE(space->context_tbl); i++) if (space->context_tbl[i] && ~space->context_tbl[i]) tpm2_flush_context_cmd(chip, space->context_tbl[i], - TPM_TRANSMIT_UNLOCKED); + TPM_TRANSMIT_UNLOCKED | + TPM_TRANSMIT_RAW);
tpm2_flush_sessions(chip, space); } @@ -377,7 +379,8 @@ static int tpm2_map_response_header(stru
return 0; out_no_slots: - tpm2_flush_context_cmd(chip, phandle, TPM_TRANSMIT_UNLOCKED); + tpm2_flush_context_cmd(chip, phandle, + TPM_TRANSMIT_UNLOCKED | TPM_TRANSMIT_RAW); dev_warn(&chip->dev, "%s: out of slots for 0x%08X\n", __func__, phandle); return -ENOMEM; @@ -465,7 +468,8 @@ static int tpm2_save_space(struct tpm_ch return rc;
tpm2_flush_context_cmd(chip, space->context_tbl[i], - TPM_TRANSMIT_UNLOCKED); + TPM_TRANSMIT_UNLOCKED | + TPM_TRANSMIT_RAW); space->context_tbl[i] = ~0; }
--- a/drivers/char/tpm/tpm_crb.c +++ b/drivers/char/tpm/tpm_crb.c @@ -132,7 +132,7 @@ static bool crb_wait_for_reg_32(u32 __io }
/** - * crb_go_idle - request tpm crb device to go the idle state + * __crb_go_idle - request tpm crb device to go the idle state * * @dev: crb device * @priv: crb private data @@ -147,7 +147,7 @@ static bool crb_wait_for_reg_32(u32 __io * * Return: 0 always */ -static int crb_go_idle(struct device *dev, struct crb_priv *priv) +static int __crb_go_idle(struct device *dev, struct crb_priv *priv) { if ((priv->sm == ACPI_TPM2_START_METHOD) || (priv->sm == ACPI_TPM2_COMMAND_BUFFER_WITH_START_METHOD) || @@ -163,11 +163,20 @@ static int crb_go_idle(struct device *de dev_warn(dev, "goIdle timed out\n"); return -ETIME; } + return 0; }
+static int crb_go_idle(struct tpm_chip *chip) +{ + struct device *dev = &chip->dev; + struct crb_priv *priv = dev_get_drvdata(dev); + + return __crb_go_idle(dev, priv); +} + /** - * crb_cmd_ready - request tpm crb device to enter ready state + * __crb_cmd_ready - request tpm crb device to enter ready state * * @dev: crb device * @priv: crb private data @@ -181,7 +190,7 @@ static int crb_go_idle(struct device *de * * Return: 0 on success -ETIME on timeout; */ -static int crb_cmd_ready(struct device *dev, struct crb_priv *priv) +static int __crb_cmd_ready(struct device *dev, struct crb_priv *priv) { if ((priv->sm == ACPI_TPM2_START_METHOD) || (priv->sm == ACPI_TPM2_COMMAND_BUFFER_WITH_START_METHOD) || @@ -200,6 +209,14 @@ static int crb_cmd_ready(struct device * return 0; }
+static int crb_cmd_ready(struct tpm_chip *chip) +{ + struct device *dev = &chip->dev; + struct crb_priv *priv = dev_get_drvdata(dev); + + return __crb_cmd_ready(dev, priv); +} + static int __crb_request_locality(struct device *dev, struct crb_priv *priv, int loc) { @@ -401,6 +418,8 @@ static const struct tpm_class_ops tpm_cr .send = crb_send, .cancel = crb_cancel, .req_canceled = crb_req_canceled, + .go_idle = crb_go_idle, + .cmd_ready = crb_cmd_ready, .request_locality = crb_request_locality, .relinquish_locality = crb_relinquish_locality, .req_complete_mask = CRB_DRV_STS_COMPLETE, @@ -520,7 +539,7 @@ static int crb_map_io(struct acpi_device * PTT HW bug w/a: wake up the device to access * possibly not retained registers. */ - ret = crb_cmd_ready(dev, priv); + ret = __crb_cmd_ready(dev, priv); if (ret) goto out_relinquish_locality;
@@ -565,7 +584,7 @@ out: if (!ret) priv->cmd_size = cmd_size;
- crb_go_idle(dev, priv); + __crb_go_idle(dev, priv);
out_relinquish_locality:
@@ -628,32 +647,7 @@ static int crb_acpi_add(struct acpi_devi chip->acpi_dev_handle = device->handle; chip->flags = TPM_CHIP_FLAG_TPM2;
- rc = __crb_request_locality(dev, priv, 0); - if (rc) - return rc; - - rc = crb_cmd_ready(dev, priv); - if (rc) - goto out; - - pm_runtime_get_noresume(dev); - pm_runtime_set_active(dev); - pm_runtime_enable(dev); - - rc = tpm_chip_register(chip); - if (rc) { - crb_go_idle(dev, priv); - pm_runtime_put_noidle(dev); - pm_runtime_disable(dev); - goto out; - } - - pm_runtime_put_sync(dev); - -out: - __crb_relinquish_locality(dev, priv, 0); - - return rc; + return tpm_chip_register(chip); }
static int crb_acpi_remove(struct acpi_device *device) @@ -663,52 +657,11 @@ static int crb_acpi_remove(struct acpi_d
tpm_chip_unregister(chip);
- pm_runtime_disable(dev); - return 0; }
-static int __maybe_unused crb_pm_runtime_suspend(struct device *dev) -{ - struct tpm_chip *chip = dev_get_drvdata(dev); - struct crb_priv *priv = dev_get_drvdata(&chip->dev); - - return crb_go_idle(dev, priv); -} - -static int __maybe_unused crb_pm_runtime_resume(struct device *dev) -{ - struct tpm_chip *chip = dev_get_drvdata(dev); - struct crb_priv *priv = dev_get_drvdata(&chip->dev); - - return crb_cmd_ready(dev, priv); -} - -static int __maybe_unused crb_pm_suspend(struct device *dev) -{ - int ret; - - ret = tpm_pm_suspend(dev); - if (ret) - return ret; - - return crb_pm_runtime_suspend(dev); -} - -static int __maybe_unused crb_pm_resume(struct device *dev) -{ - int ret; - - ret = crb_pm_runtime_resume(dev); - if (ret) - return ret; - - return tpm_pm_resume(dev); -} - static const struct dev_pm_ops crb_pm = { - SET_SYSTEM_SLEEP_PM_OPS(crb_pm_suspend, crb_pm_resume) - SET_RUNTIME_PM_OPS(crb_pm_runtime_suspend, crb_pm_runtime_resume, NULL) + SET_SYSTEM_SLEEP_PM_OPS(tpm_pm_suspend, tpm_pm_resume) };
static const struct acpi_device_id crb_device_ids[] = { --- a/include/linux/tpm.h +++ b/include/linux/tpm.h @@ -43,6 +43,8 @@ struct tpm_class_ops { u8 (*status) (struct tpm_chip *chip); bool (*update_timeouts)(struct tpm_chip *chip, unsigned long *timeout_cap); + int (*go_idle)(struct tpm_chip *chip); + int (*cmd_ready)(struct tpm_chip *chip); int (*request_locality)(struct tpm_chip *chip, int loc); int (*relinquish_locality)(struct tpm_chip *chip, int loc); void (*clk_enable)(struct tpm_chip *chip, bool value);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sreekanth Reddy sreekanth.reddy@broadcom.com
commit e70183143cc472960bc60dfee1b7bbe1949feffb upstream.
Below kernel BUG was observed while running IOs with host reset (issued from application),
mpt3sas_cm0: diag reset: SUCCESS ------------[ cut here ]------------ WARNING: CPU: 12 PID: 4336 at drivers/scsi/mpt3sas/mpt3sas_base.c:3282 mpt3sas_base_clear_st+0x3d/0x40 [mpt3sas] Modules linked in: macsec tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag binfmt_misc fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun devlink ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc vfat fat sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support dcdbas pcspkr joydev ipmi_ssif ses enclosure sg ipmi_devintf acpi_pad ipmi_msghandler acpi_power_meter mei_me lpc_ich wmi mei shpchp ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi uas usb_storage mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix mpt3sas libata crct10dif_pclmul crct10dif_common tg3 crc32c_intel i2c_core raid_class ptp scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod CPU: 12 PID: 4336 Comm: python Kdump: loaded Tainted: G W ------------ 3.10.0-875.el7.brdc.x86_64 #1 Hardware name: Dell Inc. PowerEdge R820/0YWR73, BIOS 1.5.0 03/08/2013 Call Trace: [<ffffffff9cf16583>] dump_stack+0x19/0x1b [<ffffffff9c891698>] __warn+0xd8/0x100 [<ffffffff9c8917dd>] warn_slowpath_null+0x1d/0x20 [<ffffffffc04f3f4d>] mpt3sas_base_clear_st+0x3d/0x40 [mpt3sas] [<ffffffffc05047d2>] _scsih_flush_running_cmds+0x92/0xe0 [mpt3sas] [<ffffffffc05095db>] mpt3sas_scsih_reset_handler+0x43b/0xaf0 [mpt3sas] [<ffffffff9c894829>] ? vprintk_default+0x29/0x40 [<ffffffff9cf10531>] ? printk+0x60/0x77 [<ffffffffc04f06c8>] ? _base_diag_reset+0x238/0x340 [mpt3sas] [<ffffffffc04f794d>] mpt3sas_base_hard_reset_handler+0x1ad/0x420 [mpt3sas] [<ffffffffc05132b9>] _ctl_ioctl_main.isra.12+0x11b9/0x1200 [mpt3sas] [<ffffffffc068d585>] ? xfs_file_aio_write+0x155/0x1b0 [xfs] [<ffffffff9ca1a4e3>] ? do_sync_write+0x93/0xe0 [<ffffffffc051337a>] _ctl_ioctl+0x1a/0x20 [mpt3sas] [<ffffffff9ca2fe90>] do_vfs_ioctl+0x350/0x560 [<ffffffff9ca1dec1>] ? __sb_end_write+0x31/0x60 [<ffffffff9ca30141>] SyS_ioctl+0xa1/0xc0 [<ffffffff9cf28715>] ? system_call_after_swapgs+0xa2/0x146 [<ffffffff9cf287d5>] system_call_fastpath+0x1c/0x21 [<ffffffff9cf28721>] ? system_call_after_swapgs+0xae/0x146 ---[ end trace 5dac5b98d89aaa3c ]--- ------------[ cut here ]------------ kernel BUG at block/blk-core.c:1476! invalid opcode: 0000 [#1] SMP Modules linked in: macsec tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag binfmt_misc fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun devlink ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc vfat fat sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support dcdbas pcspkr joydev ipmi_ssif ses enclosure sg ipmi_devintf acpi_pad ipmi_msghandler acpi_power_meter mei_me lpc_ich wmi mei shpchp ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi uas usb_storage mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix mpt3sas libata crct10dif_pclmul crct10dif_common tg3 crc32c_intel i2c_core raid_class ptp scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod CPU: 12 PID: 4336 Comm: python Kdump: loaded Tainted: G W ------------ 3.10.0-875.el7.brdc.x86_64 #1 Hardware name: Dell Inc. PowerEdge R820/0YWR73, BIOS 1.5.0 03/08/2013 task: ffff903fc96e0fd0 ti: ffff903fb1eec000 task.ti: ffff903fb1eec000 RIP: 0010:[<ffffffff9cb19ec0>] [<ffffffff9cb19ec0>] blk_requeue_request+0x90/0xa0 RSP: 0018:ffff903c6b783dc0 EFLAGS: 00010087 RAX: ffff903bb67026d0 RBX: ffff903b7d6a6140 RCX: dead000000000200 RDX: ffff903bb67026d0 RSI: ffff903bb6702580 RDI: ffff903bb67026d0 RBP: ffff903c6b783dd8 R08: ffff903bb67026d0 R09: ffffd97e80000000 R10: ffff903c658bac00 R11: 0000000000000000 R12: ffff903bb6702580 R13: ffff903fa9a292f0 R14: 0000000000000246 R15: 0000000000001057 FS: 00007f7026f5b740(0000) GS:ffff903c6b780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f298877c004 CR3: 00000000caf36000 CR4: 00000000000607e0 Call Trace: <IRQ> [<ffffffff9cca68ff>] __scsi_queue_insert+0xbf/0x110 [<ffffffff9cca79ca>] scsi_io_completion+0x5da/0x6a0 [<ffffffff9cc9ca3c>] scsi_finish_command+0xdc/0x140 [<ffffffff9cca6aa2>] scsi_softirq_done+0x132/0x160 [<ffffffff9cb240c6>] blk_done_softirq+0x96/0xc0 [<ffffffff9c89a905>] __do_softirq+0xf5/0x280 [<ffffffff9cf2bd2c>] call_softirq+0x1c/0x30 [<ffffffff9c82d625>] do_softirq+0x65/0xa0 [<ffffffff9c89ac85>] irq_exit+0x105/0x110 [<ffffffff9cf2d0a8>] smp_apic_timer_interrupt+0x48/0x60 [<ffffffff9cf297f2>] apic_timer_interrupt+0x162/0x170 <EOI> [<ffffffff9cca5f41>] ? scsi_done+0x21/0x60 [<ffffffff9cb5ac18>] ? delay_tsc+0x38/0x60 [<ffffffff9cb5ab5d>] __const_udelay+0x2d/0x30 [<ffffffffc04effde>] _base_handshake_req_reply_wait+0x8e/0x4a0 [mpt3sas] [<ffffffffc04f0b13>] _base_get_ioc_facts+0x123/0x590 [mpt3sas] [<ffffffffc04f06c8>] ? _base_diag_reset+0x238/0x340 [mpt3sas] [<ffffffffc04f7993>] mpt3sas_base_hard_reset_handler+0x1f3/0x420 [mpt3sas] [<ffffffffc05132b9>] _ctl_ioctl_main.isra.12+0x11b9/0x1200 [mpt3sas] [<ffffffffc068d585>] ? xfs_file_aio_write+0x155/0x1b0 [xfs] [<ffffffff9ca1a4e3>] ? do_sync_write+0x93/0xe0 [<ffffffffc051337a>] _ctl_ioctl+0x1a/0x20 [mpt3sas] [<ffffffff9ca2fe90>] do_vfs_ioctl+0x350/0x560 [<ffffffff9ca1dec1>] ? __sb_end_write+0x31/0x60 [<ffffffff9ca30141>] SyS_ioctl+0xa1/0xc0 [<ffffffff9cf28715>] ? system_call_after_swapgs+0xa2/0x146 [<ffffffff9cf287d5>] system_call_fastpath+0x1c/0x21 [<ffffffff9cf28721>] ? system_call_after_swapgs+0xae/0x146 Code: 83 c3 10 4c 89 e2 4c 89 ee e8 8d 21 04 00 48 8b 03 48 85 c0 75 e5 41 f6 44 24 4a 10 74 ad 4c 89 e6 4c 89 ef e8 b2 42 00 00 eb a0 <0f> 0b 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 RIP [<ffffffff9cb19ec0>] blk_requeue_request+0x90/0xa0 RSP <ffff903c6b783dc0>
As a part of host reset operation, driver will flushout all IOs outstanding at driver level with "DID_RESET" result. To find which are all commands outstanding at the driver level, driver loops with smid starting from one to HBA queue depth and calls mpt3sas_scsih_scsi_lookup_get() to get scmd as shown below
for (smid = 1; smid <= ioc->scsiio_depth; smid++) { scmd = mpt3sas_scsih_scsi_lookup_get(ioc, smid); if (!scmd) continue;
But in mpt3sas_scsih_scsi_lookup_get() function, driver returns some scsi cmnds which are not outstanding at the driver level (possibly request is constructed at block layer since QUEUE_FLAG_QUIESCED is not set. Even if driver uses scsi_block_requests and scsi_unblock_requests, issue still persists as they will be just blocking further IO from scsi layer and not from block layer) and these commands are flushed with DID_RESET host bytes thus resulting into above kernel BUG.
This issue got introduced by commit dbec4c9040ed ("scsi: mpt3sas: lockless command submission").
To fix this issue, we have modified the mpt3sas_scsih_scsi_lookup_get() to check for smid equals to zero (note: whenever any scsi cmnd is processing at the driver level then smid for that scsi cmnd will be non-zero, always it starts from one) before it returns the scmd pointer to the caller. If smid is zero then this function returns scmd pointer as NULL and driver won't flushout those scsi cmnds at driver level with DID_RESET host byte thus this issue will not be observed.
[mkp: amended with updated fix from Sreekanth]
Signed-off-by: Sreekanth Reddy sreekanth.reddy@broadcom.com Fixes: dbec4c9040ed ("scsi: mpt3sas: lockless command submission") Cc: stable@vger.kernel.org # v4.16+ Reviewed-by: Tomas Henzl thenzl@redhat.com Reviewed-by: Bart Van Assche bart.vanassche@wdc.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/mpt3sas/mpt3sas_base.c | 1 + drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -3284,6 +3284,7 @@ void mpt3sas_base_clear_st(struct MPT3SA st->cb_idx = 0xFF; st->direct_io = 0; atomic_set(&ioc->chain_lookup[st->smid - 1].chain_offset, 0); + st->smid = 0; }
/** --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -1489,7 +1489,7 @@ mpt3sas_scsih_scsi_lookup_get(struct MPT scmd = scsi_host_find_tag(ioc->shost, unique_tag); if (scmd) { st = scsi_cmd_priv(scmd); - if (st->cb_idx == 0xFF) + if (st->cb_idx == 0xFF || st->smid == 0) scmd = NULL; } }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bart Van Assche bart.vanassche@wdc.com
commit 91b7bdb2c0089cbbb817df6888ab1458c645184e upstream.
This patch avoids that smatch complains about a double unlock on ioc->transport_cmds.mutex.
Fixes: 651a01364994 ("scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough") Signed-off-by: Bart Van Assche bart.vanassche@wdc.com Cc: Christoph Hellwig hch@lst.de Cc: Sathya Prakash sathya.prakash@broadcom.com Cc: Chaitra P B chaitra.basappa@broadcom.com Cc: Suganath Prabu Subramani suganath-prabu.subramani@broadcom.com Cc: stable@vger.kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/mpt3sas/mpt3sas_transport.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/scsi/mpt3sas/mpt3sas_transport.c +++ b/drivers/scsi/mpt3sas/mpt3sas_transport.c @@ -1936,12 +1936,12 @@ _transport_smp_handler(struct bsg_job *j pr_info(MPT3SAS_FMT "%s: host reset in progress!\n", __func__, ioc->name); rc = -EFAULT; - goto out; + goto job_done; }
rc = mutex_lock_interruptible(&ioc->transport_cmds.mutex); if (rc) - goto out; + goto job_done;
if (ioc->transport_cmds.status != MPT3_CMD_NOT_USED) { pr_err(MPT3SAS_FMT "%s: transport_cmds in use\n", ioc->name, @@ -2066,6 +2066,7 @@ _transport_smp_handler(struct bsg_job *j out: ioc->transport_cmds.status = MPT3_CMD_NOT_USED; mutex_unlock(&ioc->transport_cmds.mutex); +job_done: bsg_job_done(job, rc, reslen); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bart Van Assche bart.vanassche@wdc.com
commit 2afc9166f79b8f6da5f347f48515215ceee4ae37 upstream.
Introduce these two functions and export them such that the next patch can add calls to these functions from the SCSI core.
Signed-off-by: Bart Van Assche bart.vanassche@wdc.com Acked-by: Tejun Heo tj@kernel.org Acked-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/sysfs/file.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/sysfs.h | 14 ++++++++++++++ 2 files changed, 58 insertions(+)
--- a/fs/sysfs/file.c +++ b/fs/sysfs/file.c @@ -406,6 +406,50 @@ int sysfs_chmod_file(struct kobject *kob EXPORT_SYMBOL_GPL(sysfs_chmod_file);
/** + * sysfs_break_active_protection - break "active" protection + * @kobj: The kernel object @attr is associated with. + * @attr: The attribute to break the "active" protection for. + * + * With sysfs, just like kernfs, deletion of an attribute is postponed until + * all active .show() and .store() callbacks have finished unless this function + * is called. Hence this function is useful in methods that implement self + * deletion. + */ +struct kernfs_node *sysfs_break_active_protection(struct kobject *kobj, + const struct attribute *attr) +{ + struct kernfs_node *kn; + + kobject_get(kobj); + kn = kernfs_find_and_get(kobj->sd, attr->name); + if (kn) + kernfs_break_active_protection(kn); + return kn; +} +EXPORT_SYMBOL_GPL(sysfs_break_active_protection); + +/** + * sysfs_unbreak_active_protection - restore "active" protection + * @kn: Pointer returned by sysfs_break_active_protection(). + * + * Undo the effects of sysfs_break_active_protection(). Since this function + * calls kernfs_put() on the kernfs node that corresponds to the 'attr' + * argument passed to sysfs_break_active_protection() that attribute may have + * been removed between the sysfs_break_active_protection() and + * sysfs_unbreak_active_protection() calls, it is not safe to access @kn after + * this function has returned. + */ +void sysfs_unbreak_active_protection(struct kernfs_node *kn) +{ + struct kobject *kobj = kn->parent->priv; + + kernfs_unbreak_active_protection(kn); + kernfs_put(kn); + kobject_put(kobj); +} +EXPORT_SYMBOL_GPL(sysfs_unbreak_active_protection); + +/** * sysfs_remove_file_ns - remove an object attribute with a custom ns tag * @kobj: object we're acting for * @attr: attribute descriptor --- a/include/linux/sysfs.h +++ b/include/linux/sysfs.h @@ -237,6 +237,9 @@ int __must_check sysfs_create_files(stru const struct attribute **attr); int __must_check sysfs_chmod_file(struct kobject *kobj, const struct attribute *attr, umode_t mode); +struct kernfs_node *sysfs_break_active_protection(struct kobject *kobj, + const struct attribute *attr); +void sysfs_unbreak_active_protection(struct kernfs_node *kn); void sysfs_remove_file_ns(struct kobject *kobj, const struct attribute *attr, const void *ns); bool sysfs_remove_file_self(struct kobject *kobj, const struct attribute *attr); @@ -350,6 +353,17 @@ static inline int sysfs_chmod_file(struc return 0; }
+static inline struct kernfs_node * +sysfs_break_active_protection(struct kobject *kobj, + const struct attribute *attr) +{ + return NULL; +} + +static inline void sysfs_unbreak_active_protection(struct kernfs_node *kn) +{ +} + static inline void sysfs_remove_file_ns(struct kobject *kobj, const struct attribute *attr, const void *ns)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bart Van Assche bart.vanassche@wdc.com
commit 0ee223b2e1f67cb2de9c0e3247c510d846e74d63 upstream.
A long time ago the unfortunate decision was taken to add a self-deletion attribute to the sysfs SCSI device directory. That decision was unfortunate because self-deletion is really tricky. We can't drop that attribute because widely used user space software depends on it, namely the rescan-scsi-bus.sh script. Hence this patch that avoids that writing into that attribute triggers a deadlock. See also commit 7973cbd9fbd9 ("[PATCH] add sysfs attributes to scan and delete scsi_devices").
This patch avoids that self-removal triggers the following deadlock:
====================================================== WARNING: possible circular locking dependency detected 4.18.0-rc2-dbg+ #5 Not tainted ------------------------------------------------------ modprobe/6539 is trying to acquire lock: 000000008323c4cd (kn->count#202){++++}, at: kernfs_remove_by_name_ns+0x45/0x90
but task is already holding lock: 00000000a6ec2c69 (&shost->scan_mutex){+.+.}, at: scsi_remove_host+0x21/0x150 [scsi_mod]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&shost->scan_mutex){+.+.}: __mutex_lock+0xfe/0xc70 mutex_lock_nested+0x1b/0x20 scsi_remove_device+0x26/0x40 [scsi_mod] sdev_store_delete+0x27/0x30 [scsi_mod] dev_attr_store+0x3e/0x50 sysfs_kf_write+0x87/0xa0 kernfs_fop_write+0x190/0x230 __vfs_write+0xd2/0x3b0 vfs_write+0x101/0x270 ksys_write+0xab/0x120 __x64_sys_write+0x43/0x50 do_syscall_64+0x77/0x230 entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #0 (kn->count#202){++++}: lock_acquire+0xd2/0x260 __kernfs_remove+0x424/0x4a0 kernfs_remove_by_name_ns+0x45/0x90 remove_files.isra.1+0x3a/0x90 sysfs_remove_group+0x5c/0xc0 sysfs_remove_groups+0x39/0x60 device_remove_attrs+0x82/0xb0 device_del+0x251/0x580 __scsi_remove_device+0x19f/0x1d0 [scsi_mod] scsi_forget_host+0x37/0xb0 [scsi_mod] scsi_remove_host+0x9b/0x150 [scsi_mod] sdebug_driver_remove+0x4b/0x150 [scsi_debug] device_release_driver_internal+0x241/0x360 device_release_driver+0x12/0x20 bus_remove_device+0x1bc/0x290 device_del+0x259/0x580 device_unregister+0x1a/0x70 sdebug_remove_adapter+0x8b/0xf0 [scsi_debug] scsi_debug_exit+0x76/0xe8 [scsi_debug] __x64_sys_delete_module+0x1c1/0x280 do_syscall_64+0x77/0x230 entry_SYSCALL_64_after_hwframe+0x49/0xbe
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&shost->scan_mutex); lock(kn->count#202); lock(&shost->scan_mutex); lock(kn->count#202);
*** DEADLOCK ***
2 locks held by modprobe/6539: #0: 00000000efaf9298 (&dev->mutex){....}, at: device_release_driver_internal+0x68/0x360 #1: 00000000a6ec2c69 (&shost->scan_mutex){+.+.}, at: scsi_remove_host+0x21/0x150 [scsi_mod]
stack backtrace: CPU: 10 PID: 6539 Comm: modprobe Not tainted 4.18.0-rc2-dbg+ #5 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xa4/0xf5 print_circular_bug.isra.34+0x213/0x221 __lock_acquire+0x1a7e/0x1b50 lock_acquire+0xd2/0x260 __kernfs_remove+0x424/0x4a0 kernfs_remove_by_name_ns+0x45/0x90 remove_files.isra.1+0x3a/0x90 sysfs_remove_group+0x5c/0xc0 sysfs_remove_groups+0x39/0x60 device_remove_attrs+0x82/0xb0 device_del+0x251/0x580 __scsi_remove_device+0x19f/0x1d0 [scsi_mod] scsi_forget_host+0x37/0xb0 [scsi_mod] scsi_remove_host+0x9b/0x150 [scsi_mod] sdebug_driver_remove+0x4b/0x150 [scsi_debug] device_release_driver_internal+0x241/0x360 device_release_driver+0x12/0x20 bus_remove_device+0x1bc/0x290 device_del+0x259/0x580 device_unregister+0x1a/0x70 sdebug_remove_adapter+0x8b/0xf0 [scsi_debug] scsi_debug_exit+0x76/0xe8 [scsi_debug] __x64_sys_delete_module+0x1c1/0x280 do_syscall_64+0x77/0x230 entry_SYSCALL_64_after_hwframe+0x49/0xbe
See also https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg54525.html.
Fixes: ac0ece9174ac ("scsi: use device_remove_file_self() instead of device_schedule_callback()") Signed-off-by: Bart Van Assche bart.vanassche@wdc.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Acked-by: Tejun Heo tj@kernel.org Cc: Johannes Thumshirn jthumshirn@suse.de Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Signed-off-by: Martin K. Petersen martin.petersen@oracle.com
--- drivers/scsi/scsi_sysfs.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
--- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -722,8 +722,24 @@ static ssize_t sdev_store_delete(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { - if (device_remove_file_self(dev, attr)) - scsi_remove_device(to_scsi_device(dev)); + struct kernfs_node *kn; + + kn = sysfs_break_active_protection(&dev->kobj, &attr->attr); + WARN_ON_ONCE(!kn); + /* + * Concurrent writes into the "delete" sysfs attribute may trigger + * concurrent calls to device_remove_file() and scsi_remove_device(). + * device_remove_file() handles concurrent removal calls by + * serializing these and by ignoring the second and later removal + * attempts. Concurrent calls of scsi_remove_device() are + * serialized. The second and later calls of scsi_remove_device() are + * ignored because the first call of that function changes the device + * state into SDEV_DEL. + */ + device_remove_file(dev, attr); + scsi_remove_device(to_scsi_device(dev)); + if (kn) + sysfs_unbreak_active_protection(kn); return count; }; static DEVICE_ATTR(delete, S_IWUSR, NULL, sdev_store_delete);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mike Christie mchristi@redhat.com
commit 26abc916a898d34c5ad159315a2f683def3c5555 upstream.
The problem is that iscsi_login_zero_tsih_s1 sets conn->sess early in iscsi_login_set_conn_values. If the function fails later like when we alloc the idr it does kfree(sess) and leaves the conn->sess pointer set. iscsi_login_zero_tsih_s1 then returns -Exyz and we then call iscsi_target_login_sess_out and access the freed memory.
This patch has iscsi_login_zero_tsih_s1 either completely setup the session or completely tear it down, so later in iscsi_target_login_sess_out we can just check for it being set to the connection.
Cc: stable@vger.kernel.org Fixes: 0957627a9960 ("iscsi-target: Fix sess allocation leak in...") Signed-off-by: Mike Christie mchristi@redhat.com Acked-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Matthew Wilcox willy@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/target/iscsi/iscsi_target_login.c | 35 ++++++++++++++++++------------ 1 file changed, 21 insertions(+), 14 deletions(-)
--- a/drivers/target/iscsi/iscsi_target_login.c +++ b/drivers/target/iscsi/iscsi_target_login.c @@ -348,8 +348,7 @@ static int iscsi_login_zero_tsih_s1( pr_err("idr_alloc() for sess_idr failed\n"); iscsit_tx_login_rsp(conn, ISCSI_STATUS_CLS_TARGET_ERR, ISCSI_LOGIN_STATUS_NO_RESOURCES); - kfree(sess); - return -ENOMEM; + goto free_sess; }
sess->creation_time = get_jiffies_64(); @@ -365,20 +364,28 @@ static int iscsi_login_zero_tsih_s1( ISCSI_LOGIN_STATUS_NO_RESOURCES); pr_err("Unable to allocate memory for" " struct iscsi_sess_ops.\n"); - kfree(sess); - return -ENOMEM; + goto remove_idr; }
sess->se_sess = transport_init_session(TARGET_PROT_NORMAL); if (IS_ERR(sess->se_sess)) { iscsit_tx_login_rsp(conn, ISCSI_STATUS_CLS_TARGET_ERR, ISCSI_LOGIN_STATUS_NO_RESOURCES); - kfree(sess->sess_ops); - kfree(sess); - return -ENOMEM; + goto free_ops; }
return 0; + +free_ops: + kfree(sess->sess_ops); +remove_idr: + spin_lock_bh(&sess_idr_lock); + idr_remove(&sess_idr, sess->session_index); + spin_unlock_bh(&sess_idr_lock); +free_sess: + kfree(sess); + conn->sess = NULL; + return -ENOMEM; }
static int iscsi_login_zero_tsih_s2( @@ -1161,13 +1168,13 @@ void iscsi_target_login_sess_out(struct ISCSI_LOGIN_STATUS_INIT_ERR); if (!zero_tsih || !conn->sess) goto old_sess_out; - if (conn->sess->se_sess) - transport_free_session(conn->sess->se_sess); - if (conn->sess->session_index != 0) { - spin_lock_bh(&sess_idr_lock); - idr_remove(&sess_idr, conn->sess->session_index); - spin_unlock_bh(&sess_idr_lock); - } + + transport_free_session(conn->sess->se_sess); + + spin_lock_bh(&sess_idr_lock); + idr_remove(&sess_idr, conn->sess->session_index); + spin_unlock_bh(&sess_idr_lock); + kfree(conn->sess->sess_ops); kfree(conn->sess); conn->sess = NULL;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Boris Brezillon boris.brezillon@bootlin.com
commit 20366e19e28f9954b25580c020d7a4e0db6055c4 upstream.
Modern NAND controller drivers implement ->exec_op() instead of ->cmdfunc(), make sure we don't end up with a NULL pointer dereference when hynix_nand_reg_write_op() is called.
Fixes: 8878b126df76 ("mtd: nand: add ->exec_op() implementation") Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mtd/nand/raw/nand_hynix.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/drivers/mtd/nand/raw/nand_hynix.c +++ b/drivers/mtd/nand/raw/nand_hynix.c @@ -100,6 +100,16 @@ static int hynix_nand_reg_write_op(struc struct mtd_info *mtd = nand_to_mtd(chip); u16 column = ((u16)addr << 8) | addr;
+ if (chip->exec_op) { + struct nand_op_instr instrs[] = { + NAND_OP_ADDR(1, &addr, 0), + NAND_OP_8BIT_DATA_OUT(1, &val, 0), + }; + struct nand_operation op = NAND_OPERATION(instrs); + + return nand_exec_op(chip, &op); + } + chip->cmdfunc(mtd, NAND_CMD_NONE, column, -1); chip->write_byte(mtd, val);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Boris Brezillon boris.brezillon@bootlin.com
commit 79e1ca37cc0c056f224cc1dd4a301b9dc2f94167 upstream.
chip->read_buf is left unassigned since commit 4da712e70294 ("mtd: nand: fsmc: use ->exec_op()"), leading to a NULL pointer dereference when it's called from fsmc_read_page_hwecc(). Fix that by using the appropriate helper to read data out of the NAND.
Fixes: 4da712e70294 ("mtd: nand: fsmc: use ->exec_op()") Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon boris.brezillon@bootlin.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mtd/nand/raw/fsmc_nand.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mtd/nand/raw/fsmc_nand.c +++ b/drivers/mtd/nand/raw/fsmc_nand.c @@ -740,7 +740,7 @@ static int fsmc_read_page_hwecc(struct m for (i = 0, s = 0; s < eccsteps; s++, i += eccbytes, p += eccsize) { nand_read_page_op(chip, page, s * eccsize, NULL, 0); chip->ecc.hwctl(mtd, NAND_ECC_READ); - chip->read_buf(mtd, p, eccsize); + nand_read_data_op(chip, p, eccsize, false);
for (j = 0; j < eccbytes;) { struct mtd_oob_region oobregion;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Mack daniel@zonque.org
commit bd9c3f9b3c00da322b5e784e820533f1598f690a upstream.
This patch restores the suspend and resume hooks that the old driver used to have. Apart from stopping and starting the clocks, the resume callback also nullifies the selected_chip pointer, so the next command that is issued will re-select the chip and thereby restore the timing registers.
Factor out some code from marvell_nfc_init() into a new function marvell_nfc_reset() and also call it at resume time to reset some registers that don't retain their contents during low-power mode.
Without this patch, a PXA3xx based system would cough up an error similar to the one below after resume.
[ 44.660162] marvell-nfc 43100000.nand-controller: Timeout waiting for RB signal [ 44.671492] ubi0 error: ubi_io_write: error -110 while writing 2048 bytes to PEB 102:38912, written 0 bytes [ 44.682887] CPU: 0 PID: 1417 Comm: remote-control Not tainted 4.18.0-rc2+ #344 [ 44.691197] Hardware name: Marvell PXA3xx (Device Tree Support) [ 44.697111] Backtrace: [ 44.699593] [<c0106458>] (dump_backtrace) from [<c0106718>] (show_stack+0x18/0x1c) [ 44.708931] r7:00000800 r6:00009800 r5:00000066 r4:c6139000 [ 44.715833] [<c0106700>] (show_stack) from [<c0678a60>] (dump_stack+0x20/0x28) [ 44.724206] [<c0678a40>] (dump_stack) from [<c0456cbc>] (ubi_io_write+0x3d4/0x630) [ 44.732925] [<c04568e8>] (ubi_io_write) from [<c0454428>] (ubi_eba_write_leb+0x690/0x6fc) ...
Fixes: 02f26ecf8c77 ("mtd: nand: add reworked Marvell NAND controller driver") Cc: stable@vger.kernel.org Signed-off-by: Daniel Mack daniel@zonque.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mtd/nand/raw/marvell_nand.c | 73 ++++++++++++++++++++++++++++++------ 1 file changed, 62 insertions(+), 11 deletions(-)
--- a/drivers/mtd/nand/raw/marvell_nand.c +++ b/drivers/mtd/nand/raw/marvell_nand.c @@ -2677,6 +2677,21 @@ static int marvell_nfc_init_dma(struct m return 0; }
+static void marvell_nfc_reset(struct marvell_nfc *nfc) +{ + /* + * ECC operations and interruptions are only enabled when specifically + * needed. ECC shall not be activated in the early stages (fails probe). + * Arbiter flag, even if marked as "reserved", must be set (empirical). + * SPARE_EN bit must always be set or ECC bytes will not be at the same + * offset in the read page and this will fail the protection. + */ + writel_relaxed(NDCR_ALL_INT | NDCR_ND_ARB_EN | NDCR_SPARE_EN | + NDCR_RD_ID_CNT(NFCV1_READID_LEN), nfc->regs + NDCR); + writel_relaxed(0xFFFFFFFF, nfc->regs + NDSR); + writel_relaxed(0, nfc->regs + NDECCCTRL); +} + static int marvell_nfc_init(struct marvell_nfc *nfc) { struct device_node *np = nfc->dev->of_node; @@ -2715,17 +2730,7 @@ static int marvell_nfc_init(struct marve if (!nfc->caps->is_nfcv2) marvell_nfc_init_dma(nfc);
- /* - * ECC operations and interruptions are only enabled when specifically - * needed. ECC shall not be activated in the early stages (fails probe). - * Arbiter flag, even if marked as "reserved", must be set (empirical). - * SPARE_EN bit must always be set or ECC bytes will not be at the same - * offset in the read page and this will fail the protection. - */ - writel_relaxed(NDCR_ALL_INT | NDCR_ND_ARB_EN | NDCR_SPARE_EN | - NDCR_RD_ID_CNT(NFCV1_READID_LEN), nfc->regs + NDCR); - writel_relaxed(0xFFFFFFFF, nfc->regs + NDSR); - writel_relaxed(0, nfc->regs + NDECCCTRL); + marvell_nfc_reset(nfc);
return 0; } @@ -2840,6 +2845,51 @@ static int marvell_nfc_remove(struct pla return 0; }
+static int __maybe_unused marvell_nfc_suspend(struct device *dev) +{ + struct marvell_nfc *nfc = dev_get_drvdata(dev); + struct marvell_nand_chip *chip; + + list_for_each_entry(chip, &nfc->chips, node) + marvell_nfc_wait_ndrun(&chip->chip); + + clk_disable_unprepare(nfc->reg_clk); + clk_disable_unprepare(nfc->core_clk); + + return 0; +} + +static int __maybe_unused marvell_nfc_resume(struct device *dev) +{ + struct marvell_nfc *nfc = dev_get_drvdata(dev); + int ret; + + ret = clk_prepare_enable(nfc->core_clk); + if (ret < 0) + return ret; + + if (!IS_ERR(nfc->reg_clk)) { + ret = clk_prepare_enable(nfc->reg_clk); + if (ret < 0) + return ret; + } + + /* + * Reset nfc->selected_chip so the next command will cause the timing + * registers to be restored in marvell_nfc_select_chip(). + */ + nfc->selected_chip = NULL; + + /* Reset registers that have lost their contents */ + marvell_nfc_reset(nfc); + + return 0; +} + +static const struct dev_pm_ops marvell_nfc_pm_ops = { + SET_SYSTEM_SLEEP_PM_OPS(marvell_nfc_suspend, marvell_nfc_resume) +}; + static const struct marvell_nfc_caps marvell_armada_8k_nfc_caps = { .max_cs_nb = 4, .max_rb_nb = 2, @@ -2924,6 +2974,7 @@ static struct platform_driver marvell_nf .driver = { .name = "marvell-nfc", .of_match_table = marvell_nfc_of_ids, + .pm = &marvell_nfc_pm_ops, }, .id_table = marvell_nfc_platform_ids, .probe = marvell_nfc_probe,
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Abhishek Sahu absahu@codeaurora.org
commit 6f20070d51a20e489ef117603210264c6bcde8a5 upstream.
The BAM has 3 channels - tx, rx and command. command channel is used for register read/writes, tx channel for data writes and rx channel for data reads. Currently, the driver assumes the transfer completion once it gets all the command descriptors completed. Sometimes, there is race condition between data channel (tx/rx) and command channel completion. In these cases, the data present in buffer is not valid during small window between command descriptor completion and data descriptor completion.
This patch generates NAND transfer completion when both (Data and Command) DMA channels have completed all its DMA descriptors. It assigns completion callback in last DMA descriptors of that channel and wait for completion.
Fixes: 8d6b6d7e135e ("mtd: nand: qcom: support for command descriptor formation") Cc: stable@vger.kernel.org Signed-off-by: Abhishek Sahu absahu@codeaurora.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mtd/nand/raw/qcom_nandc.c | 53 +++++++++++++++++++++++++++++++++++++- 1 file changed, 52 insertions(+), 1 deletion(-)
--- a/drivers/mtd/nand/raw/qcom_nandc.c +++ b/drivers/mtd/nand/raw/qcom_nandc.c @@ -213,6 +213,8 @@ nandc_set_reg(nandc, NAND_READ_LOCATION_ #define QPIC_PER_CW_CMD_SGL 32 #define QPIC_PER_CW_DATA_SGL 8
+#define QPIC_NAND_COMPLETION_TIMEOUT msecs_to_jiffies(2000) + /* * Flags used in DMA descriptor preparation helper functions * (i.e. read_reg_dma/write_reg_dma/read_data_dma/write_data_dma) @@ -245,6 +247,11 @@ nandc_set_reg(nandc, NAND_READ_LOCATION_ * @tx_sgl_start - start index in data sgl for tx. * @rx_sgl_pos - current index in data sgl for rx. * @rx_sgl_start - start index in data sgl for rx. + * @wait_second_completion - wait for second DMA desc completion before making + * the NAND transfer completion. + * @txn_done - completion for NAND transfer. + * @last_data_desc - last DMA desc in data channel (tx/rx). + * @last_cmd_desc - last DMA desc in command channel. */ struct bam_transaction { struct bam_cmd_element *bam_ce; @@ -258,6 +265,10 @@ struct bam_transaction { u32 tx_sgl_start; u32 rx_sgl_pos; u32 rx_sgl_start; + bool wait_second_completion; + struct completion txn_done; + struct dma_async_tx_descriptor *last_data_desc; + struct dma_async_tx_descriptor *last_cmd_desc; };
/* @@ -504,6 +515,8 @@ alloc_bam_transaction(struct qcom_nand_c
bam_txn->data_sgl = bam_txn_buf;
+ init_completion(&bam_txn->txn_done); + return bam_txn; }
@@ -523,11 +536,33 @@ static void clear_bam_transaction(struct bam_txn->tx_sgl_start = 0; bam_txn->rx_sgl_pos = 0; bam_txn->rx_sgl_start = 0; + bam_txn->last_data_desc = NULL; + bam_txn->wait_second_completion = false;
sg_init_table(bam_txn->cmd_sgl, nandc->max_cwperpage * QPIC_PER_CW_CMD_SGL); sg_init_table(bam_txn->data_sgl, nandc->max_cwperpage * QPIC_PER_CW_DATA_SGL); + + reinit_completion(&bam_txn->txn_done); +} + +/* Callback for DMA descriptor completion */ +static void qpic_bam_dma_done(void *data) +{ + struct bam_transaction *bam_txn = data; + + /* + * In case of data transfer with NAND, 2 callbacks will be generated. + * One for command channel and another one for data channel. + * If current transaction has data descriptors + * (i.e. wait_second_completion is true), then set this to false + * and wait for second DMA descriptor completion. + */ + if (bam_txn->wait_second_completion) + bam_txn->wait_second_completion = false; + else + complete(&bam_txn->txn_done); }
static inline struct qcom_nand_host *to_qcom_nand_host(struct nand_chip *chip) @@ -756,6 +791,12 @@ static int prepare_bam_async_desc(struct
desc->dma_desc = dma_desc;
+ /* update last data/command descriptor */ + if (chan == nandc->cmd_chan) + bam_txn->last_cmd_desc = dma_desc; + else + bam_txn->last_data_desc = dma_desc; + list_add_tail(&desc->node, &nandc->desc_list);
return 0; @@ -1273,10 +1314,20 @@ static int submit_descs(struct qcom_nand cookie = dmaengine_submit(desc->dma_desc);
if (nandc->props->is_bam) { + bam_txn->last_cmd_desc->callback = qpic_bam_dma_done; + bam_txn->last_cmd_desc->callback_param = bam_txn; + if (bam_txn->last_data_desc) { + bam_txn->last_data_desc->callback = qpic_bam_dma_done; + bam_txn->last_data_desc->callback_param = bam_txn; + bam_txn->wait_second_completion = true; + } + dma_async_issue_pending(nandc->tx_chan); dma_async_issue_pending(nandc->rx_chan); + dma_async_issue_pending(nandc->cmd_chan);
- if (dma_sync_wait(nandc->cmd_chan, cookie) != DMA_COMPLETE) + if (!wait_for_completion_timeout(&bam_txn->txn_done, + QPIC_NAND_COMPLETION_TIMEOUT)) return -ETIMEDOUT; } else { if (dma_sync_wait(nandc->chan, cookie) != DMA_COMPLETE)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alberto Panizzo alberto@amarulasolutions.com
commit a64ad008980c65d38e6cf6858429c78e6b740c41 upstream.
Register, shift and mask were wrong according to datasheet.
Fixes: 115510053e5e ("clk: rockchip: add clock controller for the RK3399") Cc: stable@vger.kernel.org Signed-off-by: Alberto Panizzo alberto@amarulasolutions.com Signed-off-by: Anthony Brandon anthony@amarulasolutions.com Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/rockchip/clk-rk3399.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/clk/rockchip/clk-rk3399.c +++ b/drivers/clk/rockchip/clk-rk3399.c @@ -631,7 +631,7 @@ static struct rockchip_clk_branch rk3399 MUX(0, "clk_i2sout_src", mux_i2sch_p, CLK_SET_RATE_PARENT, RK3399_CLKSEL_CON(31), 0, 2, MFLAGS), COMPOSITE_NODIV(SCLK_I2S_8CH_OUT, "clk_i2sout", mux_i2sout_p, CLK_SET_RATE_PARENT, - RK3399_CLKSEL_CON(30), 8, 2, MFLAGS, + RK3399_CLKSEL_CON(31), 2, 1, MFLAGS, RK3399_CLKGATE_CON(8), 12, GFLAGS),
/* uart */
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavo@embeddedor.com
commit 450b6b9b169382205f88858541a8b79830262ce7 upstream.
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example:
struct foo { int stuff; void *entry[]; };
instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper:
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
Notice that, currently, there is a bug during the allocation:
sizeof(npcm7xx_clk_data) should be sizeof(*npcm7xx_clk_data)
Fix this bug by using struct_size() in kzalloc()
This issue was detected with the help of Coccinelle.
Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Avi Fishman avifishman70@gmail.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/clk-npcm7xx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/clk/clk-npcm7xx.c +++ b/drivers/clk/clk-npcm7xx.c @@ -558,8 +558,8 @@ static void __init npcm7xx_clk_init(stru if (!clk_base) goto npcm7xx_init_error;
- npcm7xx_clk_data = kzalloc(sizeof(*npcm7xx_clk_data->hws) * - NPCM7XX_NUM_CLOCKS + sizeof(npcm7xx_clk_data), GFP_KERNEL); + npcm7xx_clk_data = kzalloc(struct_size(npcm7xx_clk_data, hws, + NPCM7XX_NUM_CLOCKS), GFP_KERNEL); if (!npcm7xx_clk_data) goto npcm7xx_init_np_err;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@oracle.com
commit 5e2e2f9f76e157063a656351728703cb02b068f1 upstream.
"count" needs to be signed for the error handling to work. I made "i" signed as well so they match.
Fixes: 02113ba93ea4 (PM / clk: Add support for obtaining clocks from device-tree) Cc: 4.6+ stable@vger.kernel.org # 4.6+ Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/base/power/clock_ops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/base/power/clock_ops.c +++ b/drivers/base/power/clock_ops.c @@ -185,7 +185,7 @@ EXPORT_SYMBOL_GPL(of_pm_clk_add_clk); int of_pm_clk_add_clks(struct device *dev) { struct clk **clks; - unsigned int i, count; + int i, count; int ret;
if (!dev || !dev->of_node)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: H. Nikolaus Schaller hns@goldelico.com
commit 932d47448c3caa0fa99e84d7f5bc302aa286efd8 upstream.
We did have sporadic problems in the pinctrl framework during boot where a pin group name unexpectedly became NULL leading to a NULL dereference in strcmp.
Detailled analysis of the failing cases did reveal that there were two devm allocated objects close to each other. The second one was the affected group_desc in pinmux and the first one was the psy_desc->properties buffer of the gab driver.
Review of the gab code showed that the address calculation for one memcpy() is wrong. It does
properties + sizeof(type) * index
but C is defined to do the index multiplication already for pointer + integer additions. Hence the factor was applied twice and the memcpy() does write outside of the properties buffer. Sometimes it happened to be the pinctrl and triggered the strcmp(NULL).
Anyways, it is overkill to use a memcpy() here instead of a simple assignment, which is easier to read and has less risk for wrong address calculations. So we change code to a simple assignment.
If we initialize the index to the first free location, we can even remove the local variable 'properties'.
This bug seems to exist right from the beginning in 3.7-rc1 in
commit e60fea794e6e ("power: battery: Generic battery driver using IIO")
Signed-off-by: H. Nikolaus Schaller hns@goldelico.com Cc: stable@vger.kernel.org Fixes: e60fea794e6e ("power: battery: Generic battery driver using IIO") Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/power/supply/generic-adc-battery.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-)
--- a/drivers/power/supply/generic-adc-battery.c +++ b/drivers/power/supply/generic-adc-battery.c @@ -241,10 +241,9 @@ static int gab_probe(struct platform_dev struct power_supply_desc *psy_desc; struct power_supply_config psy_cfg = {}; struct gab_platform_data *pdata = pdev->dev.platform_data; - enum power_supply_property *properties; int ret = 0; int chan; - int index = 0; + int index = ARRAY_SIZE(gab_props);
adc_bat = devm_kzalloc(&pdev->dev, sizeof(*adc_bat), GFP_KERNEL); if (!adc_bat) { @@ -278,8 +277,6 @@ static int gab_probe(struct platform_dev }
memcpy(psy_desc->properties, gab_props, sizeof(gab_props)); - properties = (enum power_supply_property *) - ((char *)psy_desc->properties + sizeof(gab_props));
/* * getting channel from iio and copying the battery properties @@ -293,15 +290,12 @@ static int gab_probe(struct platform_dev adc_bat->channel[chan] = NULL; } else { /* copying properties for supported channels only */ - memcpy(properties + sizeof(*(psy_desc->properties)) * index, - &gab_dyn_props[chan], - sizeof(gab_dyn_props[chan])); - index++; + psy_desc->properties[index++] = gab_dyn_props[chan]; } }
/* none of the channels are supported so let's bail out */ - if (index == 0) { + if (index == ARRAY_SIZE(gab_props)) { ret = -ENODEV; goto second_mem_fail; } @@ -312,7 +306,7 @@ static int gab_probe(struct platform_dev * as come channels may be not be supported by the device.So * we need to take care of that. */ - psy_desc->num_properties = ARRAY_SIZE(gab_props) + index; + psy_desc->num_properties = index;
adc_bat->psy = power_supply_register(&pdev->dev, psy_desc, &psy_cfg); if (IS_ERR(adc_bat->psy)) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: H. Nikolaus Schaller hns@goldelico.com
commit a427503edaaed9b75ed9746a654cece7e93e60a8 upstream.
If an iio channel defines a basic property, there are duplicate entries in /sys/class/power/*/uevent.
So add a check to avoid duplicates. Since all channels may be duplicates, we have to modify the related error check.
Signed-off-by: H. Nikolaus Schaller hns@goldelico.com Cc: stable@vger.kernel.org Fixes: e60fea794e6e ("power: battery: Generic battery driver using IIO") Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/power/supply/generic-adc-battery.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
--- a/drivers/power/supply/generic-adc-battery.c +++ b/drivers/power/supply/generic-adc-battery.c @@ -244,6 +244,7 @@ static int gab_probe(struct platform_dev int ret = 0; int chan; int index = ARRAY_SIZE(gab_props); + bool any = false;
adc_bat = devm_kzalloc(&pdev->dev, sizeof(*adc_bat), GFP_KERNEL); if (!adc_bat) { @@ -290,12 +291,22 @@ static int gab_probe(struct platform_dev adc_bat->channel[chan] = NULL; } else { /* copying properties for supported channels only */ - psy_desc->properties[index++] = gab_dyn_props[chan]; + int index2; + + for (index2 = 0; index2 < index; index2++) { + if (psy_desc->properties[index2] == + gab_dyn_props[chan]) + break; /* already known */ + } + if (index2 == index) /* really new */ + psy_desc->properties[index++] = + gab_dyn_props[chan]; + any = true; } }
/* none of the channels are supported so let's bail out */ - if (index == ARRAY_SIZE(gab_props)) { + if (!any) { ret = -ENODEV; goto second_mem_fail; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vincent Whitchurch vincent.whitchurch@axis.com
commit cb9d7fd51d9fbb329d182423bd7b92d0f8cb0e01 upstream.
Some architectures need to use stop_machine() to patch functions for ftrace, and the assumption is that the stopped CPUs do not make function calls to traceable functions when they are in the stopped state.
Commit ce4f06dcbb5d ("stop_machine: Touch_nmi_watchdog() after MULTI_STOP_PREPARE") added calls to the watchdog touch functions from the stopped CPUs and those functions lack notrace annotations. This leads to crashes when enabling/disabling ftrace on ARM kernels built with the Thumb-2 instruction set.
Fix it by adding the necessary notrace annotations.
Fixes: ce4f06dcbb5d ("stop_machine: Touch_nmi_watchdog() after MULTI_STOP_PREPARE") Signed-off-by: Vincent Whitchurch vincent.whitchurch@axis.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: Peter Zijlstra peterz@infradead.org Cc: oleg@redhat.com Cc: tj@kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180821152507.18313-1-vincent.whitchurch@axis.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/watchdog.c | 4 ++-- kernel/watchdog_hld.c | 2 +- kernel/workqueue.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-)
--- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -266,7 +266,7 @@ static void __touch_watchdog(void) * entering idle state. This should only be used for scheduler events. * Use touch_softlockup_watchdog() for everything else. */ -void touch_softlockup_watchdog_sched(void) +notrace void touch_softlockup_watchdog_sched(void) { /* * Preemption can be enabled. It doesn't matter which CPU's timestamp @@ -275,7 +275,7 @@ void touch_softlockup_watchdog_sched(voi raw_cpu_write(watchdog_touch_ts, 0); }
-void touch_softlockup_watchdog(void) +notrace void touch_softlockup_watchdog(void) { touch_softlockup_watchdog_sched(); wq_watchdog_touch(raw_smp_processor_id()); --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -29,7 +29,7 @@ static struct cpumask dead_events_mask; static unsigned long hardlockup_allcpu_dumped; static atomic_t watchdog_cpus = ATOMIC_INIT(0);
-void arch_touch_nmi_watchdog(void) +notrace void arch_touch_nmi_watchdog(void) { /* * Using __raw here because some code paths have --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -5559,7 +5559,7 @@ static void wq_watchdog_timer_fn(struct mod_timer(&wq_watchdog_timer, jiffies + thresh); }
-void wq_watchdog_touch(int cpu) +notrace void wq_watchdog_touch(int cpu) { if (cpu >= 0) per_cpu(wq_watchdog_touched_cpu, cpu) = jiffies;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Scott Bauer scott.bauer@intel.com
commit 8f3fafc9c2f0ece10832c25f7ffcb07c97a32ad4 upstream.
Like d88b6d04: "cdrom: information leak in cdrom_ioctl_media_changed()"
There is another cast from unsigned long to int which causes a bounds check to fail with specially crafted input. The value is then used as an index in the slot array in cdrom_slot_status().
Signed-off-by: Scott Bauer scott.bauer@intel.com Signed-off-by: Scott Bauer sbauer@plzdonthack.me Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/cdrom/cdrom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/cdrom/cdrom.c +++ b/drivers/cdrom/cdrom.c @@ -2542,7 +2542,7 @@ static int cdrom_ioctl_drive_status(stru if (!CDROM_CAN(CDC_SELECT_DISC) || (arg == CDSL_CURRENT || arg == CDSL_NONE)) return cdi->ops->drive_status(cdi, CDSL_CURRENT); - if (((int)arg >= cdi->capacity)) + if (arg >= cdi->capacity) return -EINVAL; return cdrom_slot_status(cdi, arg); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
commit 342db04ae71273322f0011384a9ed414df8bdae4 upstream.
show_opcodes() is used both for dumping kernel instructions and for dumping user instructions. If userspace causes #PF by jumping to a kernel address, show_opcodes() can be reached with regs->ip controlled by the user, pointing to kernel code. Make sure that userspace can't trick us into dumping kernel memory into dmesg.
Fixes: 7cccf0725cf7 ("x86/dumpstack: Add a show_ip() function") Signed-off-by: Jann Horn jannh@google.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Borislav Petkov bp@suse.de Cc: "H. Peter Anvin" hpa@zytor.com Cc: Andy Lutomirski luto@kernel.org Cc: security@kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180828154901.112726-1-jannh@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/stacktrace.h | 2 +- arch/x86/kernel/dumpstack.c | 21 +++++++++++++++------ arch/x86/mm/fault.c | 2 +- 3 files changed, 17 insertions(+), 8 deletions(-)
--- a/arch/x86/include/asm/stacktrace.h +++ b/arch/x86/include/asm/stacktrace.h @@ -111,6 +111,6 @@ static inline unsigned long caller_frame return (unsigned long)frame; }
-void show_opcodes(u8 *rip, const char *loglvl); +void show_opcodes(struct pt_regs *regs, const char *loglvl); void show_ip(struct pt_regs *regs, const char *loglvl); #endif /* _ASM_X86_STACKTRACE_H */ --- a/arch/x86/kernel/dumpstack.c +++ b/arch/x86/kernel/dumpstack.c @@ -92,23 +92,32 @@ static void printk_stack_address(unsigne * Thus, the 2/3rds prologue and 64 byte OPCODE_BUFSIZE is just a random * guesstimate in attempt to achieve all of the above. */ -void show_opcodes(u8 *rip, const char *loglvl) +void show_opcodes(struct pt_regs *regs, const char *loglvl) { unsigned int code_prologue = OPCODE_BUFSIZE * 2 / 3; u8 opcodes[OPCODE_BUFSIZE]; - u8 *ip; + unsigned long ip; int i; + bool bad_ip;
printk("%sCode: ", loglvl);
- ip = (u8 *)rip - code_prologue; - if (probe_kernel_read(opcodes, ip, OPCODE_BUFSIZE)) { + ip = regs->ip - code_prologue; + + /* + * Make sure userspace isn't trying to trick us into dumping kernel + * memory by pointing the userspace instruction pointer at it. + */ + bad_ip = user_mode(regs) && + __chk_range_not_ok(ip, OPCODE_BUFSIZE, TASK_SIZE_MAX); + + if (bad_ip || probe_kernel_read(opcodes, (u8 *)ip, OPCODE_BUFSIZE)) { pr_cont("Bad RIP value.\n"); return; }
for (i = 0; i < OPCODE_BUFSIZE; i++, ip++) { - if (ip == rip) + if (ip == regs->ip) pr_cont("<%02x> ", opcodes[i]); else pr_cont("%02x ", opcodes[i]); @@ -123,7 +132,7 @@ void show_ip(struct pt_regs *regs, const #else printk("%sRIP: %04x:%pS\n", loglvl, (int)regs->cs, (void *)regs->ip); #endif - show_opcodes((u8 *)regs->ip, loglvl); + show_opcodes(regs, loglvl); }
void show_iret_regs(struct pt_regs *regs) --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -838,7 +838,7 @@ show_signal_msg(struct pt_regs *regs, un
printk(KERN_CONT "\n");
- show_opcodes((u8 *)regs->ip, loglvl); + show_opcodes(regs, loglvl); }
static void
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit 7288bde1f9df6c1475675419bdd7725ce84dec56 upstream.
Removing one of the two accesses of the maxphyaddr variable led to a harmless warning:
arch/x86/kvm/x86.c: In function 'kvm_set_mmio_spte_mask': arch/x86/kvm/x86.c:6563:6: error: unused variable 'maxphyaddr' [-Werror=unused-variable]
Removing the #ifdef seems to be the nicest workaround, as it makes the code look cleaner than adding another #ifdef.
Fixes: 28a1f3ac1d0c ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs") Signed-off-by: Arnd Bergmann arnd@arndb.de Cc: stable@vger.kernel.org # L1TF Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/x86.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6512,14 +6512,12 @@ static void kvm_set_mmio_spte_mask(void) /* Set the present bit. */ mask |= 1ull;
-#ifdef CONFIG_X86_64 /* * If reserved bit is not supported, clear the present bit to disable * mmio page fault. */ - if (maxphyaddr == 52) + if (IS_ENABLED(CONFIG_X86_64) && maxphyaddr == 52) mask &= ~1ull; -#endif
kvm_mmu_set_mmio_spte_mask(mask, mask); }
On 09/03/18 18:55, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release.
Unfortunately this is busted. First blamed my custom patches, but as it turns out a 100% vanilla 4.18.6 build crashes as well. Single-user starts, but later when starting services and esp. autofs (I think - too much output) explodes with:
... Sep 3 20:19:36 ragnarok kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready Sep 3 20:19:38 ragnarok kernel: BUG: stack guard page was hit at 00000000ab58c99c (stack is 00000000382b9464..00000000d642b9d6) Sep 3 20:19:38 ragnarok kernel: kernel stack overflow (double-fault): 0000 [#1] SMP Sep 3 20:19:38 ragnarok kernel: CPU: 4 PID: 3634 Comm: automount Tainted: G O 4.18.6 #1 Sep 3 20:19:38 ragnarok kernel: Hardware name: Gigabyte Technology Co., Ltd. P67-DS3-B3/P67-DS3-B3, BIOS F1 05/06/2011 Sep 3 20:19:38 ragnarok kernel: RIP: 0010:flush_tlb_func_common.constprop.4+0x23/0x260 Sep 3 20:19:38 ragnarok kernel: Code: 0b eb e5 0f 1f 40 00 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 e4 f0 48 83 ec 20 65 48 8b 04 25 28 00 00 00 <48> 89 44 24 18 31 c0 65 66 8b 1d 96 fd fc 7e 0f b7 c3 65 48 8b 15 Sep 3 20:19:38 ragnarok kernel: RSP: 0018:ffffc9000326bfe0 EFLAGS: 00010082 Sep 3 20:19:38 ragnarok kernel: RAX: cf0a75e3a0e78e00 RBX: ffff880601006cc0 RCX: 0000000000000000 Sep 3 20:19:38 ragnarok kernel: RDX: 00007fb464e7e000 RSI: 0000000000000003 RDI: ffffc9000326c040 Sep 3 20:19:38 ragnarok kernel: RBP: ffffc9000326c030 R08: 00000005fca490e7 R09: 00000000004fa811 Sep 3 20:19:38 ragnarok kernel: R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000004 Sep 3 20:19:38 ragnarok kernel: R13: ffff8805fbab7600 R14: ffff880601006cc0 R15: ffff880602dfb540 Sep 3 20:19:38 ragnarok kernel: FS: 00007fb469245240(0000) GS:ffff88061f500000(0000) knlGS:0000000000000000 Sep 3 20:19:38 ragnarok kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 3 20:19:38 ragnarok kernel: CR2: ffffc9000326bfd8 CR3: 00000005feadb001 CR4: 00000000000606e0 Sep 3 20:19:38 ragnarok kernel: Call Trace: Sep 3 20:19:38 ragnarok kernel: flush_tlb_mm_range+0xff/0x110 Sep 3 20:19:38 ragnarok kernel: ? cpumask_any_but+0x1f/0x40 Sep 3 20:19:38 ragnarok kernel: ? cpumask_any_but+0x1f/0x40 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x76/0xc0 Sep 3 20:19:38 ragnarok kernel: tlb_table_flush.part.13+0xe/0x30 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x54/0xc0 ..a few hundred times.. Sep 3 20:19:38 ragnarok kernel: tlb_table_flush.part.13+0xe/0x30 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x54/0xc0 Sep 3 20:19:38 ragnarok kernel: arch_tlb_finish_mmu+0x3a/0x70 Sep 3 20:19:38 ragnarok kernel: tlb_finish_mmu+0x1f/0x30 Sep 3 20:19:38 ragnarok kernel: unmap_region+0xdd/0x110 Sep 3 20:19:38 ragnarok kernel: ? __vma_rb_erase+0x128/0x250 Sep 3 20:19:38 ragnarok kernel: do_munmap+0x273/0x3f0 Sep 3 20:19:38 ragnarok kernel: vm_munmap+0x5f/0xa0 Sep 3 20:19:38 ragnarok kernel: __x64_sys_munmap+0x22/0x30 Sep 3 20:19:38 ragnarok kernel: do_syscall_64+0x3e/0xe0 Sep 3 20:19:38 ragnarok kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Sep 3 20:19:38 ragnarok kernel: RIP: 0033:0x7fb469081187 Sep 3 20:19:38 ragnarok kernel: Code: ff ff ff f7 d8 89 05 58 df 20 00 48 c7 c0 ff ff ff ff eb 8a 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8d 0d 29 df 20 00 f7 d8 89 01 48 83 Sep 3 20:19:38 ragnarok kernel: RSP: 002b:00007ffef83ba548 EFLAGS: 00000206 ORIG_RAX: 000000000000000b Sep 3 20:19:38 ragnarok kernel: RAX: ffffffffffffffda RBX: 0000562a1dca9010 RCX: 00007fb469081187 Sep 3 20:19:38 ragnarok kernel: RDX: 0000000000000002 RSI: 0000000000204028 RDI: 00007fb464c79000 Sep 3 20:19:38 ragnarok kernel: RBP: 00007ffef83ba720 R08: 00007fb46928e930 R09: 0000000000000000 Sep 3 20:19:38 ragnarok kernel: R10: 00007fb464e7d000 R11: 0000000000000206 R12: 00007ffef83ba654 Sep 3 20:19:38 ragnarok kernel: R13: 00007ffef83ba610 R14: 00007ffef83ba655 R15: 00007fb46928e000 Sep 3 20:19:38 ragnarok kernel: Modules linked in: autofs4 tcp_bbr sch_fq_codel pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) it87 hwmon_vid x86_pkg_temp_thermal uvcvideo videobuf2_vmalloc videobuf2_memops snd_usb_audio videobuf2_v4l2 snd_hwdep bfq snd_usbmidi_lib videodev snd_rawmidi coretemp snd_seq_device videobuf2_common radeon usbhid kvm_intel i2c_algo_bit kvm snd_hda_codec_realtek irqbypass drm_kms_helper snd_hda_codec_hdmi snd_hda_codec_generic pcbc syscopyarea sysfillrect sysimgblt mq_deadline fb_sys_fops ttm snd_hda_intel aesni_intel snd_hda_codec drm snd_hda_core aes_x86_64 crypto_simd drm_panel_orientation_quirks cryptd snd_pcm glue_helper backlight snd_timer snd i2c_i801 soundcore i2c_core r8169 parport_pc parport mii Sep 3 20:19:38 ragnarok kernel: ---[ end trace cf25033b43d98311 ]--- Sep 3 20:19:38 ragnarok kernel: RIP: 0010:flush_tlb_func_common.constprop.4+0x23/0x260 Sep 3 20:19:38 ragnarok kernel: Code: 0b eb e5 0f 1f 40 00 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 e4 f0 48 83 ec 20 65 48 8b 04 25 28 00 00 00 <48> 89 44 24 18 31 c0 65 66 8b 1d 96 fd fc 7e 0f b7 c3 65 48 8b 15 Sep 3 20:19:38 ragnarok kernel: RSP: 0018:ffffc9000326bfe0 EFLAGS: 00010082 Sep 3 20:19:38 ragnarok kernel: RAX: cf0a75e3a0e78e00 RBX: ffff880601006cc0 RCX: 0000000000000000 Sep 3 20:19:38 ragnarok kernel: RDX: 00007fb464e7e000 RSI: 0000000000000003 RDI: ffffc9000326c040 Sep 3 20:19:38 ragnarok kernel: RBP: ffffc9000326c030 R08: 00000005fca490e7 R09: 00000000004fa811 Sep 3 20:19:38 ragnarok kernel: R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000004 Sep 3 20:19:38 ragnarok kernel: R13: ffff8805fbab7600 R14: ffff880601006cc0 R15: ffff880602dfb540 Sep 3 20:19:38 ragnarok kernel: FS: 00007fb469245240(0000) GS:ffff88061f500000(0000) knlGS:0000000000000000 Sep 3 20:19:38 ragnarok kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 3 20:19:38 ragnarok kernel: CR2: ffffc9000326bfd8 CR3: 00000005feadb001 CR4: 00000000000606e0 Sep 3 20:19:40 ragnarok kernel: r8169 0000:04:00.0 eth0: link up Sep 3 20:19:40 ragnarok kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
This is repeatable; full log is available on request.
Reverting "mm-tlb-x86-mm-support-invalidating-tlb-caches-for-rcu_table_free" makes everything work.
I'm now going back to my custom tree with lazy TLB handling, that worked as advertised. :D
cheers Holger
Le 3/09/18 à 20:39, Holger Hoffstätte a écrit :
On 09/03/18 18:55, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release.
Unfortunately this is busted. First blamed my custom patches, but as it turns out a 100% vanilla 4.18.6 build crashes as well. Single-user starts, but later when starting services and esp. autofs (I think - too much output) explodes with:
... Sep 3 20:19:36 ragnarok kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready Sep 3 20:19:38 ragnarok kernel: BUG: stack guard page was hit at 00000000ab58c99c (stack is 00000000382b9464..00000000d642b9d6) Sep 3 20:19:38 ragnarok kernel: kernel stack overflow (double-fault): 0000 [#1] SMP Sep 3 20:19:38 ragnarok kernel: CPU: 4 PID: 3634 Comm: automount Tainted: G O 4.18.6 #1 Sep 3 20:19:38 ragnarok kernel: Hardware name: Gigabyte Technology Co., Ltd. P67-DS3-B3/P67-DS3-B3, BIOS F1 05/06/2011 Sep 3 20:19:38 ragnarok kernel: RIP: 0010:flush_tlb_func_common.constprop.4+0x23/0x260 Sep 3 20:19:38 ragnarok kernel: Code: 0b eb e5 0f 1f 40 00 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 e4 f0 48 83 ec 20 65 48 8b 04 25 28 00 00 00 <48> 89 44 24 18 31 c0 65 66 8b 1d 96 fd fc 7e 0f b7 c3 65 48 8b 15 Sep 3 20:19:38 ragnarok kernel: RSP: 0018:ffffc9000326bfe0 EFLAGS: 00010082 Sep 3 20:19:38 ragnarok kernel: RAX: cf0a75e3a0e78e00 RBX: ffff880601006cc0 RCX: 0000000000000000 Sep 3 20:19:38 ragnarok kernel: RDX: 00007fb464e7e000 RSI: 0000000000000003 RDI: ffffc9000326c040 Sep 3 20:19:38 ragnarok kernel: RBP: ffffc9000326c030 R08: 00000005fca490e7 R09: 00000000004fa811 Sep 3 20:19:38 ragnarok kernel: R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000004 Sep 3 20:19:38 ragnarok kernel: R13: ffff8805fbab7600 R14: ffff880601006cc0 R15: ffff880602dfb540 Sep 3 20:19:38 ragnarok kernel: FS: 00007fb469245240(0000) GS:ffff88061f500000(0000) knlGS:0000000000000000 Sep 3 20:19:38 ragnarok kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 3 20:19:38 ragnarok kernel: CR2: ffffc9000326bfd8 CR3: 00000005feadb001 CR4: 00000000000606e0 Sep 3 20:19:38 ragnarok kernel: Call Trace: Sep 3 20:19:38 ragnarok kernel: flush_tlb_mm_range+0xff/0x110 Sep 3 20:19:38 ragnarok kernel: ? cpumask_any_but+0x1f/0x40 Sep 3 20:19:38 ragnarok kernel: ? cpumask_any_but+0x1f/0x40 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x76/0xc0 Sep 3 20:19:38 ragnarok kernel: tlb_table_flush.part.13+0xe/0x30 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x54/0xc0 ..a few hundred times.. Sep 3 20:19:38 ragnarok kernel: tlb_table_flush.part.13+0xe/0x30 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x54/0xc0 Sep 3 20:19:38 ragnarok kernel: arch_tlb_finish_mmu+0x3a/0x70 Sep 3 20:19:38 ragnarok kernel: tlb_finish_mmu+0x1f/0x30 Sep 3 20:19:38 ragnarok kernel: unmap_region+0xdd/0x110 Sep 3 20:19:38 ragnarok kernel: ? __vma_rb_erase+0x128/0x250 Sep 3 20:19:38 ragnarok kernel: do_munmap+0x273/0x3f0 Sep 3 20:19:38 ragnarok kernel: vm_munmap+0x5f/0xa0 Sep 3 20:19:38 ragnarok kernel: __x64_sys_munmap+0x22/0x30 Sep 3 20:19:38 ragnarok kernel: do_syscall_64+0x3e/0xe0 Sep 3 20:19:38 ragnarok kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Sep 3 20:19:38 ragnarok kernel: RIP: 0033:0x7fb469081187 Sep 3 20:19:38 ragnarok kernel: Code: ff ff ff f7 d8 89 05 58 df 20 00 48 c7 c0 ff ff ff ff eb 8a 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8d 0d 29 df 20 00 f7 d8 89 01 48 83 Sep 3 20:19:38 ragnarok kernel: RSP: 002b:00007ffef83ba548 EFLAGS: 00000206 ORIG_RAX: 000000000000000b Sep 3 20:19:38 ragnarok kernel: RAX: ffffffffffffffda RBX: 0000562a1dca9010 RCX: 00007fb469081187 Sep 3 20:19:38 ragnarok kernel: RDX: 0000000000000002 RSI: 0000000000204028 RDI: 00007fb464c79000 Sep 3 20:19:38 ragnarok kernel: RBP: 00007ffef83ba720 R08: 00007fb46928e930 R09: 0000000000000000 Sep 3 20:19:38 ragnarok kernel: R10: 00007fb464e7d000 R11: 0000000000000206 R12: 00007ffef83ba654 Sep 3 20:19:38 ragnarok kernel: R13: 00007ffef83ba610 R14: 00007ffef83ba655 R15: 00007fb46928e000 Sep 3 20:19:38 ragnarok kernel: Modules linked in: autofs4 tcp_bbr sch_fq_codel pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) it87 hwmon_vid x86_pkg_temp_thermal uvcvideo videobuf2_vmalloc videobuf2_memops snd_usb_audio videobuf2_v4l2 snd_hwdep bfq snd_usbmidi_lib videodev snd_rawmidi coretemp snd_seq_device videobuf2_common radeon usbhid kvm_intel i2c_algo_bit kvm snd_hda_codec_realtek irqbypass drm_kms_helper snd_hda_codec_hdmi snd_hda_codec_generic pcbc syscopyarea sysfillrect sysimgblt mq_deadline fb_sys_fops ttm snd_hda_intel aesni_intel snd_hda_codec drm snd_hda_core aes_x86_64 crypto_simd drm_panel_orientation_quirks cryptd snd_pcm glue_helper backlight snd_timer snd i2c_i801 soundcore i2c_core r8169 parport_pc parport mii Sep 3 20:19:38 ragnarok kernel: ---[ end trace cf25033b43d98311 ]--- Sep 3 20:19:38 ragnarok kernel: RIP: 0010:flush_tlb_func_common.constprop.4+0x23/0x260 Sep 3 20:19:38 ragnarok kernel: Code: 0b eb e5 0f 1f 40 00 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 e4 f0 48 83 ec 20 65 48 8b 04 25 28 00 00 00 <48> 89 44 24 18 31 c0 65 66 8b 1d 96 fd fc 7e 0f b7 c3 65 48 8b 15 Sep 3 20:19:38 ragnarok kernel: RSP: 0018:ffffc9000326bfe0 EFLAGS: 00010082 Sep 3 20:19:38 ragnarok kernel: RAX: cf0a75e3a0e78e00 RBX: ffff880601006cc0 RCX: 0000000000000000 Sep 3 20:19:38 ragnarok kernel: RDX: 00007fb464e7e000 RSI: 0000000000000003 RDI: ffffc9000326c040 Sep 3 20:19:38 ragnarok kernel: RBP: ffffc9000326c030 R08: 00000005fca490e7 R09: 00000000004fa811 Sep 3 20:19:38 ragnarok kernel: R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000004 Sep 3 20:19:38 ragnarok kernel: R13: ffff8805fbab7600 R14: ffff880601006cc0 R15: ffff880602dfb540 Sep 3 20:19:38 ragnarok kernel: FS: 00007fb469245240(0000) GS:ffff88061f500000(0000) knlGS:0000000000000000 Sep 3 20:19:38 ragnarok kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 3 20:19:38 ragnarok kernel: CR2: ffffc9000326bfd8 CR3: 00000005feadb001 CR4: 00000000000606e0 Sep 3 20:19:40 ragnarok kernel: r8169 0000:04:00.0 eth0: link up Sep 3 20:19:40 ragnarok kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
This is repeatable; full log is available on request.
Reverting "mm-tlb-x86-mm-support-invalidating-tlb-caches-for-rcu_table_free" makes everything work.
I'm now going back to my custom tree with lazy TLB handling, that worked as advertised. :D
cheers Holger
I confirm this also for the 4.14 tree. I get the same errors and reverting the same patch also fixes the problem.
François Valenduc
On 4 September 2018 at 02:46, François Valenduc francoisvalenduc@gmail.com wrote:
Le 3/09/18 à 20:39, Holger Hoffstätte a écrit :
On 09/03/18 18:55, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release.
Unfortunately this is busted. First blamed my custom patches, but as it turns out a 100% vanilla 4.18.6 build crashes as well. Single-user starts, but later when starting services and esp. autofs (I think - too much output) explodes with:
... Sep 3 20:19:36 ragnarok kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready Sep 3 20:19:38 ragnarok kernel: BUG: stack guard page was hit at 00000000ab58c99c (stack is 00000000382b9464..00000000d642b9d6) Sep 3 20:19:38 ragnarok kernel: kernel stack overflow (double-fault): 0000 [#1] SMP Sep 3 20:19:38 ragnarok kernel: CPU: 4 PID: 3634 Comm: automount Tainted: G O 4.18.6 #1 Sep 3 20:19:38 ragnarok kernel: Hardware name: Gigabyte Technology Co., Ltd. P67-DS3-B3/P67-DS3-B3, BIOS F1 05/06/2011 Sep 3 20:19:38 ragnarok kernel: RIP: 0010:flush_tlb_func_common.constprop.4+0x23/0x260 Sep 3 20:19:38 ragnarok kernel: Code: 0b eb e5 0f 1f 40 00 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 e4 f0 48 83 ec 20 65 48 8b 04 25 28 00 00 00 <48> 89 44 24 18 31 c0 65 66 8b 1d 96 fd fc 7e 0f b7 c3 65 48 8b 15 Sep 3 20:19:38 ragnarok kernel: RSP: 0018:ffffc9000326bfe0 EFLAGS: 00010082 Sep 3 20:19:38 ragnarok kernel: RAX: cf0a75e3a0e78e00 RBX: ffff880601006cc0 RCX: 0000000000000000 Sep 3 20:19:38 ragnarok kernel: RDX: 00007fb464e7e000 RSI: 0000000000000003 RDI: ffffc9000326c040 Sep 3 20:19:38 ragnarok kernel: RBP: ffffc9000326c030 R08: 00000005fca490e7 R09: 00000000004fa811 Sep 3 20:19:38 ragnarok kernel: R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000004 Sep 3 20:19:38 ragnarok kernel: R13: ffff8805fbab7600 R14: ffff880601006cc0 R15: ffff880602dfb540 Sep 3 20:19:38 ragnarok kernel: FS: 00007fb469245240(0000) GS:ffff88061f500000(0000) knlGS:0000000000000000 Sep 3 20:19:38 ragnarok kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 3 20:19:38 ragnarok kernel: CR2: ffffc9000326bfd8 CR3: 00000005feadb001 CR4: 00000000000606e0 Sep 3 20:19:38 ragnarok kernel: Call Trace: Sep 3 20:19:38 ragnarok kernel: flush_tlb_mm_range+0xff/0x110 Sep 3 20:19:38 ragnarok kernel: ? cpumask_any_but+0x1f/0x40 Sep 3 20:19:38 ragnarok kernel: ? cpumask_any_but+0x1f/0x40 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x76/0xc0 Sep 3 20:19:38 ragnarok kernel: tlb_table_flush.part.13+0xe/0x30 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x54/0xc0 ..a few hundred times.. Sep 3 20:19:38 ragnarok kernel: tlb_table_flush.part.13+0xe/0x30 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x54/0xc0 Sep 3 20:19:38 ragnarok kernel: arch_tlb_finish_mmu+0x3a/0x70 Sep 3 20:19:38 ragnarok kernel: tlb_finish_mmu+0x1f/0x30 Sep 3 20:19:38 ragnarok kernel: unmap_region+0xdd/0x110 Sep 3 20:19:38 ragnarok kernel: ? __vma_rb_erase+0x128/0x250 Sep 3 20:19:38 ragnarok kernel: do_munmap+0x273/0x3f0 Sep 3 20:19:38 ragnarok kernel: vm_munmap+0x5f/0xa0 Sep 3 20:19:38 ragnarok kernel: __x64_sys_munmap+0x22/0x30 Sep 3 20:19:38 ragnarok kernel: do_syscall_64+0x3e/0xe0 Sep 3 20:19:38 ragnarok kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Sep 3 20:19:38 ragnarok kernel: RIP: 0033:0x7fb469081187 Sep 3 20:19:38 ragnarok kernel: Code: ff ff ff f7 d8 89 05 58 df 20 00 48 c7 c0 ff ff ff ff eb 8a 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8d 0d 29 df 20 00 f7 d8 89 01 48 83 Sep 3 20:19:38 ragnarok kernel: RSP: 002b:00007ffef83ba548 EFLAGS: 00000206 ORIG_RAX: 000000000000000b Sep 3 20:19:38 ragnarok kernel: RAX: ffffffffffffffda RBX: 0000562a1dca9010 RCX: 00007fb469081187 Sep 3 20:19:38 ragnarok kernel: RDX: 0000000000000002 RSI: 0000000000204028 RDI: 00007fb464c79000 Sep 3 20:19:38 ragnarok kernel: RBP: 00007ffef83ba720 R08: 00007fb46928e930 R09: 0000000000000000 Sep 3 20:19:38 ragnarok kernel: R10: 00007fb464e7d000 R11: 0000000000000206 R12: 00007ffef83ba654 Sep 3 20:19:38 ragnarok kernel: R13: 00007ffef83ba610 R14: 00007ffef83ba655 R15: 00007fb46928e000 Sep 3 20:19:38 ragnarok kernel: Modules linked in: autofs4 tcp_bbr sch_fq_codel pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) it87 hwmon_vid x86_pkg_temp_thermal uvcvideo videobuf2_vmalloc videobuf2_memops snd_usb_audio videobuf2_v4l2 snd_hwdep bfq snd_usbmidi_lib videodev snd_rawmidi coretemp snd_seq_device videobuf2_common radeon usbhid kvm_intel i2c_algo_bit kvm snd_hda_codec_realtek irqbypass drm_kms_helper snd_hda_codec_hdmi snd_hda_codec_generic pcbc syscopyarea sysfillrect sysimgblt mq_deadline fb_sys_fops ttm snd_hda_intel aesni_intel snd_hda_codec drm snd_hda_core aes_x86_64 crypto_simd drm_panel_orientation_quirks cryptd snd_pcm glue_helper backlight snd_timer snd i2c_i801 soundcore i2c_core r8169 parport_pc parport mii Sep 3 20:19:38 ragnarok kernel: ---[ end trace cf25033b43d98311 ]--- Sep 3 20:19:38 ragnarok kernel: RIP: 0010:flush_tlb_func_common.constprop.4+0x23/0x260 Sep 3 20:19:38 ragnarok kernel: Code: 0b eb e5 0f 1f 40 00 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 e4 f0 48 83 ec 20 65 48 8b 04 25 28 00 00 00 <48> 89 44 24 18 31 c0 65 66 8b 1d 96 fd fc 7e 0f b7 c3 65 48 8b 15 Sep 3 20:19:38 ragnarok kernel: RSP: 0018:ffffc9000326bfe0 EFLAGS: 00010082 Sep 3 20:19:38 ragnarok kernel: RAX: cf0a75e3a0e78e00 RBX: ffff880601006cc0 RCX: 0000000000000000 Sep 3 20:19:38 ragnarok kernel: RDX: 00007fb464e7e000 RSI: 0000000000000003 RDI: ffffc9000326c040 Sep 3 20:19:38 ragnarok kernel: RBP: ffffc9000326c030 R08: 00000005fca490e7 R09: 00000000004fa811 Sep 3 20:19:38 ragnarok kernel: R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000004 Sep 3 20:19:38 ragnarok kernel: R13: ffff8805fbab7600 R14: ffff880601006cc0 R15: ffff880602dfb540 Sep 3 20:19:38 ragnarok kernel: FS: 00007fb469245240(0000) GS:ffff88061f500000(0000) knlGS:0000000000000000 Sep 3 20:19:38 ragnarok kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 3 20:19:38 ragnarok kernel: CR2: ffffc9000326bfd8 CR3: 00000005feadb001 CR4: 00000000000606e0 Sep 3 20:19:40 ragnarok kernel: r8169 0000:04:00.0 eth0: link up Sep 3 20:19:40 ragnarok kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
This is repeatable; full log is available on request.
Reverting "mm-tlb-x86-mm-support-invalidating-tlb-caches-for-rcu_table_free" makes everything work.
I'm now going back to my custom tree with lazy TLB handling, that worked as advertised. :D
cheers Holger
I confirm this also for the 4.14 tree. I get the same errors and reverting the same patch also fixes the problem.
I do see this crash log on 4.18.6-rc1, 4.18.6-rc1 run full log, https://lkft.validation.linaro.org/scheduler/job/404027#L3244
François Valenduc
Best regards Naresh Kamboju
On Mon, Sep 3, 2018 at 11:39 AM Holger Hoffstätte holger@applied-asynchrony.com wrote:
Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x76/0xc0 Sep 3 20:19:38 ragnarok kernel: tlb_table_flush.part.13+0xe/0x30 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x54/0xc0 ..a few hundred times.. Sep 3 20:19:38 ragnarok kernel: tlb_table_flush.part.13+0xe/0x30 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x54/0xc0 Sep 3 20:19:38 ragnarok kernel: arch_tlb_finish_mmu+0x3a/0x70 Sep 3 20:19:38 ragnarok kernel: tlb_finish_mmu+0x1f/0x30
Yeah, so what seems to have happened is that commit db7ddef30112 ("mm: move tlb_table_flush to tlb_flush_mmu_free") wasn't applied to the stable tree (because it wasn't an obvious dependency).
And without that, the backport of d86564a2f085 ("mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE") ends up with recursion from tlb_flush_mmu_tlbonly() calling tlb_table_flush(), which in turn calls tlb_table_invalidate(), which calls back to tlb_flush_mmu_tlbonly().
So you have endless recursion - at least until you run out of stack. Then, if you have VMAP_STACK enabled (x86-64 without KASAN), you get a nice clean kernel stack overflow message like you did.
Or if you have KASAN enabled and no VMAP stack, you just end up with random hangs and huge memory corruption as the recursion stomps all over your memory.
Linus
On Tue, Sep 04, 2018 at 10:12:13AM -0700, Linus Torvalds wrote:
On Mon, Sep 3, 2018 at 11:39 AM Holger Hoffstätte holger@applied-asynchrony.com wrote:
Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x76/0xc0 Sep 3 20:19:38 ragnarok kernel: tlb_table_flush.part.13+0xe/0x30 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x54/0xc0 ..a few hundred times.. Sep 3 20:19:38 ragnarok kernel: tlb_table_flush.part.13+0xe/0x30 Sep 3 20:19:38 ragnarok kernel: tlb_flush_mmu_tlbonly+0x54/0xc0 Sep 3 20:19:38 ragnarok kernel: arch_tlb_finish_mmu+0x3a/0x70 Sep 3 20:19:38 ragnarok kernel: tlb_finish_mmu+0x1f/0x30
Yeah, so what seems to have happened is that commit db7ddef30112 ("mm: move tlb_table_flush to tlb_flush_mmu_free") wasn't applied to the stable tree (because it wasn't an obvious dependency).
And without that, the backport of d86564a2f085 ("mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE") ends up with recursion from tlb_flush_mmu_tlbonly() calling tlb_table_flush(), which in turn calls tlb_table_invalidate(), which calls back to tlb_flush_mmu_tlbonly().
So you have endless recursion - at least until you run out of stack. Then, if you have VMAP_STACK enabled (x86-64 without KASAN), you get a nice clean kernel stack overflow message like you did.
Or if you have KASAN enabled and no VMAP stack, you just end up with random hangs and huge memory corruption as the recursion stomps all over your memory.
Ok, I will go queue this patch up now, it was in my very-long "to-apply" queue, but I didn't catch the dependancy here.
thanks,
greg k-h
On Mon, Sep 03, 2018 at 06:55:44PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Sep 5 16:56:53 UTC 2018. Anything received after that time might be too late.
Not directly related to v4.18.6-rc1. I have seen the following hang several times with v4.18.5. It happens on a quite regular basis after a suspend-resume cycle. CPU is Ryzen 1700X.
Guenter
--- [ 9990.754641] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:155] [ 9990.762549] Modules linked in: ipt_REJECT nf_reject_ipv4 xt_multiport sp5100_tco squashfs iptable_filter snd_hda_codec_hdmi binfmt_misc edac_mce_amd kvm snd_hda_codec_realtek irqbypass snd_hda_codec_generic snd_seq_midi snd_seq_midi_event crct10dif_pclmul ghash_clmulni_intel snd_rawmidi aesni_intel snd_hda_intel aes_x86_64 crypto_simd cryptd glue_helper snd_hda_codec snd_hda_core wmi_bmof snd_hwdep snd_seq snd_pcm k10temp snd_seq_device snd_timer snd soundcore sch_fq_codel parport_pc sunrpc ppdev lp parport ip_tables x_tables autofs4 hid_generic nouveau mxm_wmi video ttm drm_kms_helper usbhid syscopyarea sysfillrect hid sysimgblt igb fb_sys_fops dca drm i2c_algo_bit i2c_piix4 i2c_core r8169 ahci mii libahci wmi [ 9990.762589] CPU: 5 PID: 155 Comm: kworker/5:1 Tainted: G L 4.18.5+ #1 [ 9990.762591] Hardware name: Gigabyte Technology Co., Ltd. AB350M-Gaming 3/AB350M-Gaming 3-CF, BIOS F23 08/08/2018 [ 9990.762596] Workqueue: events free_work [ 9990.762601] RIP: 0010:smp_call_function_many+0x208/0x270 [ 9990.762601] Code: e8 0d d1 77 00 3b 05 cb f0 24 01 0f 83 86 fe ff ff 48 63 d0 49 8b 0c 24 48 03 0c d5 00 f7 11 a7 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c7 0f b6 4d d0 4c 89 f2 4c 89 ee 44 89 [ 9990.762626] RSP: 0018:ffff95ebc3effd20 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13 [ 9990.762628] RAX: 000000000000000c RBX: ffff94eeded63cc8 RCX: ffff94eedef27bc0 [ 9990.762629] RDX: 0000000000000001 RSI: 0000000000000100 RDI: ffff94eeded63cc8 [ 9990.762630] RBP: ffff95ebc3effd60 R08: 00000000fffffff0 R09: 00000000000000ff [ 9990.762631] R10: ffff94eeded63ce8 R11: ffff94eeded63cc8 R12: ffff94eeded63cc0 [ 9990.762632] R13: ffffffffa6076150 R14: 0000000000000000 R15: 0000000000000100 [ 9990.762633] FS: 0000000000000000(0000) GS:ffff94eeded40000(0000) knlGS:0000000000000000 [ 9990.762635] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9990.762636] CR2: 0000000000a67000 CR3: 00000006f120c000 CR4: 00000000003406e0 [ 9990.762637] Call Trace: [ 9990.762642] ? load_new_mm_cr3+0xe0/0xe0 [ 9990.762644] on_each_cpu+0x2d/0x60 [ 9990.762647] flush_tlb_kernel_range+0x4b/0x80 [ 9990.762648] ? vunmap_page_range+0x1fe/0x310 [ 9990.762650] __purge_vmap_area_lazy+0x50/0xb0 [ 9990.762652] free_vmap_area_noflush+0x7d/0x90 [ 9990.762654] remove_vm_area+0x74/0x80 [ 9990.762656] __vunmap+0x3b/0xc0 [ 9990.762657] free_work+0x25/0x40 [ 9990.762660] process_one_work+0x15e/0x3f0 [ 9990.762662] worker_thread+0x4a/0x440 [ 9990.762664] kthread+0x105/0x140 [ 9990.762666] ? process_one_work+0x3f0/0x3f0 [ 9990.762668] ? kthread_destroy_worker+0x50/0x50 [ 9990.762670] ret_from_fork+0x22/0x40
On Tue, Sep 04, 2018 at 09:24:34AM -0700, Guenter Roeck wrote:
On Mon, Sep 03, 2018 at 06:55:44PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Sep 5 16:56:53 UTC 2018. Anything received after that time might be too late.
Not directly related to v4.18.6-rc1. I have seen the following hang several times with v4.18.5. It happens on a quite regular basis after a suspend-resume cycle. CPU is Ryzen 1700X.
Guenter
[ 9990.754641] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:155] [ 9990.762549] Modules linked in: ipt_REJECT nf_reject_ipv4 xt_multiport sp5100_tco squashfs iptable_filter snd_hda_codec_hdmi binfmt_misc edac_mce_amd kvm snd_hda_codec_realtek irqbypass snd_hda_codec_generic snd_seq_midi snd_seq_midi_event crct10dif_pclmul ghash_clmulni_intel snd_rawmidi aesni_intel snd_hda_intel aes_x86_64 crypto_simd cryptd glue_helper snd_hda_codec snd_hda_core wmi_bmof snd_hwdep snd_seq snd_pcm k10temp snd_seq_device snd_timer snd soundcore sch_fq_codel parport_pc sunrpc ppdev lp parport ip_tables x_tables autofs4 hid_generic nouveau mxm_wmi video ttm drm_kms_helper usbhid syscopyarea sysfillrect hid sysimgblt igb fb_sys_fops dca drm i2c_algo_bit i2c_piix4 i2c_core r8169 ahci mii libahci wmi [ 9990.762589] CPU: 5 PID: 155 Comm: kworker/5:1 Tainted: G L 4.18.5+ #1 [ 9990.762591] Hardware name: Gigabyte Technology Co., Ltd. AB350M-Gaming 3/AB350M-Gaming 3-CF, BIOS F23 08/08/2018 [ 9990.762596] Workqueue: events free_work [ 9990.762601] RIP: 0010:smp_call_function_many+0x208/0x270 [ 9990.762601] Code: e8 0d d1 77 00 3b 05 cb f0 24 01 0f 83 86 fe ff ff 48 63 d0 49 8b 0c 24 48 03 0c d5 00 f7 11 a7 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c7 0f b6 4d d0 4c 89 f2 4c 89 ee 44 89 [ 9990.762626] RSP: 0018:ffff95ebc3effd20 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13 [ 9990.762628] RAX: 000000000000000c RBX: ffff94eeded63cc8 RCX: ffff94eedef27bc0 [ 9990.762629] RDX: 0000000000000001 RSI: 0000000000000100 RDI: ffff94eeded63cc8 [ 9990.762630] RBP: ffff95ebc3effd60 R08: 00000000fffffff0 R09: 00000000000000ff [ 9990.762631] R10: ffff94eeded63ce8 R11: ffff94eeded63cc8 R12: ffff94eeded63cc0 [ 9990.762632] R13: ffffffffa6076150 R14: 0000000000000000 R15: 0000000000000100 [ 9990.762633] FS: 0000000000000000(0000) GS:ffff94eeded40000(0000) knlGS:0000000000000000 [ 9990.762635] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9990.762636] CR2: 0000000000a67000 CR3: 00000006f120c000 CR4: 00000000003406e0 [ 9990.762637] Call Trace: [ 9990.762642] ? load_new_mm_cr3+0xe0/0xe0 [ 9990.762644] on_each_cpu+0x2d/0x60 [ 9990.762647] flush_tlb_kernel_range+0x4b/0x80 [ 9990.762648] ? vunmap_page_range+0x1fe/0x310 [ 9990.762650] __purge_vmap_area_lazy+0x50/0xb0 [ 9990.762652] free_vmap_area_noflush+0x7d/0x90 [ 9990.762654] remove_vm_area+0x74/0x80 [ 9990.762656] __vunmap+0x3b/0xc0 [ 9990.762657] free_work+0x25/0x40 [ 9990.762660] process_one_work+0x15e/0x3f0 [ 9990.762662] worker_thread+0x4a/0x440 [ 9990.762664] kthread+0x105/0x140 [ 9990.762666] ? process_one_work+0x3f0/0x3f0 [ 9990.762668] ? kthread_destroy_worker+0x50/0x50 [ 9990.762670] ret_from_fork+0x22/0x40
Odd. Do you see this on Linus's tree?
thanks,
greg k-h
On 09/05/2018 02:01 AM, Greg Kroah-Hartman wrote:
On Tue, Sep 04, 2018 at 09:24:34AM -0700, Guenter Roeck wrote:
On Mon, Sep 03, 2018 at 06:55:44PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Sep 5 16:56:53 UTC 2018. Anything received after that time might be too late.
Not directly related to v4.18.6-rc1. I have seen the following hang several times with v4.18.5. It happens on a quite regular basis after a suspend-resume cycle. CPU is Ryzen 1700X.
Guenter
[ 9990.754641] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:155] [ 9990.762549] Modules linked in: ipt_REJECT nf_reject_ipv4 xt_multiport sp5100_tco squashfs iptable_filter snd_hda_codec_hdmi binfmt_misc edac_mce_amd kvm snd_hda_codec_realtek irqbypass snd_hda_codec_generic snd_seq_midi snd_seq_midi_event crct10dif_pclmul ghash_clmulni_intel snd_rawmidi aesni_intel snd_hda_intel aes_x86_64 crypto_simd cryptd glue_helper snd_hda_codec snd_hda_core wmi_bmof snd_hwdep snd_seq snd_pcm k10temp snd_seq_device snd_timer snd soundcore sch_fq_codel parport_pc sunrpc ppdev lp parport ip_tables x_tables autofs4 hid_generic nouveau mxm_wmi video ttm drm_kms_helper usbhid syscopyarea sysfillrect hid sysimgblt igb fb_sys_fops dca drm i2c_algo_bit i2c_piix4 i2c_core r8169 ahci mii libahci wmi [ 9990.762589] CPU: 5 PID: 155 Comm: kworker/5:1 Tainted: G L 4.18.5+ #1 [ 9990.762591] Hardware name: Gigabyte Technology Co., Ltd. AB350M-Gaming 3/AB350M-Gaming 3-CF, BIOS F23 08/08/2018 [ 9990.762596] Workqueue: events free_work [ 9990.762601] RIP: 0010:smp_call_function_many+0x208/0x270 [ 9990.762601] Code: e8 0d d1 77 00 3b 05 cb f0 24 01 0f 83 86 fe ff ff 48 63 d0 49 8b 0c 24 48 03 0c d5 00 f7 11 a7 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c7 0f b6 4d d0 4c 89 f2 4c 89 ee 44 89 [ 9990.762626] RSP: 0018:ffff95ebc3effd20 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13 [ 9990.762628] RAX: 000000000000000c RBX: ffff94eeded63cc8 RCX: ffff94eedef27bc0 [ 9990.762629] RDX: 0000000000000001 RSI: 0000000000000100 RDI: ffff94eeded63cc8 [ 9990.762630] RBP: ffff95ebc3effd60 R08: 00000000fffffff0 R09: 00000000000000ff [ 9990.762631] R10: ffff94eeded63ce8 R11: ffff94eeded63cc8 R12: ffff94eeded63cc0 [ 9990.762632] R13: ffffffffa6076150 R14: 0000000000000000 R15: 0000000000000100 [ 9990.762633] FS: 0000000000000000(0000) GS:ffff94eeded40000(0000) knlGS:0000000000000000 [ 9990.762635] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9990.762636] CR2: 0000000000a67000 CR3: 00000006f120c000 CR4: 00000000003406e0 [ 9990.762637] Call Trace: [ 9990.762642] ? load_new_mm_cr3+0xe0/0xe0 [ 9990.762644] on_each_cpu+0x2d/0x60 [ 9990.762647] flush_tlb_kernel_range+0x4b/0x80 [ 9990.762648] ? vunmap_page_range+0x1fe/0x310 [ 9990.762650] __purge_vmap_area_lazy+0x50/0xb0 [ 9990.762652] free_vmap_area_noflush+0x7d/0x90 [ 9990.762654] remove_vm_area+0x74/0x80 [ 9990.762656] __vunmap+0x3b/0xc0 [ 9990.762657] free_work+0x25/0x40 [ 9990.762660] process_one_work+0x15e/0x3f0 [ 9990.762662] worker_thread+0x4a/0x440 [ 9990.762664] kthread+0x105/0x140 [ 9990.762666] ? process_one_work+0x3f0/0x3f0 [ 9990.762668] ? kthread_destroy_worker+0x50/0x50 [ 9990.762670] ret_from_fork+0x22/0x40
Odd. Do you see this on Linus's tree?
Not tested, but I see it in v4.17.19 and in v4.18.6-rc2. Turns out it is related to heavy load, not to suspend/resume. At this point I suspect that it may be an AMD/Ryzen specific problem - it looks like it disappears if I add "kernel.randomize_va_space = 0" to /etc/sysctl.conf. No idea if it is a CPU bug or some AMD specific code problem. I'll try to analyze it further.
Either case, it is not a concern for the current release since it affects other kernel versions.
Guenter
On Wed, Sep 5, 2018 at 8:34 AM Guenter Roeck linux@roeck-us.net wrote:
On 09/05/2018 02:01 AM, Greg Kroah-Hartman wrote:
[ 9990.754641] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:155] [ 9990.762601] RIP: 0010:smp_call_function_many+0x208/0x270 [ 9990.762601] Code: e8 0d d1 77 00 3b 05 cb f0 24 01 0f 83 86 fe ff ff 48 63 d0 49 8b 0c 24 48 03 0c d5 00 f7 11 a7 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c7 0f b6 4d d0 4c 89 f2 4c 89 ee 44 89
It's stuck in this loop:
loop: pause mov 0x18(%rcx),%edx and $0x1,%edx jne loop
which is csd_lock_wait().
Judging by the offset in smp_call_function_many(), it's the final one (there's two: the other one is part of "csd_lock()"). But that's just a guess.
Anyway, it means that we're waiting for another CPU to finish processing an IPI - either a previous one we sent asynchronously (if it's the earlier csd_lock() case) or the TLB IPI we just sent and we're waiting for completion of.
Not tested, but I see it in v4.17.19 and in v4.18.6-rc2. Turns out it is related to heavy load, not to suspend/resume. At this point I suspect that it may be an AMD/Ryzen specific problem - it looks like it disappears if I add "kernel.randomize_va_space = 0" to /etc/sysctl.conf. No idea if it is a CPU bug or some AMD specific code problem. I'll try to analyze it further.
Ouch. Some IPI sending/receiving problem would be very very painful to debug if it's hw related.
Linus
On 09/05/2018 10:01 AM, Linus Torvalds wrote:
On Wed, Sep 5, 2018 at 8:34 AM Guenter Roeck linux@roeck-us.net wrote:
On 09/05/2018 02:01 AM, Greg Kroah-Hartman wrote:
[ 9990.754641] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [kworker/5:1:155] [ 9990.762601] RIP: 0010:smp_call_function_many+0x208/0x270 [ 9990.762601] Code: e8 0d d1 77 00 3b 05 cb f0 24 01 0f 83 86 fe ff ff 48 63 d0 49 8b 0c 24 48 03 0c d5 00 f7 11 a7 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c7 0f b6 4d d0 4c 89 f2 4c 89 ee 44 89
It's stuck in this loop:
loop: pause mov 0x18(%rcx),%edx and $0x1,%edx jne loop
which is csd_lock_wait().
Judging by the offset in smp_call_function_many(), it's the final one (there's two: the other one is part of "csd_lock()"). But that's just a guess.
Anyway, it means that we're waiting for another CPU to finish processing an IPI - either a previous one we sent asynchronously (if it's the earlier csd_lock() case) or the TLB IPI we just sent and we're waiting for completion of.
Not tested, but I see it in v4.17.19 and in v4.18.6-rc2. Turns out it is related to heavy load, not to suspend/resume. At this point I suspect that it may be an AMD/Ryzen specific problem - it looks like it disappears if I add "kernel.randomize_va_space = 0" to /etc/sysctl.conf. No idea if it is a CPU bug or some AMD specific code problem. I'll try to analyze it further.
Ouch. Some IPI sending/receiving problem would be very very painful to debug if it's hw related.
Turns out this is a well known problem with Ryzen CPUs:
https://bugzilla.kernel.org/show_bug.cgi?id=196683
Guenter
On Mon, Sep 03, 2018 at 06:55:44PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Sep 5 16:56:53 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.18.y and the diffstat can be found below.
I have released -rc2 to fix a reported problem: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc2....
On 09/04/2018 01:32 PM, Greg Kroah-Hartman wrote:
On Mon, Sep 03, 2018 at 06:55:44PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Sep 5 16:56:53 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.18.y and the diffstat can be found below.
I have released -rc2 to fix a reported problem: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc2....
It hasn't shown up on kernel.org yet. Found patch-4.14.68-rc2.gz
thanks, -- Shuah
On 5 September 2018 at 01:02, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Sep 03, 2018 at 06:55:44PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Sep 5 16:56:53 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.18.y and the diffstat can be found below.
I have released -rc2 to fix a reported problem: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc2....
I get to see 4.18.6-rc1 not rc2. With the current results on given commit id are looking good.
Results from Linaro’s test farm. No regressions on arm64, arm and x86_64.
Summary ------------------------------------------------------------------------
kernel: 4.18.6-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.18.y git commit: a6a229cf7e7f147eb6d118815a01758749fa6e8d git describe: v4.18.5-124-ga6a229cf7e7f Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.18-oe/build/v4.18.5-124...
No regressions (compared to build v4.18.5)
Ran 21181 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * boot * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-containers-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * prep-inline * ltp-open-posix-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On Wed, Sep 05, 2018 at 04:08:35PM +0530, Naresh Kamboju wrote:
On 5 September 2018 at 01:02, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Sep 03, 2018 at 06:55:44PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Sep 5 16:56:53 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.18.y and the diffstat can be found below.
I have released -rc2 to fix a reported problem: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc2....
I get to see 4.18.6-rc1 not rc2.
Odd. Something is up with the kernel.org mirroring right now. Let's wait for people to wake up to look into it...
With the current results on given commit id are looking good.
Wonderful, thanks for testing and letting me know.
greg k-h
On 09/05/2018 03:43 AM, Greg Kroah-Hartman wrote:
On Wed, Sep 05, 2018 at 04:08:35PM +0530, Naresh Kamboju wrote:
On 5 September 2018 at 01:02, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Sep 03, 2018 at 06:55:44PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Sep 5 16:56:53 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.18.y and the diffstat can be found below.
I have released -rc2 to fix a reported problem: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc2....
I get to see 4.18.6-rc1 not rc2.
Odd. Something is up with the kernel.org mirroring right now. Let's wait for people to wake up to look into it...
Same here (rc1 vs. rc2). The necessary added patch was there, so I figured you did not update the release number and ignored it.
Guenter
With the current results on given commit id are looking good.
Wonderful, thanks for testing and letting me know.
greg k-h
On Wed, Sep 05, 2018 at 04:08:35PM +0530, Naresh Kamboju wrote:
On 5 September 2018 at 01:02, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Sep 03, 2018 at 06:55:44PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Sep 5 16:56:53 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.18.y and the diffstat can be found below.
I have released -rc2 to fix a reported problem: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.6-rc2....
I get to see 4.18.6-rc1 not rc2. With the current results on given commit id are looking good.
Results from Linaro’s test farm. No regressions on arm64, arm and x86_64.
You may have noticed i386 listed below under Environments. This was added last week, and runs functional testing on an i386 kernel under QEMU emulation, as well as on x86_64 server hardware. This also accounts for our total test count (unique tests * test environments) surpassing 20,000.
We'll therefore be updating the email header to read:
No regressions on arm64, arm, x86_64, and i386.
Thanks, Dan
Summary
kernel: 4.18.6-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.18.y git commit: a6a229cf7e7f147eb6d118815a01758749fa6e8d git describe: v4.18.5-124-ga6a229cf7e7f Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.18-oe/build/v4.18.5-124...
No regressions (compared to build v4.18.5)
Ran 21181 total tests in the following environments and test suites.
Environments
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64
Test Suites
- boot
- kselftest
- libhugetlbfs
- ltp-cap_bounds-tests
- ltp-containers-tests
- ltp-cve-tests
- ltp-fcntl-locktests-tests
- ltp-filecaps-tests
- ltp-fs-tests > * ltp-fs_bind-tests > * ltp-fs_perms_simple-tests
- ltp-fsx-tests
- ltp-hugetlb-tests
- ltp-io-tests
- ltp-ipc-tests
- ltp-math-tests
- ltp-nptl-tests
- ltp-pty-tests
- ltp-sched-tests
- ltp-securebits-tests
- ltp-syscalls-tests
- ltp-timers-tests
- prep-inline
- ltp-open-posix-tests
- kselftest-vsyscall-mode-native
- kselftest-vsyscall-mode-none
-- Linaro LKFT https://lkft.linaro.org
On 09/03/2018 09:55 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Sep 5 16:56:53 UTC 2018. Anything received after that time might be too late.
Build results: total: 136 pass: 136 fail: 0 Qemu test results: total: 314 pass: 314 fail: 0
Details are available at https://kerneltests.org/builders/.
Guenter
On Tue, Sep 04, 2018 at 03:53:32PM -0700, Guenter Roeck wrote:
On 09/03/2018 09:55 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.6 release. There are 123 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Sep 5 16:56:53 UTC 2018. Anything received after that time might be too late.
Build results: total: 136 pass: 136 fail: 0 Qemu test results: total: 314 pass: 314 fail: 0
Details are available at https://kerneltests.org/builders/.
Thanks for testing all of these and letting me know.
greg k-h
linux-stable-mirror@lists.linaro.org