This is the start of the stable review cycle for the 5.2.3 release. There are 413 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Jul 2019 07:13:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.2.3-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.2.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.2.3-rc1
Junxiao Bi junxiao.bi@oracle.com dm bufio: fix deadlock with loop device
Mike Snitzer snitzer@redhat.com dm thin metadata: check if in fail_io mode when setting needs_check
Bjorn Andersson bjorn.andersson@linaro.org phy: qcom-qmp: Correct READY_STATUS poll break condition
Norbert Manthey nmanthey@amazon.de pstore: Fix double-free in pstore_mkfile() failure path
Josua Mayer josua@solid-run.com dt-bindings: allow up to four clocks for orion-mdio
Josua Mayer josua@solid-run.com net: mvmdio: allow up to four clocks to be specified for orion-mdio
Tejun Heo tj@kernel.org blkcg: update blkcg_print_stat() to handle larger outputs
Tejun Heo tj@kernel.org blk-iolatency: clear use_delay when io.latency is set to zero
Peng Fan peng.fan@nxp.com clk: imx: imx8mm: correct audio_pll2_clk to audio_pll2_out
Konstantin Khlebnikov khlebnikov@yandex-team.ru blk-throttle: fix zero wait time for iops throttled group
Lee, Chiasheng chiasheng.lee@intel.com usb: Handle USB3 remote wakeup for LPM enabled devices correctly
Matthew Wilcox (Oracle) willy@infradead.org dax: Fix missed wakeup with PMD faults
Szymon Janc szymon.janc@codecoup.pl Bluetooth: Add SMP workaround Microsoft Surface Precision Mouse bug
Alexander Shishkin alexander.shishkin@linux.intel.com intel_th: msu: Fix single mode with disabled IOMMU
YueHaibing yuehaibing@huawei.com intel_th: msu: Remove set but not used variable 'last'
liaoweixiong liaoweixiong@allwinnertech.com mtd: spinand: read returns badly if the last page has bitflips
Xiaolei Li xiaolei.li@mediatek.com mtd: rawnand: mtk: Correct low level time calculation of r/w cycle
Dan Carpenter dan.carpenter@oracle.com eCryptfs: fix a couple type promotion bugs
Jorge Ramirez-Ortiz jorge.ramirez-ortiz@linaro.org mmc: sdhci-msm: fix mutex while in spinlock
Nathan Lynch nathanl@linux.ibm.com powerpc/pseries: Fix oops in hotplug memory notifier
Greg Kurz groug@kaod.org powerpc/pseries: Fix xive=off command line
Alexey Kardashevskiy aik@ozlabs.ru powerpc/powernv: Fix stale iommu table base after VFIO
Athira Rajeev atrajeev@linux.vnet.ibm.com powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state.
Greg Kurz groug@kaod.org powerpc/powernv/npu: Fix reference leak
Ravi Bangoria ravi.bangoria@linux.ibm.com powerpc/watchpoint: Restore NV GPRs while returning from exception
Andreas Schwab schwab@linux-m68k.org powerpc/mm/32s: fix condition that is always true
Christophe Leroy christophe.leroy@c-s.fr powerpc/32s: fix suspend/resume when IBATs 4-7 are used
Helge Deller deller@gmx.de parisc: Fix kernel panic due invalid values in IAOQ0 or IAOQ1
Helge Deller deller@gmx.de parisc: Avoid kernel panic triggered by invalid kprobe
Helge Deller deller@gmx.de parisc: Ensure userspace privilege for ptraced processes in regset functions
Steve Longerbeam slongerbeam@gmail.com gpu: ipu-v3: ipu-ic: Fix saturation bit offset in TPMEM
Nadav Amit namit@vmware.com resource: fix locking in find_next_iomem_res()
Drew Davenport ddavenport@chromium.org include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures
Jan Harkes jaharkes@cs.cmu.edu coda: pass the host file in vma->vm_file on mmap
Henry Burns henryburns@google.com mm/z3fold.c: lock z3fold page before __SetPageMovable()
Yafang Shao laoar.shao@gmail.com mm/memcontrol: fix wrong statistics in memory.stat
Dan Williams dan.j.williams@intel.com libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields
Aneesh Kumar K.V aneesh.kumar@linux.ibm.com mm/nvdimm: add is_ioremap_addr and use that to check ioremap address
Kuo-Hsin Yang vovoy@chromium.org mm: vmscan: scan anonymous pages on file refaults
Aaron Armstrong Skomra skomra@gmail.com HID: wacom: correct touch resolution x/y typo
Aaron Armstrong Skomra skomra@gmail.com HID: wacom: generic: Correct pad syncing
Aaron Armstrong Skomra skomra@gmail.com HID: wacom: generic: only switch the mode on devices with LEDs
Danit Goldberg danitg@mellanox.com IB/mlx5: Report correctly tag matching rendezvous capability
Filipe Manana fdmanana@suse.com Btrfs: add missing inode version, ctime and mtime updates when punching hole
Filipe Manana fdmanana@suse.com Btrfs: fix fsync not persisting dentry deletions due to inode evictions
Filipe Manana fdmanana@suse.com Btrfs: fix data loss after inode eviction, renaming it, and fsync it
Johannes Thumshirn jthumshirn@suse.de btrfs: correctly validate compression type
Niklas Cassel niklas.cassel@linaro.org PCI: qcom: Ensure that PERST is asserted for at least 100 ms
Mika Westerberg mika.westerberg@linux.intel.com PCI: Do not poll for PME if the device is in D3cold
Dexuan Cui decui@microsoft.com PCI: hv: Fix a use-after-free bug in hv_eject_device_work()
Alexander Shishkin alexander.shishkin@linux.intel.com intel_th: pci: Add Ice Lake NNPI support
Jason Gunthorpe jgg@ziepe.ca RDMA/odp: Fix missed unlock in non-blocking invalidate_start
Bart Van Assche bvanassche@acm.org RDMA/srp: Accept again source addresses that do not have a port number
Damien Le Moal damien.lemoal@wdc.com block: Fix potential overflow in blk_report_zones()
Damien Le Moal damien.lemoal@wdc.com block: Allow mapping of vmalloc-ed buffers
Andres Rodriguez andresx7@gmail.com drm/edid: parse CEA blocks embedded in DisplayID
Eiichi Tsukata devel@etsukata.com x86/stacktrace: Prevent infinite loop in arch_stack_walk_user()
Kim Phillips kim.phillips@amd.com perf/x86/amd/uncore: Set the thread mask for F17h L3 PMCs
Kim Phillips kim.phillips@amd.com perf/x86/amd/uncore: Do not set 'ThreadMask' and 'SliceMask' for non-L3 PMCs
Kan Liang kan.liang@linux.intel.com perf/x86/intel: Fix spurious NMI on fixed counter
David Rientjes rientjes@google.com x86/boot: Fix memory leak in default_get_smp_config()
Dexuan Cui decui@microsoft.com x86/hyper-v: Zero out the VP ASSIST PAGE on allocation
Soeren Moch smoch@web.de rt2x00usb: fix rx queue hang
YueHaibing yuehaibing@huawei.com 9p/virtio: Add cleanup path in p9_virtio_init
YueHaibing yuehaibing@huawei.com 9p/xen: Add cleanup path in p9_trans_xen_init
Juergen Gross jgross@suse.com xen/events: fix binding user event channels to cpus
Damien Le Moal damien.lemoal@wdc.com dm zoned: fix zone state management race
Daniel Jordan daniel.m.jordan@oracle.com padata: use smp_mb in padata_reorder to avoid orphaned padata jobs
Lyude Paul lyude@redhat.com drm/nouveau/i2c: Enable i2c pads & busses during preinit
Linus Walleij linus.walleij@linaro.org ARM: dts: gemini: Set DIR-685 SPI CS as active low
Vitor Soares Vitor.Soares@synopsys.com i3c: fix i2c and i3c scl rate by bus mode
Radoslaw Burny rburny@google.com fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.
Eric W. Biederman ebiederm@xmission.com signal: Correct namespace fixups of si_pid and si_uid
Eric W. Biederman ebiederm@xmission.com signal/usb: Replace kill_pid_info_as_cred with kill_pid_usb_asyncio
Shaokun Zhang zhangshaokun@hisilicon.com intel_th: msu: Fix unused variable warning on arm64 platform
Julien Thierry julien.thierry@arm.com arm64: Fix incorrect irqflag restore for priority masking
Julien Thierry julien.thierry@arm.com arm64: irqflags: Add condition flags to inline asm clobber list
Jon Hunter jonathanh@nvidia.com arm64: tegra: Fix AGIC register range
Like Xu like.xu@linux.intel.com KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed
Michael Neuling mikey@neuling.org KVM: PPC: Book3S HV: Fix CR0 setting in TM emulation
Suraj Jitindar Singh sjitindarsingh@gmail.com KVM: PPC: Book3S HV: Clear pending decrementer exceptions on nested guest entry
Suraj Jitindar Singh sjitindarsingh@gmail.com KVM: PPC: Book3S HV: Signed extend decrementer value if not using large decrementer
KarimAllah Ahmed karahmed@amazon.de KVM: Properly check if "page" is valid in kvm_vcpu_unmap
Wanpeng Li wanpengli@tencent.com KVM: VMX: check CPUID before allowing read/write of IA32_XSS
Sean Christopherson sean.j.christopherson@intel.com KVM: VMX: Fix handling of #MC that occurs during VM-Entry
Sean Christopherson sean.j.christopherson@intel.com KVM: nVMX: Always sync GUEST_BNDCFGS when it comes from vmcs01
Sean Christopherson sean.j.christopherson@intel.com KVM: VMX: Always signal #GP on WRMSR to MSR_IA32_CR_PAT with bad value
Sean Christopherson sean.j.christopherson@intel.com KVM: nVMX: Don't dump VMCS if virtual APIC page can't be mapped
Sakari Ailus sakari.ailus@linux.intel.com media: videobuf2-dma-sg: Prevent size from overflowing
Sakari Ailus sakari.ailus@linux.intel.com media: videobuf2-core: Prevent size alignment wrapping buffer size to 0
Ezequiel Garcia ezequiel@collabora.com media: coda: Remove unbalanced and unneeded mutex unlock
Boris Brezillon boris.brezillon@collabora.com media: v4l2: Test type instead of cfg->type in v4l2_ctrl_new_custom()
Yan, Zheng zyan@redhat.com ceph: use ceph_evict_inode to cleanup inode's resource
Luis Henriques lhenriques@suse.com ceph: fix end offset in truncate_inode_pages_range call
Takashi Iwai tiwai@suse.de ALSA: hda/hdmi - Fix i915 reverse port/pin mapping
Takashi Iwai tiwai@suse.de ALSA: hda/hdmi - Remove duplicated define
Hui Wang hui.wang@canonical.com ALSA: hda/realtek: apply ALC891 headset fixup to one Dell machine
Kailang Yang kailang@realtek.com ALSA: hda/realtek - Fixed Headphone Mic can't record on Dell platform
Takashi Iwai tiwai@suse.de ALSA: hda - Don't resume forcibly i915 HDMI/DP codec
Takashi Iwai tiwai@suse.de ALSA: seq: Break too long mutex context in the write loop
Masahiro Yamada yamada.masahiro@socionext.com kconfig: fix missing choice values in auto.conf
Xiao Ni xni@redhat.com raid5-cache: Need to do start() part job after adding journal device
Mark Brown broonie@kernel.org ASoC: core: Adapt for debugfs API change
Mark Brown broonie@kernel.org ASoC: dapm: Adapt for debugfs API change
Christophe Leroy christophe.leroy@c-s.fr lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE
Trond Myklebust trond.myklebust@hammerspace.com SUNRPC: Ensure the bvecs are reset when we re-encode the RPC request
Trond Myklebust trond.myklebust@hammerspace.com pnfs: Fix a problem where we gratuitously start doing I/O through the MDS
Trond Myklebust trond.myklebust@hammerspace.com pnfs/flexfiles: Fix PTR_ERR() dereferences in ff_layout_track_ds_error
Max Kellermann mk@cm4all.com Revert "NFS: readdirplus optimization by cache mechanism" (memleak)
Trond Myklebust trond.myklebust@hammerspace.com NFSv4: Handle the special Linux file open access mode
Eiichi Tsukata devel@etsukata.com tracing: Fix user stack trace "??" output
Julien Thierry julien.thierry@arm.com arm64: Fix interrupt tracing in the presence of NMIs
Dmitry Osipenko digetx@gmail.com opp: Don't use IS_ERR on invalid supplies
Emmanuel Grumbach emmanuel.grumbach@intel.com iwlwifi: mvm: clear rfkill_safe_init_done when we start the firmware
Johannes Berg johannes.berg@intel.com iwlwifi: mvm: delay GTK setting in FW in AP mode
Emmanuel Grumbach emmanuel.grumbach@intel.com iwlwifi: fix RF-Kill interrupt while FW load for gen2 devices
Emmanuel Grumbach emmanuel.grumbach@intel.com iwlwifi: don't WARN when calling iwl_get_shared_mem_conf with RF-Kill
Emmanuel Grumbach emmanuel.grumbach@intel.com iwlwifi: pcie: fix ALIVE interrupt handling for gen2 devices w/o MSI-X
Emmanuel Grumbach emmanuel.grumbach@intel.com iwlwifi: pcie: don't service an interrupt that was masked
Oren Givon oren.givon@intel.com iwlwifi: add support for hr1 RF ID
Jon Hunter jonathanh@nvidia.com arm64: tegra: Fix Jetson Nano GPU regulator
Jon Hunter jonathanh@nvidia.com arm64: tegra: Update Jetson TX1 GPU regulator timings
Krzysztof Kozlowski krzk@kernel.org regulator: s2mps11: Fix buck7 and buck8 wrong voltages
Krzysztof Kozlowski krzk@kernel.org regulator: s2mps11: Fix ERR_PTR dereference on GPIO lookup failure
Hui Wang hui.wang@canonical.com Input: alps - fix a mismatch between a condition check and its comment
Nick Black dankamongmen@gmail.com Input: synaptics - whitelist Lenovo T580 SMBus intertouch
Hui Wang hui.wang@canonical.com Input: alps - don't handle ALPS cs19 trackpoint-only device
Grant Hernandez granthernandez@google.com Input: gtco - bounds check collection indent level
Coly Li colyli@suse.de bcache: destroy dc->writeback_write_wq if failed to create dc->writeback_thread
Coly Li colyli@suse.de bcache: fix mistaken sysfs entry for io_error counter
Coly Li colyli@suse.de bcache: ignore read-ahead request failure on backing device
Coly Li colyli@suse.de bcache: Revert "bcache: free heap cache_set->flush_btree in bch_journal_free"
Coly Li colyli@suse.de bcache: Revert "bcache: fix high CPU occupancy during journal"
Coly Li colyli@suse.de Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()"
Aurelien Aptel aaptel@suse.com CIFS: fix deadlock in cached root handling
Ronnie Sahlberg lsahlber@redhat.com cifs: flush before set-info if we have writeable handles
Paulo Alcantara (SUSE) paulo@paulo.ac cifs: Properly handle auto disabling of serverino option
Ronnie Sahlberg lsahlber@redhat.com cifs: fix crash in smb2_compound_op()/smb2_set_next_command()
Ronnie Sahlberg lsahlber@redhat.com cifs: always add credits back for unsolicited PDUs
Wen Yang wen.yang99@zte.com.cn crypto: crypto4xx - fix a potential double free in ppc4xx_trng_probe
Cfir Cohen cfir@google.com crypto: ccp/gcm - use const time tag comparison.
Hook, Gary Gary.Hook@amd.com crypto: ccp - memset structure fields to zero before reuse
Christian Lamparter chunkeey@gmail.com crypto: crypto4xx - block ciphers should only accept complete blocks
Christian Lamparter chunkeey@gmail.com crypto: crypto4xx - fix blocksize for cfb and ofb
Christian Lamparter chunkeey@gmail.com crypto: crypto4xx - fix AES CTR blocksize value
Eric Biggers ebiggers@google.com crypto: chacha20poly1305 - fix atomic sleep when using async algorithm
Elena Petrova lenaptr@google.com crypto: arm64/sha2-ce - correct digest for empty data in finup
Elena Petrova lenaptr@google.com crypto: arm64/sha1-ce - correct digest for empty data in finup
Hook, Gary Gary.Hook@amd.com crypto: ccp - Validate the the error value used to index error messages
Ard Biesheuvel ard.biesheuvel@linaro.org crypto: caam - limit output IV to CBC to work around CTR mode DMA issue
Eric Biggers ebiggers@google.com crypto: ghash - fix unaligned memory access in ghash_setkey()
Finn Thain fthain@telegraphics.com.au scsi: mac_scsi: Fix pseudo DMA implementation, take 2
Finn Thain fthain@telegraphics.com.au scsi: mac_scsi: Increase PIO/PDMA transfer length threshold
Shivasharan S shivasharan.srikanteshwara@broadcom.com scsi: megaraid_sas: Fix calculation of target ID
Benjamin Block bblock@linux.ibm.com scsi: zfcp: fix request object use-after-free in send path causing wrong traces
Benjamin Block bblock@linux.ibm.com scsi: zfcp: fix request object use-after-free in send path causing seqno errors
Damien Le Moal damien.lemoal@wdc.com scsi: sd_zbc: Fix compilation warning
Ming Lei ming.lei@redhat.com scsi: core: Fix race on creating sense cache
Finn Thain fthain@telegraphics.com.au Revert "scsi: ncr5380: Increase register polling limit"
Finn Thain fthain@telegraphics.com.au scsi: NCR5380: Handle PDMA failure reliably
Finn Thain fthain@telegraphics.com.au scsi: NCR5380: Always re-enable reselection interrupt
Juergen Gross jgross@suse.com xen: let alloc_xenballooned_pages() fail if not enough memory free
Denis Efremov efremov@ispras.ru floppy: fix out-of-bounds read in copy_buffer
Denis Efremov efremov@ispras.ru floppy: fix invalid pointer dereference in drive_name
Denis Efremov efremov@ispras.ru floppy: fix out-of-bounds read in next_valid_format
Denis Efremov efremov@ispras.ru floppy: fix div-by-zero in setup_format_params
Andrii Nakryiko andriin@fb.com libbpf: fix another GCC8 warning for strncpy
Dennis Zhou dennis@kernel.org blk-iolatency: fix STS_AGAIN handling
Colin Ian King colin.king@canonical.com iavf: fix dereference of null rx_buffer pointer
Huazhong Tan tanhuazhong@huawei.com net: hns3: fix __QUEUE_STATE_STACK_XOFF not cleared issue
Josua Mayer josua@solid-run.com net: mvmdio: defer probe of orion-mdio if a clock is not ready
Ilya Maximets i.maximets@samsung.com xdp: fix race on generic receive path
Taehee Yoo ap420073@gmail.com gtp: fix use-after-free in gtp_newlink()
Taehee Yoo ap420073@gmail.com gtp: fix use-after-free in gtp_encap_destroy()
Taehee Yoo ap420073@gmail.com gtp: fix Illegal context switch in RCU read-side critical section.
Taehee Yoo ap420073@gmail.com gtp: fix suspicious RCU usage
csonsino csonsino@gmail.com Bluetooth: validate BLE connection interval updates
Taehee Yoo ap420073@gmail.com gtp: add missing gtp_encap_disable_sock() in gtp_encap_enable()
Dan Carpenter dan.carpenter@oracle.com Bluetooth: hidp: NUL terminate a string in the compat ioctl
Matias Karhumaa matias.karhumaa@gmail.com Bluetooth: Check state in l2cap_disconnect_rsp
Seeteena Thoufeek s1seetee@linux.vnet.ibm.com perf tests: Fix record+probe_libc_inet_pton.sh for powerpc64
Shijith Thotton sthotton@marvell.com genirq: Update irq stats from NMI handlers
Josua Mayer josua.mayer@jm0.eu Bluetooth: 6lowpan: search for destination address in all peers
João Paulo Rechi Vita jprvita@gmail.com Bluetooth: Add new 13d3:3501 QCA_ROME device
João Paulo Rechi Vita jprvita@gmail.com Bluetooth: Add new 13d3:3491 QCA_ROME device
Tomas Bortoli tomasbortoli@gmail.com Bluetooth: hci_bcsp: Fix memory leak in rx_skb
Jian Shen shenjian15@huawei.com net: hns3: fix port capbility updating issue
Jian Shen shenjian15@huawei.com net: hns3: enable broadcast promisc mode when initializing VF
Jiri Olsa jolsa@redhat.com tools: bpftool: Fix json dump crash on powerpc
Wen Yang wen.yang99@zte.com.cn ASoC: audio-graph-card: fix use-after-free in graph_for_each_link
Jean-Philippe Brucker jean-philippe.brucker@arm.com iommu/arm-smmu-v3: Invalidate ATC when detaching a device
Geert Uytterhoeven geert+renesas@glider.be gpiolib: Fix references to gpiod_[gs]et_*value_cansleep() variants
Cong Wang xiyou.wangcong@gmail.com bonding: validate ip header before check IPPROTO_IGMP
Jiri Benc jbenc@redhat.com selftests: bpf: fix inlines in test_lwt_seg6local
Leo Yan leo.yan@linaro.org bpf, libbpf, smatch: Fix potential NULL pointer dereference
Andrii Nakryiko andriin@fb.com libbpf: fix GCC8 warning for strncpy
David Howells dhowells@redhat.com rxrpc: Fix oops in tracepoint
Phong Tran tranmanphong@gmail.com net: usb: asix: init MAC address buffers
Guilherme G. Piccoli gpiccoli@canonical.com bnx2x: Prevent ptp_task to be rescheduled indefinitely
Taehee Yoo ap420073@gmail.com vxlan: do not destroy fdb if register_netdevice() is failed
Andi Kleen ak@linux.intel.com perf stat: Fix group lookup for metric group
Andi Kleen ak@linux.intel.com perf stat: Don't merge events in the same PMU
Andi Kleen ak@linux.intel.com perf stat: Fix metrics with --no-merge
Andi Kleen ak@linux.intel.com perf stat: Make metric event lookup more robust
Rander Wang rander.wang@linux.intel.com ALSA: hda: Fix a headphone detection issue when using SOF
Michael Chan michael.chan@broadcom.com bnxt_en: Cap the returned MSIX vectors to the RDMA driver.
Michael Chan michael.chan@broadcom.com bnxt_en: Fix statistics context reservation logic for RDMA driver.
Michael Chan michael.chan@broadcom.com bnxt_en: Disable bus master during PCI shutdown and driver unload.
Shahar S Matityahu shahar.s.matityahu@intel.com iwlwifi: dbg: fix debug monitor stop and restart delays
He Zhe zhe.he@windriver.com netfilter: Fix remainder of pseudo-header protocol 0
Baruch Siach baruch@tkos.co.il bpf: fix uapi bpf_prog_info fields alignment
Andrei Otcheretianski andrei.otcheretianski@intel.com iwlwifi: mvm: Drop large non sta frames
Dann Frazier dann.frazier@canonical.com ixgbe: Avoid NULL pointer dereference with VF on non-IPsec hw
Marek Vasut marex@denx.de net: ethernet: ti: cpsw: Assign OF node to slave devices
Yonglong Liu liuyonglong@huawei.com net: hns3: add Asym Pause support to fix autoneg problem
Vedang Patel vedang.patel@intel.com igb: clear out skb->tstamp after reading the txtime
Maxime Chevallier maxime.chevallier@bootlin.com net: mvpp2: prs: Don't override the sign bit in SRAM parser shift
Wen Gong wgong@codeaurora.org ath10k: destroy sdio workqueue while remove sdio module
Dundi Raviteja dundi@codeaurora.org ath10k: Fix memory leak in qmi
Yunsheng Lin linyunsheng@huawei.com net: hns3: add some error checking in hclge_tm module
Yonglong Liu liuyonglong@huawei.com net: hns3: fix a -Wformat-nonliteral compile warning
Coly Li colyli@suse.de bcache: fix potential deadlock in cached_def_free()
Coly Li colyli@suse.de bcache: avoid a deadlock in bcache_reboot()
Coly Li colyli@suse.de bcache: check c->gc_thread by IS_ERR_OR_NULL in cache_set_flush()
Coly Li colyli@suse.de bcache: acquire bch_register_lock later in cached_dev_free()
Coly Li colyli@suse.de bcache: check CACHE_SET_IO_DISABLE bit in bch_journal()
Coly Li colyli@suse.de bcache: check CACHE_SET_IO_DISABLE in allocator code
Coly Li colyli@suse.de bcache: fix return value error in bch_journal_read()
Maxim Mikityanskiy maximmi@mellanox.com net/mlx5e: Attach/detach XDP program safely
Eiichi Tsukata devel@etsukata.com EDAC: Fix global-out-of-bounds write when setting edac_mc_poll_msec
Ahmad Masri amasri@codeaurora.org wil6210: drop old event after wmi_call timeout
Zefir Kurtisi zefir.kurtisi@neratec.com ath9k: correctly handle short radar pulses
Arnd Bergmann arnd@arndb.de crypto: asymmetric_keys - select CRYPTO_HASH where needed
Arnd Bergmann arnd@arndb.de crypto: serpent - mark __serpent_setkey_sbox noinline
Mauro S. M. Rodrigues maurosr@linux.vnet.ibm.com ixgbe: Check DDM existence in transceiver before access
Jianbo Liu jianbol@mellanox.com net/mlx5: Get vport ACL namespace by vport index
Jian Shen shenjian15@huawei.com net: hns3: restore the MAC autoneg state after reset
Waibel Georg Georg.Waibel@sensor-technik.de gpio: Fix return value mismatch of function gpiod_get_from_of_node()
Ferdinand Blomqvist ferdinand.blomqvist@gmail.com rslib: Fix handling of of caller provided syndrome
Jiong Wang jiong.wang@netronome.com bpf: fix BPF_ALU32 | BPF_ARSH on BE arches
Ferdinand Blomqvist ferdinand.blomqvist@gmail.com rslib: Fix decoding of shortened codes
Nathan Chancellor natechancellor@gmail.com xsk: Properly terminate assignment in xskq_produce_flush_desc
Felix Kaechele felix@kaechele.ca netfilter: ctnetlink: Fix regression in conntrack entry deletion
Marek Szyprowski m.szyprowski@samsung.com clocksource/drivers/exynos_mct: Increase priority over ARM arch timer
Dmitry Osipenko digetx@gmail.com clocksource/drivers/tegra: Restore base address before cleanup
Tejun Heo tj@kernel.org libata: don't request sense data on !ZAC ATA devices
Dmitry Osipenko digetx@gmail.com clocksource/drivers/tegra: Release all IRQ's on request_irq() error
Paolo Valente paolo.valente@linaro.org block, bfq: fix rq_in_driver check in bfq_update_inject_limit
Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com ASoC: Intel: hdac_hdmi: Set ops to NULL on remove
Kyle Meyer kyle.meyer@hpe.com perf tools: Increase MAX_NR_CPUS and MAX_CACHES
Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com ALSA: hdac: Fix codec name after machine driver is unloaded and reloaded
Miaoqing Pan miaoqing@codeaurora.org ath10k: fix PCIE device wake up failed
Miaoqing Pan miaoqing@codeaurora.org ath10k: fix fw crash by moving chip reset after napi disabled
Claire Chang tientzu@chromium.org ath10k: add missing error handling
Lorenzo Bianconi lorenzo@kernel.org mt76: mt7615: do not process rx packets if the device is not initialized
Julian Anastasov ja@ssi.bg ipvs: fix tinfo memory leak in start_sync_thread
Lorenzo Bianconi lorenzo@kernel.org mt7601u: fix possible memory leak when the device is disconnected
Masahiro Yamada yamada.masahiro@socionext.com x86/build: Add 'set -e' to mkcapflags.sh to delete broken capflags.c
Lorenzo Bianconi lorenzo@kernel.org mt7601u: do not schedule rx_tasklet when the device has been disconnected
Ping-Ke Shih pkshih@realtek.com rtlwifi: rtl8192cu: fix error handle when usb probe failed
Icenowy Zheng icenowy@aosc.io net: stmmac: sun8i: force select external PHY when no internal one
Hans Verkuil hverkuil@xs4all.nl media: hdpvr: fix locking and a missing msleep
André Almeida andrealmeid@collabora.com media: vimc: cap: check v4l2_fill_pixfmt return value
Philipp Zabel p.zabel@pengutronix.de media: coda: increment sequence offset for the last returned frame
Marco Felsch m.felsch@pengutronix.de media: coda: fix last buffer handling in V4L2_ENC_CMD_STOP
Philipp Zabel p.zabel@pengutronix.de media: coda: fix mpeg2 sequence number handling
Ard Biesheuvel ard.biesheuvel@linaro.org acpi/arm64: ignore 5.1 FADTs that are reported as 5.0
Kuninori Morimoto kuninori.morimoto.gx@renesas.com ASoC: soc-core: call snd_soc_unbind_card() under mutex_lock;
Robert Jarzmik robert.jarzmik@free.fr media: mt9m111: fix fw-node refactoring
Nathan Huckleberry nhuck@google.com timer_list: Guard procfs specific code
Miroslav Lichvar mlichvar@redhat.com ntp: Limit TAI-UTC offset
Anders Roxell anders.roxell@linaro.org media: i2c: fix warning same module names
Marek Szyprowski m.szyprowski@samsung.com media: s5p-mfc: Make additional clocks optional
Julian Anastasov ja@ssi.bg ipvs: defer hook registration to avoid leaks
Colin Ian King colin.king@canonical.com media: staging: davinci: fix memory leaks and check for allocation failure
Arnd Bergmann arnd@arndb.de ipsec: select crypto ciphers for xfrm_algo
Julien Thierry julien.thierry@arm.com arm64: Do not enable IRQs for ct_user_exit
Minwoo Im minwoo.im.dev@gmail.com nvme-pci: adjust irq max_vector using num_possible_cpus()
Geert Uytterhoeven geert@linux-m68k.org lightnvm: fix uninitialized pointer in nvm_remove_tgt()
Heiner Litz hlitz@ucsc.edu lightnvm: pblk: fix freeing of merged pages
Chaitanya Kulkarni chaitanya.kulkarni@wdc.com nvme-pci: set the errno on ctrl state change error
Minwoo Im minwoo.im.dev@gmail.com nvme-pci: properly report state change failure in nvme_reset_work
Anton Eidelman anton@lightbitslabs.com nvme: fix possible io failures when removing multipathed ns
Pan Bian bianpan2016@163.com EDAC/sysfs: Fix memory leak when creating a csrow object
Greg KH gregkh@linuxfoundation.org EDAC/sysfs: Drop device references properly
Tudor Ambarus tudor.ambarus@microchip.com spi: fix ctrl->num_chipselect constraint
Rafael J. Wysocki rafael.j.wysocki@intel.com ACPICA: Clear status of GPEs on first direct enable
Dennis Zhou dennis@kernel.org blk-iolatency: only account submitted bios
Qian Cai cai@lca.pw x86/cacheinfo: Fix a -Wtype-limits warning
Ilias Apalodimas ilias.apalodimas@linaro.org net: netsec: initialize tx ring on ndo_open
Mika Westerberg mika.westerberg@linux.intel.com PCI: Add missing link delays required by the PCIe spec
Arnaldo Carvalho de Melo acme@redhat.com perf build: Handle slang being in /usr/include and in /usr/include/slang/
Alexei Starovoitov ast@kernel.org bpf: fix callees pruning callers
Arnaldo Carvalho de Melo acme@redhat.com tools build: Fix the zstd test in the test-all.c common case feature test
Nilkanth Ahirrao anilkanth@jp.adit-jv.com ASoC: rsnd: fixup mod ID calculation in rsnd_ctu_probe_
Denis Kirjanov kda@linux-powerpc.org ipoib: correcly show a VF hardware address
Mitch Williams mitch.a.williams@intel.com iavf: allow null RX descriptors
Jason Wang jasowang@redhat.com vhost_net: disable zerocopy by default
Arnaldo Carvalho de Melo acme@redhat.com perf evsel: Make perf_evsel__name() accept a NULL argument
Peter Zijlstra peterz@infradead.org x86/atomic: Fix smp_mb__{before,after}_atomic()
Geert Uytterhoeven geert@linux-m68k.org integrity: Fix __integrity_init_keyring() section mismatch
Kan Liang kan.liang@linux.intel.com perf/x86/intel/uncore: Handle invalid event coding for free-running counter
Jiri Olsa jolsa@redhat.com perf/x86/intel: Disable check_msr for real HW
Qian Cai cai@lca.pw sched/fair: Fix "runnable_avg_yN_inv" not used warnings
Kan Liang kan.liang@linux.intel.com perf/x86/intel: Add more Icelake CPUIDs
Gao Xiang gaoxiang25@huawei.com sched/core: Add __sched tag for io_schedule()
Nicolas Dichtel nicolas.dichtel@6wind.com xfrm: fix sa selector validation
Tejun Heo tj@kernel.org blkcg, writeback: dead memcgs shouldn't contribute to writeback ownership arbitration
Bob Liu bob.liu@oracle.com block: null_blk: fix race condition for null_del_dev
Yunsheng Lin linyunsheng@huawei.com net: hns3: delay ring buffer clearing during reset
Yunsheng Lin linyunsheng@huawei.com net: hns3: fix for skb leak when doing selftest
Yunsheng Lin linyunsheng@huawei.com net: hns3: fix for dereferencing before null checking
Michal Kalderon michal.kalderon@marvell.com qed: iWARP - Fix tc for MPA ll2 connection
Aaron Lewis aaronlewis@google.com x86/cpufeatures: Add FDP_EXCPTN_ONLY and ZERO_FCS_FDS
Rajneesh Bhardwaj rajneesh.bhardwaj@linux.intel.com perf/x86: Add Intel Ice Lake NNPI uncore support
Waiman Long longman@redhat.com rcu: Force inlining of rcu_read_lock()
Jerome Brunet jbrunet@baylibre.com ASoC: meson: axg-tdm: fix sample clock inversion
Rajneesh Bhardwaj rajneesh.bhardwaj@linux.intel.com x86/cpu: Add Ice Lake NNPI to Intel family
Eric Biggers ebiggers@google.com crypto: testmgr - add some more preemption points
Ondrej Mosnacek omosnace@redhat.com selinux: fix empty write to keycreate file
Marek Szyprowski m.szyprowski@samsung.com media: s5p-mfc: fix reading min scratch buffer size on MFC v6/v7
Valdis Kletnieks valdis.kletnieks@vt.edu bpf: silence warning messages in core
Young Xiao 92siuyang@gmail.com media: davinci: vpif_capture: fix memory leak in vpif_probe()
Tony Lindgren tony@atomide.com gpio: omap: Fix lost edge wake-up interrupts
Srinivas Kandagatla srinivas.kandagatla@linaro.org regmap: fix bulk writes on paged registers
Russell King rmk+kernel@armlinux.org.uk gpio: omap: ensure irq is enabled before wakeup
Russell King rmk+kernel@armlinux.org.uk gpio: omap: fix lack of irqstatus_raw0 for OMAP4
Eric Auger eric.auger@redhat.com iommu: Fix a leak in iommu_insert_resv_region
Kieran Bingham kieran.bingham+renesas@ideasonboard.com media: fdp1: Support M3N and E3 platforms
Oliver Neukum oneukum@suse.com media: uvcvideo: Fix access to uninitialized fields on probe error
Xingyu Chen xingyu.chen@amlogic.com irqchip/meson-gpio: Add support for Meson-G12A SoC
Hechao Li hechaol@fb.com selftests/bpf : clean up feature/ when make clean
Thomas Richter tmricht@linux.ibm.com perf report: Fix OOM error in TUI mode on s390
Thomas Richter tmricht@linux.ibm.com perf test 6: Fix missing kvm module load for s390
Mathieu Poirier mathieu.poirier@linaro.org perf cs-etm: Properly set the value of 'old' and 'head' in snapshot mode
Stefano Brivio sbrivio@redhat.com ipset: Fix memory accounting for hash types on resize
Aditya Pakki pakki001@umn.edu netfilter: ipset: fix a missing check of nla_parse
Robert Hancock hancock@sedsystems.ca net: sfp: add mutex to prevent concurrent state checks
Borislav Petkov bp@suse.de RAS/CEC: Fix pfn insertion
Julian Wiedmann jwi@linux.ibm.com s390/qdio: handle PENDING state for QEBSM devices
Robert Hancock hancock@sedsystems.ca net: axienet: Fix race condition causing TX hang
Fabio Estevam festevam@gmail.com net: fec: Do not use netdev messages too early
Antoine Tenart antoine.tenart@bootlin.com crypto: inside-secure - do not rely on the hardware last bit for result descriptors
Biao Huang biao.huang@mediatek.com net: stmmac: modify default value of tx-frames
Biao Huang biao.huang@mediatek.com net: stmmac: dwmac4: fix flow control issue
Jae Hyun Yoo jae.hyun.yoo@linux.intel.com media: aspeed: fix a kernel warning on clk control
Jae Hyun Yoo jae.hyun.yoo@linux.intel.com media: aspeed: change irq to threaded irq
Jiri Olsa jolsa@redhat.com perf jvmti: Address gcc string overflow warning for strncpy()
Fabio Estevam festevam@gmail.com media: imx7-mipi-csis: Propagate the error if clock enabling fails
Miles Chen miles.chen@mediatek.com arm64: mm: make CONFIG_ZONE_DMA32 configurable
Abhishek Goel huntbag@linux.vnet.ibm.com cpupower : frequency-set -r option misses the last cpu in related cpu list
Weihang Li liweihang@hisilicon.com net: hns3: set ops to null when unregister ad_dev
Weihang Li liweihang@hisilicon.com net: hns3: add a check to pointer in error_detected and slot_reset
Kefeng Wang wangkefeng.wang@huawei.com media: wl128x: Fix some error handling in fm_v4l2_init_video_device()
Neil Armstrong narmstrong@baylibre.com media: platform: ao-cec-g12a: disable regmap fast_io for cec bus regmap
Imre Deak imre.deak@intel.com locking/lockdep: Fix merging of hlocks with non-zero references
Imre Deak imre.deak@intel.com locking/lockdep: Fix OOO unlock when hlocks need merging
Sven Eckelmann sven@narfation.org batman-adv: Fix duplicated OGMs on NETDEV_UP
David S. Miller davem@davemloft.net tua6100: Avoid build warnings.
Christophe Leroy christophe.leroy@c-s.fr crypto: talitos - Align SEC1 accesses to 32 bits boundaries.
Christophe Leroy christophe.leroy@c-s.fr crypto: talitos - properly handle split ICV.
Vladimir Oltean olteanv@gmail.com net: dsa: sja1105: Fix broken fixed-link interfaces on user ports
Ioana Ciornei ioana.ciornei@nxp.com net: phy: Check against net_device being NULL
Shailendra Verma shailendra.v@samsung.com media: staging: media: davinci_vpfe: - Fix for memory leak if decoder initialization fails.
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: Intel: sof-rt5682: fix undefined references with Baytrail-only support
Kefeng Wang wangkefeng.wang@huawei.com media: saa7164: fix remove_proc_entry warning
Hans Verkuil hverkuil@xs4all.nl media: mc-device.c: don't memset __user pointer contents
Mitch Williams mitch.a.williams@intel.com ice: Check all VFs for MDD activity, don't disable
Arnaldo Carvalho de Melo acme@redhat.com perf annotate TUI browser: Do not use member from variable within its own initialization
Vandana BN bnvandana@gmail.com media: usb:zr364xx:Fix KASAN:null-ptr-deref Read in zr364xx_vidioc_querycap
Eric Biggers ebiggers@google.com fscrypt: clean up some BUG_ON()s in block encryption/decryption
sumitg sumitg@nvidia.com media: v4l2-core: fix use-after-free error
Kefeng Wang wangkefeng.wang@huawei.com media: vim2m: fix two double-free issues
Anirudh Gupta anirudhrudr@gmail.com xfrm: Fix xfrm sel prefix length validation
Jeremy Sowden jeremy@azazel.net af_key: fix leaks in key_pol_get_resp and dump_sp.
Eric W. Biederman ebiederm@xmission.com signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig
Eric W. Biederman ebiederm@xmission.com signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig
Michal Kalderon michal.kalderon@marvell.com qed: Set the doorbell address correctly
Jian Shen shenjian15@huawei.com net: hns3: fix for FEC configuration
Jian Shen shenjian15@huawei.com net: hns3: initialize CPU reverse mapping
Maxime Chevallier maxime.chevallier@bootlin.com net: mvpp2: cls: Extract the RSS context when parsing the ethtool rule
Brett Creeley brett.creeley@intel.com ice: Fix couple of issues in ice_vsi_release
Jose Abreu Jose.Abreu@synopsys.com net: stmmac: Prevent missing interrupts when running NAPI
Jose Abreu Jose.Abreu@synopsys.com net: stmmac: dwmac4/5: Clear unused address entries
Jose Abreu Jose.Abreu@synopsys.com net: stmmac: dwmac1000: Clear unused address entries
Horia Geantă horia.geanta@nxp.com crypto: caam - avoid S/G table fetching for AEAD zero-length output
Wen Yang wen.yang99@zte.com.cn media: venus: firmware: fix leaked of_node references
Brett Creeley brett.creeley@intel.com ice: Gracefully handle reset failure in ice_alloc_vfs()
Jungo Lin jungo.lin@mediatek.com media: media_device_enum_links32: clean a reserved field
Kangjie Lu kjlu@umn.edu media: vpss: fix a potential NULL pointer dereference
Alexei Starovoitov ast@kernel.org selftests/bpf: adjust verifier scale test
Lubomir Rintel lkundrak@v3.sk media: marvell-ccic: fix DMA s/g desc number calculation
Akinobu Mita akinobu.mita@gmail.com media: ov7740: avoid invalid framesize setting
Christophe Leroy christophe.leroy@c-s.fr crypto: talitos - fix skcipher failure due to wrong output IV
Daniel Gomez dagmcr@gmail.com media: spi: IR LED: add missing of table registration
Oliver Neukum oneukum@suse.com media: dvb: usb: fix use after free in dvb_usb_device_exit
Jeremy Sowden jeremy@azazel.net batman-adv: fix for leaked TVLV handler.
Daniel Baluta daniel.baluta@nxp.com regmap: debugfs: Fix memory leak in regmap_debugfs_init
Rakesh Pillai pillair@codeaurora.org ath10k: Fix encoding for protected management frames
Anilkumar Kolli akolli@codeaurora.org ath: DFS JP domain W56 fixed pulse type 3 RADAR detection
Maya Erez merez@codeaurora.org wil6210: fix spurious interrupts in 3-msi
Wen Gong wgong@codeaurora.org ath10k: add peer id check in ath10k_peer_find_by_id
Dan Carpenter dan.carpenter@oracle.com ath6kl: add some bounds checking
Maya Erez merez@codeaurora.org wil6210: fix missed MISC mbox interrupt
Surabhi Vishnoi svishnoi@codeaurora.org ath10k: Fix the wrong value of enums for wmi tlv stats id
Tim Schumacher timschumi@gmx.de ath9k: Check for errors when reading SREV register
Emil Renner Berthing kernel@esmil.dk spi: rockchip: turn down tx dma bursts
Surabhi Vishnoi svishnoi@codeaurora.org ath10k: Do not send probe response template for mesh
Gustavo A. R. Silva gustavo@embeddedor.com wil6210: fix potential out-of-bounds read
Toke Høiland-Jørgensen toke@redhat.com ath9k: Don't trust TX status TID number when reporting airtime
Pradeep kumar Chitrapu pradeepc@codeaurora.org ath10k: fix incorrect multicast/broadcast rate setting
Alagu Sankar alagusankar@silex-india.com ath10k: htt: don't use txdone_fifo with SDIO
Yingying Tang yintang@codeaurora.org ath10k: Check tx_stats before use it
-------------
Diffstat:
Documentation/atomic_t.txt | 3 + .../devicetree/bindings/net/marvell-orion-mdio.txt | 2 +- Documentation/scheduler/sched-pelt.c | 3 +- Makefile | 4 +- arch/arm/boot/dts/gemini-dlink-dir-685.dts | 2 +- arch/arm64/Kconfig | 3 +- arch/arm64/boot/dts/nvidia/tegra210-p2180.dtsi | 3 +- arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts | 17 +- arch/arm64/boot/dts/nvidia/tegra210.dtsi | 2 +- arch/arm64/crypto/sha1-ce-glue.c | 2 +- arch/arm64/crypto/sha2-ce-glue.c | 2 +- arch/arm64/include/asm/arch_gicv3.h | 4 +- arch/arm64/include/asm/daifflags.h | 68 ++-- arch/arm64/include/asm/irqflags.h | 65 ++-- arch/arm64/include/asm/kvm_host.h | 7 +- arch/arm64/include/asm/ptrace.h | 10 +- arch/arm64/kernel/acpi.c | 10 +- arch/arm64/kernel/entry.S | 84 ++++- arch/arm64/kernel/irq.c | 17 + arch/arm64/kernel/process.c | 2 +- arch/arm64/kernel/smp.c | 8 +- arch/arm64/kvm/hyp/switch.c | 2 +- arch/arm64/mm/init.c | 5 +- arch/parisc/kernel/kprobes.c | 3 + arch/parisc/kernel/ptrace.c | 31 +- arch/powerpc/include/asm/pgtable.h | 14 + arch/powerpc/kernel/exceptions-64s.S | 9 +- arch/powerpc/kernel/prom_init.c | 16 +- arch/powerpc/kernel/swsusp_32.S | 73 +++- arch/powerpc/kvm/book3s_hv.c | 13 +- arch/powerpc/kvm/book3s_hv_tm.c | 6 +- arch/powerpc/mm/pgtable_32.c | 2 +- arch/powerpc/platforms/powermac/sleep.S | 68 +++- arch/powerpc/platforms/powernv/idle.c | 2 +- arch/powerpc/platforms/powernv/npu-dma.c | 15 +- arch/powerpc/platforms/powernv/pci-ioda.c | 10 + arch/powerpc/platforms/pseries/hotplug-memory.c | 3 + arch/powerpc/sysdev/xive/spapr.c | 52 ++- arch/x86/events/amd/uncore.c | 15 +- arch/x86/events/intel/core.c | 29 +- arch/x86/events/intel/uncore.c | 1 + arch/x86/events/intel/uncore.h | 10 + arch/x86/events/intel/uncore_snbep.c | 1 + arch/x86/hyperv/hv_init.c | 13 +- arch/x86/include/asm/atomic.h | 8 +- arch/x86/include/asm/atomic64_64.h | 8 +- arch/x86/include/asm/barrier.h | 4 +- arch/x86/include/asm/cpufeatures.h | 2 + arch/x86/include/asm/intel-family.h | 1 + arch/x86/kernel/cpu/cacheinfo.c | 3 +- arch/x86/kernel/cpu/mkcapflags.sh | 2 + arch/x86/kernel/mpparse.c | 10 +- arch/x86/kernel/stacktrace.c | 8 +- arch/x86/kvm/pmu.c | 4 +- arch/x86/kvm/vmx/nested.c | 16 +- arch/x86/kvm/vmx/vmx.c | 35 +- block/bfq-iosched.c | 8 +- block/bio.c | 28 +- block/blk-cgroup.c | 8 +- block/blk-iolatency.c | 51 +-- block/blk-throttle.c | 9 +- block/blk-zoned.c | 2 +- crypto/asymmetric_keys/Kconfig | 3 + crypto/chacha20poly1305.c | 30 +- crypto/ghash-generic.c | 8 +- crypto/serpent_generic.c | 8 +- crypto/testmgr.c | 6 + drivers/acpi/acpica/acevents.h | 3 +- drivers/acpi/acpica/evgpe.c | 8 +- drivers/acpi/acpica/evgpeblk.c | 2 +- drivers/acpi/acpica/evxface.c | 2 +- drivers/acpi/acpica/evxfgpe.c | 2 +- drivers/ata/libata-eh.c | 8 +- drivers/base/regmap/regmap-debugfs.c | 2 + drivers/base/regmap/regmap.c | 2 + drivers/block/floppy.c | 34 +- drivers/block/null_blk_main.c | 11 +- drivers/bluetooth/btusb.c | 2 + drivers/bluetooth/hci_bcsp.c | 5 + drivers/clk/imx/clk-imx8mm.c | 6 +- drivers/clocksource/exynos_mct.c | 4 +- drivers/clocksource/timer-tegra20.c | 7 +- drivers/crypto/amcc/crypto4xx_alg.c | 36 +- drivers/crypto/amcc/crypto4xx_core.c | 24 +- drivers/crypto/amcc/crypto4xx_core.h | 10 +- drivers/crypto/amcc/crypto4xx_trng.c | 1 - drivers/crypto/caam/caamalg.c | 10 +- drivers/crypto/caam/caamalg_qi.c | 2 +- drivers/crypto/caam/caamalg_qi2.c | 9 + drivers/crypto/caam/qi.c | 3 + drivers/crypto/ccp/ccp-dev.c | 96 +++--- drivers/crypto/ccp/ccp-dev.h | 2 +- drivers/crypto/ccp/ccp-ops.c | 15 +- drivers/crypto/inside-secure/safexcel_cipher.c | 24 +- drivers/crypto/talitos.c | 35 +- drivers/edac/edac_mc_sysfs.c | 34 +- drivers/edac/edac_module.h | 2 +- drivers/gpio/gpio-omap.c | 29 +- drivers/gpio/gpiolib.c | 13 +- drivers/gpu/drm/drm_edid.c | 81 ++++- drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c | 20 ++ drivers/gpu/ipu-v3/ipu-ic.c | 2 +- drivers/hid/wacom_sys.c | 3 + drivers/hid/wacom_wac.c | 19 +- drivers/hid/wacom_wac.h | 1 + drivers/hwtracing/intel_th/msu.c | 45 ++- drivers/hwtracing/intel_th/pci.c | 5 + drivers/i3c/master.c | 51 ++- drivers/infiniband/core/umem_odp.c | 14 +- drivers/infiniband/hw/mlx5/main.c | 8 +- drivers/infiniband/ulp/ipoib/ipoib_main.c | 1 + drivers/infiniband/ulp/srp/ib_srp.c | 21 +- drivers/input/mouse/alps.c | 32 ++ drivers/input/mouse/synaptics.c | 1 + drivers/input/tablet/gtco.c | 20 +- drivers/iommu/arm-smmu-v3.c | 5 +- drivers/iommu/iommu.c | 8 +- drivers/irqchip/irq-gic-v3.c | 7 + drivers/irqchip/irq-meson-gpio.c | 1 + drivers/lightnvm/core.c | 2 +- drivers/lightnvm/pblk-core.c | 18 +- drivers/md/bcache/alloc.c | 9 + drivers/md/bcache/bcache.h | 2 - drivers/md/bcache/io.c | 12 + drivers/md/bcache/journal.c | 54 ++- drivers/md/bcache/super.c | 65 ++-- drivers/md/bcache/sysfs.c | 30 +- drivers/md/bcache/util.h | 2 - drivers/md/bcache/writeback.c | 5 + drivers/md/dm-bufio.c | 4 +- drivers/md/dm-thin-metadata.c | 7 +- drivers/md/dm-zoned-metadata.c | 24 -- drivers/md/dm-zoned.h | 28 +- drivers/md/raid5.c | 11 +- drivers/media/common/videobuf2/videobuf2-core.c | 4 + drivers/media/common/videobuf2/videobuf2-dma-sg.c | 2 +- drivers/media/dvb-frontends/tua6100.c | 22 +- drivers/media/i2c/Makefile | 2 +- drivers/media/i2c/{adv7511.c => adv7511-v4l2.c} | 5 + drivers/media/i2c/mt9m111.c | 8 +- drivers/media/i2c/ov7740.c | 6 +- drivers/media/media-device.c | 10 +- drivers/media/pci/saa7164/saa7164-core.c | 33 +- drivers/media/platform/aspeed-video.c | 16 +- drivers/media/platform/coda/coda-bit.c | 9 +- drivers/media/platform/coda/coda-common.c | 2 + drivers/media/platform/davinci/vpif_capture.c | 16 +- drivers/media/platform/davinci/vpss.c | 5 + drivers/media/platform/marvell-ccic/mcam-core.c | 5 +- drivers/media/platform/meson/ao-cec-g12a.c | 1 - drivers/media/platform/qcom/venus/firmware.c | 6 +- drivers/media/platform/rcar_fdp1.c | 8 + drivers/media/platform/s5p-mfc/s5p_mfc.c | 3 +- drivers/media/platform/s5p-mfc/s5p_mfc_pm.c | 5 + drivers/media/platform/vim2m.c | 6 +- drivers/media/platform/vimc/vimc-capture.c | 5 +- drivers/media/radio/wl128x/fmdrv_v4l2.c | 3 + drivers/media/rc/ir-spi.c | 1 + drivers/media/usb/dvb-usb/dvb-usb-init.c | 7 +- drivers/media/usb/hdpvr/hdpvr-video.c | 17 +- drivers/media/usb/uvc/uvc_ctrl.c | 4 +- drivers/media/usb/zr364xx/zr364xx.c | 3 +- drivers/media/v4l2-core/v4l2-ctrls.c | 27 +- drivers/mmc/host/sdhci-msm.c | 9 +- drivers/mtd/nand/raw/mtk_nand.c | 24 +- drivers/mtd/nand/spi/core.c | 2 +- drivers/net/bonding/bond_main.c | 37 +- drivers/net/dsa/sja1105/sja1105_main.c | 11 +- drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 5 +- .../net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c | 4 +- drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 33 +- drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h | 3 + drivers/net/ethernet/broadcom/bnxt/bnxt.c | 20 +- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 4 +- drivers/net/ethernet/freescale/fec_main.c | 6 +- drivers/net/ethernet/hisilicon/hns3/hnae3.c | 2 + drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 146 ++++---- drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 6 +- .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 21 +- .../ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c | 7 + .../net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 6 +- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 14 +- drivers/net/ethernet/intel/iavf/iavf_txrx.c | 27 +- drivers/net/ethernet/intel/ice/ice.h | 1 - drivers/net/ethernet/intel/ice/ice_lib.c | 24 +- drivers/net/ethernet/intel/ice/ice_main.c | 25 +- drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c | 11 +- drivers/net/ethernet/intel/igb/igb_main.c | 1 + drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 3 +- drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 3 + drivers/net/ethernet/intel/ixgbe/ixgbe_phy.h | 1 + drivers/net/ethernet/marvell/mvmdio.c | 7 +- drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c | 6 + drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 3 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 31 +- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 4 +- drivers/net/ethernet/qlogic/qed/qed_dev.c | 29 +- drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 2 + drivers/net/ethernet/qlogic/qed/qed_rdma.c | 2 +- drivers/net/ethernet/socionext/netsec.c | 32 +- drivers/net/ethernet/stmicro/stmmac/common.h | 2 +- drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 5 + .../net/ethernet/stmicro/stmmac/dwmac1000_core.c | 6 + drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 18 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 + drivers/net/ethernet/ti/cpsw.c | 3 + drivers/net/ethernet/ti/cpsw_priv.h | 1 + drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 20 +- drivers/net/gtp.c | 36 +- drivers/net/phy/phy_device.c | 6 + drivers/net/phy/sfp.c | 6 +- drivers/net/usb/asix_devices.c | 6 +- drivers/net/vxlan.c | 37 +- drivers/net/wireless/ath/ath10k/debugfs_sta.c | 7 + drivers/net/wireless/ath/ath10k/htt_rx.c | 4 +- drivers/net/wireless/ath/ath10k/hw.c | 2 +- drivers/net/wireless/ath/ath10k/mac.c | 14 +- drivers/net/wireless/ath/ath10k/pci.c | 9 +- drivers/net/wireless/ath/ath10k/qmi.c | 1 + drivers/net/wireless/ath/ath10k/sdio.c | 7 + drivers/net/wireless/ath/ath10k/txrx.c | 3 + drivers/net/wireless/ath/ath10k/wmi-tlv.c | 4 +- drivers/net/wireless/ath/ath10k/wmi.h | 7 +- drivers/net/wireless/ath/ath6kl/wmi.c | 10 +- drivers/net/wireless/ath/ath9k/hw.c | 32 +- drivers/net/wireless/ath/ath9k/recv.c | 6 +- drivers/net/wireless/ath/ath9k/xmit.c | 7 +- drivers/net/wireless/ath/dfs_pattern_detector.c | 2 +- drivers/net/wireless/ath/wil6210/interrupt.c | 67 ++-- drivers/net/wireless/ath/wil6210/txrx.c | 1 + drivers/net/wireless/ath/wil6210/wmi.c | 13 +- drivers/net/wireless/intel/iwlwifi/fw/dbg.c | 2 - drivers/net/wireless/intel/iwlwifi/fw/dbg.h | 6 +- drivers/net/wireless/intel/iwlwifi/fw/smem.c | 12 +- drivers/net/wireless/intel/iwlwifi/iwl-csr.h | 1 + drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 8 +- drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 53 ++- drivers/net/wireless/intel/iwlwifi/mvm/mvm.h | 3 + drivers/net/wireless/intel/iwlwifi/mvm/tx.c | 3 + .../wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c | 2 +- .../net/wireless/intel/iwlwifi/pcie/ctxt-info.c | 2 +- drivers/net/wireless/intel/iwlwifi/pcie/internal.h | 27 ++ drivers/net/wireless/intel/iwlwifi/pcie/rx.c | 66 ++-- .../net/wireless/intel/iwlwifi/pcie/trans-gen2.c | 9 + drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 8 +- drivers/net/wireless/mediatek/mt76/mt7615/mac.c | 3 + drivers/net/wireless/mediatek/mt7601u/dma.c | 54 +-- drivers/net/wireless/mediatek/mt7601u/tx.c | 4 +- drivers/net/wireless/ralink/rt2x00/rt2x00usb.c | 12 +- drivers/net/wireless/realtek/rtlwifi/usb.c | 5 +- drivers/nvdimm/dax_devs.c | 2 +- drivers/nvdimm/pfn.h | 1 + drivers/nvdimm/pfn_devs.c | 18 +- drivers/nvme/host/core.c | 14 +- drivers/nvme/host/pci.c | 14 +- drivers/opp/core.c | 2 +- drivers/pci/controller/dwc/pcie-qcom.c | 2 + drivers/pci/controller/pci-hyperv.c | 15 +- drivers/pci/pci.c | 36 +- drivers/pci/pci.h | 1 + drivers/pci/pcie/portdrv_core.c | 66 ++++ drivers/phy/qualcomm/phy-qcom-qmp.c | 4 +- drivers/ras/cec.c | 4 +- drivers/regulator/da9211-regulator.c | 2 + drivers/regulator/s2mps11.c | 9 +- drivers/regulator/s5m8767.c | 4 +- drivers/regulator/tps65090-regulator.c | 7 +- drivers/s390/cio/qdio_main.c | 1 + drivers/s390/scsi/zfcp_fsf.c | 55 ++- drivers/scsi/NCR5380.c | 18 +- drivers/scsi/NCR5380.h | 2 +- drivers/scsi/mac_scsi.c | 375 ++++++++++++--------- drivers/scsi/megaraid/megaraid_sas_base.c | 3 +- drivers/scsi/scsi_lib.c | 6 +- drivers/scsi/sd_zbc.c | 2 +- drivers/spi/spi-rockchip.c | 4 +- drivers/spi/spi.c | 12 +- drivers/staging/media/davinci_vpfe/dm365_ipipe.c | 15 +- drivers/staging/media/davinci_vpfe/vpfe_video.c | 3 + drivers/staging/media/imx/imx7-mipi-csis.c | 14 +- drivers/usb/core/devio.c | 48 +-- drivers/usb/core/hub.c | 7 +- drivers/vhost/net.c | 2 +- drivers/xen/balloon.c | 16 +- drivers/xen/events/events_base.c | 12 +- drivers/xen/evtchn.c | 2 +- fs/btrfs/compression.c | 16 + fs/btrfs/compression.h | 1 + fs/btrfs/file.c | 5 + fs/btrfs/props.c | 6 +- fs/btrfs/tree-log.c | 40 ++- fs/ceph/file.c | 2 +- fs/ceph/inode.c | 7 +- fs/ceph/super.c | 2 +- fs/ceph/super.h | 2 +- fs/cifs/cifs_fs_sb.h | 5 + fs/cifs/connect.c | 12 +- fs/cifs/inode.c | 16 + fs/cifs/misc.c | 1 + fs/cifs/smb2inode.c | 12 + fs/cifs/smb2ops.c | 57 +++- fs/coda/file.c | 70 +++- fs/crypto/crypto.c | 15 +- fs/dax.c | 53 +-- fs/ecryptfs/crypto.c | 12 +- fs/fs-writeback.c | 8 +- fs/nfs/dir.c | 90 +---- fs/nfs/flexfilelayout/flexfilelayoutdev.c | 2 +- fs/nfs/inode.c | 1 + fs/nfs/internal.h | 3 +- fs/nfs/nfs4file.c | 2 +- fs/nfs/pnfs.c | 2 +- fs/proc/proc_sysctl.c | 4 + fs/pstore/inode.c | 13 +- include/asm-generic/bug.h | 6 +- include/drm/drm_displayid.h | 10 + include/linux/blkdev.h | 4 +- include/linux/cpuhotplug.h | 2 +- include/linux/mm.h | 5 + include/linux/rcupdate.h | 2 +- include/linux/sched/signal.h | 2 +- include/net/ip_vs.h | 6 +- include/net/xdp_sock.h | 2 + include/rdma/ib_verbs.h | 4 +- include/sound/hda_codec.h | 2 + include/trace/events/rxrpc.h | 2 +- include/uapi/linux/bpf.h | 1 + include/xen/events.h | 3 +- kernel/bpf/Makefile | 1 + kernel/bpf/core.c | 4 +- kernel/bpf/verifier.c | 11 +- kernel/iomem.c | 2 +- kernel/irq/chip.c | 4 + kernel/irq/irqdesc.c | 16 +- kernel/locking/lockdep.c | 59 ++-- kernel/padata.c | 12 + kernel/pid_namespace.c | 2 +- kernel/resource.c | 20 +- kernel/sched/core.c | 2 +- kernel/sched/sched-pelt.h | 2 +- kernel/signal.c | 136 ++++++-- kernel/time/ntp.c | 4 +- kernel/time/timer_list.c | 36 +- kernel/trace/trace_output.c | 9 +- lib/reed_solomon/decode_rs.c | 18 +- lib/scatterlist.c | 9 +- mm/memcontrol.c | 5 +- mm/vmscan.c | 6 +- mm/z3fold.c | 12 +- net/9p/trans_virtio.c | 8 +- net/9p/trans_xen.c | 8 +- net/batman-adv/bat_iv_ogm.c | 4 +- net/batman-adv/hard-interface.c | 3 + net/batman-adv/translation-table.c | 2 + net/batman-adv/types.h | 3 + net/bluetooth/6lowpan.c | 14 +- net/bluetooth/hci_event.c | 5 + net/bluetooth/hidp/core.c | 2 +- net/bluetooth/hidp/sock.c | 1 + net/bluetooth/l2cap_core.c | 15 +- net/bluetooth/smp.c | 13 + net/key/af_key.c | 8 +- net/netfilter/ipset/ip_set_core.c | 10 +- net/netfilter/ipset/ip_set_hash_gen.h | 2 +- net/netfilter/ipvs/ip_vs_core.c | 21 +- net/netfilter/ipvs/ip_vs_ctl.c | 4 - net/netfilter/ipvs/ip_vs_sync.c | 134 ++++---- net/netfilter/nf_conntrack_netlink.c | 7 +- net/netfilter/nf_conntrack_proto_icmp.c | 2 +- net/netfilter/nf_nat_proto.c | 2 +- net/netfilter/utils.c | 5 +- net/sunrpc/clnt.c | 3 +- net/sunrpc/xprt.c | 2 + net/sunrpc/xprtsock.c | 1 + net/xdp/xsk.c | 31 +- net/xdp/xsk_queue.h | 2 +- net/xfrm/Kconfig | 2 + net/xfrm/xfrm_user.c | 19 ++ scripts/kconfig/confdata.c | 7 +- scripts/kconfig/expr.h | 1 + security/integrity/digsig.c | 5 +- security/selinux/hooks.c | 11 +- sound/core/seq/seq_clientmgr.c | 11 +- sound/hda/ext/hdac_ext_bus.c | 8 +- sound/hda/hdac_controller.c | 5 +- sound/pci/hda/hda_codec.c | 8 +- sound/pci/hda/patch_hdmi.c | 31 +- sound/pci/hda/patch_realtek.c | 10 +- sound/soc/codecs/hdac_hdmi.c | 6 + sound/soc/generic/audio-graph-card.c | 6 +- sound/soc/intel/boards/Kconfig | 2 +- sound/soc/meson/axg-tdm.h | 2 +- sound/soc/sh/rcar/ctu.c | 2 +- sound/soc/soc-core.c | 20 +- sound/soc/soc-dapm.c | 18 +- tools/bpf/bpftool/jit_disasm.c | 11 +- tools/build/feature/test-all.c | 2 +- tools/include/uapi/linux/bpf.h | 1 + tools/lib/bpf/libbpf.c | 8 +- tools/lib/bpf/xsk.c | 6 +- tools/perf/Makefile.config | 11 +- tools/perf/arch/arm/util/cs-etm.c | 127 ++++++- tools/perf/jvmti/libjvmti.c | 4 +- tools/perf/perf.h | 2 +- tools/perf/tests/parse-events.c | 27 ++ .../tests/shell/record+probe_libc_inet_pton.sh | 2 +- tools/perf/ui/browsers/annotate.c | 5 +- tools/perf/ui/libslang.h | 5 + tools/perf/util/annotate.c | 5 +- tools/perf/util/evsel.c | 8 +- tools/perf/util/header.c | 2 +- tools/perf/util/metricgroup.c | 47 ++- tools/perf/util/stat-display.c | 3 +- tools/perf/util/stat-shadow.c | 23 +- tools/power/cpupower/utils/cpufreq-set.c | 2 + tools/testing/selftests/bpf/Makefile | 3 +- .../selftests/bpf/progs/test_lwt_seg6local.c | 12 +- tools/testing/selftests/bpf/test_verifier.c | 31 +- virt/kvm/kvm_main.c | 2 +- 419 files changed, 4191 insertions(+), 1794 deletions(-)
[ Upstream commit 9e7251fa38978b85108c44743e1436d48e8d0d76 ]
tx_stats will be freed and set to NULL before debugfs_sta node is removed in station disconnetion process. So if read the debugfs_sta node there may be NULL pointer error. Add check for tx_stats before use it to resove this issue.
Signed-off-by: Yingying Tang yintang@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/debugfs_sta.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/wireless/ath/ath10k/debugfs_sta.c b/drivers/net/wireless/ath/ath10k/debugfs_sta.c index c704ae371c4d..42931a669b02 100644 --- a/drivers/net/wireless/ath/ath10k/debugfs_sta.c +++ b/drivers/net/wireless/ath/ath10k/debugfs_sta.c @@ -663,6 +663,13 @@ static ssize_t ath10k_dbg_sta_dump_tx_stats(struct file *file,
mutex_lock(&ar->conf_mutex);
+ if (!arsta->tx_stats) { + ath10k_warn(ar, "failed to get tx stats"); + mutex_unlock(&ar->conf_mutex); + kfree(buf); + return 0; + } + spin_lock_bh(&ar->data_lock); for (k = 0; k < ATH10K_STATS_TYPE_MAX; k++) { for (j = 0; j < ATH10K_COUNTER_TYPE_MAX; j++) {
[ Upstream commit e2a6b711282a371c5153239e0468a48254f17ca6 ]
HTT High Latency (ATH10K_DEV_TYPE_HL) does not use txdone_fifo at all, we don't even initialise it by skipping ath10k_htt_tx_alloc_buf() in ath10k_htt_tx_start(). Because of this using QCA6174 SDIO ath10k_htt_rx_tx_compl_ind() will crash when it accesses unitialised txdone_fifo. So skip txdone_fifo when using High Latency mode.
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00007-QCARMSWP-1.
Co-developed-by: Wen Gong wgong@codeaurora.org Signed-off-by: Alagu Sankar alagusankar@silex-india.com Signed-off-by: Wen Gong wgong@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/htt_rx.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c index 1acc622d2183..f22840bbc389 100644 --- a/drivers/net/wireless/ath/ath10k/htt_rx.c +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c @@ -2277,7 +2277,9 @@ static void ath10k_htt_rx_tx_compl_ind(struct ath10k *ar, * Note that with only one concurrent reader and one concurrent * writer, you don't need extra locking to use these macro. */ - if (!kfifo_put(&htt->txdone_fifo, tx_done)) { + if (ar->bus_param.dev_type == ATH10K_DEV_TYPE_HL) { + ath10k_txrx_tx_unref(htt, &tx_done); + } else if (!kfifo_put(&htt->txdone_fifo, tx_done)) { ath10k_warn(ar, "txdone fifo overrun, msdu_id %d status %d\n", tx_done.msdu_id, tx_done.status); ath10k_txrx_tx_unref(htt, &tx_done);
[ Upstream commit 93ee3d108fc77e19efeac3ec5aa7d5886711bfef ]
Invalid rate code is sent to firmware when multicast rate value of 0 is sent to driver indicating disabled case, causing broken mesh path. so fix that.
Tested on QCA9984 with firmware 10.4-3.6.1-00827
Sven tested on IPQ4019 with 10.4-3.5.3-00057 and QCA9888 with 10.4-3.5.3-00053 (ath10k-firmware) and 10.4-3.6-00140 (linux-firmware 2018-12-16-211de167).
Fixes: cd93b83ad92 ("ath10k: support for multicast rate control") Co-developed-by: Zhi Chen zhichen@codeaurora.org Signed-off-by: Zhi Chen zhichen@codeaurora.org Signed-off-by: Pradeep Kumar Chitrapu pradeepc@codeaurora.org Tested-by: Sven Eckelmann sven@narfation.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/mac.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c index 9c703d287333..e8997e22ceec 100644 --- a/drivers/net/wireless/ath/ath10k/mac.c +++ b/drivers/net/wireless/ath/ath10k/mac.c @@ -5588,8 +5588,8 @@ static void ath10k_bss_info_changed(struct ieee80211_hw *hw, struct cfg80211_chan_def def; u32 vdev_param, pdev_param, slottime, preamble; u16 bitrate, hw_value; - u8 rate, basic_rate_idx; - int rateidx, ret = 0, hw_rate_code; + u8 rate, basic_rate_idx, rateidx; + int ret = 0, hw_rate_code, mcast_rate; enum nl80211_band band; const struct ieee80211_supported_band *sband;
@@ -5776,7 +5776,11 @@ static void ath10k_bss_info_changed(struct ieee80211_hw *hw, if (changed & BSS_CHANGED_MCAST_RATE && !ath10k_mac_vif_chan(arvif->vif, &def)) { band = def.chan->band; - rateidx = vif->bss_conf.mcast_rate[band] - 1; + mcast_rate = vif->bss_conf.mcast_rate[band]; + if (mcast_rate > 0) + rateidx = mcast_rate - 1; + else + rateidx = ffs(vif->bss_conf.basic_rates) - 1;
if (ar->phy_capability & WHAL_WLAN_11A_CAPABILITY) rateidx += ATH10K_MAC_FIRST_OFDM_RATE_IDX;
[ Upstream commit 389b72e58259336c2d56d58b660b79cf4b9e0dcb ]
As already noted a comment in ath_tx_complete_aggr(), the hardware will occasionally send a TX status with the wrong tid number. If we trust the value, airtime usage will be reported to the wrong AC, which can cause the deficit on that AC to become very low, blocking subsequent attempts to transmit.
To fix this, account airtime usage to the TID number from the original skb, instead of the one in the hardware TX status report.
Reported-by: Miguel Catalan Cid miguel.catalan@i2cat.net Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath9k/xmit.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c index b17e1ca40995..3be0aeedb9b5 100644 --- a/drivers/net/wireless/ath/ath9k/xmit.c +++ b/drivers/net/wireless/ath/ath9k/xmit.c @@ -668,7 +668,8 @@ static bool bf_is_ampdu_not_probing(struct ath_buf *bf) static void ath_tx_count_airtime(struct ath_softc *sc, struct ieee80211_sta *sta, struct ath_buf *bf, - struct ath_tx_status *ts) + struct ath_tx_status *ts, + u8 tid) { u32 airtime = 0; int i; @@ -679,7 +680,7 @@ static void ath_tx_count_airtime(struct ath_softc *sc, airtime += rate_dur * bf->rates[i].count; }
- ieee80211_sta_register_airtime(sta, ts->tid, airtime, 0); + ieee80211_sta_register_airtime(sta, tid, airtime, 0); }
static void ath_tx_process_buffer(struct ath_softc *sc, struct ath_txq *txq, @@ -709,7 +710,7 @@ static void ath_tx_process_buffer(struct ath_softc *sc, struct ath_txq *txq, if (sta) { struct ath_node *an = (struct ath_node *)sta->drv_priv; tid = ath_get_skb_tid(sc, an, bf->bf_mpdu); - ath_tx_count_airtime(sc, sta, bf, ts); + ath_tx_count_airtime(sc, sta, bf, ts, tid->tidno); if (ts->ts_status & (ATH9K_TXERR_FILT | ATH9K_TXERR_XRETRY)) tid->clear_ps_filter = true; }
[ Upstream commit bfabdd6997323adbedccb13a3fed1967fb8cf8f5 ]
Notice that *rc* can evaluate to up to 5, include/linux/netdevice.h:
enum gro_result { GRO_MERGED, GRO_MERGED_FREE, GRO_HELD, GRO_NORMAL, GRO_DROP, GRO_CONSUMED, }; typedef enum gro_result gro_result_t;
In case *rc* evaluates to 5, we end up having an out-of-bounds read at drivers/net/wireless/ath/wil6210/txrx.c:821:
wil_dbg_txrx(wil, "Rx complete %d bytes => %s\n", len, gro_res_str[rc]);
Fix this by adding element "GRO_CONSUMED" to array gro_res_str.
Addresses-Coverity-ID: 1444666 ("Out-of-bounds read") Fixes: 194b482b5055 ("wil6210: Debug print GRO Rx result") Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Reviewed-by: Maya Erez merez@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/wil6210/txrx.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/wireless/ath/wil6210/txrx.c b/drivers/net/wireless/ath/wil6210/txrx.c index 4ccfd1404458..d74837cce67f 100644 --- a/drivers/net/wireless/ath/wil6210/txrx.c +++ b/drivers/net/wireless/ath/wil6210/txrx.c @@ -750,6 +750,7 @@ void wil_netif_rx_any(struct sk_buff *skb, struct net_device *ndev) [GRO_HELD] = "GRO_HELD", [GRO_NORMAL] = "GRO_NORMAL", [GRO_DROP] = "GRO_DROP", + [GRO_CONSUMED] = "GRO_CONSUMED", };
wil->txrx_ops.get_netif_rx_params(skb, &cid, &security);
[ Upstream commit 97354f2c432788e3163134df6bb144f4b6289d87 ]
Currently mac80211 do not support probe response template for mesh point. When WMI_SERVICE_BEACON_OFFLOAD is enabled, host driver tries to configure probe response template for mesh, but it fails because the interface type is not NL80211_IFTYPE_AP but NL80211_IFTYPE_MESH_POINT.
To avoid this failure, skip sending probe response template to firmware for mesh point.
Tested HW: WCN3990/QCA6174/QCA9984
Signed-off-by: Surabhi Vishnoi svishnoi@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/mac.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c index e8997e22ceec..b500fd427595 100644 --- a/drivers/net/wireless/ath/ath10k/mac.c +++ b/drivers/net/wireless/ath/ath10k/mac.c @@ -1630,6 +1630,10 @@ static int ath10k_mac_setup_prb_tmpl(struct ath10k_vif *arvif) if (arvif->vdev_type != WMI_VDEV_TYPE_AP) return 0;
+ /* For mesh, probe response and beacon share the same template */ + if (ieee80211_vif_is_mesh(vif)) + return 0; + prb = ieee80211_proberesp_get(hw, vif); if (!prb) { ath10k_warn(ar, "failed to get probe resp template from mac80211\n");
[ Upstream commit 47300728fb213486a830565d2af49da967c9d16a ]
This fixes tx and bi-directional dma transfers on rk3399-gru-kevin.
It seems the SPI fifo must have room for 2 bursts when the dma_tx_req signal is generated or it might skip some words. This in turn makes the rx dma channel never complete for bi-directional transfers.
Fix it by setting tx burst length to fifo_len / 4 and the dma watermark to fifo_len / 2.
However the rk3399 TRM says (sic): "DMAC support incrementing-address burst and fixed-address burst. But in the case of access SPI and UART at byte or halfword size, DMAC only support fixed-address burst and the address must be aligned to word."
So this relies on fifo_len being a multiple of 16 such that the burst length (= fifo_len / 4) is a multiple of 4 and the addresses will be word-aligned.
Fixes: dcfc861d24ec ("spi: rockchip: adjust dma watermark and burstlen") Signed-off-by: Emil Renner Berthing kernel@esmil.dk Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-rockchip.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/spi-rockchip.c b/drivers/spi/spi-rockchip.c index 9b91188a85f9..2cc6d9951b52 100644 --- a/drivers/spi/spi-rockchip.c +++ b/drivers/spi/spi-rockchip.c @@ -417,7 +417,7 @@ static int rockchip_spi_prepare_dma(struct rockchip_spi *rs, .direction = DMA_MEM_TO_DEV, .dst_addr = rs->dma_addr_tx, .dst_addr_width = rs->n_bytes, - .dst_maxburst = rs->fifo_len / 2, + .dst_maxburst = rs->fifo_len / 4, };
dmaengine_slave_config(master->dma_tx, &txconf); @@ -518,7 +518,7 @@ static void rockchip_spi_config(struct rockchip_spi *rs, else writel_relaxed(rs->fifo_len / 2 - 1, rs->regs + ROCKCHIP_SPI_RXFTLR);
- writel_relaxed(rs->fifo_len / 2 - 1, rs->regs + ROCKCHIP_SPI_DMATDLR); + writel_relaxed(rs->fifo_len / 2, rs->regs + ROCKCHIP_SPI_DMATDLR); writel_relaxed(0, rs->regs + ROCKCHIP_SPI_DMARDLR); writel_relaxed(dmacr, rs->regs + ROCKCHIP_SPI_DMACR);
[ Upstream commit 2f90c7e5d09437a4d8d5546feaae9f1cf48cfbe1 ]
Right now, if an error is encountered during the SREV register read (i.e. an EIO in ath9k_regread()), that error code gets passed all the way to __ath9k_hw_init(), where it is visible during the "Chip rev not supported" message.
ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits ath: phy2: Mac Chip Rev 0x0f.3 is not supported by this driver ath: phy2: Unable to initialize hardware; initialization status: -95 ath: phy2: Unable to initialize hardware; initialization status: -95 ath9k_htc: Failed to initialize the device
Check for -EIO explicitly in ath9k_hw_read_revisions() and return a boolean based on the success of the operation. Check for that in __ath9k_hw_init() and abort with a more debugging-friendly message if reading the revisions wasn't successful.
ath9k_htc 1-1.4:1.0: ath9k_htc: HTC initialized with 33 credits ath: phy2: Failed to read SREV register ath: phy2: Could not read hardware revision ath: phy2: Unable to initialize hardware; initialization status: -95 ath: phy2: Unable to initialize hardware; initialization status: -95 ath9k_htc: Failed to initialize the device
This helps when debugging by directly showing the first point of failure and it could prevent possible errors if a 0x0f.3 revision is ever supported.
Signed-off-by: Tim Schumacher timschumi@gmx.de Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath9k/hw.c | 32 +++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 9 deletions(-)
diff --git a/drivers/net/wireless/ath/ath9k/hw.c b/drivers/net/wireless/ath/ath9k/hw.c index 8581d917635a..b6773d613f0c 100644 --- a/drivers/net/wireless/ath/ath9k/hw.c +++ b/drivers/net/wireless/ath/ath9k/hw.c @@ -252,8 +252,9 @@ void ath9k_hw_get_channel_centers(struct ath_hw *ah, /* Chip Revisions */ /******************/
-static void ath9k_hw_read_revisions(struct ath_hw *ah) +static bool ath9k_hw_read_revisions(struct ath_hw *ah) { + u32 srev; u32 val;
if (ah->get_mac_revision) @@ -269,25 +270,33 @@ static void ath9k_hw_read_revisions(struct ath_hw *ah) val = REG_READ(ah, AR_SREV); ah->hw_version.macRev = MS(val, AR_SREV_REVISION2); } - return; + return true; case AR9300_DEVID_AR9340: ah->hw_version.macVersion = AR_SREV_VERSION_9340; - return; + return true; case AR9300_DEVID_QCA955X: ah->hw_version.macVersion = AR_SREV_VERSION_9550; - return; + return true; case AR9300_DEVID_AR953X: ah->hw_version.macVersion = AR_SREV_VERSION_9531; - return; + return true; case AR9300_DEVID_QCA956X: ah->hw_version.macVersion = AR_SREV_VERSION_9561; - return; + return true; }
- val = REG_READ(ah, AR_SREV) & AR_SREV_ID; + srev = REG_READ(ah, AR_SREV); + + if (srev == -EIO) { + ath_err(ath9k_hw_common(ah), + "Failed to read SREV register"); + return false; + } + + val = srev & AR_SREV_ID;
if (val == 0xFF) { - val = REG_READ(ah, AR_SREV); + val = srev; ah->hw_version.macVersion = (val & AR_SREV_VERSION2) >> AR_SREV_TYPE2_S; ah->hw_version.macRev = MS(val, AR_SREV_REVISION2); @@ -306,6 +315,8 @@ static void ath9k_hw_read_revisions(struct ath_hw *ah) if (ah->hw_version.macVersion == AR_SREV_VERSION_5416_PCIE) ah->is_pciexpress = true; } + + return true; }
/************************************/ @@ -559,7 +570,10 @@ static int __ath9k_hw_init(struct ath_hw *ah) struct ath_common *common = ath9k_hw_common(ah); int r = 0;
- ath9k_hw_read_revisions(ah); + if (!ath9k_hw_read_revisions(ah)) { + ath_err(common, "Could not read hardware revisions"); + return -EOPNOTSUPP; + }
switch (ah->hw_version.macVersion) { case AR_SREV_VERSION_5416_PCI:
[ Upstream commit 9280f4fc06f44d0b4dc9e831f72d97b3d7cd35d3 ]
The enum value for WMI_TLV_STAT_PDEV, WMI_TLV_STAT_VDEV and WMI_TLV_STAT_PEER is wrong, due to which the vdev stats are not received from firmware in wmi_update_stats event.
Fix the enum values for above stats to receive all stats from firmware in WMI_TLV_UPDATE_STATS_EVENTID.
Tested HW: WCN3990 Tested FW: WLAN.HL.3.1-00784-QCAHLSWMTPLZ-1
Fixes: f40a307eb92c ("ath10k: Fill rx duration for each peer in fw_stats for WCN3990) Signed-off-by: Surabhi Vishnoi svishnoi@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/wmi.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireless/ath/ath10k/wmi.h b/drivers/net/wireless/ath/ath10k/wmi.h index e1c40bb69932..12f57f9adbba 100644 --- a/drivers/net/wireless/ath/ath10k/wmi.h +++ b/drivers/net/wireless/ath/ath10k/wmi.h @@ -4535,9 +4535,10 @@ enum wmi_10_4_stats_id { };
enum wmi_tlv_stats_id { - WMI_TLV_STAT_PDEV = BIT(0), - WMI_TLV_STAT_VDEV = BIT(1), - WMI_TLV_STAT_PEER = BIT(2), + WMI_TLV_STAT_PEER = BIT(0), + WMI_TLV_STAT_AP = BIT(1), + WMI_TLV_STAT_PDEV = BIT(2), + WMI_TLV_STAT_VDEV = BIT(3), WMI_TLV_STAT_PEER_EXTD = BIT(10), };
[ Upstream commit 7441be71ba7e07791fd4fa2b07c932dff14ff4d9 ]
When MISC interrupt is triggered due to HALP bit, in parallel to mbox events handling by the MISC threaded IRQ, new mbox interrupt can be missed in the following scenario: 1. MISC ICR is read in the IRQ handler 2. Threaded IRQ is completed and all MISC interrupts are unmasked 3. mbox interrupt is set by FW 4. HALP is masked The mbox interrupt in step 3 can be missed due to constant high level of ICM. Masking all MISC IRQs instead of masking only HALP bit in step 4 will guarantee that ICM will drop to 0 and interrupt will be triggered once MISC interrupts will be unmasked.
Signed-off-by: Maya Erez merez@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/wil6210/interrupt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/wil6210/interrupt.c b/drivers/net/wireless/ath/wil6210/interrupt.c index 3f5bd177d55f..e41ba24011d8 100644 --- a/drivers/net/wireless/ath/wil6210/interrupt.c +++ b/drivers/net/wireless/ath/wil6210/interrupt.c @@ -580,7 +580,7 @@ static irqreturn_t wil6210_irq_misc(int irq, void *cookie) /* no need to handle HALP ICRs until next vote */ wil->halp.handle_icr = false; wil_dbg_irq(wil, "irq_misc: HALP IRQ invoked\n"); - wil6210_mask_halp(wil); + wil6210_mask_irq_misc(wil, true); complete(&wil->halp.comp); } }
[ Upstream commit 5d6751eaff672ea77642e74e92e6c0ac7f9709ab ]
The "ev->traffic_class" and "reply->ac" variables come from the network and they're used as an offset into the wmi->stream_exist_for_ac[] array. Those variables are u8 so they can be 0-255 but the stream_exist_for_ac[] array only has WMM_NUM_AC (4) elements. We need to add a couple bounds checks to prevent array overflows.
I also modified one existing check from "if (traffic_class > 3) {" to "if (traffic_class >= WMM_NUM_AC) {" just to make them all consistent.
Fixes: bdcd81707973 (" Add ath6kl cleaned up driver") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath6kl/wmi.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath6kl/wmi.c b/drivers/net/wireless/ath/ath6kl/wmi.c index 68854c45d0a4..9ab6aa9ded5c 100644 --- a/drivers/net/wireless/ath/ath6kl/wmi.c +++ b/drivers/net/wireless/ath/ath6kl/wmi.c @@ -1176,6 +1176,10 @@ static int ath6kl_wmi_pstream_timeout_event_rx(struct wmi *wmi, u8 *datap, return -EINVAL;
ev = (struct wmi_pstream_timeout_event *) datap; + if (ev->traffic_class >= WMM_NUM_AC) { + ath6kl_err("invalid traffic class: %d\n", ev->traffic_class); + return -EINVAL; + }
/* * When the pstream (fat pipe == AC) timesout, it means there were @@ -1517,6 +1521,10 @@ static int ath6kl_wmi_cac_event_rx(struct wmi *wmi, u8 *datap, int len, return -EINVAL;
reply = (struct wmi_cac_event *) datap; + if (reply->ac >= WMM_NUM_AC) { + ath6kl_err("invalid AC: %d\n", reply->ac); + return -EINVAL; + }
if ((reply->cac_indication == CAC_INDICATION_ADMISSION_RESP) && (reply->status_code != IEEE80211_TSPEC_STATUS_ADMISS_ACCEPTED)) { @@ -2633,7 +2641,7 @@ int ath6kl_wmi_delete_pstream_cmd(struct wmi *wmi, u8 if_idx, u8 traffic_class, u16 active_tsids = 0; int ret;
- if (traffic_class > 3) { + if (traffic_class >= WMM_NUM_AC) { ath6kl_err("invalid traffic class: %d\n", traffic_class); return -EINVAL; }
[ Upstream commit 49ed34b835e231aa941257394716bc689bc98d9f ]
For some SDIO chip, the peer id is 65535 for MPDU with error status, then test_bit will trigger buffer overflow for peer's memory, if kasan enabled, it will report error.
Reason is when station is in disconnecting status, firmware do not delete the peer info since it not disconnected completely, meanwhile some AP will still send data packet to station, then hardware will receive the packet and send to firmware, firmware's logic will report peer id of 65535 for MPDU with error status.
Add check for overflow the size of peer's peer_ids will avoid the buffer overflow access.
Call trace of kasan: dump_backtrace+0x0/0x2ec show_stack+0x20/0x2c __dump_stack+0x20/0x28 dump_stack+0xc8/0xec print_address_description+0x74/0x240 kasan_report+0x250/0x26c __asan_report_load8_noabort+0x20/0x2c ath10k_peer_find_by_id+0x180/0x1e4 [ath10k_core] ath10k_htt_t2h_msg_handler+0x100c/0x2fd4 [ath10k_core] ath10k_htt_htc_t2h_msg_handler+0x20/0x34 [ath10k_core] ath10k_sdio_irq_handler+0xcc8/0x1678 [ath10k_sdio] process_sdio_pending_irqs+0xec/0x370 sdio_run_irqs+0x68/0xe4 sdio_irq_work+0x1c/0x28 process_one_work+0x3d8/0x8b0 worker_thread+0x508/0x7cc kthread+0x24c/0x264 ret_from_fork+0x10/0x18
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00007-QCARMSWP-1.
Signed-off-by: Wen Gong wgong@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/txrx.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/wireless/ath/ath10k/txrx.c b/drivers/net/wireless/ath/ath10k/txrx.c index c5818d28f55a..4102df016931 100644 --- a/drivers/net/wireless/ath/ath10k/txrx.c +++ b/drivers/net/wireless/ath/ath10k/txrx.c @@ -150,6 +150,9 @@ struct ath10k_peer *ath10k_peer_find_by_id(struct ath10k *ar, int peer_id) { struct ath10k_peer *peer;
+ if (peer_id >= BITS_PER_TYPE(peer->peer_ids)) + return NULL; + lockdep_assert_held(&ar->data_lock);
list_for_each_entry(peer, &ar->peers, list)
[ Upstream commit e10b0eddd5235aa5aef4e40b970e34e735611a80 ]
Interrupt is set in ICM (ICR & ~IMV) rising trigger. As the driver masks the IRQ after clearing it, there can be a race where an additional spurious interrupt is triggered when the driver unmask the IRQ. This can happen in case HW triggers an interrupt after the clear and before the mask.
To prevent the second spurious interrupt the driver needs to mask the IRQ before reading and clearing it.
Signed-off-by: Maya Erez merez@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/wil6210/interrupt.c | 65 ++++++++++++-------- 1 file changed, 40 insertions(+), 25 deletions(-)
diff --git a/drivers/net/wireless/ath/wil6210/interrupt.c b/drivers/net/wireless/ath/wil6210/interrupt.c index e41ba24011d8..b00a13d6d530 100644 --- a/drivers/net/wireless/ath/wil6210/interrupt.c +++ b/drivers/net/wireless/ath/wil6210/interrupt.c @@ -296,21 +296,24 @@ void wil_configure_interrupt_moderation(struct wil6210_priv *wil) static irqreturn_t wil6210_irq_rx(int irq, void *cookie) { struct wil6210_priv *wil = cookie; - u32 isr = wil_ioread32_and_clear(wil->csr + - HOSTADDR(RGF_DMA_EP_RX_ICR) + - offsetof(struct RGF_ICR, ICR)); + u32 isr; bool need_unmask = true;
+ wil6210_mask_irq_rx(wil); + + isr = wil_ioread32_and_clear(wil->csr + + HOSTADDR(RGF_DMA_EP_RX_ICR) + + offsetof(struct RGF_ICR, ICR)); + trace_wil6210_irq_rx(isr); wil_dbg_irq(wil, "ISR RX 0x%08x\n", isr);
if (unlikely(!isr)) { wil_err_ratelimited(wil, "spurious IRQ: RX\n"); + wil6210_unmask_irq_rx(wil); return IRQ_NONE; }
- wil6210_mask_irq_rx(wil); - /* RX_DONE and RX_HTRSH interrupts are the same if interrupt * moderation is not used. Interrupt moderation may cause RX * buffer overflow while RX_DONE is delayed. The required @@ -355,21 +358,24 @@ static irqreturn_t wil6210_irq_rx(int irq, void *cookie) static irqreturn_t wil6210_irq_rx_edma(int irq, void *cookie) { struct wil6210_priv *wil = cookie; - u32 isr = wil_ioread32_and_clear(wil->csr + - HOSTADDR(RGF_INT_GEN_RX_ICR) + - offsetof(struct RGF_ICR, ICR)); + u32 isr; bool need_unmask = true;
+ wil6210_mask_irq_rx_edma(wil); + + isr = wil_ioread32_and_clear(wil->csr + + HOSTADDR(RGF_INT_GEN_RX_ICR) + + offsetof(struct RGF_ICR, ICR)); + trace_wil6210_irq_rx(isr); wil_dbg_irq(wil, "ISR RX 0x%08x\n", isr);
if (unlikely(!isr)) { wil_err(wil, "spurious IRQ: RX\n"); + wil6210_unmask_irq_rx_edma(wil); return IRQ_NONE; }
- wil6210_mask_irq_rx_edma(wil); - if (likely(isr & BIT_RX_STATUS_IRQ)) { wil_dbg_irq(wil, "RX status ring\n"); isr &= ~BIT_RX_STATUS_IRQ; @@ -403,21 +409,24 @@ static irqreturn_t wil6210_irq_rx_edma(int irq, void *cookie) static irqreturn_t wil6210_irq_tx_edma(int irq, void *cookie) { struct wil6210_priv *wil = cookie; - u32 isr = wil_ioread32_and_clear(wil->csr + - HOSTADDR(RGF_INT_GEN_TX_ICR) + - offsetof(struct RGF_ICR, ICR)); + u32 isr; bool need_unmask = true;
+ wil6210_mask_irq_tx_edma(wil); + + isr = wil_ioread32_and_clear(wil->csr + + HOSTADDR(RGF_INT_GEN_TX_ICR) + + offsetof(struct RGF_ICR, ICR)); + trace_wil6210_irq_tx(isr); wil_dbg_irq(wil, "ISR TX 0x%08x\n", isr);
if (unlikely(!isr)) { wil_err(wil, "spurious IRQ: TX\n"); + wil6210_unmask_irq_tx_edma(wil); return IRQ_NONE; }
- wil6210_mask_irq_tx_edma(wil); - if (likely(isr & BIT_TX_STATUS_IRQ)) { wil_dbg_irq(wil, "TX status ring\n"); isr &= ~BIT_TX_STATUS_IRQ; @@ -446,21 +455,24 @@ static irqreturn_t wil6210_irq_tx_edma(int irq, void *cookie) static irqreturn_t wil6210_irq_tx(int irq, void *cookie) { struct wil6210_priv *wil = cookie; - u32 isr = wil_ioread32_and_clear(wil->csr + - HOSTADDR(RGF_DMA_EP_TX_ICR) + - offsetof(struct RGF_ICR, ICR)); + u32 isr; bool need_unmask = true;
+ wil6210_mask_irq_tx(wil); + + isr = wil_ioread32_and_clear(wil->csr + + HOSTADDR(RGF_DMA_EP_TX_ICR) + + offsetof(struct RGF_ICR, ICR)); + trace_wil6210_irq_tx(isr); wil_dbg_irq(wil, "ISR TX 0x%08x\n", isr);
if (unlikely(!isr)) { wil_err_ratelimited(wil, "spurious IRQ: TX\n"); + wil6210_unmask_irq_tx(wil); return IRQ_NONE; }
- wil6210_mask_irq_tx(wil); - if (likely(isr & BIT_DMA_EP_TX_ICR_TX_DONE)) { wil_dbg_irq(wil, "TX done\n"); isr &= ~BIT_DMA_EP_TX_ICR_TX_DONE; @@ -532,20 +544,23 @@ static bool wil_validate_mbox_regs(struct wil6210_priv *wil) static irqreturn_t wil6210_irq_misc(int irq, void *cookie) { struct wil6210_priv *wil = cookie; - u32 isr = wil_ioread32_and_clear(wil->csr + - HOSTADDR(RGF_DMA_EP_MISC_ICR) + - offsetof(struct RGF_ICR, ICR)); + u32 isr; + + wil6210_mask_irq_misc(wil, false); + + isr = wil_ioread32_and_clear(wil->csr + + HOSTADDR(RGF_DMA_EP_MISC_ICR) + + offsetof(struct RGF_ICR, ICR));
trace_wil6210_irq_misc(isr); wil_dbg_irq(wil, "ISR MISC 0x%08x\n", isr);
if (!isr) { wil_err(wil, "spurious IRQ: MISC\n"); + wil6210_unmask_irq_misc(wil, false); return IRQ_NONE; }
- wil6210_mask_irq_misc(wil, false); - if (isr & ISR_MISC_FW_ERROR) { u32 fw_assert_code = wil_r(wil, wil->rgf_fw_assert_code_addr); u32 ucode_assert_code =
[ Upstream commit d8792393a783158cbb2c39939cb897dc5e5299b6 ]
Increase pulse width range from 1-2usec to 0-4usec. During data traffic HW occasionally fails detecting radar pulses, so that SW cannot get enough radar reports to achieve the success rate.
Tested ath10k hw and fw: * QCA9888(10.4-3.5.1-00052) * QCA4019(10.4-3.2.1.1-00017) * QCA9984(10.4-3.6-00104) * QCA988X(10.2.4-1.0-00041)
Tested ath9k hw: AR9300
Tested-by: Tamizh chelvam tamizhr@codeaurora.org Signed-off-by: Tamizh chelvam tamizhr@codeaurora.org Signed-off-by: Anilkumar Kolli akolli@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/dfs_pattern_detector.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/dfs_pattern_detector.c b/drivers/net/wireless/ath/dfs_pattern_detector.c index d52b31b45df7..a274eb0d1968 100644 --- a/drivers/net/wireless/ath/dfs_pattern_detector.c +++ b/drivers/net/wireless/ath/dfs_pattern_detector.c @@ -111,7 +111,7 @@ static const struct radar_detector_specs jp_radar_ref_types[] = { JP_PATTERN(0, 0, 1, 1428, 1428, 1, 18, 29, false), JP_PATTERN(1, 2, 3, 3846, 3846, 1, 18, 29, false), JP_PATTERN(2, 0, 1, 1388, 1388, 1, 18, 50, false), - JP_PATTERN(3, 1, 2, 4000, 4000, 1, 18, 50, false), + JP_PATTERN(3, 0, 4, 4000, 4000, 1, 18, 50, false), JP_PATTERN(4, 0, 5, 150, 230, 1, 23, 50, false), JP_PATTERN(5, 6, 10, 200, 500, 1, 16, 50, false), JP_PATTERN(6, 11, 20, 200, 500, 1, 12, 50, false),
[ Upstream commit 42f1bc43e6a97b9ddbe976eba9bd05306c990c75 ]
Currently the protected management frames are not appended with the MIC_LEN which results in the protected management frames being encoded incorrectly.
Add the extra space at the end of the protected management frames to fix this encoding error for the protected management frames.
Tested HW: WCN3990 Tested FW: WLAN.HL.3.1-00784-QCAHLSWMTPLZ-1
Fixes: 1807da49733e ("ath10k: wmi: add management tx by reference support over wmi") Signed-off-by: Rakesh Pillai pillair@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/wmi-tlv.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath10k/wmi-tlv.c b/drivers/net/wireless/ath/ath10k/wmi-tlv.c index 582fb11f648a..02709fc99034 100644 --- a/drivers/net/wireless/ath/ath10k/wmi-tlv.c +++ b/drivers/net/wireless/ath/ath10k/wmi-tlv.c @@ -2840,8 +2840,10 @@ ath10k_wmi_tlv_op_gen_mgmt_tx_send(struct ath10k *ar, struct sk_buff *msdu, if ((ieee80211_is_action(hdr->frame_control) || ieee80211_is_deauth(hdr->frame_control) || ieee80211_is_disassoc(hdr->frame_control)) && - ieee80211_has_protected(hdr->frame_control)) + ieee80211_has_protected(hdr->frame_control)) { + skb_put(msdu, IEEE80211_CCMP_MIC_LEN); buf_len += IEEE80211_CCMP_MIC_LEN; + }
buf_len = min_t(u32, buf_len, WMI_TLV_MGMT_TX_FRAME_MAX_LEN); buf_len = round_up(buf_len, 4);
[ Upstream commit 2899872b627e99b7586fe3b6c9f861da1b4d5072 ]
As detected by kmemleak running on i.MX6ULL board:
nreferenced object 0xd8366600 (size 64): comm "swapper/0", pid 1, jiffies 4294937370 (age 933.220s) hex dump (first 32 bytes): 64 75 6d 6d 79 2d 69 6f 6d 75 78 63 2d 67 70 72 dummy-iomuxc-gpr 40 32 30 65 34 30 30 30 00 e3 f3 ab fe d1 1b dd @20e4000........ backtrace: [<b0402aec>] kasprintf+0x2c/0x54 [<a6fbad2c>] regmap_debugfs_init+0x7c/0x31c [<9c8d91fa>] __regmap_init+0xb5c/0xcf4 [<5b1c3d2a>] of_syscon_register+0x164/0x2c4 [<596a5d80>] syscon_node_to_regmap+0x64/0x90 [<49bd597b>] imx6ul_init_machine+0x34/0xa0 [<250a4dac>] customize_machine+0x1c/0x30 [<2d19fdaf>] do_one_initcall+0x7c/0x398 [<e6084469>] kernel_init_freeable+0x328/0x448 [<168c9101>] kernel_init+0x8/0x114 [<913268aa>] ret_from_fork+0x14/0x20 [<ce7b131a>] 0x0
Root cause is that map->debugfs_name is allocated using kasprintf and then the pointer is lost by assigning it other memory address.
Reported-by: Stefan Wahren stefan.wahren@i2se.com Signed-off-by: Daniel Baluta daniel.baluta@nxp.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/regmap/regmap-debugfs.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/base/regmap/regmap-debugfs.c b/drivers/base/regmap/regmap-debugfs.c index 263f82516ff4..e5e1b3a01b1a 100644 --- a/drivers/base/regmap/regmap-debugfs.c +++ b/drivers/base/regmap/regmap-debugfs.c @@ -579,6 +579,8 @@ void regmap_debugfs_init(struct regmap *map, const char *name) }
if (!strcmp(name, "dummy")) { + kfree(map->debugfs_name); + map->debugfs_name = kasprintf(GFP_KERNEL, "dummy%d", dummy_index); name = map->debugfs_name;
[ Upstream commit 17f78dd1bd624a4dd78ed5db3284a63ee807fcc3 ]
A handler for BATADV_TVLV_ROAM was being registered when the translation-table was initialized, but not unregistered when the translation-table was freed. Unregister it.
Fixes: 122edaa05940 ("batman-adv: tvlv - convert roaming adv packet to use tvlv unicast packets") Reported-by: syzbot+d454a826e670502484b8@syzkaller.appspotmail.com Signed-off-by: Jeremy Sowden jeremy@azazel.net Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/batman-adv/translation-table.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c index 1ddfd5e011ee..8a482c5ec67b 100644 --- a/net/batman-adv/translation-table.c +++ b/net/batman-adv/translation-table.c @@ -3813,6 +3813,8 @@ static void batadv_tt_purge(struct work_struct *work) */ void batadv_tt_free(struct batadv_priv *bat_priv) { + batadv_tvlv_handler_unregister(bat_priv, BATADV_TVLV_ROAM, 1); + batadv_tvlv_container_unregister(bat_priv, BATADV_TVLV_TT, 1); batadv_tvlv_handler_unregister(bat_priv, BATADV_TVLV_TT, 1);
[ Upstream commit 6cf97230cd5f36b7665099083272595c55d72be7 ]
dvb_usb_device_exit() frees and uses the device name in that order. Fix by storing the name in a buffer before freeing it.
Signed-off-by: Oliver Neukum oneukum@suse.com Reported-by: syzbot+26ec41e9f788b3eba396@syzkaller.appspotmail.com Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/dvb-usb/dvb-usb-init.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/media/usb/dvb-usb/dvb-usb-init.c b/drivers/media/usb/dvb-usb/dvb-usb-init.c index e97f6edc98de..65f2b1a20ca1 100644 --- a/drivers/media/usb/dvb-usb/dvb-usb-init.c +++ b/drivers/media/usb/dvb-usb/dvb-usb-init.c @@ -284,12 +284,15 @@ EXPORT_SYMBOL(dvb_usb_device_init); void dvb_usb_device_exit(struct usb_interface *intf) { struct dvb_usb_device *d = usb_get_intfdata(intf); - const char *name = "generic DVB-USB module"; + const char *default_name = "generic DVB-USB module"; + char name[40];
usb_set_intfdata(intf, NULL); if (d != NULL && d->desc != NULL) { - name = d->desc->name; + strscpy(name, d->desc->name, sizeof(name)); dvb_usb_exit(d); + } else { + strscpy(name, default_name, sizeof(name)); } info("%s successfully deinitialized and disconnected.", name);
[ Upstream commit 24e4cf770371df6ad49ed873f21618d9878f64c8 ]
MODULE_DEVICE_TABLE(of, <of_match_table> should be called to complete DT OF mathing mechanism and register it.
Before this patch: modinfo drivers/media/rc/ir-spi.ko | grep alias
After this patch: modinfo drivers/media/rc/ir-spi.ko | grep alias alias: of:N*T*Cir-spi-ledC* alias: of:N*T*Cir-spi-led
Reported-by: Javier Martinez Canillas javier@dowhile0.org Signed-off-by: Daniel Gomez dagmcr@gmail.com Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/rc/ir-spi.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/media/rc/ir-spi.c b/drivers/media/rc/ir-spi.c index 66334e8d63ba..c58f2d38a458 100644 --- a/drivers/media/rc/ir-spi.c +++ b/drivers/media/rc/ir-spi.c @@ -161,6 +161,7 @@ static const struct of_device_id ir_spi_of_match[] = { { .compatible = "ir-spi-led" }, {}, }; +MODULE_DEVICE_TABLE(of, ir_spi_of_match);
static struct spi_driver ir_spi_driver = { .probe = ir_spi_probe,
[ Upstream commit 3e03e792865ae48b8cfc69a0b4d65f02f467389f ]
Selftests report the following:
[ 2.984845] alg: skcipher: cbc-aes-talitos encryption test failed (wrong output IV) on test vector 0, cfg="in-place" [ 2.995377] 00000000: 3d af ba 42 9d 9e b4 30 b4 22 da 80 2c 9f ac 41 [ 3.032673] alg: skcipher: cbc-des-talitos encryption test failed (wrong output IV) on test vector 0, cfg="in-place" [ 3.043185] 00000000: fe dc ba 98 76 54 32 10 [ 3.063238] alg: skcipher: cbc-3des-talitos encryption test failed (wrong output IV) on test vector 0, cfg="in-place" [ 3.073818] 00000000: 7d 33 88 93 0f 93 b2 42
This above dumps show that the actual output IV is indeed the input IV. This is due to the IV not being copied back into the request.
This patch fixes that.
Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Reviewed-by: Horia Geantă horia.geanta@nxp.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/talitos.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c index 8c57c5af0930..396199b2db7d 100644 --- a/drivers/crypto/talitos.c +++ b/drivers/crypto/talitos.c @@ -1606,11 +1606,15 @@ static void ablkcipher_done(struct device *dev, int err) { struct ablkcipher_request *areq = context; + struct crypto_ablkcipher *cipher = crypto_ablkcipher_reqtfm(areq); + struct talitos_ctx *ctx = crypto_ablkcipher_ctx(cipher); + unsigned int ivsize = crypto_ablkcipher_ivsize(cipher); struct talitos_edesc *edesc;
edesc = container_of(desc, struct talitos_edesc, desc);
common_nonsnoop_unmap(dev, edesc, areq); + memcpy(areq->info, ctx->iv, ivsize);
kfree(edesc);
[ Upstream commit 6e4ab830ac6d6a0d7cd7f87dc5d6536369bf24a8 ]
If the requested framesize by VIDIOC_SUBDEV_S_FMT is larger than supported framesizes, it causes an out of bounds array access and the resulting framesize is unexpected.
Avoid out of bounds array access and select the default framesize.
Cc: Wenyou Yang wenyou.yang@microchip.com Cc: Eugen Hristev eugen.hristev@microchip.com Signed-off-by: Akinobu Mita akinobu.mita@gmail.com Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/i2c/ov7740.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/media/i2c/ov7740.c b/drivers/media/i2c/ov7740.c index 54e80a60aa57..63011d4b4738 100644 --- a/drivers/media/i2c/ov7740.c +++ b/drivers/media/i2c/ov7740.c @@ -785,7 +785,11 @@ static int ov7740_try_fmt_internal(struct v4l2_subdev *sd,
fsize++; } - + if (i >= ARRAY_SIZE(ov7740_framesizes)) { + fsize = &ov7740_framesizes[0]; + fmt->width = fsize->width; + fmt->height = fsize->height; + } if (ret_frmsize != NULL) *ret_frmsize = fsize;
[ Upstream commit 0c7aa32966dab0b8a7424e1b34c7f206817953ec ]
The commit d790b7eda953 ("[media] vb2-dma-sg: move dma_(un)map_sg here") left dma_desc_nent unset. It previously contained the number of DMA descriptors as returned from dma_map_sg().
We can now (since the commit referred to above) obtain the same value from the sg_table and drop dma_desc_nent altogether.
Tested on OLPC XO-1.75 machine. Doesn't affect the OLPC XO-1's Cafe driver, since that one doesn't do DMA.
[mchehab+samsung@kernel.org: fix a checkpatch warning]
Fixes: d790b7eda953 ("[media] vb2-dma-sg: move dma_(un)map_sg here") Signed-off-by: Lubomir Rintel lkundrak@v3.sk Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/marvell-ccic/mcam-core.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/media/platform/marvell-ccic/mcam-core.c b/drivers/media/platform/marvell-ccic/mcam-core.c index f1b301810260..0a6411b877e9 100644 --- a/drivers/media/platform/marvell-ccic/mcam-core.c +++ b/drivers/media/platform/marvell-ccic/mcam-core.c @@ -200,7 +200,6 @@ struct mcam_vb_buffer { struct list_head queue; struct mcam_dma_desc *dma_desc; /* Descriptor virtual address */ dma_addr_t dma_desc_pa; /* Descriptor physical address */ - int dma_desc_nent; /* Number of mapped descriptors */ };
static inline struct mcam_vb_buffer *vb_to_mvb(struct vb2_v4l2_buffer *vb) @@ -608,9 +607,11 @@ static void mcam_dma_contig_done(struct mcam_camera *cam, int frame) static void mcam_sg_next_buffer(struct mcam_camera *cam) { struct mcam_vb_buffer *buf; + struct sg_table *sg_table;
buf = list_first_entry(&cam->buffers, struct mcam_vb_buffer, queue); list_del_init(&buf->queue); + sg_table = vb2_dma_sg_plane_desc(&buf->vb_buf.vb2_buf, 0); /* * Very Bad Not Good Things happen if you don't clear * C1_DESC_ENA before making any descriptor changes. @@ -618,7 +619,7 @@ static void mcam_sg_next_buffer(struct mcam_camera *cam) mcam_reg_clear_bit(cam, REG_CTRL1, C1_DESC_ENA); mcam_reg_write(cam, REG_DMA_DESC_Y, buf->dma_desc_pa); mcam_reg_write(cam, REG_DESC_LEN_Y, - buf->dma_desc_nent*sizeof(struct mcam_dma_desc)); + sg_table->nents * sizeof(struct mcam_dma_desc)); mcam_reg_write(cam, REG_DESC_LEN_U, 0); mcam_reg_write(cam, REG_DESC_LEN_V, 0); mcam_reg_set_bit(cam, REG_CTRL1, C1_DESC_ENA);
[ Upstream commit 7c0c6095d48dcd0e67c917aa73cdbb2715aafc36 ]
Adjust scale tests to check for new jmp sequence limit.
BPF_JGT had to be changed to BPF_JEQ because the verifier was too smart. It tracked the known safe range of R0 values and pruned the search earlier before hitting exact 8192 limit. bpf_semi_rand_get() was too (un)?lucky.
k = 0; was missing in bpf_fill_scale2. It was testing a bit shorter sequence of jumps than intended.
Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Andrii Nakryiko andriin@fb.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/bpf/test_verifier.c | 31 +++++++++++---------- 1 file changed, 17 insertions(+), 14 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 288cb740e005..6438d4dc8ae1 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -207,33 +207,35 @@ static void bpf_fill_rand_ld_dw(struct bpf_test *self) self->retval = (uint32_t)res; }
-/* test the sequence of 1k jumps */ +#define MAX_JMP_SEQ 8192 + +/* test the sequence of 8k jumps */ static void bpf_fill_scale1(struct bpf_test *self) { struct bpf_insn *insn = self->fill_insns; int i = 0, k = 0;
insn[i++] = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1); - /* test to check that the sequence of 1024 jumps is acceptable */ - while (k++ < 1024) { + /* test to check that the long sequence of jumps is acceptable */ + while (k++ < MAX_JMP_SEQ) { insn[i++] = BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_prandom_u32); - insn[i++] = BPF_JMP_IMM(BPF_JGT, BPF_REG_0, bpf_semi_rand_get(), 2); + insn[i++] = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, bpf_semi_rand_get(), 2); insn[i++] = BPF_MOV64_REG(BPF_REG_1, BPF_REG_10); insn[i++] = BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_6, -8 * (k % 64 + 1)); } - /* every jump adds 1024 steps to insn_processed, so to stay exactly - * within 1m limit add MAX_TEST_INSNS - 1025 MOVs and 1 EXIT + /* every jump adds 1 step to insn_processed, so to stay exactly + * within 1m limit add MAX_TEST_INSNS - MAX_JMP_SEQ - 1 MOVs and 1 EXIT */ - while (i < MAX_TEST_INSNS - 1025) + while (i < MAX_TEST_INSNS - MAX_JMP_SEQ - 1) insn[i++] = BPF_ALU32_IMM(BPF_MOV, BPF_REG_0, 42); insn[i] = BPF_EXIT_INSN(); self->prog_len = i + 1; self->retval = 42; }
-/* test the sequence of 1k jumps in inner most function (function depth 8)*/ +/* test the sequence of 8k jumps in inner most function (function depth 8)*/ static void bpf_fill_scale2(struct bpf_test *self) { struct bpf_insn *insn = self->fill_insns; @@ -245,19 +247,20 @@ static void bpf_fill_scale2(struct bpf_test *self) insn[i++] = BPF_EXIT_INSN(); } insn[i++] = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1); - /* test to check that the sequence of 1024 jumps is acceptable */ - while (k++ < 1024) { + /* test to check that the long sequence of jumps is acceptable */ + k = 0; + while (k++ < MAX_JMP_SEQ) { insn[i++] = BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_prandom_u32); - insn[i++] = BPF_JMP_IMM(BPF_JGT, BPF_REG_0, bpf_semi_rand_get(), 2); + insn[i++] = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, bpf_semi_rand_get(), 2); insn[i++] = BPF_MOV64_REG(BPF_REG_1, BPF_REG_10); insn[i++] = BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_6, -8 * (k % (64 - 4 * FUNC_NEST) + 1)); } - /* every jump adds 1024 steps to insn_processed, so to stay exactly - * within 1m limit add MAX_TEST_INSNS - 1025 MOVs and 1 EXIT + /* every jump adds 1 step to insn_processed, so to stay exactly + * within 1m limit add MAX_TEST_INSNS - MAX_JMP_SEQ - 1 MOVs and 1 EXIT */ - while (i < MAX_TEST_INSNS - 1025) + while (i < MAX_TEST_INSNS - MAX_JMP_SEQ - 1) insn[i++] = BPF_ALU32_IMM(BPF_MOV, BPF_REG_0, 42); insn[i] = BPF_EXIT_INSN(); self->prog_len = i + 1;
[ Upstream commit e08f0761234def47961d3252eac09ccedfe4c6a0 ]
In case ioremap fails, the fix returns -ENOMEM to avoid NULL pointer dereference.
Signed-off-by: Kangjie Lu kjlu@umn.edu Acked-by: Lad, Prabhakar prabhakar.csengg@gmail.com Reviewed-by: Mukesh Ojha mojha@codeaurora.org Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/davinci/vpss.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/media/platform/davinci/vpss.c b/drivers/media/platform/davinci/vpss.c index 3f079ac1b080..be91b0c7d20b 100644 --- a/drivers/media/platform/davinci/vpss.c +++ b/drivers/media/platform/davinci/vpss.c @@ -509,6 +509,11 @@ static int __init vpss_init(void) return -EBUSY;
oper_cfg.vpss_regs_base2 = ioremap(VPSS_CLK_CTRL, 4); + if (unlikely(!oper_cfg.vpss_regs_base2)) { + release_mem_region(VPSS_CLK_CTRL, 4); + return -ENOMEM; + } + writel(VPSS_CLK_CTRL_VENCCLKEN | VPSS_CLK_CTRL_DACCLKEN, oper_cfg.vpss_regs_base2);
[ Upstream commit f49308878d7202e07d8761238e01bd0e5fce2750 ]
In v4l2-compliance utility, test MEDIA_IOC_ENUM_ENTITIES will check whether reserved field of media_links_enum filled with zero.
However, for 32 bit program, the reserved field is missing copy from kernel space to user space in media_device_enum_links32 function.
This patch adds the cleaning a reserved field logic in media_device_enum_links32 function.
Signed-off-by: Jungo Lin jungo.lin@mediatek.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/media-device.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/media/media-device.c b/drivers/media/media-device.c index 9ae481ddd975..b9bb4904bba1 100644 --- a/drivers/media/media-device.c +++ b/drivers/media/media-device.c @@ -494,6 +494,7 @@ static long media_device_enum_links32(struct media_device *mdev, { struct media_links_enum links; compat_uptr_t pads_ptr, links_ptr; + int ret;
memset(&links, 0, sizeof(links));
@@ -505,7 +506,13 @@ static long media_device_enum_links32(struct media_device *mdev, links.pads = compat_ptr(pads_ptr); links.links = compat_ptr(links_ptr);
- return media_device_enum_links(mdev, &links); + ret = media_device_enum_links(mdev, &links); + if (ret) + return ret; + + memset(ulinks->reserved, 0, sizeof(ulinks->reserved)); + + return 0; }
#define MEDIA_IOC_ENUM_LINKS32 _IOWR('|', 0x02, struct media_links_enum32)
[ Upstream commit 72f9c2039859e6303550f202d6cc6b8d8af0178c ]
Currently if ice_reset_all_vfs() fails in ice_alloc_vfs() we fail to free some resources, reset variables, and return an error value. Fix this by adding another unroll case to free the pf->vf array, set the pf->num_alloc_vfs to 0, and return an error code.
Without this, if ice_reset_all_vfs() fails in ice_alloc_vfs() we will not be able to do SRIOV without hard rebooting the system because rmmod'ing the driver does not work.
Signed-off-by: Brett Creeley brett.creeley@intel.com Signed-off-by: Anirudh Venkataramanan anirudh.venkataramanan@intel.com Tested-by: Andrew Bowers andrewx.bowers@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c index a805cbdd69be..81ea77978355 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c @@ -1134,7 +1134,7 @@ static int ice_alloc_vfs(struct ice_pf *pf, u16 num_alloc_vfs) GFP_KERNEL); if (!vfs) { ret = -ENOMEM; - goto err_unroll_sriov; + goto err_pci_disable_sriov; } pf->vf = vfs;
@@ -1154,12 +1154,19 @@ static int ice_alloc_vfs(struct ice_pf *pf, u16 num_alloc_vfs) pf->num_alloc_vfs = num_alloc_vfs;
/* VF resources get allocated during reset */ - if (!ice_reset_all_vfs(pf, true)) + if (!ice_reset_all_vfs(pf, true)) { + ret = -EIO; goto err_unroll_sriov; + }
goto err_unroll_intr;
err_unroll_sriov: + pf->vf = NULL; + devm_kfree(&pf->pdev->dev, vfs); + vfs = NULL; + pf->num_alloc_vfs = 0; +err_pci_disable_sriov: pci_disable_sriov(pf->pdev); err_unroll_intr: /* rearm interrupts here */
[ Upstream commit 2c41cc0be07b5ee2f1167f41cd8a86fc5b53d82c ]
The call to of_parse_phandle returns a node pointer with refcount incremented thus it must be explicitly decremented after the last usage.
Detected by coccinelle with the following warnings: drivers/media/platform/qcom/venus/firmware.c:90:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 82, but without a corresponding object release within this function. drivers/media/platform/qcom/venus/firmware.c:94:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 82, but without a corresponding object release within this function. drivers/media/platform/qcom/venus/firmware.c:128:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 82, but without a corresponding object release within this function.
Signed-off-by: Wen Yang wen.yang99@zte.com.cn Acked-by: Stanimir Varbanov stanimir.varbanov@linaro.org Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/qcom/venus/firmware.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/media/platform/qcom/venus/firmware.c b/drivers/media/platform/qcom/venus/firmware.c index 1eba23409ff3..d3d1748a7ef6 100644 --- a/drivers/media/platform/qcom/venus/firmware.c +++ b/drivers/media/platform/qcom/venus/firmware.c @@ -78,11 +78,11 @@ static int venus_load_fw(struct venus_core *core, const char *fwname,
ret = of_address_to_resource(node, 0, &r); if (ret) - return ret; + goto err_put_node;
ret = request_firmware(&mdt, fwname, dev); if (ret < 0) - return ret; + goto err_put_node;
fw_size = qcom_mdt_get_size(mdt); if (fw_size < 0) { @@ -116,6 +116,8 @@ static int venus_load_fw(struct venus_core *core, const char *fwname, memunmap(mem_va); err_release_fw: release_firmware(mdt); +err_put_node: + of_node_put(node); return ret; }
[ Upstream commit dcd9c76e5a183af4f793beb5141efcd260b8d09f ]
When enabling IOMMU support, the following issue becomes visible in the AEAD zero-length case.
Even though the output sequence length is set to zero, the crypto engine tries to prefetch 4 S/G table entries (since SGF bit is set in SEQ OUT PTR command - which is either generated in SW in case of caam/jr or in HW in case of caam/qi, caam/qi2). The DMA read operation will trigger an IOMMU fault since the address in the SEQ OUT PTR is "dummy" (set to zero / not obtained via DMA API mapping).
1. In case of caam/jr, avoid the IOMMU fault by clearing the SGF bit in SEQ OUT PTR command.
2. In case of caam/qi - setting address, bpid, length to zero for output entry in the compound frame has a special meaning (cf. CAAM RM): "Output frame = Unspecified, Input address = Y. A unspecified frame is indicated by an unused SGT entry (an entry in which the Address, Length, and BPID fields are all zero). SEC obtains output buffers from BMan as prescribed by the preheader."
Since no output buffers are needed, modify the preheader by setting (ABS = 1, ADDBUF = 0): -"ABS = 1 means obtain the number of buffers in ADDBUF (0 or 1) from the pool POOL ID" -ADDBUF: "If ABS is set, ADD BUF specifies whether to allocate a buffer or not"
3. In case of caam/qi2, since engine: -does not support FLE[FMT]=2'b11 ("unused" entry) mentioned in DPAA2 RM -requires output entry to be present, even if not used the solution chosen is to leave output frame list entry zeroized.
Fixes: 763069ba49d3 ("crypto: caam - handle zero-length AEAD output") Signed-off-by: Horia Geantă horia.geanta@nxp.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/caam/caamalg.c | 1 + drivers/crypto/caam/caamalg_qi.c | 2 +- drivers/crypto/caam/caamalg_qi2.c | 9 +++++++++ drivers/crypto/caam/qi.c | 3 +++ 4 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c index c0ece44f303b..df416e6c1468 100644 --- a/drivers/crypto/caam/caamalg.c +++ b/drivers/crypto/caam/caamalg.c @@ -1106,6 +1106,7 @@ static void init_aead_job(struct aead_request *req, if (unlikely(req->src != req->dst)) { if (!edesc->mapped_dst_nents) { dst_dma = 0; + out_options = 0; } else if (edesc->mapped_dst_nents == 1) { dst_dma = sg_dma_address(req->dst); out_options = 0; diff --git a/drivers/crypto/caam/caamalg_qi.c b/drivers/crypto/caam/caamalg_qi.c index d290d6b41825..116cbc81fa8d 100644 --- a/drivers/crypto/caam/caamalg_qi.c +++ b/drivers/crypto/caam/caamalg_qi.c @@ -1109,7 +1109,7 @@ static struct aead_edesc *aead_edesc_alloc(struct aead_request *req, dma_to_qm_sg_one_ext(&fd_sgt[0], qm_sg_dma + (1 + !!ivsize) * sizeof(*sg_table), out_len, 0); - } else if (mapped_dst_nents == 1) { + } else if (mapped_dst_nents <= 1) { dma_to_qm_sg_one(&fd_sgt[0], sg_dma_address(req->dst), out_len, 0); } else { diff --git a/drivers/crypto/caam/caamalg_qi2.c b/drivers/crypto/caam/caamalg_qi2.c index 2b2980a8a9b9..b949944c8e55 100644 --- a/drivers/crypto/caam/caamalg_qi2.c +++ b/drivers/crypto/caam/caamalg_qi2.c @@ -559,6 +559,14 @@ static struct aead_edesc *aead_edesc_alloc(struct aead_request *req, dpaa2_fl_set_addr(out_fle, qm_sg_dma + (1 + !!ivsize) * sizeof(*sg_table)); } + } else if (!mapped_dst_nents) { + /* + * crypto engine requires the output entry to be present when + * "frame list" FD is used. + * Since engine does not support FMT=2'b11 (unused entry type), + * leaving out_fle zeroized is the best option. + */ + goto skip_out_fle; } else if (mapped_dst_nents == 1) { dpaa2_fl_set_format(out_fle, dpaa2_fl_single); dpaa2_fl_set_addr(out_fle, sg_dma_address(req->dst)); @@ -570,6 +578,7 @@ static struct aead_edesc *aead_edesc_alloc(struct aead_request *req,
dpaa2_fl_set_len(out_fle, out_len);
+skip_out_fle: return edesc; }
diff --git a/drivers/crypto/caam/qi.c b/drivers/crypto/caam/qi.c index 9f08f84cca59..2d9b0485141f 100644 --- a/drivers/crypto/caam/qi.c +++ b/drivers/crypto/caam/qi.c @@ -18,6 +18,7 @@ #include "desc_constr.h"
#define PREHDR_RSLS_SHIFT 31 +#define PREHDR_ABS BIT(25)
/* * Use a reasonable backlog of frames (per CPU) as congestion threshold, @@ -346,6 +347,7 @@ int caam_drv_ctx_update(struct caam_drv_ctx *drv_ctx, u32 *sh_desc) */ drv_ctx->prehdr[0] = cpu_to_caam32((1 << PREHDR_RSLS_SHIFT) | num_words); + drv_ctx->prehdr[1] = cpu_to_caam32(PREHDR_ABS); memcpy(drv_ctx->sh_desc, sh_desc, desc_bytes(sh_desc)); dma_sync_single_for_device(qidev, drv_ctx->context_a, sizeof(drv_ctx->sh_desc) + @@ -401,6 +403,7 @@ struct caam_drv_ctx *caam_drv_ctx_init(struct device *qidev, */ drv_ctx->prehdr[0] = cpu_to_caam32((1 << PREHDR_RSLS_SHIFT) | num_words); + drv_ctx->prehdr[1] = cpu_to_caam32(PREHDR_ABS); memcpy(drv_ctx->sh_desc, sh_desc, desc_bytes(sh_desc)); size = sizeof(drv_ctx->prehdr) + sizeof(drv_ctx->sh_desc); hwdesc = dma_map_single(qidev, drv_ctx->prehdr, size,
[ Upstream commit 9463c445590091202659cdfdd44b236acadfbd84 ]
In case we don't use a given address entry we need to clear it because it could contain previous values that are no longer valid.
Found out while running stmmac selftests.
Signed-off-by: Jose Abreu joabreu@synopsys.com Cc: Joao Pinto jpinto@synopsys.com Cc: David S. Miller davem@davemloft.net Cc: Giuseppe Cavallaro peppe.cavallaro@st.com Cc: Alexandre Torgue alexandre.torgue@st.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c index 9fff81170163..54f4ffb36d60 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c @@ -206,6 +206,12 @@ static void dwmac1000_set_filter(struct mac_device_info *hw, GMAC_ADDR_LOW(reg)); reg++; } + + while (reg <= perfect_addr_number) { + writel(0, ioaddr + GMAC_ADDR_HIGH(reg)); + writel(0, ioaddr + GMAC_ADDR_LOW(reg)); + reg++; + } }
#ifdef FRAME_FILTER_DEBUG
[ Upstream commit 0620ec6c62a5a07625b65f699adc5d1b90394ee6 ]
In case we don't use a given address entry we need to clear it because it could contain previous values that are no longer valid.
Found out while running stmmac selftests.
Signed-off-by: Jose Abreu joabreu@synopsys.com Cc: Joao Pinto jpinto@synopsys.com Cc: David S. Miller davem@davemloft.net Cc: Giuseppe Cavallaro peppe.cavallaro@st.com Cc: Alexandre Torgue alexandre.torgue@st.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c index 99d772517242..206170d0bf81 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c @@ -443,14 +443,20 @@ static void dwmac4_set_filter(struct mac_device_info *hw, * are required */ value |= GMAC_PACKET_FILTER_PR; - } else if (!netdev_uc_empty(dev)) { - int reg = 1; + } else { struct netdev_hw_addr *ha; + int reg = 1;
netdev_for_each_uc_addr(ha, dev) { dwmac4_set_umac_addr(hw, ha->addr, reg); reg++; } + + while (reg <= GMAC_MAX_PERFECT_ADDRESSES) { + writel(0, ioaddr + GMAC_ADDR_HIGH(reg)); + writel(0, ioaddr + GMAC_ADDR_LOW(reg)); + reg++; + } }
writel(value, ioaddr + GMAC_PACKET_FILTER);
[ Upstream commit a976ca79e23f13bff79c14e7266cea4a0ea51e67 ]
When we trigger NAPI we are disabling interrupts but in case we receive or send a packet in the meantime, as interrupts are disabled, we will miss this event.
Trigger both NAPI instances (RX and TX) when at least one event happens so that we don't miss any interrupts.
Signed-off-by: Jose Abreu joabreu@synopsys.com Cc: Joao Pinto jpinto@synopsys.com Cc: David S. Miller davem@davemloft.net Cc: Giuseppe Cavallaro peppe.cavallaro@st.com Cc: Alexandre Torgue alexandre.torgue@st.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 06358fe5b245..dbee9b0113e3 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -2048,6 +2048,9 @@ static int stmmac_napi_check(struct stmmac_priv *priv, u32 chan) &priv->xstats, chan); struct stmmac_channel *ch = &priv->channel[chan];
+ if (status) + status |= handle_rx | handle_tx; + if ((status & handle_rx) && (chan < priv->plat->rx_queues_to_use)) { stmmac_disable_dma_irq(priv, priv->ioaddr, chan); napi_schedule_irqoff(&ch->rx_napi);
[ Upstream commit aa6ccf3f2d7042f94c4e91538956ce7051e7856e ]
Currently the driver is calling ice_napi_del() and then unregister_netdev(). The call to unregister_netdev() will result in a call to ice_stop() and then ice_vsi_close(). This is where we call napi_disable() for all the MSI-X vectors. This flow is reversed so make the changes to ensure napi_disable() happens prior to napi_del().
Before calling napi_del() and free_netdev() make sure unregister_netdev() was called. This is done by making sure the __ICE_DOWN bit is set in the vsi->state for the interested VSI.
Signed-off-by: Brett Creeley brett.creeley@intel.com Signed-off-by: Anirudh Venkataramanan anirudh.venkataramanan@intel.com Tested-by: Andrew Bowers andrewx.bowers@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice.h | 1 - drivers/net/ethernet/intel/ice/ice_lib.c | 24 ++++++++++++----------- drivers/net/ethernet/intel/ice/ice_main.c | 2 +- 3 files changed, 14 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index 792e6e42030e..754c7080c3fc 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -451,7 +451,6 @@ int ice_set_rss(struct ice_vsi *vsi, u8 *seed, u8 *lut, u16 lut_size); int ice_get_rss(struct ice_vsi *vsi, u8 *seed, u8 *lut, u16 lut_size); void ice_fill_rss_lut(u8 *lut, u16 rss_table_size, u16 rss_size); void ice_print_link_msg(struct ice_vsi *vsi, bool isup); -void ice_napi_del(struct ice_vsi *vsi); #ifdef CONFIG_DCB int ice_pf_ena_all_vsi(struct ice_pf *pf, bool locked); void ice_pf_dis_all_vsi(struct ice_pf *pf, bool locked); diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index fbf1eba0cc2a..f14fa51cc704 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -2754,19 +2754,14 @@ int ice_vsi_release(struct ice_vsi *vsi)
if (vsi->type == ICE_VSI_VF) vf = &pf->vf[vsi->vf_id]; - /* do not unregister and free netdevs while driver is in the reset - * recovery pending state. Since reset/rebuild happens through PF - * service task workqueue, its not a good idea to unregister netdev - * that is associated to the PF that is running the work queue items - * currently. This is done to avoid check_flush_dependency() warning - * on this wq + /* do not unregister while driver is in the reset recovery pending + * state. Since reset/rebuild happens through PF service task workqueue, + * it's not a good idea to unregister netdev that is associated to the + * PF that is running the work queue items currently. This is done to + * avoid check_flush_dependency() warning on this wq */ - if (vsi->netdev && !ice_is_reset_in_progress(pf->state)) { - ice_napi_del(vsi); + if (vsi->netdev && !ice_is_reset_in_progress(pf->state)) unregister_netdev(vsi->netdev); - free_netdev(vsi->netdev); - vsi->netdev = NULL; - }
if (test_bit(ICE_FLAG_RSS_ENA, pf->flags)) ice_rss_clean(vsi); @@ -2799,6 +2794,13 @@ int ice_vsi_release(struct ice_vsi *vsi) ice_rm_vsi_lan_cfg(vsi->port_info, vsi->idx); ice_vsi_delete(vsi); ice_vsi_free_q_vectors(vsi); + + /* make sure unregister_netdev() was called by checking __ICE_DOWN */ + if (vsi->netdev && test_bit(__ICE_DOWN, vsi->state)) { + free_netdev(vsi->netdev); + vsi->netdev = NULL; + } + ice_vsi_clear_rings(vsi);
ice_vsi_put_qs(vsi); diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 7843abf4d44d..dbf3d39ad8b1 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -1667,7 +1667,7 @@ static int ice_req_irq_msix_misc(struct ice_pf *pf) * ice_napi_del - Remove NAPI handler for the VSI * @vsi: VSI for which NAPI handler is to be removed */ -void ice_napi_del(struct ice_vsi *vsi) +static void ice_napi_del(struct ice_vsi *vsi) { int v_idx;
[ Upstream commit c561da68038a738f30eca21456534c2d1872d13d ]
ethtool_rx_flow_rule_create takes into parameter the ethtool flow spec, which doesn't contain the rss context id. We therefore need to extract it ourself before parsing the ethtool rule.
The FLOW_RSS flag is only set in info->fs.flow_type, and not info->flow_type.
Signed-off-by: Maxime Chevallier maxime.chevallier@bootlin.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c index a57d17ab91f0..fb06c0aa620a 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c @@ -1242,6 +1242,12 @@ int mvpp2_ethtool_cls_rule_ins(struct mvpp2_port *port,
input.fs = &info->fs;
+ /* We need to manually set the rss_ctx, since this info isn't present + * in info->fs + */ + if (info->fs.flow_type & FLOW_RSS) + input.rss_ctx = info->rss_context; + ethtool_rule = ethtool_rx_flow_rule_create(&input); if (IS_ERR(ethtool_rule)) { ret = PTR_ERR(ethtool_rule);
[ Upstream commit ffab9691bcb2fe2594f4c38bfceb4d9685b93b87 ]
Allocate CPU rmap and add entry for each irq. CPU rmap is used in aRFS to get the queue number of the rx completion interrupts.
In additional, remove the calling of irq_set_affinity_notifier() in hns3_nic_init_irq(), because we have registered notifier in irq_cpu_rmap_add() for each vector, otherwise it may cause use-after-free issue.
Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/hisilicon/hns3/hns3_enet.c | 77 ++++++++++++------- 1 file changed, 48 insertions(+), 29 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index f326805543a4..cd59c0cc636a 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -4,6 +4,9 @@ #include <linux/dma-mapping.h> #include <linux/etherdevice.h> #include <linux/interrupt.h> +#ifdef CONFIG_RFS_ACCEL +#include <linux/cpu_rmap.h> +#endif #include <linux/if_vlan.h> #include <linux/ip.h> #include <linux/ipv6.h> @@ -79,23 +82,6 @@ static irqreturn_t hns3_irq_handle(int irq, void *vector) return IRQ_HANDLED; }
-/* This callback function is used to set affinity changes to the irq affinity - * masks when the irq_set_affinity_notifier function is used. - */ -static void hns3_nic_irq_affinity_notify(struct irq_affinity_notify *notify, - const cpumask_t *mask) -{ - struct hns3_enet_tqp_vector *tqp_vectors = - container_of(notify, struct hns3_enet_tqp_vector, - affinity_notify); - - tqp_vectors->affinity_mask = *mask; -} - -static void hns3_nic_irq_affinity_release(struct kref *ref) -{ -} - static void hns3_nic_uninit_irq(struct hns3_nic_priv *priv) { struct hns3_enet_tqp_vector *tqp_vectors; @@ -107,8 +93,7 @@ static void hns3_nic_uninit_irq(struct hns3_nic_priv *priv) if (tqp_vectors->irq_init_flag != HNS3_VECTOR_INITED) continue;
- /* clear the affinity notifier and affinity mask */ - irq_set_affinity_notifier(tqp_vectors->vector_irq, NULL); + /* clear the affinity mask */ irq_set_affinity_hint(tqp_vectors->vector_irq, NULL);
/* release the irq resource */ @@ -161,12 +146,6 @@ static int hns3_nic_init_irq(struct hns3_nic_priv *priv) return ret; }
- tqp_vectors->affinity_notify.notify = - hns3_nic_irq_affinity_notify; - tqp_vectors->affinity_notify.release = - hns3_nic_irq_affinity_release; - irq_set_affinity_notifier(tqp_vectors->vector_irq, - &tqp_vectors->affinity_notify); irq_set_affinity_hint(tqp_vectors->vector_irq, &tqp_vectors->affinity_mask);
@@ -340,6 +319,40 @@ static void hns3_tqp_disable(struct hnae3_queue *tqp) hns3_write_dev(tqp, HNS3_RING_EN_REG, rcb_reg); }
+static void hns3_free_rx_cpu_rmap(struct net_device *netdev) +{ +#ifdef CONFIG_RFS_ACCEL + free_irq_cpu_rmap(netdev->rx_cpu_rmap); + netdev->rx_cpu_rmap = NULL; +#endif +} + +static int hns3_set_rx_cpu_rmap(struct net_device *netdev) +{ +#ifdef CONFIG_RFS_ACCEL + struct hns3_nic_priv *priv = netdev_priv(netdev); + struct hns3_enet_tqp_vector *tqp_vector; + int i, ret; + + if (!netdev->rx_cpu_rmap) { + netdev->rx_cpu_rmap = alloc_irq_cpu_rmap(priv->vector_num); + if (!netdev->rx_cpu_rmap) + return -ENOMEM; + } + + for (i = 0; i < priv->vector_num; i++) { + tqp_vector = &priv->tqp_vector[i]; + ret = irq_cpu_rmap_add(netdev->rx_cpu_rmap, + tqp_vector->vector_irq); + if (ret) { + hns3_free_rx_cpu_rmap(netdev); + return ret; + } + } +#endif + return 0; +} + static int hns3_nic_net_up(struct net_device *netdev) { struct hns3_nic_priv *priv = netdev_priv(netdev); @@ -351,11 +364,16 @@ static int hns3_nic_net_up(struct net_device *netdev) if (ret) return ret;
+ /* the device can work without cpu rmap, only aRFS needs it */ + ret = hns3_set_rx_cpu_rmap(netdev); + if (ret) + netdev_warn(netdev, "set rx cpu rmap fail, ret=%d!\n", ret); + /* get irq resource for all vectors */ ret = hns3_nic_init_irq(priv); if (ret) { netdev_err(netdev, "hns init irq failed! ret=%d\n", ret); - return ret; + goto free_rmap; }
clear_bit(HNS3_NIC_STATE_DOWN, &priv->state); @@ -384,7 +402,8 @@ static int hns3_nic_net_up(struct net_device *netdev) hns3_vector_disable(&priv->tqp_vector[j]);
hns3_nic_uninit_irq(priv); - +free_rmap: + hns3_free_rx_cpu_rmap(netdev); return ret; }
@@ -467,6 +486,8 @@ static void hns3_nic_net_down(struct net_device *netdev) if (ops->stop) ops->stop(priv->ae_handle);
+ hns3_free_rx_cpu_rmap(netdev); + /* free irq resources */ hns3_nic_uninit_irq(priv);
@@ -3331,8 +3352,6 @@ static void hns3_nic_uninit_vector_data(struct hns3_nic_priv *priv) hns3_free_vector_ring_chain(tqp_vector, &vector_ring_chain);
if (tqp_vector->irq_init_flag == HNS3_VECTOR_INITED) { - irq_set_affinity_notifier(tqp_vector->vector_irq, - NULL); irq_set_affinity_hint(tqp_vector->vector_irq, NULL); free_irq(tqp_vector->vector_irq, tqp_vector); tqp_vector->irq_init_flag = HNS3_VECTOR_NOT_INITED;
[ Upstream commit f438bfe9d4fe2e491505abfbf04d7c506e00d146 ]
The FEC capbility may be changed with port speed changes. Driver needs to read the active FEC mode, and update FEC capability when port speed changes.
Fixes: 7e6ec9148a1d ("net: hns3: add support for FEC encoding control") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index d3b1f8cb1155..4d9bcad26f06 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -2508,6 +2508,9 @@ static void hclge_update_link_status(struct hclge_dev *hdev)
static void hclge_update_port_capability(struct hclge_mac *mac) { + /* update fec ability by speed */ + hclge_convert_setting_fec(mac); + /* firmware can not identify back plane type, the media type * read from configuration can help deal it */ @@ -2580,6 +2583,10 @@ static int hclge_get_sfp_info(struct hclge_dev *hdev, struct hclge_mac *mac) mac->speed_ability = le32_to_cpu(resp->speed_ability); mac->autoneg = resp->autoneg; mac->support_autoneg = resp->autoneg_ability; + if (!resp->active_fec) + mac->fec_mode = 0; + else + mac->fec_mode = BIT(resp->active_fec); } else { mac->speed_type = QUERY_SFP_SPEED; }
[ Upstream commit 8366d520019f366fabd6c7a13032bdcd837e18d4 ]
In 100g mode the doorbell bar is united for both engines. Set the correct offset in the hwfn so that the doorbell returned for RoCE is in the affined hwfn.
Signed-off-by: Ariel Elior ariel.elior@marvell.com Signed-off-by: Denis Bolotin denis.bolotin@marvell.com Signed-off-by: Michal Kalderon michal.kalderon@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_dev.c | 29 ++++++++++++++-------- drivers/net/ethernet/qlogic/qed/qed_rdma.c | 2 +- 2 files changed, 19 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c b/drivers/net/ethernet/qlogic/qed/qed_dev.c index fccdb06fc5c5..8c40739e0d1b 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_dev.c +++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c @@ -3443,6 +3443,7 @@ static void qed_nvm_info_free(struct qed_hwfn *p_hwfn) static int qed_hw_prepare_single(struct qed_hwfn *p_hwfn, void __iomem *p_regview, void __iomem *p_doorbells, + u64 db_phys_addr, enum qed_pci_personality personality) { struct qed_dev *cdev = p_hwfn->cdev; @@ -3451,6 +3452,7 @@ static int qed_hw_prepare_single(struct qed_hwfn *p_hwfn, /* Split PCI bars evenly between hwfns */ p_hwfn->regview = p_regview; p_hwfn->doorbells = p_doorbells; + p_hwfn->db_phys_addr = db_phys_addr;
if (IS_VF(p_hwfn->cdev)) return qed_vf_hw_prepare(p_hwfn); @@ -3546,7 +3548,9 @@ int qed_hw_prepare(struct qed_dev *cdev, /* Initialize the first hwfn - will learn number of hwfns */ rc = qed_hw_prepare_single(p_hwfn, cdev->regview, - cdev->doorbells, personality); + cdev->doorbells, + cdev->db_phys_addr, + personality); if (rc) return rc;
@@ -3555,22 +3559,25 @@ int qed_hw_prepare(struct qed_dev *cdev, /* Initialize the rest of the hwfns */ if (cdev->num_hwfns > 1) { void __iomem *p_regview, *p_doorbell; - u8 __iomem *addr; + u64 db_phys_addr; + u32 offset;
/* adjust bar offset for second engine */ - addr = cdev->regview + - qed_hw_bar_size(p_hwfn, p_hwfn->p_main_ptt, - BAR_ID_0) / 2; - p_regview = addr; + offset = qed_hw_bar_size(p_hwfn, p_hwfn->p_main_ptt, + BAR_ID_0) / 2; + p_regview = cdev->regview + offset;
- addr = cdev->doorbells + - qed_hw_bar_size(p_hwfn, p_hwfn->p_main_ptt, - BAR_ID_1) / 2; - p_doorbell = addr; + offset = qed_hw_bar_size(p_hwfn, p_hwfn->p_main_ptt, + BAR_ID_1) / 2; + + p_doorbell = cdev->doorbells + offset; + + db_phys_addr = cdev->db_phys_addr + offset;
/* prepare second hw function */ rc = qed_hw_prepare_single(&cdev->hwfns[1], p_regview, - p_doorbell, personality); + p_doorbell, db_phys_addr, + personality);
/* in case of error, need to free the previously * initiliazed hwfn 0. diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c b/drivers/net/ethernet/qlogic/qed/qed_rdma.c index 7873d6dfd91f..13802b825d65 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c +++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c @@ -803,7 +803,7 @@ static int qed_rdma_add_user(void *rdma_cxt, dpi_start_offset + ((out_params->dpi) * p_hwfn->dpi_size));
- out_params->dpi_phys_addr = p_hwfn->cdev->db_phys_addr + + out_params->dpi_phys_addr = p_hwfn->db_phys_addr + dpi_start_offset + ((out_params->dpi) * p_hwfn->dpi_size);
[ Upstream commit f9070dc94542093fd516ae4ccea17ef46a4362c5 ]
The locking in force_sig_info is not prepared to deal with a task that exits or execs (as sighand may change). The is not a locking problem in force_sig as force_sig is only built to handle synchronous exceptions.
Further the function force_sig_info changes the signal state if the signal is ignored, or blocked or if SIGNAL_UNKILLABLE will prevent the delivery of the signal. The signal SIGKILL can not be ignored and can not be blocked and SIGNAL_UNKILLABLE won't prevent it from being delivered.
So using force_sig rather than send_sig for SIGKILL is confusing and pointless.
Because it won't impact the sending of the signal and and because using force_sig is wrong, replace force_sig with send_sig.
Cc: Daniel Lezcano daniel.lezcano@free.fr Cc: Serge Hallyn serge@hallyn.com Cc: Oleg Nesterov oleg@redhat.com Fixes: cf3f89214ef6 ("pidns: add reboot_pid_ns() to handle the reboot syscall") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/pid_namespace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index f54bc7cb6c2d..6d726cef241c 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c @@ -326,7 +326,7 @@ int reboot_pid_ns(struct pid_namespace *pid_ns, int cmd) }
read_lock(&tasklist_lock); - force_sig(SIGKILL, pid_ns->child_reaper); + send_sig(SIGKILL, pid_ns->child_reaper, 1); read_unlock(&tasklist_lock);
do_exit(0);
[ Upstream commit 72abe3bcf0911d69b46c1e8bdb5612675e0ac42c ]
The locking in force_sig_info is not prepared to deal with a task that exits or execs (as sighand may change). The is not a locking problem in force_sig as force_sig is only built to handle synchronous exceptions.
Further the function force_sig_info changes the signal state if the signal is ignored, or blocked or if SIGNAL_UNKILLABLE will prevent the delivery of the signal. The signal SIGKILL can not be ignored and can not be blocked and SIGNAL_UNKILLABLE won't prevent it from being delivered.
So using force_sig rather than send_sig for SIGKILL is confusing and pointless.
Because it won't impact the sending of the signal and and because using force_sig is wrong, replace force_sig with send_sig.
Cc: Namjae Jeon namjae.jeon@samsung.com Cc: Jeff Layton jlayton@primarydata.com Cc: Steve French smfrench@gmail.com Fixes: a5c3e1c725af ("Revert "cifs: No need to send SIGKILL to demux_thread during umount"") Fixes: e7ddee9037e7 ("cifs: disable sharing session and tcon and add new TCP sharing code") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/connect.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 8dd6637a3cbb..714a359c7c8d 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -2631,7 +2631,7 @@ cifs_put_tcp_session(struct TCP_Server_Info *server, int from_reconnect)
task = xchg(&server->tsk, NULL); if (task) - force_sig(SIGKILL, task); + send_sig(SIGKILL, task, 1); }
static struct TCP_Server_Info *
Note that this patch causes a regression (removing cifs module fails, due to unmount leaking a thread with this change).
We are testing a workaround to cifs.ko which would be needed if this patch were to be backported.
On Wed, Jul 24, 2019 at 2:26 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
[ Upstream commit 72abe3bcf0911d69b46c1e8bdb5612675e0ac42c ]
The locking in force_sig_info is not prepared to deal with a task that exits or execs (as sighand may change). The is not a locking problem in force_sig as force_sig is only built to handle synchronous exceptions.
Further the function force_sig_info changes the signal state if the signal is ignored, or blocked or if SIGNAL_UNKILLABLE will prevent the delivery of the signal. The signal SIGKILL can not be ignored and can not be blocked and SIGNAL_UNKILLABLE won't prevent it from being delivered.
So using force_sig rather than send_sig for SIGKILL is confusing and pointless.
Because it won't impact the sending of the signal and and because using force_sig is wrong, replace force_sig with send_sig.
Cc: Namjae Jeon namjae.jeon@samsung.com Cc: Jeff Layton jlayton@primarydata.com Cc: Steve French smfrench@gmail.com Fixes: a5c3e1c725af ("Revert "cifs: No need to send SIGKILL to demux_thread during umount"") Fixes: e7ddee9037e7 ("cifs: disable sharing session and tcon and add new TCP sharing code") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Sasha Levin sashal@kernel.org
fs/cifs/connect.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 8dd6637a3cbb..714a359c7c8d 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -2631,7 +2631,7 @@ cifs_put_tcp_session(struct TCP_Server_Info *server, int from_reconnect)
task = xchg(&server->tsk, NULL); if (task)
force_sig(SIGKILL, task);
send_sig(SIGKILL, task, 1);
}
static struct TCP_Server_Info *
2.20.1
On Wed, Jul 24, 2019 at 03:49:32PM -0500, Steve French wrote:
Note that this patch causes a regression (removing cifs module fails, due to unmount leaking a thread with this change).
We are testing a workaround to cifs.ko which would be needed if this patch were to be backported.
I've now dropped this from all of the stable queues. If you all figure this out, please let us know and we will be glad to queue this up, along with the fix.
thanks,
greg k-h
[ Upstream commit 7c80eb1c7e2b8420477fbc998971d62a648035d9 ]
In both functions, if pfkey_xfrm_policy2msg failed we leaked the newly allocated sk_buff. Free it on error.
Fixes: 55569ce256ce ("Fix conversion between IPSEC_MODE_xxx and XFRM_MODE_xxx.") Reported-by: syzbot+4f0529365f7f2208d9f0@syzkaller.appspotmail.com Signed-off-by: Jeremy Sowden jeremy@azazel.net Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/key/af_key.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/key/af_key.c b/net/key/af_key.c index a50dd6f34b91..fe5fc4bab7ee 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -2438,8 +2438,10 @@ static int key_pol_get_resp(struct sock *sk, struct xfrm_policy *xp, const struc goto out; } err = pfkey_xfrm_policy2msg(out_skb, xp, dir); - if (err < 0) + if (err < 0) { + kfree_skb(out_skb); goto out; + }
out_hdr = (struct sadb_msg *) out_skb->data; out_hdr->sadb_msg_version = hdr->sadb_msg_version; @@ -2690,8 +2692,10 @@ static int dump_sp(struct xfrm_policy *xp, int dir, int count, void *ptr) return PTR_ERR(out_skb);
err = pfkey_xfrm_policy2msg(out_skb, xp, dir); - if (err < 0) + if (err < 0) { + kfree_skb(out_skb); return err; + }
out_hdr = (struct sadb_msg *) out_skb->data; out_hdr->sadb_msg_version = pfk->dump.msg_version;
[ Upstream commit b38ff4075a80b4da5cb2202d7965332ca0efb213 ]
Family of src/dst can be different from family of selector src/dst. Use xfrm selector family to validate address prefix length, while verifying new sa from userspace.
Validated patch with this command: ip xfrm state add src 1.1.6.1 dst 1.1.6.2 proto esp spi 4260196 \ reqid 20004 mode tunnel aead "rfc4106(gcm(aes))" \ 0x1111016400000000000000000000000044440001 128 \ sel src 1011:1:4::2/128 sel dst 1021:1:4::2/128 dev Port5
Fixes: 07bf7908950a ("xfrm: Validate address prefix lengths in the xfrm selector.") Signed-off-by: Anirudh Gupta anirudh.gupta@sophos.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/xfrm_user.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 173477211e40..76ad7e201626 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -151,6 +151,22 @@ static int verify_newsa_info(struct xfrm_usersa_info *p,
err = -EINVAL; switch (p->family) { + case AF_INET: + break; + + case AF_INET6: +#if IS_ENABLED(CONFIG_IPV6) + break; +#else + err = -EAFNOSUPPORT; + goto out; +#endif + + default: + goto out; + } + + switch (p->sel.family) { case AF_INET: if (p->sel.prefixlen_d > 32 || p->sel.prefixlen_s > 32) goto out;
[ Upstream commit 20059cbbf981ca954be56f7963ae494d18e2dda1 ]
vim2m_device_release() will be called by video_unregister_device() to release various objects.
There are two double-free issue, 1. dev->m2m_dev will be freed twice in error_m2m path/vim2m_device_release 2. the error_v4l2 and error_free path in vim2m_probe() will release same objects, since vim2m_device_release has done.
Fixes: ea6c7e34f3b2 ("media: vim2m: replace devm_kzalloc by kzalloc")
Cc: Laurent Pinchart laurent.pinchart@ideasonboard.com Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/vim2m.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/media/platform/vim2m.c b/drivers/media/platform/vim2m.c index 243c82b5d537..acd3bd48c7e2 100644 --- a/drivers/media/platform/vim2m.c +++ b/drivers/media/platform/vim2m.c @@ -1359,7 +1359,7 @@ static int vim2m_probe(struct platform_device *pdev) MEDIA_ENT_F_PROC_VIDEO_SCALER); if (ret) { v4l2_err(&dev->v4l2_dev, "Failed to init mem2mem media controller\n"); - goto error_m2m; + goto error_dev; }
ret = media_device_register(&dev->mdev); @@ -1373,11 +1373,11 @@ static int vim2m_probe(struct platform_device *pdev) #ifdef CONFIG_MEDIA_CONTROLLER error_m2m_mc: v4l2_m2m_unregister_media_controller(dev->m2m_dev); -error_m2m: - v4l2_m2m_release(dev->m2m_dev); #endif error_dev: video_unregister_device(&dev->vfd); + /* vim2m_device_release called by video_unregister_device to release various objects */ + return ret; error_v4l2: v4l2_device_unregister(&dev->v4l2_dev); error_free:
[ Upstream commit 3e0f724346e96daae7792262c6767449795ac3b5 ]
Fixing use-after-free within __v4l2_ctrl_handler_setup(). Memory is being freed with kfree(new_ref) for duplicate control reference entry but ctrl->cluster pointer is still referring to freed duplicate entry resulting in error on access. Change done to update cluster pointer only when new control reference is added.
================================================================== BUG: KASAN: use-after-free in __v4l2_ctrl_handler_setup+0x388/0x428 Read of size 8 at addr ffffffc324e78618 by task systemd-udevd/312
Allocated by task 312:
Freed by task 312:
The buggy address belongs to the object at ffffffc324e78600 which belongs to the cache kmalloc-64 of size 64 The buggy address is located 24 bytes inside of 64-byte region [ffffffc324e78600, ffffffc324e78640) The buggy address belongs to the page: page:ffffffbf0c939e00 count:1 mapcount:0 mapping: (null) index:0xffffffc324e78f80 flags: 0x4000000000000100(slab) raw: 4000000000000100 0000000000000000 ffffffc324e78f80 000000018020001a raw: 0000000000000000 0000000100000001 ffffffc37040fb80 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffffffc324e78500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffffffc324e78580: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffffffc324e78600: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
^ ffffffc324e78680: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc ffffffc324e78700: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc ==================================================================
Suggested-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sumit Gupta sumitg@nvidia.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/v4l2-core/v4l2-ctrls.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c index 7d3a33258748..3c720f54efa8 100644 --- a/drivers/media/v4l2-core/v4l2-ctrls.c +++ b/drivers/media/v4l2-core/v4l2-ctrls.c @@ -2149,15 +2149,6 @@ static int handler_new_ref(struct v4l2_ctrl_handler *hdl, if (size_extra_req) new_ref->p_req.p = &new_ref[1];
- if (ctrl->handler == hdl) { - /* By default each control starts in a cluster of its own. - new_ref->ctrl is basically a cluster array with one - element, so that's perfect to use as the cluster pointer. - But only do this for the handler that owns the control. */ - ctrl->cluster = &new_ref->ctrl; - ctrl->ncontrols = 1; - } - INIT_LIST_HEAD(&new_ref->node);
mutex_lock(hdl->lock); @@ -2190,6 +2181,15 @@ static int handler_new_ref(struct v4l2_ctrl_handler *hdl, hdl->buckets[bucket] = new_ref; if (ctrl_ref) *ctrl_ref = new_ref; + if (ctrl->handler == hdl) { + /* By default each control starts in a cluster of its own. + * new_ref->ctrl is basically a cluster array with one + * element, so that's perfect to use as the cluster pointer. + * But only do this for the handler that owns the control. + */ + ctrl->cluster = &new_ref->ctrl; + ctrl->ncontrols = 1; + }
unlock: mutex_unlock(hdl->lock);
[ Upstream commit eeacfdc68a104967162dfcba60f53f6f5b62a334 ]
Replace some BUG_ON()s with WARN_ON_ONCE() and returning an error code, and move the check for len divisible by FS_CRYPTO_BLOCK_SIZE into fscrypt_crypt_block() so that it's done for both encryption and decryption, not just encryption.
Reviewed-by: Chandan Rajendra chandan@linux.ibm.com Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/crypto/crypto.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c index 335a362ee446..6f753198eeef 100644 --- a/fs/crypto/crypto.c +++ b/fs/crypto/crypto.c @@ -154,7 +154,10 @@ int fscrypt_do_page_crypto(const struct inode *inode, fscrypt_direction_t rw, struct crypto_skcipher *tfm = ci->ci_ctfm; int res = 0;
- BUG_ON(len == 0); + if (WARN_ON_ONCE(len <= 0)) + return -EINVAL; + if (WARN_ON_ONCE(len % FS_CRYPTO_BLOCK_SIZE != 0)) + return -EINVAL;
fscrypt_generate_iv(&iv, lblk_num, ci);
@@ -238,8 +241,6 @@ struct page *fscrypt_encrypt_page(const struct inode *inode, struct page *ciphertext_page = page; int err;
- BUG_ON(len % FS_CRYPTO_BLOCK_SIZE != 0); - if (inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES) { /* with inplace-encryption we just encrypt the page */ err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, lblk_num, page, @@ -251,7 +252,8 @@ struct page *fscrypt_encrypt_page(const struct inode *inode, return ciphertext_page; }
- BUG_ON(!PageLocked(page)); + if (WARN_ON_ONCE(!PageLocked(page))) + return ERR_PTR(-EINVAL);
ctx = fscrypt_get_ctx(gfp_flags); if (IS_ERR(ctx)) @@ -299,8 +301,9 @@ EXPORT_SYMBOL(fscrypt_encrypt_page); int fscrypt_decrypt_page(const struct inode *inode, struct page *page, unsigned int len, unsigned int offs, u64 lblk_num) { - if (!(inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES)) - BUG_ON(!PageLocked(page)); + if (WARN_ON_ONCE(!PageLocked(page) && + !(inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES))) + return -EINVAL;
return fscrypt_do_page_crypto(inode, FS_DECRYPT, lblk_num, page, page, len, offs, GFP_NOFS);
[ Upstream commit 5d2e73a5f80a5b5aff3caf1ec6d39b5b3f54b26e ]
SyzKaller hit the null pointer deref while reading from uninitialized udev->product in zr364xx_vidioc_querycap().
================================================================== BUG: KASAN: null-ptr-deref in read_word_at_a_time+0xe/0x20 include/linux/compiler.h:274 Read of size 1 at addr 0000000000000000 by task v4l_id/5287
CPU: 1 PID: 5287 Comm: v4l_id Not tainted 5.1.0-rc3-319004-g43151d6 #6 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xe8/0x16e lib/dump_stack.c:113 kasan_report.cold+0x5/0x3c mm/kasan/report.c:321 read_word_at_a_time+0xe/0x20 include/linux/compiler.h:274 strscpy+0x8a/0x280 lib/string.c:207 zr364xx_vidioc_querycap+0xb5/0x210 drivers/media/usb/zr364xx/zr364xx.c:706 v4l_querycap+0x12b/0x340 drivers/media/v4l2-core/v4l2-ioctl.c:1062 __video_do_ioctl+0x5bb/0xb40 drivers/media/v4l2-core/v4l2-ioctl.c:2874 video_usercopy+0x44e/0xf00 drivers/media/v4l2-core/v4l2-ioctl.c:3056 v4l2_ioctl+0x14e/0x1a0 drivers/media/v4l2-core/v4l2-dev.c:364 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0xced/0x12f0 fs/ioctl.c:696 ksys_ioctl+0xa0/0xc0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x74/0xb0 fs/ioctl.c:718 do_syscall_64+0xcf/0x4f0 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f3b56d8b347 Code: 90 90 90 48 8b 05 f1 fa 2a 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 90 90 90 90 90 90 90 90 90 90 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c1 fa 2a 00 31 d2 48 29 c2 64 RSP: 002b:00007ffe005d5d68 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f3b56d8b347 RDX: 00007ffe005d5d70 RSI: 0000000080685600 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000400884 R13: 00007ffe005d5ec0 R14: 0000000000000000 R15: 0000000000000000 ==================================================================
For this device udev->product is not initialized and accessing it causes a NULL pointer deref.
The fix is to check for NULL before strscpy() and copy empty string, if product is NULL
Reported-by: syzbot+66010012fd4c531a1a96@syzkaller.appspotmail.com Signed-off-by: Vandana BN bnvandana@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/zr364xx/zr364xx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/media/usb/zr364xx/zr364xx.c b/drivers/media/usb/zr364xx/zr364xx.c index 37a7992585df..48803eb773ed 100644 --- a/drivers/media/usb/zr364xx/zr364xx.c +++ b/drivers/media/usb/zr364xx/zr364xx.c @@ -694,7 +694,8 @@ static int zr364xx_vidioc_querycap(struct file *file, void *priv, struct zr364xx_camera *cam = video_drvdata(file);
strscpy(cap->driver, DRIVER_DESC, sizeof(cap->driver)); - strscpy(cap->card, cam->udev->product, sizeof(cap->card)); + if (cam->udev->product) + strscpy(cap->card, cam->udev->product, sizeof(cap->card)); strscpy(cap->bus_info, dev_name(&cam->udev->dev), sizeof(cap->bus_info)); cap->device_caps = V4L2_CAP_VIDEO_CAPTURE |
[ Upstream commit da2019633f0b5c105ce658aada333422d8cb28fe ]
Some compilers will complain when using a member of a struct to initialize another member, in the same struct initialization.
For instance:
debian:8 Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0) oraclelinux:7 clang version 3.4.2 (tags/RELEASE_34/dot2-final)
Produce:
ui/browsers/annotate.c:104:12: error: variable 'ops' is uninitialized when used within its own initialization [-Werror,-Wuninitialized] (!ops.current_entry || ^~~ 1 error generated.
So use an extra variable, initialized just before that struct, to have the value used in the expressions used to init two of the struct members.
Cc: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Fixes: c298304bd747 ("perf annotate: Use a ops table for annotation_line__write()") Link: https://lkml.kernel.org/n/tip-f9nexro58q62l3o9hez8hr0i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/ui/browsers/annotate.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c index 98d934a36d86..b0d089a95dac 100644 --- a/tools/perf/ui/browsers/annotate.c +++ b/tools/perf/ui/browsers/annotate.c @@ -97,11 +97,12 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int struct annotate_browser *ab = container_of(browser, struct annotate_browser, b); struct annotation *notes = browser__annotation(browser); struct annotation_line *al = list_entry(entry, struct annotation_line, node); + const bool is_current_entry = ui_browser__is_current_entry(browser, row); struct annotation_write_ops ops = { .first_line = row == 0, - .current_entry = ui_browser__is_current_entry(browser, row), + .current_entry = is_current_entry, .change_color = (!notes->options->hide_src_code && - (!ops.current_entry || + (!is_current_entry || (browser->use_navkeypressed && !browser->navkeypressed))), .width = browser->width,
[ Upstream commit 23c0112246b454e408fb0579b3f9089353d4d327 ]
Don't use the mdd_detected variable as an exit condition for this loop; the first VF to NOT have an MDD event will cause the loop to terminate.
Instead just look at all of the VFs, but don't disable them. This prevents proper release of resources if the VFs are rebooted or the VF driver reloaded. Instead, just log a message and call out repeat offenders.
To make it clear what we are doing, use a differently-named variable in the loop.
Signed-off-by: Mitch Williams mitch.a.williams@intel.com Signed-off-by: Anirudh Venkataramanan anirudh.venkataramanan@intel.com Tested-by: Andrew Bowers andrewx.bowers@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_main.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index dbf3d39ad8b1..1c803106e301 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -1161,16 +1161,16 @@ static void ice_handle_mdd_event(struct ice_pf *pf) } }
- /* see if one of the VFs needs to be reset */ - for (i = 0; i < pf->num_alloc_vfs && mdd_detected; i++) { + /* check to see if one of the VFs caused the MDD */ + for (i = 0; i < pf->num_alloc_vfs; i++) { struct ice_vf *vf = &pf->vf[i];
- mdd_detected = false; + bool vf_mdd_detected = false;
reg = rd32(hw, VP_MDET_TX_PQM(i)); if (reg & VP_MDET_TX_PQM_VALID_M) { wr32(hw, VP_MDET_TX_PQM(i), 0xFFFF); - mdd_detected = true; + vf_mdd_detected = true; dev_info(&pf->pdev->dev, "TX driver issue detected on VF %d\n", i); } @@ -1178,7 +1178,7 @@ static void ice_handle_mdd_event(struct ice_pf *pf) reg = rd32(hw, VP_MDET_TX_TCLAN(i)); if (reg & VP_MDET_TX_TCLAN_VALID_M) { wr32(hw, VP_MDET_TX_TCLAN(i), 0xFFFF); - mdd_detected = true; + vf_mdd_detected = true; dev_info(&pf->pdev->dev, "TX driver issue detected on VF %d\n", i); } @@ -1186,7 +1186,7 @@ static void ice_handle_mdd_event(struct ice_pf *pf) reg = rd32(hw, VP_MDET_TX_TDPU(i)); if (reg & VP_MDET_TX_TDPU_VALID_M) { wr32(hw, VP_MDET_TX_TDPU(i), 0xFFFF); - mdd_detected = true; + vf_mdd_detected = true; dev_info(&pf->pdev->dev, "TX driver issue detected on VF %d\n", i); } @@ -1194,19 +1194,18 @@ static void ice_handle_mdd_event(struct ice_pf *pf) reg = rd32(hw, VP_MDET_RX(i)); if (reg & VP_MDET_RX_VALID_M) { wr32(hw, VP_MDET_RX(i), 0xFFFF); - mdd_detected = true; + vf_mdd_detected = true; dev_info(&pf->pdev->dev, "RX driver issue detected on VF %d\n", i); }
- if (mdd_detected) { + if (vf_mdd_detected) { vf->num_mdd_events++; - dev_info(&pf->pdev->dev, - "Use PF Control I/F to re-enable the VF\n"); - set_bit(ICE_VF_STATE_DIS, vf->vf_states); + if (vf->num_mdd_events > 1) + dev_info(&pf->pdev->dev, "VF %d has had %llu MDD events since last boot\n", + i, vf->num_mdd_events); } } - }
/**
[ Upstream commit 518fa4e0e0da97ea2e17c95ab57647ce748a96e2 ]
You can't memset the contents of a __user pointer. Instead, call copy_to_user to copy links.reserved (which is zeroed) to the user memory.
This fixes this sparse warning:
SPARSE:drivers/media/mc/mc-device.c drivers/media/mc/mc-device.c:521:16: warning: incorrect type in argument 1 (different address spaces)
Fixes: f49308878d720 ("media: media_device_enum_links32: clean a reserved field")
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Reviewed-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/media-device.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/media/media-device.c b/drivers/media/media-device.c index b9bb4904bba1..e19df5165e78 100644 --- a/drivers/media/media-device.c +++ b/drivers/media/media-device.c @@ -510,8 +510,9 @@ static long media_device_enum_links32(struct media_device *mdev, if (ret) return ret;
- memset(ulinks->reserved, 0, sizeof(ulinks->reserved)); - + if (copy_to_user(ulinks->reserved, links.reserved, + sizeof(ulinks->reserved))) + return -EFAULT; return 0; }
[ Upstream commit 50710eeefbc1ed25375942aad0c4d1eb4af0f330 ]
if saa7164_proc_create() fails, saa7164_fini() will trigger a warning,
name 'saa7164' WARNING: CPU: 1 PID: 6311 at fs/proc/generic.c:672 remove_proc_entry+0x1e8/0x3a0 ? remove_proc_entry+0x1e8/0x3a0 ? try_stop_module+0x7b/0x240 ? proc_readdir+0x70/0x70 ? rcu_read_lock_sched_held+0xd7/0x100 saa7164_fini+0x13/0x1f [saa7164] __x64_sys_delete_module+0x30c/0x480 ? __ia32_sys_delete_module+0x480/0x480 ? __x64_sys_clock_gettime+0x11e/0x1c0 ? __x64_sys_timer_create+0x1a0/0x1a0 ? trace_hardirqs_off_caller+0x40/0x180 ? do_syscall_64+0x18/0x450 do_syscall_64+0x9f/0x450 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fix it by checking the return of proc_create_single() before calling remove_proc_entry().
Signed-off-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl [hverkuil-cisco@xs4all.nl: use 0444 instead of S_IRUGO] [hverkuil-cisco@xs4all.nl: use pr_info instead of KERN_INFO] Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/pci/saa7164/saa7164-core.c | 33 ++++++++++++++++-------- 1 file changed, 22 insertions(+), 11 deletions(-)
diff --git a/drivers/media/pci/saa7164/saa7164-core.c b/drivers/media/pci/saa7164/saa7164-core.c index c594aff92e70..9ae04e18e6c6 100644 --- a/drivers/media/pci/saa7164/saa7164-core.c +++ b/drivers/media/pci/saa7164/saa7164-core.c @@ -1112,16 +1112,25 @@ static int saa7164_proc_show(struct seq_file *m, void *v) return 0; }
+static struct proc_dir_entry *saa7164_pe; + static int saa7164_proc_create(void) { - struct proc_dir_entry *pe; - - pe = proc_create_single("saa7164", S_IRUGO, NULL, saa7164_proc_show); - if (!pe) + saa7164_pe = proc_create_single("saa7164", 0444, NULL, saa7164_proc_show); + if (!saa7164_pe) return -ENOMEM;
return 0; } + +static void saa7164_proc_destroy(void) +{ + if (saa7164_pe) + remove_proc_entry("saa7164", NULL); +} +#else +static int saa7164_proc_create(void) { return 0; } +static void saa7164_proc_destroy(void) {} #endif
static int saa7164_thread_function(void *data) @@ -1493,19 +1502,21 @@ static struct pci_driver saa7164_pci_driver = {
static int __init saa7164_init(void) { - printk(KERN_INFO "saa7164 driver loaded\n"); + int ret = pci_register_driver(&saa7164_pci_driver); + + if (ret) + return ret;
-#ifdef CONFIG_PROC_FS saa7164_proc_create(); -#endif - return pci_register_driver(&saa7164_pci_driver); + + pr_info("saa7164 driver loaded\n"); + + return 0; }
static void __exit saa7164_fini(void) { -#ifdef CONFIG_PROC_FS - remove_proc_entry("saa7164", NULL); -#endif + saa7164_proc_destroy(); pci_unregister_driver(&saa7164_pci_driver); }
[ Upstream commit 17fc24875da1bef4650cf007edae3b2e26d2fa4e ]
The sof-rt5682 machine driver supports both legacy Baytrail devices and more recent ApolloLake/CometLake platforms. When only Baytrail is selected, the compilation fails with the following errors:
ERROR: "hdac_hdmi_jack_port_init" [sound/soc/intel/boards/snd-soc-sof_rt5682.ko] undefined!
ERROR: "hdac_hdmi_jack_init" [sound/soc/intel/boards/snd-soc-sof_rt5682.ko] undefined!
Fix by selecting SND_SOC_HDAC_HDMI unconditionally. The code for HDMI support is not reachable on Baytrail so this change has no functional impact.
Fixes: f70abd75b7c6 ("ASoC: Intel: add sof-rt5682 machine driver") Reported-by: kbuild test robot lkp@intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/boards/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/intel/boards/Kconfig b/sound/soc/intel/boards/Kconfig index 5407d217228e..c0aef45d335a 100644 --- a/sound/soc/intel/boards/Kconfig +++ b/sound/soc/intel/boards/Kconfig @@ -392,7 +392,7 @@ config SND_SOC_INTEL_SOF_RT5682_MACH (SND_SOC_SOF_BAYTRAIL && X86_INTEL_LPSS) select SND_SOC_RT5682 select SND_SOC_DMIC - select SND_SOC_HDAC_HDMI if SND_SOC_SOF_HDA_COMMON + select SND_SOC_HDAC_HDMI help This adds support for ASoC machine driver for SOF platforms with rt5682 codec.
[ Upstream commit 6995a659101bd4effa41cebb067f9dc18d77520d ]
Fix to avoid possible memory leak if the decoder initialization got failed.Free the allocated memory for file handle object before return in case decoder initialization fails.
Signed-off-by: Shailendra Verma shailendra.v@samsung.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/media/davinci_vpfe/vpfe_video.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/staging/media/davinci_vpfe/vpfe_video.c b/drivers/staging/media/davinci_vpfe/vpfe_video.c index 510202a3b091..84cca18e3e9d 100644 --- a/drivers/staging/media/davinci_vpfe/vpfe_video.c +++ b/drivers/staging/media/davinci_vpfe/vpfe_video.c @@ -419,6 +419,9 @@ static int vpfe_open(struct file *file) /* If decoder is not initialized. initialize it */ if (!video->initialized && vpfe_update_pipe_state(video)) { mutex_unlock(&video->lock); + v4l2_fh_del(&handle->vfh); + v4l2_fh_exit(&handle->vfh); + kfree(handle); return -ENODEV; } /* Increment device users counter */
[ Upstream commit 82c76aca81187b3d28a6fb3062f6916450ce955e ]
In general, we don't want MAC drivers calling phy_attach_direct with the net_device being NULL. Add checks against this in all the functions calling it: phy_attach() and phy_connect_direct().
Signed-off-by: Ioana Ciornei ioana.ciornei@nxp.com Suggested-by: Andrew Lunn andrew@lunn.ch Reviewed-by: Andrew Lunn andrew@lunn.ch Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/phy_device.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index dcc93a873174..a3f8740c6163 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -948,6 +948,9 @@ int phy_connect_direct(struct net_device *dev, struct phy_device *phydev, { int rc;
+ if (!dev) + return -EINVAL; + rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface); if (rc) return rc; @@ -1290,6 +1293,9 @@ struct phy_device *phy_attach(struct net_device *dev, const char *bus_id, struct device *d; int rc;
+ if (!dev) + return ERR_PTR(-EINVAL); + /* Search the list of PHY devices on the mdio bus for the * PHY with the requested name */
[ Upstream commit af7cd0366ee994e8b35985d407261dc0ed9dfb4d ]
PHYLIB and PHYLINK handle fixed-link interfaces differently. PHYLIB wraps them in a software PHY ("pseudo fixed link") phydev construct such that .adjust_link driver callbacks see an unified API. Whereas PHYLINK simply creates a phylink_link_state structure and passes it to .mac_config.
At the time the driver was introduced, DSA was using PHYLIB for the CPU/cascade ports (the ones with no net devices) and PHYLINK for everything else.
As explained below:
commit aab9c4067d2389d0adfc9c53806437df7b0fe3d5 Author: Florian Fainelli f.fainelli@gmail.com Date: Thu May 10 13:17:36 2018 -0700
net: dsa: Plug in PHYLINK support
Drivers that utilize fixed links for user-facing ports (e.g: bcm_sf2) will need to implement phylink_mac_ops from now on to preserve functionality, since PHYLINK *does not* create a phy_device instance for fixed links.
In the above patch, DSA guards the .phylink_mac_config callback against a NULL phydev pointer. Therefore, .adjust_link is not called in case of a fixed-link user port.
This patch fixes the situation by converting the driver from using .adjust_link to .phylink_mac_config. This can be done now in a unified fashion for both slave and CPU/cascade ports because DSA now uses PHYLINK for all ports.
Signed-off-by: Vladimir Oltean olteanv@gmail.com Signed-off-by: Ioana Ciornei ioana.ciornei@nxp.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/sja1105/sja1105_main.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c index 1c3959efebc4..844e038f3dc6 100644 --- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -734,15 +734,16 @@ static int sja1105_adjust_port_config(struct sja1105_private *priv, int port, return sja1105_clocking_setup_port(priv, port); }
-static void sja1105_adjust_link(struct dsa_switch *ds, int port, - struct phy_device *phydev) +static void sja1105_mac_config(struct dsa_switch *ds, int port, + unsigned int link_an_mode, + const struct phylink_link_state *state) { struct sja1105_private *priv = ds->priv;
- if (!phydev->link) + if (!state->link) sja1105_adjust_port_config(priv, port, 0, false); else - sja1105_adjust_port_config(priv, port, phydev->speed, true); + sja1105_adjust_port_config(priv, port, state->speed, true); }
static void sja1105_phylink_validate(struct dsa_switch *ds, int port, @@ -1515,9 +1516,9 @@ static int sja1105_set_ageing_time(struct dsa_switch *ds, static const struct dsa_switch_ops sja1105_switch_ops = { .get_tag_protocol = sja1105_get_tag_protocol, .setup = sja1105_setup, - .adjust_link = sja1105_adjust_link, .set_ageing_time = sja1105_set_ageing_time, .phylink_validate = sja1105_phylink_validate, + .phylink_mac_config = sja1105_mac_config, .get_strings = sja1105_get_strings, .get_ethtool_stats = sja1105_get_ethtool_stats, .get_sset_count = sja1105_get_sset_count,
[ Upstream commit eae55a586c3c8b50982bad3c3426e9c9dd7a0075 ]
The driver assumes that the ICV is as a single piece in the last element of the scatterlist. This assumption is wrong.
This patch ensures that the ICV is properly handled regardless of the scatterlist layout.
Fixes: 9c4a79653b35 ("crypto: talitos - Freescale integrated security engine (SEC) driver") Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/talitos.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c index 396199b2db7d..fb852727ee1a 100644 --- a/drivers/crypto/talitos.c +++ b/drivers/crypto/talitos.c @@ -1036,7 +1036,6 @@ static void ipsec_esp_encrypt_done(struct device *dev, unsigned int authsize = crypto_aead_authsize(authenc); unsigned int ivsize = crypto_aead_ivsize(authenc); struct talitos_edesc *edesc; - struct scatterlist *sg; void *icvdata;
edesc = container_of(desc, struct talitos_edesc, desc); @@ -1050,9 +1049,8 @@ static void ipsec_esp_encrypt_done(struct device *dev, else icvdata = &edesc->link_tbl[edesc->src_nents + edesc->dst_nents + 2]; - sg = sg_last(areq->dst, edesc->dst_nents); - memcpy((char *)sg_virt(sg) + sg->length - authsize, - icvdata, authsize); + sg_pcopy_from_buffer(areq->dst, edesc->dst_nents ? : 1, icvdata, + authsize, areq->assoclen + areq->cryptlen); }
dma_unmap_single(dev, edesc->iv_dma, ivsize, DMA_TO_DEVICE); @@ -1070,7 +1068,6 @@ static void ipsec_esp_decrypt_swauth_done(struct device *dev, struct crypto_aead *authenc = crypto_aead_reqtfm(req); unsigned int authsize = crypto_aead_authsize(authenc); struct talitos_edesc *edesc; - struct scatterlist *sg; char *oicv, *icv; struct talitos_private *priv = dev_get_drvdata(dev); bool is_sec1 = has_ftr_sec1(priv); @@ -1080,9 +1077,18 @@ static void ipsec_esp_decrypt_swauth_done(struct device *dev, ipsec_esp_unmap(dev, edesc, req);
if (!err) { + char icvdata[SHA512_DIGEST_SIZE]; + int nents = edesc->dst_nents ? : 1; + unsigned int len = req->assoclen + req->cryptlen; + /* auth check */ - sg = sg_last(req->dst, edesc->dst_nents ? : 1); - icv = (char *)sg_virt(sg) + sg->length - authsize; + if (nents > 1) { + sg_pcopy_to_buffer(req->dst, nents, icvdata, authsize, + len - authsize); + icv = icvdata; + } else { + icv = (char *)sg_virt(req->dst) + len - authsize; + }
if (edesc->dma_len) { if (is_sec1) @@ -1498,7 +1504,6 @@ static int aead_decrypt(struct aead_request *req) struct talitos_ctx *ctx = crypto_aead_ctx(authenc); struct talitos_private *priv = dev_get_drvdata(ctx->dev); struct talitos_edesc *edesc; - struct scatterlist *sg; void *icvdata;
req->cryptlen -= authsize; @@ -1532,9 +1537,8 @@ static int aead_decrypt(struct aead_request *req) else icvdata = &edesc->link_tbl[0];
- sg = sg_last(req->src, edesc->src_nents ? : 1); - - memcpy(icvdata, (char *)sg_virt(sg) + sg->length - authsize, authsize); + sg_pcopy_to_buffer(req->src, edesc->src_nents ? : 1, icvdata, authsize, + req->assoclen + req->cryptlen - authsize);
return ipsec_esp(edesc, req, ipsec_esp_decrypt_swauth_done); }
[ Upstream commit c9cca7034b34a2d82e9a03b757de2485c294851c ]
The MPC885 reference manual states:
SEC Lite-initiated 8xx writes can occur only on 32-bit-word boundaries, but reads can occur on any byte boundary. Writing back a header read from a non-32-bit-word boundary will yield unpredictable results.
In order to ensure that, cra_alignmask is set to 3 for SEC1.
Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Fixes: 9c4a79653b35 ("crypto: talitos - Freescale integrated security engine (SEC) driver") Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/talitos.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c index fb852727ee1a..710e09e28227 100644 --- a/drivers/crypto/talitos.c +++ b/drivers/crypto/talitos.c @@ -3261,7 +3261,10 @@ static struct talitos_crypto_alg *talitos_alg_alloc(struct device *dev, alg->cra_priority = t_alg->algt.priority; else alg->cra_priority = TALITOS_CRA_PRIORITY; - alg->cra_alignmask = 0; + if (has_ftr_sec1(priv)) + alg->cra_alignmask = 3; + else + alg->cra_alignmask = 0; alg->cra_ctxsize = sizeof(struct talitos_ctx); alg->cra_flags |= CRYPTO_ALG_KERN_DRIVER_ONLY;
[ Upstream commit 621ccc6cc5f8d6730b740d31d4818227866c93c9 ]
Rename _P to _P_VAL and _R to _R_VAL to avoid global namespace conflicts:
drivers/media/dvb-frontends/tua6100.c: In function ‘tua6100_set_params’: drivers/media/dvb-frontends/tua6100.c:79: warning: "_P" redefined #define _P 32
In file included from ./include/acpi/platform/aclinux.h:54, from ./include/acpi/platform/acenv.h:152, from ./include/acpi/acpi.h:22, from ./include/linux/acpi.h:34, from ./include/linux/i2c.h:17, from drivers/media/dvb-frontends/tua6100.h:30, from drivers/media/dvb-frontends/tua6100.c:32: ./include/linux/ctype.h:14: note: this is the location of the previous definition #define _P 0x10 /* punct */
Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/dvb-frontends/tua6100.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/media/dvb-frontends/tua6100.c b/drivers/media/dvb-frontends/tua6100.c index f7c3e6be8e4d..2483f614d0e7 100644 --- a/drivers/media/dvb-frontends/tua6100.c +++ b/drivers/media/dvb-frontends/tua6100.c @@ -67,8 +67,8 @@ static int tua6100_set_params(struct dvb_frontend *fe) struct i2c_msg msg1 = { .addr = priv->i2c_address, .flags = 0, .buf = reg1, .len = 4 }; struct i2c_msg msg2 = { .addr = priv->i2c_address, .flags = 0, .buf = reg2, .len = 3 };
-#define _R 4 -#define _P 32 +#define _R_VAL 4 +#define _P_VAL 32 #define _ri 4000000
// setup register 0 @@ -83,14 +83,14 @@ static int tua6100_set_params(struct dvb_frontend *fe) else reg1[1] = 0x0c;
- if (_P == 64) + if (_P_VAL == 64) reg1[1] |= 0x40; if (c->frequency >= 1525000) reg1[1] |= 0x80;
// register 2 - reg2[1] = (_R >> 8) & 0x03; - reg2[2] = _R; + reg2[1] = (_R_VAL >> 8) & 0x03; + reg2[2] = _R_VAL; if (c->frequency < 1455000) reg2[1] |= 0x1c; else if (c->frequency < 1630000) @@ -102,18 +102,18 @@ static int tua6100_set_params(struct dvb_frontend *fe) * The N divisor ratio (note: c->frequency is in kHz, but we * need it in Hz) */ - prediv = (c->frequency * _R) / (_ri / 1000); - div = prediv / _P; + prediv = (c->frequency * _R_VAL) / (_ri / 1000); + div = prediv / _P_VAL; reg1[1] |= (div >> 9) & 0x03; reg1[2] = div >> 1; reg1[3] = (div << 7); - priv->frequency = ((div * _P) * (_ri / 1000)) / _R; + priv->frequency = ((div * _P_VAL) * (_ri / 1000)) / _R_VAL;
// Finally, calculate and store the value for A - reg1[3] |= (prediv - (div*_P)) & 0x7f; + reg1[3] |= (prediv - (div*_P_VAL)) & 0x7f;
-#undef _R -#undef _P +#undef _R_VAL +#undef _P_VAL #undef _ri
if (fe->ops.i2c_gate_ctrl)
[ Upstream commit 9e6b5648bbc4cd48fab62cecbb81e9cc3c6e7e88 ]
The state of slave interfaces are handled differently depending on whether the interface is up or not. All active interfaces (IFF_UP) will transmit OGMs. But for B.A.T.M.A.N. IV, also non-active interfaces are scheduling (low TTL) OGMs on active interfaces. The code which setups and schedules the OGMs must therefore already be called when the interfaces gets added as slave interface and the transmit function must then check whether it has to send out the OGM or not on the specific slave interface.
But the commit f0d97253fb5f ("batman-adv: remove ogm_emit and ogm_schedule API calls") moved the setup code from the enable function to the activate function. The latter is called either when the added slave was already up when batadv_hardif_enable_interface processed the new interface or when a NETDEV_UP event was received for this slave interfac. As result, each NETDEV_UP would schedule a new OGM worker for the interface and thus OGMs would be send a lot more than expected.
Fixes: f0d97253fb5f ("batman-adv: remove ogm_emit and ogm_schedule API calls") Reported-by: Linus Lüssing linus.luessing@c0d3.blue Tested-by: Linus Lüssing linus.luessing@c0d3.blue Acked-by: Marek Lindner mareklindner@neomailbox.ch Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/batman-adv/bat_iv_ogm.c | 4 ++-- net/batman-adv/hard-interface.c | 3 +++ net/batman-adv/types.h | 3 +++ 3 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c index bd4138ddf7e0..240ed70912d6 100644 --- a/net/batman-adv/bat_iv_ogm.c +++ b/net/batman-adv/bat_iv_ogm.c @@ -2337,7 +2337,7 @@ batadv_iv_ogm_neigh_is_sob(struct batadv_neigh_node *neigh1, return ret; }
-static void batadv_iv_iface_activate(struct batadv_hard_iface *hard_iface) +static void batadv_iv_iface_enabled(struct batadv_hard_iface *hard_iface) { /* begin scheduling originator messages on that interface */ batadv_iv_ogm_schedule(hard_iface); @@ -2683,8 +2683,8 @@ static void batadv_iv_gw_dump(struct sk_buff *msg, struct netlink_callback *cb, static struct batadv_algo_ops batadv_batman_iv __read_mostly = { .name = "BATMAN_IV", .iface = { - .activate = batadv_iv_iface_activate, .enable = batadv_iv_ogm_iface_enable, + .enabled = batadv_iv_iface_enabled, .disable = batadv_iv_ogm_iface_disable, .update_mac = batadv_iv_ogm_iface_update_mac, .primary_set = batadv_iv_ogm_primary_iface_set, diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c index 79d1731b8306..3719cfd026f0 100644 --- a/net/batman-adv/hard-interface.c +++ b/net/batman-adv/hard-interface.c @@ -795,6 +795,9 @@ int batadv_hardif_enable_interface(struct batadv_hard_iface *hard_iface,
batadv_hardif_recalc_extra_skbroom(soft_iface);
+ if (bat_priv->algo_ops->iface.enabled) + bat_priv->algo_ops->iface.enabled(hard_iface); + out: return 0;
diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h index 74b644738a36..e0b25104cbfa 100644 --- a/net/batman-adv/types.h +++ b/net/batman-adv/types.h @@ -2129,6 +2129,9 @@ struct batadv_algo_iface_ops { /** @enable: init routing info when hard-interface is enabled */ int (*enable)(struct batadv_hard_iface *hard_iface);
+ /** @enabled: notification when hard-interface was enabled (optional) */ + void (*enabled)(struct batadv_hard_iface *hard_iface); + /** @disable: de-init routing info when hard-interface is disabled */ void (*disable)(struct batadv_hard_iface *hard_iface);
[ Upstream commit 8c8889d8eaf4501ae4aaf870b6f8f55db5d5109a ]
The sequence
static DEFINE_WW_CLASS(test_ww_class);
struct ww_acquire_ctx ww_ctx; struct ww_mutex ww_lock_a; struct ww_mutex ww_lock_b; struct mutex lock_c; struct mutex lock_d;
ww_acquire_init(&ww_ctx, &test_ww_class);
ww_mutex_init(&ww_lock_a, &test_ww_class); ww_mutex_init(&ww_lock_b, &test_ww_class);
mutex_init(&lock_c);
ww_mutex_lock(&ww_lock_a, &ww_ctx);
mutex_lock(&lock_c);
ww_mutex_lock(&ww_lock_b, &ww_ctx);
mutex_unlock(&lock_c); (*)
ww_mutex_unlock(&ww_lock_b); ww_mutex_unlock(&ww_lock_a);
ww_acquire_fini(&ww_ctx);
triggers the following WARN in __lock_release() when doing the unlock at *:
DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth - 1);
The problem is that the WARN check doesn't take into account the merging of ww_lock_a and ww_lock_b which results in decreasing curr->lockdep_depth by 2 not only 1.
Note that the following sequence doesn't trigger the WARN, since there won't be any hlock merging.
ww_acquire_init(&ww_ctx, &test_ww_class);
ww_mutex_init(&ww_lock_a, &test_ww_class); ww_mutex_init(&ww_lock_b, &test_ww_class);
mutex_init(&lock_c); mutex_init(&lock_d);
ww_mutex_lock(&ww_lock_a, &ww_ctx);
mutex_lock(&lock_c); mutex_lock(&lock_d);
ww_mutex_lock(&ww_lock_b, &ww_ctx);
mutex_unlock(&lock_d);
ww_mutex_unlock(&ww_lock_b); ww_mutex_unlock(&ww_lock_a);
mutex_unlock(&lock_c);
ww_acquire_fini(&ww_ctx);
In general both of the above two sequences are valid and shouldn't trigger any lockdep warning.
Fix this by taking the decrement due to the hlock merging into account during lock release and hlock class re-setting. Merging can't happen during lock downgrading since there won't be a new possibility to merge hlocks in that case, so add a WARN if merging still happens then.
Signed-off-by: Imre Deak imre.deak@intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Will Deacon will.deacon@arm.com Cc: ville.syrjala@linux.intel.com Link: https://lkml.kernel.org/r/20190524201509.9199-1-imre.deak@intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/lockdep.c | 41 ++++++++++++++++++++++++++++------------ 1 file changed, 29 insertions(+), 12 deletions(-)
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index c47788fa85f9..82361e1bce0f 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -3715,7 +3715,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass, hlock->references = 2; }
- return 1; + return 2; } }
@@ -3921,22 +3921,33 @@ static struct held_lock *find_held_lock(struct task_struct *curr, }
static int reacquire_held_locks(struct task_struct *curr, unsigned int depth, - int idx) + int idx, unsigned int *merged) { struct held_lock *hlock; + int first_idx = idx;
if (DEBUG_LOCKS_WARN_ON(!irqs_disabled())) return 0;
for (hlock = curr->held_locks + idx; idx < depth; idx++, hlock++) { - if (!__lock_acquire(hlock->instance, + switch (__lock_acquire(hlock->instance, hlock_class(hlock)->subclass, hlock->trylock, hlock->read, hlock->check, hlock->hardirqs_off, hlock->nest_lock, hlock->acquire_ip, - hlock->references, hlock->pin_count)) + hlock->references, hlock->pin_count)) { + case 0: return 1; + case 1: + break; + case 2: + *merged += (idx == first_idx); + break; + default: + WARN_ON(1); + return 0; + } } return 0; } @@ -3947,9 +3958,9 @@ __lock_set_class(struct lockdep_map *lock, const char *name, unsigned long ip) { struct task_struct *curr = current; + unsigned int depth, merged = 0; struct held_lock *hlock; struct lock_class *class; - unsigned int depth; int i;
if (unlikely(!debug_locks)) @@ -3974,14 +3985,14 @@ __lock_set_class(struct lockdep_map *lock, const char *name, curr->lockdep_depth = i; curr->curr_chain_key = hlock->prev_chain_key;
- if (reacquire_held_locks(curr, depth, i)) + if (reacquire_held_locks(curr, depth, i, &merged)) return 0;
/* * I took it apart and put it back together again, except now I have * these 'spare' parts.. where shall I put them. */ - if (DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth)) + if (DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth - merged)) return 0; return 1; } @@ -3989,8 +4000,8 @@ __lock_set_class(struct lockdep_map *lock, const char *name, static int __lock_downgrade(struct lockdep_map *lock, unsigned long ip) { struct task_struct *curr = current; + unsigned int depth, merged = 0; struct held_lock *hlock; - unsigned int depth; int i;
if (unlikely(!debug_locks)) @@ -4015,7 +4026,11 @@ static int __lock_downgrade(struct lockdep_map *lock, unsigned long ip) hlock->read = 1; hlock->acquire_ip = ip;
- if (reacquire_held_locks(curr, depth, i)) + if (reacquire_held_locks(curr, depth, i, &merged)) + return 0; + + /* Merging can't happen with unchanged classes.. */ + if (DEBUG_LOCKS_WARN_ON(merged)) return 0;
/* @@ -4024,6 +4039,7 @@ static int __lock_downgrade(struct lockdep_map *lock, unsigned long ip) */ if (DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth)) return 0; + return 1; }
@@ -4038,8 +4054,8 @@ static int __lock_release(struct lockdep_map *lock, int nested, unsigned long ip) { struct task_struct *curr = current; + unsigned int depth, merged = 1; struct held_lock *hlock; - unsigned int depth; int i;
if (unlikely(!debug_locks)) @@ -4094,14 +4110,15 @@ __lock_release(struct lockdep_map *lock, int nested, unsigned long ip) if (i == depth-1) return 1;
- if (reacquire_held_locks(curr, depth, i + 1)) + if (reacquire_held_locks(curr, depth, i + 1, &merged)) return 0;
/* * We had N bottles of beer on the wall, we drank one, but now * there's not N-1 bottles of beer left on the wall... + * Pouring two of the bottles together is acceptable. */ - DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth-1); + DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth - merged);
/* * Since reacquire_held_locks() would have called check_chain_key()
[ Upstream commit d9349850e188b8b59e5322fda17ff389a1c0cd7d ]
The sequence
static DEFINE_WW_CLASS(test_ww_class);
struct ww_acquire_ctx ww_ctx; struct ww_mutex ww_lock_a; struct ww_mutex ww_lock_b; struct ww_mutex ww_lock_c; struct mutex lock_c;
ww_acquire_init(&ww_ctx, &test_ww_class);
ww_mutex_init(&ww_lock_a, &test_ww_class); ww_mutex_init(&ww_lock_b, &test_ww_class); ww_mutex_init(&ww_lock_c, &test_ww_class);
mutex_init(&lock_c);
ww_mutex_lock(&ww_lock_a, &ww_ctx);
mutex_lock(&lock_c);
ww_mutex_lock(&ww_lock_b, &ww_ctx); ww_mutex_lock(&ww_lock_c, &ww_ctx);
mutex_unlock(&lock_c); (*)
ww_mutex_unlock(&ww_lock_c); ww_mutex_unlock(&ww_lock_b); ww_mutex_unlock(&ww_lock_a);
ww_acquire_fini(&ww_ctx); (**)
will trigger the following error in __lock_release() when calling mutex_release() at **:
DEBUG_LOCKS_WARN_ON(depth <= 0)
The problem is that the hlock merging happening at * updates the references for test_ww_class incorrectly to 3 whereas it should've updated it to 4 (representing all the instances for ww_ctx and ww_lock_[abc]).
Fix this by updating the references during merging correctly taking into account that we can have non-zero references (both for the hlock that we merge into another hlock or for the hlock we are merging into).
Signed-off-by: Imre Deak imre.deak@intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= ville.syrjala@linux.intel.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Will Deacon will.deacon@arm.com Link: https://lkml.kernel.org/r/20190524201509.9199-2-imre.deak@intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/lockdep.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index 82361e1bce0f..dbc936ccf149 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -3703,17 +3703,17 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass, if (depth) { hlock = curr->held_locks + depth - 1; if (hlock->class_idx == class_idx && nest_lock) { - if (hlock->references) { - /* - * Check: unsigned int references:12, overflow. - */ - if (DEBUG_LOCKS_WARN_ON(hlock->references == (1 << 12)-1)) - return 0; + if (!references) + references++;
+ if (!hlock->references) hlock->references++; - } else { - hlock->references = 2; - } + + hlock->references += references; + + /* Overflow */ + if (DEBUG_LOCKS_WARN_ON(hlock->references < references)) + return 0;
return 2; }
[ Upstream commit 9f7406d6b56b4b71a12480b68221755ea7b3e0ee ]
With fast_io enabled, spinlock_irq is used for read/write operations, thus leading to : BUG: sleeping function called from invalid context at [snip]/ao-cec-g12a.c:379 in_atomic(): 1, irqs_disabled(): 128, pid: 1451, name: irq/14-ff800280 [snip] Call trace: dump_backtrace+0x0/0x180 show_stack+0x14/0x1c dump_stack+0xa8/0xe0 ___might_sleep+0xf4/0x104 __might_sleep+0x4c/0x80 meson_ao_cec_g12a_read+0x7c/0x164 regmap_read+0x16c/0x1b0 meson_ao_cec_g12a_irq_thread+0xcc/0x200 irq_thread_fn+0x2c/0x60 irq_thread+0x14c/0x1fc kthread+0x11c/0x12c ret_from_fork+0x10/0x18
Simply remove fast_io to use mutexes instead.
Fixes: b7778c46683c ("media: platform: meson: Add Amlogic Meson G12A AO CEC Controller driver")
Signed-off-by: Neil Armstrong narmstrong@baylibre.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/meson/ao-cec-g12a.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/media/platform/meson/ao-cec-g12a.c b/drivers/media/platform/meson/ao-cec-g12a.c index 3620a1e310f5..ddfd060625da 100644 --- a/drivers/media/platform/meson/ao-cec-g12a.c +++ b/drivers/media/platform/meson/ao-cec-g12a.c @@ -415,7 +415,6 @@ static const struct regmap_config meson_ao_cec_g12a_cec_regmap_conf = { .reg_read = meson_ao_cec_g12a_read, .reg_write = meson_ao_cec_g12a_write, .max_register = 0xffff, - .fast_io = true, };
static inline void
[ Upstream commit 69fbb3f47327d959830c94bf31893972b8c8f700 ]
X-Originating-IP: [10.175.113.25] X-CFilter-Loop: Reflected The fm_v4l2_init_video_device() forget to unregister v4l2/video device in the error path, it could lead to UAF issue, eg,
BUG: KASAN: use-after-free in atomic64_read include/asm-generic/atomic-instrumented.h:836 [inline] BUG: KASAN: use-after-free in atomic_long_read include/asm-generic/atomic-long.h:28 [inline] BUG: KASAN: use-after-free in __mutex_unlock_slowpath+0x92/0x690 kernel/locking/mutex.c:1206 Read of size 8 at addr ffff8881e84a7c70 by task v4l_id/3659
CPU: 1 PID: 3659 Comm: v4l_id Not tainted 5.1.0 #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xa9/0x10e lib/dump_stack.c:113 print_address_description+0x65/0x270 mm/kasan/report.c:187 kasan_report+0x149/0x18d mm/kasan/report.c:317 atomic64_read include/asm-generic/atomic-instrumented.h:836 [inline] atomic_long_read include/asm-generic/atomic-long.h:28 [inline] __mutex_unlock_slowpath+0x92/0x690 kernel/locking/mutex.c:1206 fm_v4l2_fops_open+0xac/0x120 [fm_drv] v4l2_open+0x191/0x390 [videodev] chrdev_open+0x20d/0x570 fs/char_dev.c:417 do_dentry_open+0x700/0xf30 fs/open.c:777 do_last fs/namei.c:3416 [inline] path_openat+0x7c4/0x2a90 fs/namei.c:3532 do_filp_open+0x1a5/0x2b0 fs/namei.c:3563 do_sys_open+0x302/0x490 fs/open.c:1069 do_syscall_64+0x9f/0x450 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f8180c17c8e ... Allocated by task 3642: set_track mm/kasan/common.c:87 [inline] __kasan_kmalloc.constprop.3+0xa0/0xd0 mm/kasan/common.c:497 fm_drv_init+0x13/0x1000 [fm_drv] do_one_initcall+0xbc/0x47d init/main.c:901 do_init_module+0x1b5/0x547 kernel/module.c:3456 load_module+0x6405/0x8c10 kernel/module.c:3804 __do_sys_finit_module+0x162/0x190 kernel/module.c:3898 do_syscall_64+0x9f/0x450 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 3642: set_track mm/kasan/common.c:87 [inline] __kasan_slab_free+0x130/0x180 mm/kasan/common.c:459 slab_free_hook mm/slub.c:1429 [inline] slab_free_freelist_hook mm/slub.c:1456 [inline] slab_free mm/slub.c:3003 [inline] kfree+0xe1/0x270 mm/slub.c:3958 fm_drv_init+0x1e6/0x1000 [fm_drv] do_one_initcall+0xbc/0x47d init/main.c:901 do_init_module+0x1b5/0x547 kernel/module.c:3456 load_module+0x6405/0x8c10 kernel/module.c:3804 __do_sys_finit_module+0x162/0x190 kernel/module.c:3898 do_syscall_64+0x9f/0x450 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Add relevant unregister functions to fix it.
Cc: Hans Verkuil hans.verkuil@cisco.com Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Kefeng Wang wangkefeng.wang@huawei.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/radio/wl128x/fmdrv_v4l2.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/media/radio/wl128x/fmdrv_v4l2.c b/drivers/media/radio/wl128x/fmdrv_v4l2.c index c80a6df47f5e..469366dae1d5 100644 --- a/drivers/media/radio/wl128x/fmdrv_v4l2.c +++ b/drivers/media/radio/wl128x/fmdrv_v4l2.c @@ -541,6 +541,7 @@ int fm_v4l2_init_video_device(struct fmdev *fmdev, int radio_nr)
/* Register with V4L2 subsystem as RADIO device */ if (video_register_device(&gradio_dev, VFL_TYPE_RADIO, radio_nr)) { + v4l2_device_unregister(&fmdev->v4l2_dev); fmerr("Could not register video device\n"); return -ENOMEM; } @@ -554,6 +555,8 @@ int fm_v4l2_init_video_device(struct fmdev *fmdev, int radio_nr) if (ret < 0) { fmerr("(fmdev): Can't init ctrl handler\n"); v4l2_ctrl_handler_free(&fmdev->ctrl_handler); + video_unregister_device(fmdev->radio_dev); + v4l2_device_unregister(&fmdev->v4l2_dev); return -EBUSY; }
[ Upstream commit 661262bc3e0ecc9a1aed39c6b2a99766da2c22e2 ]
If we add a VF without loading hclgevf.ko and then there is a RAS error occurs, PCIe AER will call error_detected and slot_reset of all functions, and will get a NULL pointer when we check ad_dev->ops->handle_hw_ras_error. This will cause a call trace and failures on handling of follow-up RAS errors.
This patch check ae_dev and ad_dev->ops at first to solve above issues.
Signed-off-by: Weihang Li liweihang@hisilicon.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index cd59c0cc636a..5611b990ac34 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -1916,9 +1916,9 @@ static pci_ers_result_t hns3_error_detected(struct pci_dev *pdev, if (state == pci_channel_io_perm_failure) return PCI_ERS_RESULT_DISCONNECT;
- if (!ae_dev) { + if (!ae_dev || !ae_dev->ops) { dev_err(&pdev->dev, - "Can't recover - error happened during device init\n"); + "Can't recover - error happened before device initialized\n"); return PCI_ERS_RESULT_NONE; }
@@ -1937,6 +1937,9 @@ static pci_ers_result_t hns3_slot_reset(struct pci_dev *pdev)
dev_info(dev, "requesting reset due to PCI error\n");
+ if (!ae_dev || !ae_dev->ops) + return PCI_ERS_RESULT_NONE; + /* request the reset */ if (ae_dev->ops->reset_event) { if (!ae_dev->override_pci_need_reset)
[ Upstream commit 594a81b39525f0a17e92c2e0b167ae1400650380 ]
The hclge/hclgevf and hns3 module can be unloaded independently, when hclge/hclgevf unloaded firstly, the ops of ae_dev should be set to NULL, otherwise it will cause an use-after-free problem.
Fixes: 38caee9d3ee8 ("net: hns3: Add support of the HNAE3 framework") Signed-off-by: Weihang Li liweihang@hisilicon.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hnae3.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.c b/drivers/net/ethernet/hisilicon/hns3/hnae3.c index fa8b8506b120..738e01393b68 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hnae3.c +++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.c @@ -251,6 +251,7 @@ void hnae3_unregister_ae_algo(struct hnae3_ae_algo *ae_algo)
ae_algo->ops->uninit_ae_dev(ae_dev); hnae3_set_bit(ae_dev->flag, HNAE3_DEV_INITED_B, 0); + ae_dev->ops = NULL; }
list_del(&ae_algo->node); @@ -351,6 +352,7 @@ void hnae3_unregister_ae_dev(struct hnae3_ae_dev *ae_dev)
ae_algo->ops->uninit_ae_dev(ae_dev); hnae3_set_bit(ae_dev->flag, HNAE3_DEV_INITED_B, 0); + ae_dev->ops = NULL; }
list_del(&ae_dev->node);
[ Upstream commit 04507c0a9385cc8280f794a36bfff567c8cc1042 ]
To set frequency on specific cpus using cpupower, following syntax can be used : cpupower -c #i frequency-set -f #f -r
While setting frequency using cpupower frequency-set command, if we use '-r' option, it is expected to set frequency for all cpus related to cpu #i. But it is observed to be missing the last cpu in related cpu list. This patch fixes the problem.
Signed-off-by: Abhishek Goel huntbag@linux.vnet.ibm.com Reviewed-by: Thomas Renninger trenn@suse.de Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/power/cpupower/utils/cpufreq-set.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/power/cpupower/utils/cpufreq-set.c b/tools/power/cpupower/utils/cpufreq-set.c index f49bc4aa2a08..6ed82fba5aaa 100644 --- a/tools/power/cpupower/utils/cpufreq-set.c +++ b/tools/power/cpupower/utils/cpufreq-set.c @@ -305,6 +305,8 @@ int cmd_freq_set(int argc, char **argv) bitmask_setbit(cpus_chosen, cpus->cpu); cpus = cpus->next; } + /* Set the last cpu in related cpus list */ + bitmask_setbit(cpus_chosen, cpus->cpu); cpufreq_put_related_cpus(cpus); } }
[ Upstream commit 0c1f14ed12262f45a3af1d588e4d7bd12438b8f5 ]
This change makes CONFIG_ZONE_DMA32 defuly y and allows users to overwrite it only when CONFIG_EXPERT=y.
For the SoCs that do not need CONFIG_ZONE_DMA32, this is the first step to manage all available memory by a single zone(normal zone) to reduce the overhead of multiple zones.
The change also fixes a build error when CONFIG_NUMA=y and CONFIG_ZONE_DMA32=n.
arch/arm64/mm/init.c:195:17: error: use of undeclared identifier 'ZONE_DMA32' max_zone_pfns[ZONE_DMA32] = PFN_DOWN(max_zone_dma_phys());
Change since v1: 1. only expose CONFIG_ZONE_DMA32 when CONFIG_EXPERT=y 2. remove redundant IS_ENABLED(CONFIG_ZONE_DMA32)
Cc: Robin Murphy robin.murphy@arm.com Signed-off-by: Miles Chen miles.chen@mediatek.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/Kconfig | 3 ++- arch/arm64/mm/init.c | 5 +++-- 2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 697ea0510729..cf5f1dafcf74 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -260,7 +260,8 @@ config GENERIC_CALIBRATE_DELAY def_bool y
config ZONE_DMA32 - def_bool y + bool "Support DMA32 zone" if EXPERT + default y
config HAVE_GENERIC_GUP def_bool y diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 749c9b269f08..f3c795278def 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -180,8 +180,9 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) { unsigned long max_zone_pfns[MAX_NR_ZONES] = {0};
- if (IS_ENABLED(CONFIG_ZONE_DMA32)) - max_zone_pfns[ZONE_DMA32] = PFN_DOWN(max_zone_dma_phys()); +#ifdef CONFIG_ZONE_DMA32 + max_zone_pfns[ZONE_DMA32] = PFN_DOWN(max_zone_dma_phys()); +#endif max_zone_pfns[ZONE_NORMAL] = max;
free_area_init_nodes(max_zone_pfns);
[ Upstream commit 2b393f91c651c16d5c09f5c7aa689e58a79df34e ]
Currently the return value from clk_bulk_prepare_enable() is checked, but it is not propagate it in the case of failure.
Fix it and also move the error message to the caller of mipi_csis_clk_enable().
Signed-off-by: Fabio Estevam festevam@gmail.com Reviewed-by: Rui Miguel Silva rmfrfs@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/media/imx/imx7-mipi-csis.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/staging/media/imx/imx7-mipi-csis.c b/drivers/staging/media/imx/imx7-mipi-csis.c index 19455f425416..7d7bdfdd852a 100644 --- a/drivers/staging/media/imx/imx7-mipi-csis.c +++ b/drivers/staging/media/imx/imx7-mipi-csis.c @@ -456,13 +456,9 @@ static void mipi_csis_set_params(struct csi_state *state) MIPI_CSIS_CMN_CTRL_UPDATE_SHADOW_CTRL); }
-static void mipi_csis_clk_enable(struct csi_state *state) +static int mipi_csis_clk_enable(struct csi_state *state) { - int ret; - - ret = clk_bulk_prepare_enable(state->num_clks, state->clks); - if (ret < 0) - dev_err(state->dev, "failed to enable clocks\n"); + return clk_bulk_prepare_enable(state->num_clks, state->clks); }
static void mipi_csis_clk_disable(struct csi_state *state) @@ -973,7 +969,11 @@ static int mipi_csis_probe(struct platform_device *pdev) if (ret < 0) return ret;
- mipi_csis_clk_enable(state); + ret = mipi_csis_clk_enable(state); + if (ret < 0) { + dev_err(state->dev, "failed to enable clocks: %d\n", ret); + return ret; + }
ret = devm_request_irq(dev, state->irq, mipi_csis_irq_handler, 0, dev_name(dev), state);
[ Upstream commit 279ab04dbea1370d2eac0f854270369ccaef8a44 ]
We are getting false positive gcc warning when we compile with gcc9 (9.1.1):
CC jvmti/libjvmti.o In file included from /usr/include/string.h:494, from jvmti/libjvmti.c:5: In function ‘strncpy’, inlined from ‘copy_class_filename.constprop’ at jvmti/libjvmti.c:166:3: /usr/include/bits/string_fortified.h:106:10: error: ‘__builtin_strncpy’ specified bound depends on the length of the source argument [-Werror=stringop-overflow=] 106 | return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ jvmti/libjvmti.c: In function ‘copy_class_filename.constprop’: jvmti/libjvmti.c:165:26: note: length computed here 165 | size_t file_name_len = strlen(file_name); | ^~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors
As per Arnaldo's suggestion use strlcpy(), which does the same thing and keeps gcc silent.
Suggested-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Jiri Olsa jolsa@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ben Gainey ben.gainey@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: http://lkml.kernel.org/r/20190531131321.GB1281@krava Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/jvmti/libjvmti.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/jvmti/libjvmti.c b/tools/perf/jvmti/libjvmti.c index aea7b1fe85aa..c441a34cb1c0 100644 --- a/tools/perf/jvmti/libjvmti.c +++ b/tools/perf/jvmti/libjvmti.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include <linux/compiler.h> +#include <linux/string.h> #include <sys/types.h> #include <stdio.h> #include <string.h> @@ -162,8 +163,7 @@ copy_class_filename(const char * class_sign, const char * file_name, char * resu result[i] = '\0'; } else { /* fallback case */ - size_t file_name_len = strlen(file_name); - strncpy(result, file_name, file_name_len < max_length ? file_name_len : max_length); + strlcpy(result, file_name, max_length); } }
[ Upstream commit 12ae1c1bf5db2f33fcd9092a96f630291c4b181a ]
Differently from other Aspeed drivers, this driver calls clock control APIs in interrupt context. Since ECLK is coupled with a reset bit in clk-aspeed module, aspeed_clk_enable will make 10ms of busy waiting delay for triggering the reset and it will eventually disturb other drivers' interrupt handling. To fix this issue, this commit changes this driver's irq to threaded irq so that the delay can be happened in a thread context.
Signed-off-by: Jae Hyun Yoo jae.hyun.yoo@linux.intel.com Reviewed-by: Eddie James eajames@linux.ibm.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/aspeed-video.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/media/platform/aspeed-video.c b/drivers/media/platform/aspeed-video.c index 8144fe36ad48..76d7512c82a3 100644 --- a/drivers/media/platform/aspeed-video.c +++ b/drivers/media/platform/aspeed-video.c @@ -1589,8 +1589,9 @@ static int aspeed_video_init(struct aspeed_video *video) return -ENODEV; }
- rc = devm_request_irq(dev, irq, aspeed_video_irq, IRQF_SHARED, - DEVICE_NAME, video); + rc = devm_request_threaded_irq(dev, irq, NULL, aspeed_video_irq, + IRQF_ONESHOT | IRQF_SHARED, DEVICE_NAME, + video); if (rc < 0) { dev_err(dev, "Unable to request IRQ %d\n", irq); return rc;
[ Upstream commit 9698ed4d4a2993ce54b9f7d71a2891e972caa117 ]
Video engine clock control can be double disabled and eventually it causes a kernel warning with stack dump printing out like below:
[ 515.540498] ------------[ cut here ]------------ [ 515.545174] WARNING: CPU: 0 PID: 1310 at drivers/clk/clk.c:684 clk_core_unprepare+0x13c/0x170 [ 515.553806] vclk-gate already unprepared [ 515.557841] CPU: 0 PID: 1310 Comm: obmc-ikvm Tainted: G W 5.0.6-df66fbc97853fbba90a0bfa44de32f3d5f7602b4 #1 [ 515.568973] Hardware name: Generic DT based system [ 515.573777] Backtrace: [ 515.576272] [<80107cdc>] (dump_backtrace) from [<80107f10>] (show_stack+0x20/0x24) [ 515.583930] r7:803a5614 r6:00000009 r5:00000000 r4:9d88fe1c [ 515.589712] [<80107ef0>] (show_stack) from [<80690184>] (dump_stack+0x20/0x28) [ 515.597053] [<80690164>] (dump_stack) from [<80116044>] (__warn.part.3+0xb4/0xdc) [ 515.604557] [<80115f90>] (__warn.part.3) from [<801160d8>] (warn_slowpath_fmt+0x6c/0x90) [ 515.612734] r6:000002ac r5:8080befc r4:80a07008 [ 515.617463] [<80116070>] (warn_slowpath_fmt) from [<803a5614>] (clk_core_unprepare+0x13c/0x170) [ 515.626167] r3:8080cdf4 r2:8080bfc0 [ 515.629834] r7:98d682a8 r6:9d8a9200 r5:9e5151a0 r4:97abd620 [ 515.635530] [<803a54d8>] (clk_core_unprepare) from [<803a76a4>] (clk_unprepare+0x34/0x3c) [ 515.643812] r5:9e5151a0 r4:97abd620 [ 515.647529] [<803a7670>] (clk_unprepare) from [<804f36ec>] (aspeed_video_off+0x38/0x50) [ 515.655539] r5:9e5151a0 r4:9e504000 [ 515.659242] [<804f36b4>] (aspeed_video_off) from [<804f4358>] (aspeed_video_release+0x90/0x114) [ 515.668036] r5:9e5044b0 r4:9e504000 [ 515.671643] [<804f42c8>] (aspeed_video_release) from [<804d302c>] (v4l2_release+0xd4/0xe8) [ 515.679999] r7:98d682a8 r6:9d087810 r5:9d8a9200 r4:9e504318 [ 515.685695] [<804d2f58>] (v4l2_release) from [<80236454>] (__fput+0x98/0x1c4) [ 515.692914] r5:9e51b608 r4:9d8a9200 [ 515.696597] [<802363bc>] (__fput) from [<802365e8>] (____fput+0x18/0x1c) [ 515.703315] r9:80a0700c r8:801011e4 r7:00000000 r6:80a64b9c r5:9d8e35a0 r4:9d8e38dc [ 515.711167] [<802365d0>] (____fput) from [<80131ca4>] (task_work_run+0x7c/0xa0) [ 515.718596] [<80131c28>] (task_work_run) from [<80106884>] (do_work_pending+0x4a8/0x578) [ 515.726777] r7:801011e4 r6:80a07008 r5:9d88ffb0 r4:ffffe000 [ 515.732466] [<801063dc>] (do_work_pending) from [<8010106c>] (slow_work_pending+0xc/0x20) [ 515.740727] Exception stack(0x9d88ffb0 to 0x9d88fff8) [ 515.745840] ffa0: 00000000 76f18094 00000000 00000000 [ 515.754122] ffc0: 00000007 00176778 7eda4c20 00000006 00000000 00000000 48e20fa4 00000000 [ 515.762386] ffe0: 00000002 7eda4b08 00000000 48f91efc 80000010 00000007 [ 515.769097] r10:00000000 r9:9d88e000 r8:801011e4 r7:00000006 r6:7eda4c20 r5:00176778 [ 515.777006] r4:00000007 [ 515.779558] ---[ end trace 12c04aadef8afbbb ]--- [ 515.784176] ------------[ cut here ]------------ [ 515.788817] WARNING: CPU: 0 PID: 1310 at drivers/clk/clk.c:825 clk_core_disable+0x18c/0x204 [ 515.797161] eclk-gate already disabled [ 515.800916] CPU: 0 PID: 1310 Comm: obmc-ikvm Tainted: G W 5.0.6-df66fbc97853fbba90a0bfa44de32f3d5f7602b4 #1 [ 515.811945] Hardware name: Generic DT based system [ 515.816730] Backtrace: [ 515.819210] [<80107cdc>] (dump_backtrace) from [<80107f10>] (show_stack+0x20/0x24) [ 515.826782] r7:803a5900 r6:00000009 r5:00000000 r4:9d88fe04 [ 515.832454] [<80107ef0>] (show_stack) from [<80690184>] (dump_stack+0x20/0x28) [ 515.839687] [<80690164>] (dump_stack) from [<80116044>] (__warn.part.3+0xb4/0xdc) [ 515.847170] [<80115f90>] (__warn.part.3) from [<801160d8>] (warn_slowpath_fmt+0x6c/0x90) [ 515.855247] r6:00000339 r5:8080befc r4:80a07008 [ 515.859868] [<80116070>] (warn_slowpath_fmt) from [<803a5900>] (clk_core_disable+0x18c/0x204) [ 515.868385] r3:8080cdd0 r2:8080c00c [ 515.871957] r7:98d682a8 r6:9d8a9200 r5:97abd560 r4:97abd560 [ 515.877615] [<803a5774>] (clk_core_disable) from [<803a59a0>] (clk_core_disable_lock+0x28/0x34) [ 515.886301] r7:98d682a8 r6:9d8a9200 r5:97abd560 r4:a0000013 [ 515.891960] [<803a5978>] (clk_core_disable_lock) from [<803a7714>] (clk_disable+0x2c/0x30) [ 515.900216] r5:9e5151a0 r4:9e515f60 [ 515.903816] [<803a76e8>] (clk_disable) from [<804f36f8>] (aspeed_video_off+0x44/0x50) [ 515.911656] [<804f36b4>] (aspeed_video_off) from [<804f4358>] (aspeed_video_release+0x90/0x114) [ 515.920341] r5:9e5044b0 r4:9e504000 [ 515.923921] [<804f42c8>] (aspeed_video_release) from [<804d302c>] (v4l2_release+0xd4/0xe8) [ 515.932184] r7:98d682a8 r6:9d087810 r5:9d8a9200 r4:9e504318 [ 515.937851] [<804d2f58>] (v4l2_release) from [<80236454>] (__fput+0x98/0x1c4) [ 515.944980] r5:9e51b608 r4:9d8a9200 [ 515.948559] [<802363bc>] (__fput) from [<802365e8>] (____fput+0x18/0x1c) [ 515.955257] r9:80a0700c r8:801011e4 r7:00000000 r6:80a64b9c r5:9d8e35a0 r4:9d8e38dc [ 515.963008] [<802365d0>] (____fput) from [<80131ca4>] (task_work_run+0x7c/0xa0) [ 515.970333] [<80131c28>] (task_work_run) from [<80106884>] (do_work_pending+0x4a8/0x578) [ 515.978421] r7:801011e4 r6:80a07008 r5:9d88ffb0 r4:ffffe000 [ 515.984086] [<801063dc>] (do_work_pending) from [<8010106c>] (slow_work_pending+0xc/0x20) [ 515.992247] Exception stack(0x9d88ffb0 to 0x9d88fff8) [ 515.997296] ffa0: 00000000 76f18094 00000000 00000000 [ 516.005473] ffc0: 00000007 00176778 7eda4c20 00000006 00000000 00000000 48e20fa4 00000000 [ 516.013642] ffe0: 00000002 7eda4b08 00000000 48f91efc 80000010 00000007 [ 516.020257] r10:00000000 r9:9d88e000 r8:801011e4 r7:00000006 r6:7eda4c20 r5:00176778 [ 516.028072] r4:00000007 [ 516.030606] ---[ end trace 12c04aadef8afbbc ]---
To prevent this issue, this commit adds clock status checking logic into the Aspeed video engine driver.
Signed-off-by: Jae Hyun Yoo jae.hyun.yoo@linux.intel.com Reviewed-by: Eddie James eajames@linux.ibm.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/aspeed-video.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/media/platform/aspeed-video.c b/drivers/media/platform/aspeed-video.c index 76d7512c82a3..de0f192afa8b 100644 --- a/drivers/media/platform/aspeed-video.c +++ b/drivers/media/platform/aspeed-video.c @@ -187,6 +187,7 @@ enum { VIDEO_STREAMING, VIDEO_FRAME_INPRG, VIDEO_STOPPED, + VIDEO_CLOCKS_ON, };
struct aspeed_video_addr { @@ -483,19 +484,29 @@ static void aspeed_video_enable_mode_detect(struct aspeed_video *video)
static void aspeed_video_off(struct aspeed_video *video) { + if (!test_bit(VIDEO_CLOCKS_ON, &video->flags)) + return; + /* Disable interrupts */ aspeed_video_write(video, VE_INTERRUPT_CTRL, 0);
/* Turn off the relevant clocks */ clk_disable_unprepare(video->vclk); clk_disable_unprepare(video->eclk); + + clear_bit(VIDEO_CLOCKS_ON, &video->flags); }
static void aspeed_video_on(struct aspeed_video *video) { + if (test_bit(VIDEO_CLOCKS_ON, &video->flags)) + return; + /* Turn on the relevant clocks */ clk_prepare_enable(video->eclk); clk_prepare_enable(video->vclk); + + set_bit(VIDEO_CLOCKS_ON, &video->flags); }
static void aspeed_video_bufs_done(struct aspeed_video *video,
[ Upstream commit ee326fd01e79dfa42014d55931260b68b9fa3273 ]
Current dwmac4_flow_ctrl will not clear GMAC_RX_FLOW_CTRL_RFE/GMAC_RX_FLOW_CTRL_RFE bits, so MAC hw will keep flow control on although expecting flow control off by ethtool. Add codes to fix it.
Fixes: 477286b53f55 ("stmmac: add GMAC4 core support") Signed-off-by: Biao Huang biao.huang@mediatek.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c index 206170d0bf81..e3850938cf2f 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c @@ -474,8 +474,9 @@ static void dwmac4_flow_ctrl(struct mac_device_info *hw, unsigned int duplex, if (fc & FLOW_RX) { pr_debug("\tReceive Flow-Control ON\n"); flow |= GMAC_RX_FLOW_CTRL_RFE; - writel(flow, ioaddr + GMAC_RX_FLOW_CTRL); } + writel(flow, ioaddr + GMAC_RX_FLOW_CTRL); + if (fc & FLOW_TX) { pr_debug("\tTransmit Flow-Control ON\n");
@@ -483,7 +484,7 @@ static void dwmac4_flow_ctrl(struct mac_device_info *hw, unsigned int duplex, pr_debug("\tduplex mode: PAUSE %d\n", pause_time);
for (queue = 0; queue < tx_cnt; queue++) { - flow |= GMAC_TX_FLOW_CTRL_TFE; + flow = GMAC_TX_FLOW_CTRL_TFE;
if (duplex) flow |= @@ -491,6 +492,9 @@ static void dwmac4_flow_ctrl(struct mac_device_info *hw, unsigned int duplex,
writel(flow, ioaddr + GMAC_QX_TX_FLOW_CTRL(queue)); } + } else { + for (queue = 0; queue < tx_cnt; queue++) + writel(0, ioaddr + GMAC_QX_TX_FLOW_CTRL(queue)); } }
[ Upstream commit d2facb4b3983425f6776c24dd678a82dbe673773 ]
the default value of tx-frames is 25, it's too late when passing tstamp to stack, then the ptp4l will fail:
ptp4l -i eth0 -f gPTP.cfg -m ptp4l: selected /dev/ptp0 as PTP clock ptp4l: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l: port 0: INITIALIZING to LISTENING on INITIALIZE ptp4l: port 1: link up ptp4l: timed out while polling for tx timestamp ptp4l: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug ptp4l: port 1: send peer delay response failed ptp4l: port 1: LISTENING to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
ptp4l tests pass when changing the tx-frames from 25 to 1 with ethtool -C option. It should be fine to set tx-frames default value to 1, so ptp4l will pass by default.
Signed-off-by: Biao Huang biao.huang@mediatek.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/common.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h index ceb0d23f5041..c265cc5770e8 100644 --- a/drivers/net/ethernet/stmicro/stmmac/common.h +++ b/drivers/net/ethernet/stmicro/stmmac/common.h @@ -251,7 +251,7 @@ struct stmmac_safety_stats { #define STMMAC_COAL_TX_TIMER 1000 #define STMMAC_MAX_COAL_TX_TICK 100000 #define STMMAC_TX_MAX_FRAMES 256 -#define STMMAC_TX_FRAMES 25 +#define STMMAC_TX_FRAMES 1
/* Packets types */ enum packets_types {
[ Upstream commit 89332590427235680236b9470e851afc49b3caa1 ]
When performing a transformation the hardware is given result descriptors to save the result data. Those result descriptors are batched using a 'first' and a 'last' bit. There are cases were more descriptors than needed are given to the engine, leading to the engine only using some of them, and not setting the last bit on the last descriptor we gave. This causes issues were the driver and the hardware aren't in sync anymore about the number of result descriptors given (as the driver do not give a pool of descriptor to use for any transformation, but a pool of descriptors to use *per* transformation).
This patch fixes it by attaching the number of given result descriptors to the requests, and by using this number instead of the 'last' bit found on the descriptors to process them.
Signed-off-by: Antoine Tenart antoine.tenart@bootlin.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- .../crypto/inside-secure/safexcel_cipher.c | 24 ++++++++++++++----- 1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c b/drivers/crypto/inside-secure/safexcel_cipher.c index de4be10b172f..ccacdcf07ffc 100644 --- a/drivers/crypto/inside-secure/safexcel_cipher.c +++ b/drivers/crypto/inside-secure/safexcel_cipher.c @@ -51,6 +51,8 @@ struct safexcel_cipher_ctx {
struct safexcel_cipher_req { enum safexcel_cipher_direction direction; + /* Number of result descriptors associated to the request */ + unsigned int rdescs; bool needs_inv; };
@@ -333,7 +335,10 @@ static int safexcel_handle_req_result(struct safexcel_crypto_priv *priv, int rin
*ret = 0;
- do { + if (unlikely(!sreq->rdescs)) + return 0; + + while (sreq->rdescs--) { rdesc = safexcel_ring_next_rptr(priv, &priv->ring[ring].rdr); if (IS_ERR(rdesc)) { dev_err(priv->dev, @@ -346,7 +351,7 @@ static int safexcel_handle_req_result(struct safexcel_crypto_priv *priv, int rin *ret = safexcel_rdesc_check_errors(priv, rdesc);
ndesc++; - } while (!rdesc->last_seg); + }
safexcel_complete(priv, ring);
@@ -501,6 +506,7 @@ static int safexcel_send_req(struct crypto_async_request *base, int ring, static int safexcel_handle_inv_result(struct safexcel_crypto_priv *priv, int ring, struct crypto_async_request *base, + struct safexcel_cipher_req *sreq, bool *should_complete, int *ret) { struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(base->tfm); @@ -509,7 +515,10 @@ static int safexcel_handle_inv_result(struct safexcel_crypto_priv *priv,
*ret = 0;
- do { + if (unlikely(!sreq->rdescs)) + return 0; + + while (sreq->rdescs--) { rdesc = safexcel_ring_next_rptr(priv, &priv->ring[ring].rdr); if (IS_ERR(rdesc)) { dev_err(priv->dev, @@ -522,7 +531,7 @@ static int safexcel_handle_inv_result(struct safexcel_crypto_priv *priv, *ret = safexcel_rdesc_check_errors(priv, rdesc);
ndesc++; - } while (!rdesc->last_seg); + }
safexcel_complete(priv, ring);
@@ -564,7 +573,7 @@ static int safexcel_skcipher_handle_result(struct safexcel_crypto_priv *priv,
if (sreq->needs_inv) { sreq->needs_inv = false; - err = safexcel_handle_inv_result(priv, ring, async, + err = safexcel_handle_inv_result(priv, ring, async, sreq, should_complete, ret); } else { err = safexcel_handle_req_result(priv, ring, async, req->src, @@ -587,7 +596,7 @@ static int safexcel_aead_handle_result(struct safexcel_crypto_priv *priv,
if (sreq->needs_inv) { sreq->needs_inv = false; - err = safexcel_handle_inv_result(priv, ring, async, + err = safexcel_handle_inv_result(priv, ring, async, sreq, should_complete, ret); } else { err = safexcel_handle_req_result(priv, ring, async, req->src, @@ -633,6 +642,8 @@ static int safexcel_skcipher_send(struct crypto_async_request *async, int ring, ret = safexcel_send_req(async, ring, sreq, req->src, req->dst, req->cryptlen, 0, 0, req->iv, commands, results); + + sreq->rdescs = *results; return ret; }
@@ -655,6 +666,7 @@ static int safexcel_aead_send(struct crypto_async_request *async, int ring, req->cryptlen, req->assoclen, crypto_aead_authsize(tfm), req->iv, commands, results); + sreq->rdescs = *results; return ret; }
[ Upstream commit a19a0582363b9a5f8ba812f34f1b8df394898780 ]
When a valid MAC address is not found the current messages are shown:
fec 2188000.ethernet (unnamed net_device) (uninitialized): Invalid MAC address: 00:00:00:00:00:00 fec 2188000.ethernet (unnamed net_device) (uninitialized): Using random MAC address: aa:9f:25:eb:7e:aa
Since the network device has not been registered at this point, it is better to use dev_err()/dev_info() instead, which will provide cleaner log messages like these:
fec 2188000.ethernet: Invalid MAC address: 00:00:00:00:00:00 fec 2188000.ethernet: Using random MAC address: aa:9f:25:eb:7e:aa
Tested on a imx6dl-pico-pi board.
Signed-off-by: Fabio Estevam festevam@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fec_main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 38f10f7dcbc3..831bb709e783 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1689,10 +1689,10 @@ static void fec_get_mac(struct net_device *ndev) */ if (!is_valid_ether_addr(iap)) { /* Report it and use a random ethernet address instead */ - netdev_err(ndev, "Invalid MAC address: %pM\n", iap); + dev_err(&fep->pdev->dev, "Invalid MAC address: %pM\n", iap); eth_hw_addr_random(ndev); - netdev_info(ndev, "Using random MAC address: %pM\n", - ndev->dev_addr); + dev_info(&fep->pdev->dev, "Using random MAC address: %pM\n", + ndev->dev_addr); return; }
[ Upstream commit 7de44285c1f69ccfbe8be1d6a16fcd956681fee6 ]
It is possible that the interrupt handler fires and frees up space in the TX ring in between checking for sufficient TX ring space and stopping the TX queue in axienet_start_xmit. If this happens, the queue wake from the interrupt handler will occur before the queue is stopped, causing a lost wakeup and the adapter's transmit hanging.
To avoid this, after stopping the queue, check again whether there is sufficient space in the TX ring. If so, wake up the queue again.
Signed-off-by: Robert Hancock hancock@sedsystems.ca Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/xilinx/xilinx_axienet_main.c | 20 ++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c index 831967f6eff8..65c16772e589 100644 --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c @@ -615,6 +615,10 @@ static void axienet_start_xmit_done(struct net_device *ndev)
ndev->stats.tx_packets += packets; ndev->stats.tx_bytes += size; + + /* Matches barrier in axienet_start_xmit */ + smp_mb(); + netif_wake_queue(ndev); }
@@ -670,9 +674,19 @@ axienet_start_xmit(struct sk_buff *skb, struct net_device *ndev) cur_p = &lp->tx_bd_v[lp->tx_bd_tail];
if (axienet_check_tx_bd_space(lp, num_frag)) { - if (!netif_queue_stopped(ndev)) - netif_stop_queue(ndev); - return NETDEV_TX_BUSY; + if (netif_queue_stopped(ndev)) + return NETDEV_TX_BUSY; + + netif_stop_queue(ndev); + + /* Matches barrier in axienet_start_xmit_done */ + smp_mb(); + + /* Space might have just been freed - check again */ + if (axienet_check_tx_bd_space(lp, num_frag)) + return NETDEV_TX_BUSY; + + netif_wake_queue(ndev); }
if (skb->ip_summed == CHECKSUM_PARTIAL) {
[ Upstream commit 04310324c6f482921c071444833e70fe861b73d9 ]
When a CQ-enabled device uses QEBSM for SBAL state inspection, get_buf_states() can return the PENDING state for an Output Queue. get_outbound_buffer_frontier() isn't prepared for this, and any PENDING buffer will permanently stall all further completion processing on this Queue.
This isn't a concern for non-QEBSM devices, as get_buf_states() for such devices will manually turn PENDING buffers into EMPTY ones.
Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/cio/qdio_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/s390/cio/qdio_main.c b/drivers/s390/cio/qdio_main.c index 7b7620de2acd..730c4e68094b 100644 --- a/drivers/s390/cio/qdio_main.c +++ b/drivers/s390/cio/qdio_main.c @@ -736,6 +736,7 @@ static int get_outbound_buffer_frontier(struct qdio_q *q, unsigned int start)
switch (state) { case SLSB_P_OUTPUT_EMPTY: + case SLSB_P_OUTPUT_PENDING: /* the adapter got it */ DBF_DEV_EVENT(DBF_INFO, q->irq_ptr, "out empty:%1d %02x", q->nr, count);
[ Upstream commit 6d8e294bf5f0e85c34e8b14b064e2965f53f38b0 ]
When inserting random PFNs for debugging the CEC through (debugfs)/ras/cec/pfn, depending on the return value of pfn_set(), multiple values get inserted per a single write.
That is because simple_attr_write() interprets a retval of 0 as success and claims the whole input. However, pfn_set() returns the cec_add_elem() value, which, if > 0 and smaller than the whole input length, makes glibc continue issuing the write syscall until there's input left:
pfn_set simple_attr_write debugfs_attr_write full_proxy_write vfs_write ksys_write do_syscall_64 entry_SYSCALL_64_after_hwframe
leading to those repeated calls.
Return 0 to fix that.
Signed-off-by: Borislav Petkov bp@suse.de Cc: Tony Luck tony.luck@intel.com Cc: linux-edac linux-edac@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ras/cec.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c index 673f8a128397..f5795adc5a6e 100644 --- a/drivers/ras/cec.c +++ b/drivers/ras/cec.c @@ -369,7 +369,9 @@ static int pfn_set(void *data, u64 val) { *(u64 *)data = val;
- return cec_add_elem(val); + cec_add_elem(val); + + return 0; }
DEFINE_DEBUGFS_ATTRIBUTE(pfn_ops, u64_get, pfn_set, "0x%llx\n");
[ Upstream commit 2158e856f56bb762ef90f3ec244d41a519826f75 ]
sfp_check_state can potentially be called by both a threaded IRQ handler and delayed work. If it is concurrently called, it could result in incorrect state management. Add a st_mutex to protect the state - this lock gets taken outside of code that checks and handle state changes, and the existing sm_mutex nests inside of it.
Suggested-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Robert Hancock hancock@sedsystems.ca Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/sfp.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c index 71812be0ac64..b6efd2d41dce 100644 --- a/drivers/net/phy/sfp.c +++ b/drivers/net/phy/sfp.c @@ -186,10 +186,11 @@ struct sfp { struct gpio_desc *gpio[GPIO_MAX];
bool attached; + struct mutex st_mutex; /* Protects state */ unsigned int state; struct delayed_work poll; struct delayed_work timeout; - struct mutex sm_mutex; + struct mutex sm_mutex; /* Protects state machine */ unsigned char sm_mod_state; unsigned char sm_dev_state; unsigned short sm_state; @@ -1719,6 +1720,7 @@ static void sfp_check_state(struct sfp *sfp) { unsigned int state, i, changed;
+ mutex_lock(&sfp->st_mutex); state = sfp_get_state(sfp); changed = state ^ sfp->state; changed &= SFP_F_PRESENT | SFP_F_LOS | SFP_F_TX_FAULT; @@ -1744,6 +1746,7 @@ static void sfp_check_state(struct sfp *sfp) sfp_sm_event(sfp, state & SFP_F_LOS ? SFP_E_LOS_HIGH : SFP_E_LOS_LOW); rtnl_unlock(); + mutex_unlock(&sfp->st_mutex); }
static irqreturn_t sfp_irq(int irq, void *data) @@ -1774,6 +1777,7 @@ static struct sfp *sfp_alloc(struct device *dev) sfp->dev = dev;
mutex_init(&sfp->sm_mutex); + mutex_init(&sfp->st_mutex); INIT_DELAYED_WORK(&sfp->poll, sfp_poll); INIT_DELAYED_WORK(&sfp->timeout, sfp_timeout);
[ Upstream commit f4f5748bfec94cf418e49bf05f0c81a1b9ebc950 ]
When nla_parse fails, we should not use the results (the first argument). The fix checks if it fails, and if so, returns its error code upstream.
Signed-off-by: Aditya Pakki pakki001@umn.edu Signed-off-by: Jozsef Kadlecsik kadlec@blackhole.kfki.hu Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipset/ip_set_core.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index 3cdf171cd468..16afa0df4004 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -1541,10 +1541,14 @@ call_ad(struct sock *ctnl, struct sk_buff *skb, struct ip_set *set, memcpy(&errmsg->msg, nlh, nlh->nlmsg_len); cmdattr = (void *)&errmsg->msg + min_len;
- nla_parse_deprecated(cda, IPSET_ATTR_CMD_MAX, cmdattr, - nlh->nlmsg_len - min_len, - ip_set_adt_policy, NULL); + ret = nla_parse_deprecated(cda, IPSET_ATTR_CMD_MAX, cmdattr, + nlh->nlmsg_len - min_len, + ip_set_adt_policy, NULL);
+ if (ret) { + nlmsg_free(skb2); + return ret; + } errline = nla_data(cda[IPSET_ATTR_LINENO]);
*errline = lineno;
[ Upstream commit 11921796f4799ca9c61c4b22cc54d84aa69f8a35 ]
If a fresh array block is allocated during resize, the current in-memory set size should be increased by the size of the block, not replaced by it.
Before the fix, adding entries to a hash set type, leading to a table resize, caused an inconsistent memory size to be reported. This becomes more obvious when swapping sets with similar sizes:
# cat hash_ip_size.sh #!/bin/sh FAIL_RETRIES=10
tries=0 while [ ${tries} -lt ${FAIL_RETRIES} ]; do ipset create t1 hash:ip for i in `seq 1 4345`; do ipset add t1 1.2.$((i / 255)).$((i % 255)) done t1_init="$(ipset list t1|sed -n 's/Size in memory: (.*)/\1/p')"
ipset create t2 hash:ip for i in `seq 1 4360`; do ipset add t2 1.2.$((i / 255)).$((i % 255)) done t2_init="$(ipset list t2|sed -n 's/Size in memory: (.*)/\1/p')"
ipset swap t1 t2 t1_swap="$(ipset list t1|sed -n 's/Size in memory: (.*)/\1/p')" t2_swap="$(ipset list t2|sed -n 's/Size in memory: (.*)/\1/p')"
ipset destroy t1 ipset destroy t2 tries=$((tries + 1))
if [ ${t1_init} -lt 10000 ] || [ ${t2_init} -lt 10000 ]; then echo "FAIL after ${tries} tries:" echo "T1 size ${t1_init}, after swap ${t1_swap}" echo "T2 size ${t2_init}, after swap ${t2_swap}" exit 1 fi done echo "PASS" # echo -n 'func hash_ip4_resize +p' > /sys/kernel/debug/dynamic_debug/control # ./hash_ip_size.sh [ 2035.018673] attempt to resize set t1 from 10 to 11, t 00000000fe6551fa [ 2035.078583] set t1 resized from 10 (00000000fe6551fa) to 11 (00000000172a0163) [ 2035.080353] Table destroy by resize 00000000fe6551fa FAIL after 4 tries: T1 size 9064, after swap 71128 T2 size 71128, after swap 9064
Reported-by: NOYB JunkYardMail1@Frontier.com Fixes: 9e41f26a505c ("netfilter: ipset: Count non-static extension memory for userspace") Signed-off-by: Stefano Brivio sbrivio@redhat.com Signed-off-by: Jozsef Kadlecsik kadlec@blackhole.kfki.hu Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipset/ip_set_hash_gen.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index 10f619625abd..175f8fedcfaf 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -622,7 +622,7 @@ mtype_resize(struct ip_set *set, bool retried) goto cleanup; } m->size = AHASH_INIT_SIZE; - extsize = ext_size(AHASH_INIT_SIZE, dsize); + extsize += ext_size(AHASH_INIT_SIZE, dsize); RCU_INIT_POINTER(hbucket(t, key), m); } else if (m->pos >= m->size) { struct hbucket *ht;
[ Upstream commit e45c48a9a4d20ebc7b639a62c3ef8f4b08007027 ]
This patch adds the necessary intelligence to properly compute the value of 'old' and 'head' when operating in snapshot mode. That way we can get the latest information in the AUX buffer and be compatible with the generic AUX ring buffer mechanic.
Tester notes:
Leo, have you had the chance to test/review this one? Suzuki?
Sure. I applied this patch on the perf/core branch (with latest commit 3e4fbf36c1e3 'perf augmented_raw_syscalls: Move reading filename to the loop') and passed testing with below steps:
# perf record -e cs_etm/@tmc_etr0/ -S -m,64 --per-thread ./sort & [1] 19097 Bubble sorting array of 30000 elements
# kill -USR2 19097 # kill -USR2 19097 # kill -USR2 19097 [ perf record: Woken up 4 times to write data ] [ perf record: Captured and wrote 0.753 MB perf.data ]
Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org Tested-by: Leo Yan leo.yan@linaro.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Suzuki Poulouse suzuki.poulose@arm.com Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190605161633.12245-1-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/arch/arm/util/cs-etm.c | 127 +++++++++++++++++++++++++++++- 1 file changed, 123 insertions(+), 4 deletions(-)
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 911426721170..0a278bbcaba6 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -31,6 +31,8 @@ struct cs_etm_recording { struct auxtrace_record itr; struct perf_pmu *cs_etm_pmu; struct perf_evlist *evlist; + int wrapped_cnt; + bool *wrapped; bool snapshot_mode; size_t snapshot_size; }; @@ -536,16 +538,131 @@ static int cs_etm_info_fill(struct auxtrace_record *itr, return 0; }
-static int cs_etm_find_snapshot(struct auxtrace_record *itr __maybe_unused, +static int cs_etm_alloc_wrapped_array(struct cs_etm_recording *ptr, int idx) +{ + bool *wrapped; + int cnt = ptr->wrapped_cnt; + + /* Make @ptr->wrapped as big as @idx */ + while (cnt <= idx) + cnt++; + + /* + * Free'ed in cs_etm_recording_free(). Using realloc() to avoid + * cross compilation problems where the host's system supports + * reallocarray() but not the target. + */ + wrapped = realloc(ptr->wrapped, cnt * sizeof(bool)); + if (!wrapped) + return -ENOMEM; + + wrapped[cnt - 1] = false; + ptr->wrapped_cnt = cnt; + ptr->wrapped = wrapped; + + return 0; +} + +static bool cs_etm_buffer_has_wrapped(unsigned char *buffer, + size_t buffer_size, u64 head) +{ + u64 i, watermark; + u64 *buf = (u64 *)buffer; + size_t buf_size = buffer_size; + + /* + * We want to look the very last 512 byte (chosen arbitrarily) in + * the ring buffer. + */ + watermark = buf_size - 512; + + /* + * @head is continuously increasing - if its value is equal or greater + * than the size of the ring buffer, it has wrapped around. + */ + if (head >= buffer_size) + return true; + + /* + * The value of @head is somewhere within the size of the ring buffer. + * This can be that there hasn't been enough data to fill the ring + * buffer yet or the trace time was so long that @head has numerically + * wrapped around. To find we need to check if we have data at the very + * end of the ring buffer. We can reliably do this because mmap'ed + * pages are zeroed out and there is a fresh mapping with every new + * session. + */ + + /* @head is less than 512 byte from the end of the ring buffer */ + if (head > watermark) + watermark = head; + + /* + * Speed things up by using 64 bit transactions (see "u64 *buf" above) + */ + watermark >>= 3; + buf_size >>= 3; + + /* + * If we find trace data at the end of the ring buffer, @head has + * been there and has numerically wrapped around at least once. + */ + for (i = watermark; i < buf_size; i++) + if (buf[i]) + return true; + + return false; +} + +static int cs_etm_find_snapshot(struct auxtrace_record *itr, int idx, struct auxtrace_mmap *mm, - unsigned char *data __maybe_unused, + unsigned char *data, u64 *head, u64 *old) { + int err; + bool wrapped; + struct cs_etm_recording *ptr = + container_of(itr, struct cs_etm_recording, itr); + + /* + * Allocate memory to keep track of wrapping if this is the first + * time we deal with this *mm. + */ + if (idx >= ptr->wrapped_cnt) { + err = cs_etm_alloc_wrapped_array(ptr, idx); + if (err) + return err; + } + + /* + * Check to see if *head has wrapped around. If it hasn't only the + * amount of data between *head and *old is snapshot'ed to avoid + * bloating the perf.data file with zeros. But as soon as *head has + * wrapped around the entire size of the AUX ring buffer it taken. + */ + wrapped = ptr->wrapped[idx]; + if (!wrapped && cs_etm_buffer_has_wrapped(data, mm->len, *head)) { + wrapped = true; + ptr->wrapped[idx] = true; + } + pr_debug3("%s: mmap index %d old head %zu new head %zu size %zu\n", __func__, idx, (size_t)*old, (size_t)*head, mm->len);
- *old = *head; - *head += mm->len; + /* No wrap has occurred, we can just use *head and *old. */ + if (!wrapped) + return 0; + + /* + * *head has wrapped around - adjust *head and *old to pickup the + * entire content of the AUX buffer. + */ + if (*head >= mm->len) { + *old = *head - mm->len; + } else { + *head += mm->len; + *old = *head - mm->len; + }
return 0; } @@ -586,6 +703,8 @@ static void cs_etm_recording_free(struct auxtrace_record *itr) { struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr); + + zfree(&ptr->wrapped); free(ptr); }
[ Upstream commit 53fe307dfd309e425b171f6272d64296a54f4dff ]
Command
# perf test -Fv 6
fails with error
running test 100 'kvm-s390:kvm_s390_create_vm' failed to parse event 'kvm-s390:kvm_s390_create_vm', err -1, str 'unknown tracepoint' event syntax error: 'kvm-s390:kvm_s390_create_vm' ___ unknown tracepoint
when the kvm module is not loaded or not built in.
Fix this by adding a valid function which tests if the module is loaded. Loaded modules (or builtin KVM support) have a directory named /sys/kernel/debug/tracing/events/kvm-s390 for this tracepoint.
Check for existence of this directory.
Signed-off-by: Thomas Richter tmricht@linux.ibm.com Reviewed-by: Christian Borntraeger borntraeger@de.ibm.com Cc: Heiko Carstens heiko.carstens@de.ibm.com Cc: Hendrik Brueckner brueckner@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/20190604053504.43073-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/parse-events.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c index 4a69c07f4101..8f3c80e13584 100644 --- a/tools/perf/tests/parse-events.c +++ b/tools/perf/tests/parse-events.c @@ -18,6 +18,32 @@ #define PERF_TP_SAMPLE_TYPE (PERF_SAMPLE_RAW | PERF_SAMPLE_TIME | \ PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD)
+#if defined(__s390x__) +/* Return true if kvm module is available and loaded. Test this + * and retun success when trace point kvm_s390_create_vm + * exists. Otherwise this test always fails. + */ +static bool kvm_s390_create_vm_valid(void) +{ + char *eventfile; + bool rc = false; + + eventfile = get_events_file("kvm-s390"); + + if (eventfile) { + DIR *mydir = opendir(eventfile); + + if (mydir) { + rc = true; + closedir(mydir); + } + put_events_file(eventfile); + } + + return rc; +} +#endif + static int test__checkevent_tracepoint(struct perf_evlist *evlist) { struct perf_evsel *evsel = perf_evlist__first(evlist); @@ -1642,6 +1668,7 @@ static struct evlist_test test__events[] = { { .name = "kvm-s390:kvm_s390_create_vm", .check = test__checkevent_tracepoint, + .valid = kvm_s390_create_vm_valid, .id = 100, }, #endif
[ Upstream commit 8a07aa4e9b7b0222129c07afff81634a884b2866 ]
Debugging a OOM error using the TUI interface revealed this issue on s390:
[tmricht@m83lp54 perf]$ cat /proc/kallsyms |sort .... 00000001119b7158 B radix_tree_node_cachep 00000001119b8000 B __bss_stop 00000001119b8000 B _end 000003ff80002850 t autofs_mount [autofs4] 000003ff80002868 t autofs_show_options [autofs4] 000003ff80002a98 t autofs_evict_inode [autofs4] ....
There is a huge gap between the last kernel symbol __bss_stop/_end and the first kernel module symbol autofs_mount (from autofs4 module).
After reading the kernel symbol table via functions:
dso__load() +--> dso__load_kernel_sym() +--> dso__load_kallsyms() +--> __dso_load_kallsyms() +--> symbols__fixup_end()
the symbol __bss_stop has a start address of 1119b8000 and an end address of 3ff80002850, as can be seen by this debug statement:
symbols__fixup_end __bss_stop start:0x1119b8000 end:0x3ff80002850
The size of symbol __bss_stop is 0x3fe6e64a850 bytes! It is the last kernel symbol and fills up the space until the first kernel module symbol.
This size kills the TUI interface when executing the following code:
process_sample_event() hist_entry_iter__add() hist_iter__report_callback() hist_entry__inc_addr_samples() symbol__inc_addr_samples(symbol = __bss_stop) symbol__cycles_hist() annotated_source__alloc_histograms(..., symbol__size(sym), ...)
This function allocates memory to save sample histograms. The symbol_size() marco is defined as sym->end - sym->start, which results in above value of 0x3fe6e64a850 bytes and the call to calloc() in annotated_source__alloc_histograms() fails.
The histgram memory allocation might fail, make this failure no-fatal and continue processing.
Output before: [tmricht@m83lp54 perf]$ ./perf --debug stderr=1 report -vvvvv \ -i ~/slow.data 2>/tmp/2 [tmricht@m83lp54 perf]$ tail -5 /tmp/2 __symbol__inc_addr_samples(875): ENOMEM! sym->name=__bss_stop, start=0x1119b8000, addr=0x2aa0005eb08, end=0x3ff80002850, func: 0 problem adding hist entry, skipping event 0x938b8 [0x8]: failed to process type: 68 [Cannot allocate memory] [tmricht@m83lp54 perf]$
Output after: [tmricht@m83lp54 perf]$ ./perf --debug stderr=1 report -vvvvv \ -i ~/slow.data 2>/tmp/2 [tmricht@m83lp54 perf]$ tail -5 /tmp/2 symbol__inc_addr_samples map:0x1597830 start:0x110730000 end:0x3ff80002850 symbol__hists notes->src:0x2aa2a70 nr_hists:1 symbol__inc_addr_samples sym:unlink_anon_vmas src:0x2aa2a70 __symbol__inc_addr_samples: addr=0x11094c69e 0x11094c670 unlink_anon_vmas: period++ [addr: 0x11094c69e, 0x2e, evidx=0] => nr_samples: 1, period: 526008 [tmricht@m83lp54 perf]$
There is no error about failed memory allocation and the TUI interface shows all entries.
Signed-off-by: Thomas Richter tmricht@linux.ibm.com Reviewed-by: Hendrik Brueckner brueckner@linux.ibm.com Cc: Heiko Carstens heiko.carstens@de.ibm.com Cc: Hendrik Brueckner brueckner@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/90cb5607-3e12-5167-682d-978eba7dafa8@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/annotate.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 79db038b56f2..c8ce13419d9b 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -931,9 +931,8 @@ static int symbol__inc_addr_samples(struct symbol *sym, struct map *map, if (sym == NULL) return 0; src = symbol__hists(sym, evsel->evlist->nr_entries); - if (src == NULL) - return -ENOMEM; - return __symbol__inc_addr_samples(sym, map, src, evsel->idx, addr, sample); + return (src) ? __symbol__inc_addr_samples(sym, map, src, evsel->idx, + addr, sample) : 0; }
static int symbol__account_cycles(u64 addr, u64 start,
[ Upstream commit 89cceaa939171fafa153d4bf637b39e396bbd785 ]
An error "implicit declaration of function 'reallocarray'" can be thrown with the following steps:
$ cd tools/testing/selftests/bpf $ make clean && make CC=<Path to GCC 4.8.5> $ make clean && make CC=<Path to GCC 7.x>
The cause is that the feature folder generated by GCC 4.8.5 is not removed, leaving feature-reallocarray being 1, which causes reallocarray not defined when re-compliing with GCC 7.x. This diff adds feature folder to EXTRA_CLEAN to avoid this problem.
v2: Rephrase the commit message.
Signed-off-by: Hechao Li hechaol@fb.com Acked-by: Andrii Nakryiko andriin@fb.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/bpf/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index e36356e2377e..1c9511262947 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -275,4 +275,5 @@ $(OUTPUT)/verifier/tests.h: $(VERIFIER_TESTS_DIR) $(VERIFIER_TEST_FILES) ) > $(VERIFIER_TESTS_H))
EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(ALU32_BUILD_DIR) \ - $(VERIFIER_TESTS_H) $(PROG_TESTS_H) $(MAP_TESTS_H) + $(VERIFIER_TESTS_H) $(PROG_TESTS_H) $(MAP_TESTS_H) \ + feature
[ Upstream commit c64a9e804ccf86eb202bfd1c6a8c5233c75a0431 ]
The Meson-G12A SoC uses the same GPIO interrupt controller IP block as the other Meson SoCs, A totle of 100 pins can be spied on, which is the sum of:
- 223:100 undefined (no interrupt) - 99:97 3 pins on bank GPIOE - 96:77 20 pins on bank GPIOX - 76:61 16 pins on bank GPIOA - 60:53 8 pins on bank GPIOC - 52:37 16 pins on bank BOOT - 36:28 9 pins on bank GPIOH - 27:12 16 pins on bank GPIOZ - 11:0 12 pins in the AO domain
Signed-off-by: Xingyu Chen xingyu.chen@amlogic.com Signed-off-by: Jianxin Pan jianxin.pan@amlogic.com Signed-off-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-meson-gpio.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/irqchip/irq-meson-gpio.c b/drivers/irqchip/irq-meson-gpio.c index 8eb92eb98f54..dcdc23b9dce6 100644 --- a/drivers/irqchip/irq-meson-gpio.c +++ b/drivers/irqchip/irq-meson-gpio.c @@ -60,6 +60,7 @@ static const struct of_device_id meson_irq_gpio_matches[] = { { .compatible = "amlogic,meson-gxbb-gpio-intc", .data = &gxbb_params }, { .compatible = "amlogic,meson-gxl-gpio-intc", .data = &gxl_params }, { .compatible = "amlogic,meson-axg-gpio-intc", .data = &axg_params }, + { .compatible = "amlogic,meson-g12a-gpio-intc", .data = &axg_params }, { } };
[ Upstream commit 11a087f484bf15ff65f0a9f277aa5a61fd07ed2a ]
We need to check whether this work we are canceling actually is initialized.
Signed-off-by: Oliver Neukum oneukum@suse.com Reported-by: syzbot+2e1ef9188251d9cc7944@syzkaller.appspotmail.com Signed-off-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/uvc/uvc_ctrl.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c index 26163a5bde7d..e399b9fad757 100644 --- a/drivers/media/usb/uvc/uvc_ctrl.c +++ b/drivers/media/usb/uvc/uvc_ctrl.c @@ -2345,7 +2345,9 @@ void uvc_ctrl_cleanup_device(struct uvc_device *dev) struct uvc_entity *entity; unsigned int i;
- cancel_work_sync(&dev->async_ctrl.work); + /* Can be uninitialized if we are aborting on probe error. */ + if (dev->async_ctrl.work.func) + cancel_work_sync(&dev->async_ctrl.work);
/* Free controls and control mappings for all entities. */ list_for_each_entry(entity, &dev->entities, list) {
[ Upstream commit 4e8c120de9268fc26f583268b9d22e7d37c4595f ]
New Gen3 R-Car platforms incorporate the FDP1 with an updated version register. No code change is required to support these targets, but they will currently report an error stating that the device can not be identified.
Update the driver to match against the new device types.
Signed-off-by: Kieran Bingham kieran.bingham+renesas@ideasonboard.com Signed-off-by: Laurent Pinchart laurent.pinchart+renesas@ideasonboard.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/rcar_fdp1.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/media/platform/rcar_fdp1.c b/drivers/media/platform/rcar_fdp1.c index 6a90bc4c476e..b8615a288e2b 100644 --- a/drivers/media/platform/rcar_fdp1.c +++ b/drivers/media/platform/rcar_fdp1.c @@ -257,6 +257,8 @@ MODULE_PARM_DESC(debug, "activate debug info"); #define FD1_IP_H3_ES1 0x02010101 #define FD1_IP_M3W 0x02010202 #define FD1_IP_H3 0x02010203 +#define FD1_IP_M3N 0x02010204 +#define FD1_IP_E3 0x02010205
/* LUTs */ #define FD1_LUT_DIF_ADJ 0x1000 @@ -2365,6 +2367,12 @@ static int fdp1_probe(struct platform_device *pdev) case FD1_IP_H3: dprintk(fdp1, "FDP1 Version R-Car H3\n"); break; + case FD1_IP_M3N: + dprintk(fdp1, "FDP1 Version R-Car M3N\n"); + break; + case FD1_IP_E3: + dprintk(fdp1, "FDP1 Version R-Car E3\n"); + break; default: dev_err(fdp1->dev, "FDP1 Unidentifiable (0x%08x)\n", hw_version);
[ Upstream commit ad0834dedaa15c3a176f783c0373f836e44b4700 ]
In case we expand an existing region, we unlink this latter and insert the larger one. In that case we should free the original region after the insertion. Also we can immediately return.
Fixes: 6c65fb318e8b ("iommu: iommu_get_group_resv_regions")
Signed-off-by: Eric Auger eric.auger@redhat.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/iommu.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 9f0a2844371c..30db41e9f15c 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -225,18 +225,21 @@ static int iommu_insert_resv_region(struct iommu_resv_region *new, pos = pos->next; } else if ((start >= a) && (end <= b)) { if (new->type == type) - goto done; + return 0; else pos = pos->next; } else { if (new->type == type) { phys_addr_t new_start = min(a, start); phys_addr_t new_end = max(b, end); + int ret;
list_del(&entry->list); entry->start = new_start; entry->length = new_end - new_start + 1; - iommu_insert_resv_region(entry, regions); + ret = iommu_insert_resv_region(entry, regions); + kfree(entry); + return ret; } else { pos = pos->next; } @@ -249,7 +252,6 @@ static int iommu_insert_resv_region(struct iommu_resv_region *new, return -ENOMEM;
list_add_tail(®ion->list, pos); -done: return 0; }
[ Upstream commit 64ea3e9094a1f13b96c33244a3fb3a0f45690bd2 ]
Commit 384ebe1c2849 ("gpio/omap: Add DT support to GPIO driver") added the register definition tables to the gpio-omap driver. Subsequently to that commit, commit 4e962e8998cc ("gpio/omap: remove cpu_is_omapxxxx() checks from *_runtime_resume()") added definitions for irqstatus_raw* registers to the legacy OMAP4 definitions, but missed the DT definitions.
This causes an unintentional change of behaviour for the 1.101 errata workaround on OMAP4 platforms. Fix this oversight.
Fixes: 4e962e8998cc ("gpio/omap: remove cpu_is_omapxxxx() checks from *_runtime_resume()") Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Grygorii Strashko grygorii.strashko@ti.com Tested-by: Tony Lindgren tony@atomide.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-omap.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c index 9276ef616430..7632c98aa3a4 100644 --- a/drivers/gpio/gpio-omap.c +++ b/drivers/gpio/gpio-omap.c @@ -1453,6 +1453,8 @@ static struct omap_gpio_reg_offs omap4_gpio_regs = { .clr_dataout = OMAP4_GPIO_CLEARDATAOUT, .irqstatus = OMAP4_GPIO_IRQSTATUS0, .irqstatus2 = OMAP4_GPIO_IRQSTATUS1, + .irqstatus_raw0 = OMAP4_GPIO_IRQSTATUSRAW0, + .irqstatus_raw1 = OMAP4_GPIO_IRQSTATUSRAW1, .irqenable = OMAP4_GPIO_IRQSTATUSSET0, .irqenable2 = OMAP4_GPIO_IRQSTATUSSET1, .set_irqenable = OMAP4_GPIO_IRQSTATUSSET0,
[ Upstream commit c859e0d479b3b4f6132fc12637c51e01492f31f6 ]
Documentation states:
NOTE: There must be a correlation between the wake-up enable and interrupt-enable registers. If a GPIO pin has a wake-up configured on it, it must also have the corresponding interrupt enabled (on one of the two interrupt lines).
Ensure that this condition is always satisfied by enabling the detection events after enabling the interrupt, and disabling the detection before disabling the interrupt. This ensures interrupt/wakeup events can not happen until both the wakeup and interrupt enables correlate.
If we do any clearing, clear between the interrupt enable/disable and trigger setting.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Grygorii Strashko grygorii.strashko@ti.com Tested-by: Tony Lindgren tony@atomide.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-omap.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c index 7632c98aa3a4..746aa9caf934 100644 --- a/drivers/gpio/gpio-omap.c +++ b/drivers/gpio/gpio-omap.c @@ -829,9 +829,9 @@ static void omap_gpio_irq_shutdown(struct irq_data *d)
raw_spin_lock_irqsave(&bank->lock, flags); bank->irq_usage &= ~(BIT(offset)); - omap_set_gpio_irqenable(bank, offset, 0); - omap_clear_gpio_irqstatus(bank, offset); omap_set_gpio_triggering(bank, offset, IRQ_TYPE_NONE); + omap_clear_gpio_irqstatus(bank, offset); + omap_set_gpio_irqenable(bank, offset, 0); if (!LINE_USED(bank->mod_usage, offset)) omap_clear_gpio_debounce(bank, offset); omap_disable_gpio_module(bank, offset); @@ -867,8 +867,8 @@ static void omap_gpio_mask_irq(struct irq_data *d) unsigned long flags;
raw_spin_lock_irqsave(&bank->lock, flags); - omap_set_gpio_irqenable(bank, offset, 0); omap_set_gpio_triggering(bank, offset, IRQ_TYPE_NONE); + omap_set_gpio_irqenable(bank, offset, 0); raw_spin_unlock_irqrestore(&bank->lock, flags); }
@@ -880,9 +880,6 @@ static void omap_gpio_unmask_irq(struct irq_data *d) unsigned long flags;
raw_spin_lock_irqsave(&bank->lock, flags); - if (trigger) - omap_set_gpio_triggering(bank, offset, trigger); - omap_set_gpio_irqenable(bank, offset, 1);
/* @@ -890,9 +887,13 @@ static void omap_gpio_unmask_irq(struct irq_data *d) * is cleared, thus after the handler has run. OMAP4 needs this done * after enabing the interrupt to clear the wakeup status. */ - if (bank->level_mask & BIT(offset)) + if (bank->regs->leveldetect0 && bank->regs->wkup_en && + trigger & (IRQ_TYPE_LEVEL_HIGH | IRQ_TYPE_LEVEL_LOW)) omap_clear_gpio_irqstatus(bank, offset);
+ if (trigger) + omap_set_gpio_triggering(bank, offset, trigger); + raw_spin_unlock_irqrestore(&bank->lock, flags); }
[ Upstream commit db057679de3e9e6a03c1bcd5aee09b0d25fd9f5b ]
On buses like SlimBus and SoundWire which does not support gather_writes yet in regmap, A bulk write on paged register would be silently ignored after programming page. This is because local variable 'ret' value in regmap_raw_write_impl() gets reset to 0 once page register is written successfully and the code below checks for 'ret' value to be -ENOTSUPP before linearising the write buffer to send to bus->write().
Fix this by resetting the 'ret' value to -ENOTSUPP in cases where gather_writes() is not supported or single register write is not possible.
Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/regmap/regmap.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c index f1025452bb39..19f57ccfbe1d 100644 --- a/drivers/base/regmap/regmap.c +++ b/drivers/base/regmap/regmap.c @@ -1637,6 +1637,8 @@ static int _regmap_raw_write_impl(struct regmap *map, unsigned int reg, map->format.reg_bytes + map->format.pad_bytes, val, val_len); + else + ret = -ENOTSUPP;
/* If that didn't work fall back on linearising by hand. */ if (ret == -ENOTSUPP) {
[ Upstream commit a522f1d0c381c42f3ace13b8bbeeccabdd6d2e5c ]
If an edge interrupt triggers while entering idle just before we save GPIO datain register to saved_datain, the triggered GPIO will not be noticed on wake-up. This is because the saved_datain and GPIO datain are the same on wake-up in omap_gpio_unidle(). Let's fix this by ignoring any pending edge interrupts for saved_datain.
This issue affects only idle states where the GPIO module internal wake-up path is operational. For deeper idle states where the GPIO module gets powered off, Linux generic wakeirqs must be used for the padconf wake-up events with pinctrl-single driver. For examples, please see "interrupts-extended" dts usage in many drivers.
This issue can be somewhat easily reproduced by pinging an idle system with smsc911x Ethernet interface configured IRQ_TYPE_EDGE_FALLING. At some point the smsc911x interrupts will just stop triggering. Also if WLCORE WLAN is used with EDGE interrupt like it's documentation specifies, we can see lost interrupts without this patch.
Note that in the long run we may be able to cancel entering idle by returning an error in gpio_omap_cpu_notifier() on pending interrupts. But let's fix the bug first.
Also note that because of the recent clean-up efforts this patch does not apply directly to older kernels. This does fix a long term issue though, and can be backported as needed.
Cc: Aaro Koskinen aaro.koskinen@iki.fi Cc: Grygorii Strashko grygorii.strashko@ti.com Cc: Keerthy j-keerthy@ti.com Cc: Ladislav Michl ladis@linux-mips.org Cc: Peter Ujfalusi peter.ujfalusi@ti.com Cc: Russell King rmk+kernel@armlinux.org.uk Cc: Tero Kristo t-kristo@ti.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-omap.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c index 746aa9caf934..8591c410ecaa 100644 --- a/drivers/gpio/gpio-omap.c +++ b/drivers/gpio/gpio-omap.c @@ -1275,13 +1275,23 @@ static void omap_gpio_idle(struct gpio_bank *bank, bool may_lose_context) { struct device *dev = bank->chip.parent; void __iomem *base = bank->base; - u32 nowake; + u32 mask, nowake;
bank->saved_datain = readl_relaxed(base + bank->regs->datain);
if (!bank->enabled_non_wakeup_gpios) goto update_gpio_context_count;
+ /* Check for pending EDGE_FALLING, ignore EDGE_BOTH */ + mask = bank->enabled_non_wakeup_gpios & bank->context.fallingdetect; + mask &= ~bank->context.risingdetect; + bank->saved_datain |= mask; + + /* Check for pending EDGE_RISING, ignore EDGE_BOTH */ + mask = bank->enabled_non_wakeup_gpios & bank->context.risingdetect; + mask &= ~bank->context.fallingdetect; + bank->saved_datain &= ~mask; + if (!may_lose_context) goto update_gpio_context_count;
[ Upstream commit 64f883cd98c6d43013fb0cea788b63e50ebc068c ]
If vpif_probe() fails on v4l2_device_register() and vpif_probe_complete(), then memory allocated at initialize_vpif() for global vpif_obj.dev[i] become unreleased.
The patch adds deallocation of vpif_obj.dev[i] on the error path.
Signed-off-by: Young Xiao 92siuyang@gmail.com Acked-by: Lad, Prabhakar prabhakar.csengg@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/davinci/vpif_capture.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/media/platform/davinci/vpif_capture.c b/drivers/media/platform/davinci/vpif_capture.c index 61809d2050fa..f0f7ef638c56 100644 --- a/drivers/media/platform/davinci/vpif_capture.c +++ b/drivers/media/platform/davinci/vpif_capture.c @@ -1376,6 +1376,14 @@ static int initialize_vpif(void) return err; }
+static inline void free_vpif_objs(void) +{ + int i; + + for (i = 0; i < VPIF_CAPTURE_MAX_DEVICES; i++) + kfree(vpif_obj.dev[i]); +} + static int vpif_async_bound(struct v4l2_async_notifier *notifier, struct v4l2_subdev *subdev, struct v4l2_async_subdev *asd) @@ -1645,7 +1653,7 @@ static __init int vpif_probe(struct platform_device *pdev) err = v4l2_device_register(vpif_dev, &vpif_obj.v4l2_dev); if (err) { v4l2_err(vpif_dev->driver, "Error registering v4l2 device\n"); - goto cleanup; + goto vpif_free; }
while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, res_idx))) { @@ -1692,7 +1700,9 @@ static __init int vpif_probe(struct platform_device *pdev) "registered sub device %s\n", subdevdata->name); } - vpif_probe_complete(); + err = vpif_probe_complete(); + if (err) + goto probe_subdev_out; } else { vpif_obj.notifier.ops = &vpif_async_ops; err = v4l2_async_notifier_register(&vpif_obj.v4l2_dev, @@ -1711,6 +1721,8 @@ static __init int vpif_probe(struct platform_device *pdev) kfree(vpif_obj.sd); vpif_unregister: v4l2_device_unregister(&vpif_obj.v4l2_dev); +vpif_free: + free_vpif_objs(); cleanup: v4l2_async_notifier_cleanup(&vpif_obj.notifier);
[ Upstream commit aee450cbe482a8c2f6fa5b05b178ef8b8ff107ca ]
Compiling kernel/bpf/core.c with W=1 causes a flood of warnings:
kernel/bpf/core.c:1198:65: warning: initialized field overwritten [-Woverride-init] 1198 | #define BPF_INSN_3_TBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = true | ^~~~ kernel/bpf/core.c:1087:2: note: in expansion of macro 'BPF_INSN_3_TBL' 1087 | INSN_3(ALU, ADD, X), \ | ^~~~~~ kernel/bpf/core.c:1202:3: note: in expansion of macro 'BPF_INSN_MAP' 1202 | BPF_INSN_MAP(BPF_INSN_2_TBL, BPF_INSN_3_TBL), | ^~~~~~~~~~~~ kernel/bpf/core.c:1198:65: note: (near initialization for 'public_insntable[12]') 1198 | #define BPF_INSN_3_TBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = true | ^~~~ kernel/bpf/core.c:1087:2: note: in expansion of macro 'BPF_INSN_3_TBL' 1087 | INSN_3(ALU, ADD, X), \ | ^~~~~~ kernel/bpf/core.c:1202:3: note: in expansion of macro 'BPF_INSN_MAP' 1202 | BPF_INSN_MAP(BPF_INSN_2_TBL, BPF_INSN_3_TBL), | ^~~~~~~~~~~~
98 copies of the above.
The attached patch silences the warnings, because we *know* we're overwriting the default initializer. That leaves bpf/core.c with only 6 other warnings, which become more visible in comparison.
Signed-off-by: Valdis Kletnieks valdis.kletnieks@vt.edu Acked-by: Andrii Nakryiko andriin@fb.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/Makefile | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 4c2fa3ac56f6..29d781061cd5 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -1,5 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 obj-y := core.o +CFLAGS_core.o += $(call cc-disable-warning, override-init)
obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o
[ Upstream commit be22203aec440c1761ce8542c2636ac6c8951e3a ]
MFC v6 and v7 has no register to read min scratch buffer size, so it has to be read conditionally only if hardware supports it. This fixes following NULL pointer exception on SoCs with MFC v6/v7:
8<--- cut here --- Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = f25837f9 [00000000] *pgd=bd93d835 Internal error: Oops: 17 [#1] PREEMPT SMP ARM Modules linked in: btmrvl_sdio btmrvl bluetooth mwifiex_sdio mwifiex ecdh_generic ecc Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) PC is at s5p_mfc_get_min_scratch_buf_size+0x30/0x3c LR is at s5p_mfc_get_min_scratch_buf_size+0x28/0x3c ... [<c074f998>] (s5p_mfc_get_min_scratch_buf_size) from [<c0745bc0>] (s5p_mfc_irq+0x814/0xa5c) [<c0745bc0>] (s5p_mfc_irq) from [<c019a218>] (__handle_irq_event_percpu+0x64/0x3f8) [<c019a218>] (__handle_irq_event_percpu) from [<c019a5d8>] (handle_irq_event_percpu+0x2c/0x7c) [<c019a5d8>] (handle_irq_event_percpu) from [<c019a660>] (handle_irq_event+0x38/0x5c) [<c019a660>] (handle_irq_event) from [<c019ebc4>] (handle_fasteoi_irq+0xc4/0x180) [<c019ebc4>] (handle_fasteoi_irq) from [<c0199270>] (generic_handle_irq+0x24/0x34) [<c0199270>] (generic_handle_irq) from [<c0199888>] (__handle_domain_irq+0x7c/0xec) [<c0199888>] (__handle_domain_irq) from [<c04ac298>] (gic_handle_irq+0x58/0x9c) [<c04ac298>] (gic_handle_irq) from [<c0101ab0>] (__irq_svc+0x70/0xb0) Exception stack(0xe73ddc60 to 0xe73ddca8) ... [<c0101ab0>] (__irq_svc) from [<c01967d8>] (console_unlock+0x5a8/0x6a8) [<c01967d8>] (console_unlock) from [<c01981d0>] (vprintk_emit+0x118/0x2d8) [<c01981d0>] (vprintk_emit) from [<c01983b0>] (vprintk_default+0x20/0x28) [<c01983b0>] (vprintk_default) from [<c01989b4>] (printk+0x30/0x54) [<c01989b4>] (printk) from [<c07500b8>] (s5p_mfc_init_decode_v6+0x1d4/0x284) [<c07500b8>] (s5p_mfc_init_decode_v6) from [<c07230d0>] (vb2_start_streaming+0x24/0x150) [<c07230d0>] (vb2_start_streaming) from [<c0724e4c>] (vb2_core_streamon+0x11c/0x15c) [<c0724e4c>] (vb2_core_streamon) from [<c07478b8>] (vidioc_streamon+0x64/0xa0) [<c07478b8>] (vidioc_streamon) from [<c0709640>] (__video_do_ioctl+0x28c/0x45c) [<c0709640>] (__video_do_ioctl) from [<c0709bc8>] (video_usercopy+0x260/0x8a4) [<c0709bc8>] (video_usercopy) from [<c02b3820>] (do_vfs_ioctl+0xb0/0x9fc) [<c02b3820>] (do_vfs_ioctl) from [<c02b41a0>] (ksys_ioctl+0x34/0x58) [<c02b41a0>] (ksys_ioctl) from [<c0101000>] (ret_fast_syscall+0x0/0x28) Exception stack(0xe73ddfa8 to 0xe73ddff0) ... ---[ end trace 376cf5ba6e0bee93 ]---
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/s5p-mfc/s5p_mfc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/media/platform/s5p-mfc/s5p_mfc.c b/drivers/media/platform/s5p-mfc/s5p_mfc.c index 4e936b95018a..481088a83212 100644 --- a/drivers/media/platform/s5p-mfc/s5p_mfc.c +++ b/drivers/media/platform/s5p-mfc/s5p_mfc.c @@ -523,7 +523,8 @@ static void s5p_mfc_handle_seq_done(struct s5p_mfc_ctx *ctx, dev); ctx->mv_count = s5p_mfc_hw_call(dev->mfc_ops, get_mv_count, dev); - ctx->scratch_buf_size = s5p_mfc_hw_call(dev->mfc_ops, + if (FW_HAS_E_MIN_SCRATCH_BUF(dev)) + ctx->scratch_buf_size = s5p_mfc_hw_call(dev->mfc_ops, get_min_scratch_buf_size, dev); if (ctx->img_width == 0 || ctx->img_height == 0) ctx->state = MFCINST_ERROR;
[ Upstream commit 464c258aa45b09f16aa0f05847ed8895873262d9 ]
When sid == 0 (we are resetting keycreate_sid to the default value), we should skip the KEY__CREATE check.
Before this patch, doing a zero-sized write to /proc/self/keycreate would check if the current task can create unlabeled keys (which would usually fail with -EACCESS and generate an AVC). Now it skips the check and correctly sets the task's keycreate_sid to 0.
Bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1719067
Tested using the reproducer from the report above.
Fixes: 4eb582cf1fbd ("[PATCH] keys: add a way to store the appropriate context for newly-created keys") Reported-by: Kir Kolyshkin kir@sacred.ru Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Sasha Levin sashal@kernel.org --- security/selinux/hooks.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 94de51628fdc..3ec7ac70c313 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -6351,11 +6351,12 @@ static int selinux_setprocattr(const char *name, void *value, size_t size) } else if (!strcmp(name, "fscreate")) { tsec->create_sid = sid; } else if (!strcmp(name, "keycreate")) { - error = avc_has_perm(&selinux_state, - mysid, sid, SECCLASS_KEY, KEY__CREATE, - NULL); - if (error) - goto abort_change; + if (sid) { + error = avc_has_perm(&selinux_state, mysid, sid, + SECCLASS_KEY, KEY__CREATE, NULL); + if (error) + goto abort_change; + } tsec->keycreate_sid = sid; } else if (!strcmp(name, "sockcreate")) { tsec->sockcreate_sid = sid;
[ Upstream commit e63e1b0dd0003dc31f73d875907432be3a2abe5d ]
Call cond_resched() after each fuzz test iteration. This avoids stall warnings if fuzz_iterations is set very high for testing purposes.
While we're at it, also call cond_resched() after finishing testing each test vector.
Signed-off-by: Eric Biggers ebiggers@google.com Acked-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- crypto/testmgr.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/crypto/testmgr.c b/crypto/testmgr.c index 658a7eeebab2..292d28caf00f 100644 --- a/crypto/testmgr.c +++ b/crypto/testmgr.c @@ -1279,6 +1279,7 @@ static int test_hash_vec(const char *driver, const struct hash_testvec *vec, req, tsgl, hashstate); if (err) return err; + cond_resched(); } } #endif @@ -1493,6 +1494,7 @@ static int __alg_test_hash(const struct hash_testvec *vecs, err = test_hash_vec(driver, &vecs[i], i, req, tsgl, hashstate); if (err) goto out; + cond_resched(); } err = test_hash_vs_generic_impl(driver, generic_driver, maxkeysize, req, tsgl, hashstate); @@ -1755,6 +1757,7 @@ static int test_aead_vec(const char *driver, int enc, &cfg, req, tsgls); if (err) return err; + cond_resched(); } } #endif @@ -1994,6 +1997,7 @@ static int test_aead(const char *driver, int enc, tsgls); if (err) return err; + cond_resched(); } return 0; } @@ -2336,6 +2340,7 @@ static int test_skcipher_vec(const char *driver, int enc, &cfg, req, tsgls); if (err) return err; + cond_resched(); } } #endif @@ -2535,6 +2540,7 @@ static int test_skcipher(const char *driver, int enc, tsgls); if (err) return err; + cond_resched(); } return 0; }
[ Upstream commit e32d045cd4ba06b59878323e434bad010e78e658 ]
Add the CPUID model number of Ice Lake Neural Network Processor for Deep Learning Inference (ICL-NNPI) to the Intel family list. Ice Lake NNPI uses model number 0x9D and this will be documented in a future version of Intel Software Development Manual.
Signed-off-by: Rajneesh Bhardwaj rajneesh.bhardwaj@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: bp@suse.de Cc: Borislav Petkov bp@alien8.de Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: Kan Liang kan.liang@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: platform-driver-x86@vger.kernel.org Cc: Qiuxu Zhuo qiuxu.zhuo@intel.com Cc: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Cc: Len Brown lenb@kernel.org Cc: Linux PM linux-pm@vger.kernel.org Link: https://lkml.kernel.org/r/20190606012419.13250-1-rajneesh.bhardwaj@linux.int... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/intel-family.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h index 310118805f57..f60ddd655c78 100644 --- a/arch/x86/include/asm/intel-family.h +++ b/arch/x86/include/asm/intel-family.h @@ -56,6 +56,7 @@ #define INTEL_FAM6_ICELAKE_XEON_D 0x6C #define INTEL_FAM6_ICELAKE_DESKTOP 0x7D #define INTEL_FAM6_ICELAKE_MOBILE 0x7E +#define INTEL_FAM6_ICELAKE_NNPI 0x9D
/* "Small Core" Processors (Atom) */
[ Upstream commit cb36ff785e868992e96e8b9e5a0c2822b680a9e2 ]
The content of SND_SOC_DAIFMT_FORMAT_MASK is a number, not a bitfield, so the test to check if the format is i2s is wrong. Because of this the clock setting may be wrong. For example, the sample clock gets inverted in DSP B mode, when it should not.
Fix the lrclk invert helper function
Fixes: 1a11d88f499c ("ASoC: meson: add tdm formatter base driver") Signed-off-by: Jerome Brunet jbrunet@baylibre.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/meson/axg-tdm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/meson/axg-tdm.h b/sound/soc/meson/axg-tdm.h index e578b6f40a07..5774ce0916d4 100644 --- a/sound/soc/meson/axg-tdm.h +++ b/sound/soc/meson/axg-tdm.h @@ -40,7 +40,7 @@ struct axg_tdm_iface {
static inline bool axg_tdm_lrclk_invert(unsigned int fmt) { - return (fmt & SND_SOC_DAIFMT_I2S) ^ + return ((fmt & SND_SOC_DAIFMT_FORMAT_MASK) == SND_SOC_DAIFMT_I2S) ^ !!(fmt & (SND_SOC_DAIFMT_IB_IF | SND_SOC_DAIFMT_NB_IF)); }
[ Upstream commit 6da9f775175e516fc7229ceaa9b54f8f56aa7924 ]
When debugging options are turned on, the rcu_read_lock() function might not be inlined. This results in lockdep's print_lock() function printing "rcu_read_lock+0x0/0x70" instead of rcu_read_lock()'s caller. For example:
[ 10.579995] ============================= [ 10.584033] WARNING: suspicious RCU usage [ 10.588074] 4.18.0.memcg_v2+ #1 Not tainted [ 10.593162] ----------------------------- [ 10.597203] include/linux/rcupdate.h:281 Illegal context switch in RCU read-side critical section! [ 10.606220] [ 10.606220] other info that might help us debug this: [ 10.606220] [ 10.614280] [ 10.614280] rcu_scheduler_active = 2, debug_locks = 1 [ 10.620853] 3 locks held by systemd/1: [ 10.624632] #0: (____ptrval____) (&type->i_mutex_dir_key#5){.+.+}, at: lookup_slow+0x42/0x70 [ 10.633232] #1: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 [ 10.640954] #2: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70
These "rcu_read_lock+0x0/0x70" strings are not providing any useful information. This commit therefore forces inlining of the rcu_read_lock() function so that rcu_read_lock()'s caller is instead shown.
Signed-off-by: Waiman Long longman@redhat.com Signed-off-by: Paul E. McKenney paulmck@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/rcupdate.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index b25d20822e75..3508f4508a11 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -586,7 +586,7 @@ static inline void rcu_preempt_sleep_check(void) { } * read-side critical sections may be preempted and they may also block, but * only when acquiring spinlocks that are subject to priority inheritance. */ -static inline void rcu_read_lock(void) +static __always_inline void rcu_read_lock(void) { __rcu_read_lock(); __acquire(RCU);
[ Upstream commit 5f4318c1b1d23a9290e4def78ee76017c288bf60 ]
Intel Ice Lake uncore support already included IMC PCI ID but ICL-NNPI CPUID is missing so add it to fix the probe function.
Fixes: e39875d15ad6 ("perf/x86: add Intel Icelake uncore support") Signed-off-by: Rajneesh Bhardwaj rajneesh.bhardwaj@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Peter Zijlstra peterz@infradead.org Cc: alexander.shishkin@linux.intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: Kan Liang kan.liang@linux.intel.com Cc: Qiuxu Zhuo qiuxu.zhuo@intel.com Cc: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Cc: Len Brown lenb@kernel.org Cc: Linux PM linux-pm@vger.kernel.org Link: https://lkml.kernel.org/r/20190614081701.13828-1-rajneesh.bhardwaj@linux.int... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/intel/uncore.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c index 9e3fbd47cb56..089bfcdf2f7f 100644 --- a/arch/x86/events/intel/uncore.c +++ b/arch/x86/events/intel/uncore.c @@ -1400,6 +1400,7 @@ static const struct x86_cpu_id intel_uncore_match[] __initconst = { X86_UNCORE_MODEL_MATCH(INTEL_FAM6_KABYLAKE_MOBILE, skl_uncore_init), X86_UNCORE_MODEL_MATCH(INTEL_FAM6_KABYLAKE_DESKTOP, skl_uncore_init), X86_UNCORE_MODEL_MATCH(INTEL_FAM6_ICELAKE_MOBILE, icl_uncore_init), + X86_UNCORE_MODEL_MATCH(INTEL_FAM6_ICELAKE_NNPI, icl_uncore_init), {}, };
[ Upstream commit cbb99c0f588737ec98c333558922ce47e9a95827 ]
Add the CPUID enumeration for Intel's de-feature bits to accommodate passing these de-features through to kvm guests.
These de-features are (from SDM vol 1, section 8.1.8): - X86_FEATURE_FDP_EXCPTN_ONLY: If CPUID.(EAX=07H,ECX=0H):EBX[bit 6] = 1, the data pointer (FDP) is updated only for the x87 non-control instructions that incur unmasked x87 exceptions. - X86_FEATURE_ZERO_FCS_FDS: If CPUID.(EAX=07H,ECX=0H):EBX[bit 13] = 1, the processor deprecates FCS and FDS; it saves each as 0000H.
Signed-off-by: Aaron Lewis aaronlewis@google.com Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Jim Mattson jmattson@google.com Cc: Fenghua Yu fenghua.yu@intel.com Cc: Frederic Weisbecker frederic@kernel.org Cc: "H. Peter Anvin" hpa@zytor.com Cc: Ingo Molnar mingo@redhat.com Cc: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Cc: marcorr@google.com Cc: Peter Feiner pfeiner@google.com Cc: pshier@google.com Cc: Robert Hoo robert.hu@linux.intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Thomas Lendacky Thomas.Lendacky@amd.com Cc: x86-ml x86@kernel.org Link: https://lkml.kernel.org/r/20190605220252.103406-1-aaronlewis@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/cpufeatures.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 75f27ee2c263..1017b9c7dfe0 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -239,12 +239,14 @@ #define X86_FEATURE_BMI1 ( 9*32+ 3) /* 1st group bit manipulation extensions */ #define X86_FEATURE_HLE ( 9*32+ 4) /* Hardware Lock Elision */ #define X86_FEATURE_AVX2 ( 9*32+ 5) /* AVX2 instructions */ +#define X86_FEATURE_FDP_EXCPTN_ONLY ( 9*32+ 6) /* "" FPU data pointer updated only on x87 exceptions */ #define X86_FEATURE_SMEP ( 9*32+ 7) /* Supervisor Mode Execution Protection */ #define X86_FEATURE_BMI2 ( 9*32+ 8) /* 2nd group bit manipulation extensions */ #define X86_FEATURE_ERMS ( 9*32+ 9) /* Enhanced REP MOVSB/STOSB instructions */ #define X86_FEATURE_INVPCID ( 9*32+10) /* Invalidate Processor Context ID */ #define X86_FEATURE_RTM ( 9*32+11) /* Restricted Transactional Memory */ #define X86_FEATURE_CQM ( 9*32+12) /* Cache QoS Monitoring */ +#define X86_FEATURE_ZERO_FCS_FDS ( 9*32+13) /* "" Zero out FPU CS and FPU DS */ #define X86_FEATURE_MPX ( 9*32+14) /* Memory Protection Extension */ #define X86_FEATURE_RDT_A ( 9*32+15) /* Resource Director Technology Allocation */ #define X86_FEATURE_AVX512F ( 9*32+16) /* AVX-512 Foundation */
[ Upstream commit cb94d52b93c74fe1f2595734fabeda9f8ae891ee ]
The driver needs to assign a lossless traffic class for the MPA ll2 connection to ensure no packets are dropped when returning from the driver as they will never be re-transmitted by the peer.
Fixes: ae3488ff37dc ("qed: Add ll2 connection for processing unaligned MPA packets") Signed-off-by: Ariel Elior ariel.elior@marvell.com Signed-off-by: Michal Kalderon michal.kalderon@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c index ded556b7bab5..eeea8683d99b 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c +++ b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c @@ -2708,6 +2708,8 @@ qed_iwarp_ll2_start(struct qed_hwfn *p_hwfn, data.input.rx_num_desc = n_ooo_bufs * 2; data.input.tx_num_desc = data.input.rx_num_desc; data.input.tx_max_bds_per_packet = QED_IWARP_MAX_BDS_PER_FPDU; + data.input.tx_tc = PKT_LB_TC; + data.input.tx_dest = QED_LL2_TX_DEST_LB; data.p_connection_handle = &iwarp_info->ll2_mpa_handle; data.input.secondary_queue = true; data.cbs = &cbs;
[ Upstream commit 757188005f905664b0186b88cf26a7e844190a63 ]
The netdev is dereferenced before null checking in the function hns3_setup_tc.
This patch moves the dereferencing after the null checking.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin linyunsheng@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index 5611b990ac34..d18ad7b48a31 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -1514,12 +1514,12 @@ static void hns3_nic_get_stats64(struct net_device *netdev, static int hns3_setup_tc(struct net_device *netdev, void *type_data) { struct tc_mqprio_qopt_offload *mqprio_qopt = type_data; - struct hnae3_handle *h = hns3_get_handle(netdev); - struct hnae3_knic_private_info *kinfo = &h->kinfo; u8 *prio_tc = mqprio_qopt->qopt.prio_tc_map; + struct hnae3_knic_private_info *kinfo; u8 tc = mqprio_qopt->qopt.num_tc; u16 mode = mqprio_qopt->mode; u8 hw = mqprio_qopt->qopt.hw; + struct hnae3_handle *h;
if (!((hw == TC_MQPRIO_HW_OFFLOAD_TCS && mode == TC_MQPRIO_MODE_CHANNEL) || (!hw && tc == 0))) @@ -1531,6 +1531,9 @@ static int hns3_setup_tc(struct net_device *netdev, void *type_data) if (!netdev) return -EINVAL;
+ h = hns3_get_handle(netdev); + kinfo = &h->kinfo; + return (kinfo->dcb_ops && kinfo->dcb_ops->setup_tc) ? kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : -EOPNOTSUPP; }
[ Upstream commit 8f9eed1a8791b83eb1c54c261d68424717e4111e ]
If hns3_nic_net_xmit does not return NETDEV_TX_BUSY when doing a loopback selftest, the skb is not freed in hns3_clean_tx_ring or hns3_nic_net_xmit, which causes skb not freed problem.
This patch fixes it by freeing skb when hns3_nic_net_xmit does not return NETDEV_TX_OK.
Fixes: c39c4d98dc65 ("net: hns3: Add mac loopback selftest support in hns3 driver")
Signed-off-by: Yunsheng Lin linyunsheng@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c index d1588ea6132c..24fce343e7fc 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c @@ -243,11 +243,13 @@ static int hns3_lp_run_test(struct net_device *ndev, enum hnae3_loop mode)
skb_get(skb); tx_ret = hns3_nic_net_xmit(skb, ndev); - if (tx_ret == NETDEV_TX_OK) + if (tx_ret == NETDEV_TX_OK) { good_cnt++; - else + } else { + kfree_skb(skb); netdev_err(ndev, "hns3_lb_run_test xmit failed: %d\n", tx_ret); + } } if (good_cnt != HNS3_NIC_LB_TEST_PKT_NUM) { ret_val = HNS3_NIC_LB_TEST_TX_CNT_ERR;
[ Upstream commit 3a30964a2eef6aabd3ab18b979ea0eacf1147731 ]
The driver may not be able to disable the ring through firmware when downing the netdev during reset process, which may cause hardware accessing freed buffer problem.
This patch delays the ring buffer clearing to reset uninit process because hardware will not access the ring buffer after hardware reset is completed.
Fixes: bb6b94a896d4 ("net: hns3: Add reset interface implementation in client") Signed-off-by: Yunsheng Lin linyunsheng@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/hisilicon/hns3/hns3_enet.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index d18ad7b48a31..e0d3e2f9801d 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -28,7 +28,7 @@ #define hns3_tx_bd_count(S) DIV_ROUND_UP(S, HNS3_MAX_BD_SIZE)
static void hns3_clear_all_ring(struct hnae3_handle *h); -static void hns3_force_clear_all_rx_ring(struct hnae3_handle *h); +static void hns3_force_clear_all_ring(struct hnae3_handle *h); static void hns3_remove_hw_addr(struct net_device *netdev);
static const char hns3_driver_name[] = "hns3"; @@ -491,7 +491,12 @@ static void hns3_nic_net_down(struct net_device *netdev) /* free irq resources */ hns3_nic_uninit_irq(priv);
- hns3_clear_all_ring(priv->ae_handle); + /* delay ring buffer clearing to hns3_reset_notify_uninit_enet + * during reset process, because driver may not be able + * to disable the ring through firmware when downing the netdev. + */ + if (!hns3_nic_resetting(netdev)) + hns3_clear_all_ring(priv->ae_handle); }
static int hns3_nic_net_stop(struct net_device *netdev) @@ -3883,7 +3888,7 @@ static void hns3_client_uninit(struct hnae3_handle *handle, bool reset)
hns3_del_all_fd_rules(netdev, true);
- hns3_force_clear_all_rx_ring(handle); + hns3_force_clear_all_ring(handle);
hns3_uninit_phy(netdev);
@@ -4055,7 +4060,7 @@ static void hns3_force_clear_rx_ring(struct hns3_enet_ring *ring) } }
-static void hns3_force_clear_all_rx_ring(struct hnae3_handle *h) +static void hns3_force_clear_all_ring(struct hnae3_handle *h) { struct net_device *ndev = h->kinfo.netdev; struct hns3_nic_priv *priv = netdev_priv(ndev); @@ -4063,6 +4068,9 @@ static void hns3_force_clear_all_rx_ring(struct hnae3_handle *h) u32 i;
for (i = 0; i < h->kinfo.num_tqps; i++) { + ring = priv->ring_data[i].ring; + hns3_clear_tx_ring(ring); + ring = priv->ring_data[i + h->kinfo.num_tqps].ring; hns3_force_clear_rx_ring(ring); } @@ -4297,7 +4305,8 @@ static int hns3_reset_notify_uninit_enet(struct hnae3_handle *handle) return 0; }
- hns3_force_clear_all_rx_ring(handle); + hns3_clear_all_ring(handle); + hns3_force_clear_all_ring(handle);
hns3_nic_uninit_vector_data(priv);
[ Upstream commit 7602843fd873cae43a444b83b14dfdd114a9659c ]
Dulicate call of null_del_dev() will trigger null pointer error like below. The reason is a race condition between nullb_device_power_store() and nullb_group_drop_item().
CPU#0 CPU#1 ---------------- ----------------- do_rmdir()
configfs_rmdir()
>client_drop_item() >nullb_group_drop_item() nullb_device_power_store() >null_del_dev()
>test_and_clear_bit(NULLB_DEV_FL_UP >null_del_dev() ^^^^^ Duplicated null_dev_dev() triger null pointer error
>clear_bit(NULLB_DEV_FL_UP
The fix could be keep the sequnce of clear NULLB_DEV_FL_UP and null_del_dev().
[ 698.613600] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 698.613608] #PF error: [normal kernel read fault] [ 698.613611] PGD 0 P4D 0 [ 698.613619] Oops: 0000 [#1] SMP PTI [ 698.613627] CPU: 3 PID: 6382 Comm: rmdir Not tainted 5.0.0+ #35 [ 698.613631] Hardware name: LENOVO 20LJS2EV08/20LJS2EV08, BIOS R0SET33W (1.17 ) 07/18/2018 [ 698.613644] RIP: 0010:null_del_dev+0xc/0x110 [null_blk] [ 698.613649] Code: 00 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b eb 97 e8 47 bb 2a e8 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 53 <8b> 77 18 48 89 fb 4c 8b 27 48 c7 c7 40 57 1e c1 e8 bf c7 cb e8 48 [ 698.613654] RSP: 0018:ffffb887888bfde0 EFLAGS: 00010286 [ 698.613659] RAX: 0000000000000000 RBX: ffff9d436d92bc00 RCX: ffff9d43a9184681 [ 698.613663] RDX: ffffffffc11e5c30 RSI: 0000000068be6540 RDI: 0000000000000000 [ 698.613667] RBP: ffffb887888bfdf0 R08: 0000000000000001 R09: 0000000000000000 [ 698.613671] R10: ffffb887888bfdd8 R11: 0000000000000f16 R12: ffff9d436d92bc08 [ 698.613675] R13: ffff9d436d94e630 R14: ffffffffc11e5088 R15: ffffffffc11e5000 [ 698.613680] FS: 00007faa68be6540(0000) GS:ffff9d43d14c0000(0000) knlGS:0000000000000000 [ 698.613685] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 698.613689] CR2: 0000000000000018 CR3: 000000042f70c002 CR4: 00000000003606e0 [ 698.613693] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 698.613697] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 698.613700] Call Trace: [ 698.613712] nullb_group_drop_item+0x50/0x70 [null_blk] [ 698.613722] client_drop_item+0x29/0x40 [ 698.613728] configfs_rmdir+0x1ed/0x300 [ 698.613738] vfs_rmdir+0xb2/0x130 [ 698.613743] do_rmdir+0x1c7/0x1e0 [ 698.613750] __x64_sys_rmdir+0x17/0x20 [ 698.613759] do_syscall_64+0x5a/0x110 [ 698.613768] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: Bob Liu bob.liu@oracle.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/null_blk_main.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c index 447d635c79a2..2a4f8bc4f930 100644 --- a/drivers/block/null_blk_main.c +++ b/drivers/block/null_blk_main.c @@ -327,11 +327,12 @@ static ssize_t nullb_device_power_store(struct config_item *item, set_bit(NULLB_DEV_FL_CONFIGURED, &dev->flags); dev->power = newp; } else if (dev->power && !newp) { - mutex_lock(&lock); - dev->power = newp; - null_del_dev(dev->nullb); - mutex_unlock(&lock); - clear_bit(NULLB_DEV_FL_UP, &dev->flags); + if (test_and_clear_bit(NULLB_DEV_FL_UP, &dev->flags)) { + mutex_lock(&lock); + dev->power = newp; + null_del_dev(dev->nullb); + mutex_unlock(&lock); + } clear_bit(NULLB_DEV_FL_CONFIGURED, &dev->flags); }
[ Upstream commit 6631142229005e1b1c311a09efe9fb3cfdac8559 ]
wbc_account_io() collects information on cgroup ownership of writeback pages to determine which cgroup should own the inode. Pages can stay associated with dead memcgs but we want to avoid attributing IOs to dead blkcgs as much as possible as the association is likely to be stale. However, currently, pages associated with dead memcgs contribute to the accounting delaying and/or confusing the arbitration.
Fix it by ignoring pages associated with dead memcgs.
Signed-off-by: Tejun Heo tj@kernel.org Cc: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fs-writeback.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index e41cbe8e81b9..9ebfb1b28430 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -715,6 +715,7 @@ void wbc_detach_inode(struct writeback_control *wbc) void wbc_account_io(struct writeback_control *wbc, struct page *page, size_t bytes) { + struct cgroup_subsys_state *css; int id;
/* @@ -726,7 +727,12 @@ void wbc_account_io(struct writeback_control *wbc, struct page *page, if (!wbc->wb) return;
- id = mem_cgroup_css_from_page(page)->id; + css = mem_cgroup_css_from_page(page); + /* dead cgroups shouldn't contribute to inode ownership arbitration */ + if (!(css->flags & CSS_ONLINE)) + return; + + id = css->id;
if (id == wbc->wb_id) { wbc->wb_bytes += bytes;
[ Upstream commit b8d6d0079757cbd1b69724cfd1c08e2171c68cee ]
After commit b38ff4075a80, the following command does not work anymore: $ ip xfrm state add src 10.125.0.2 dst 10.125.0.1 proto esp spi 34 reqid 1 \ mode tunnel enc 'cbc(aes)' 0xb0abdba8b782ad9d364ec81e3a7d82a1 auth-trunc \ 'hmac(sha1)' 0xe26609ebd00acb6a4d51fca13e49ea78a72c73e6 96 flag align4
In fact, the selector is not mandatory, allow the user to provide an empty selector.
Fixes: b38ff4075a80 ("xfrm: Fix xfrm sel prefix length validation") CC: Anirudh Gupta anirudh.gupta@sophos.com Signed-off-by: Nicolas Dichtel nicolas.dichtel@6wind.com Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/xfrm_user.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 76ad7e201626..b88ba45ff1ac 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -167,6 +167,9 @@ static int verify_newsa_info(struct xfrm_usersa_info *p, }
switch (p->sel.family) { + case AF_UNSPEC: + break; + case AF_INET: if (p->sel.prefixlen_d > 32 || p->sel.prefixlen_s > 32) goto out;
[ Upstream commit e3b929b0a184edb35531153c5afcaebb09014f9d ]
Non-inline io_schedule() was introduced in:
commit 10ab56434f2f ("sched/core: Separate out io_schedule_prepare() and io_schedule_finish()")
Keep in line with io_schedule_timeout(), otherwise "/proc/<pid>/wchan" will report io_schedule() rather than its callers when waiting for IO.
Reported-by: Jilong Kou koujilong@huawei.com Signed-off-by: Gao Xiang gaoxiang25@huawei.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Tejun Heo tj@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Miao Xie miaoxie@huawei.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Fixes: 10ab56434f2f ("sched/core: Separate out io_schedule_prepare() and io_schedule_finish()") Link: https://lkml.kernel.org/r/20190603091338.2695-1-gaoxiang25@huawei.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 874c427742a9..4d5962232a55 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5123,7 +5123,7 @@ long __sched io_schedule_timeout(long timeout) } EXPORT_SYMBOL(io_schedule_timeout);
-void io_schedule(void) +void __sched io_schedule(void) { int token;
[ Upstream commit faaeff98666c24376cebd0b106504d05a36881d1 ]
Add new model number for Icelake desktop and server to perf.
The data source encoding for Icelake server is the same as Skylake server.
Signed-off-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: bp@alien8.de Cc: qiuxu.zhuo@intel.com Cc: rui.zhang@intel.com Cc: tony.luck@intel.com Link: https://lkml.kernel.org/r/20190603134122.13853-2-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/intel/core.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index a5436cee20b1..b6cae65aa7ef 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -4439,6 +4439,7 @@ __init int intel_pmu_init(void) struct event_constraint *c; unsigned int unused; struct extra_reg *er; + bool pmem = false; int version, i; char *name;
@@ -4890,9 +4891,10 @@ __init int intel_pmu_init(void) name = "knights-landing"; break;
+ case INTEL_FAM6_SKYLAKE_X: + pmem = true; case INTEL_FAM6_SKYLAKE_MOBILE: case INTEL_FAM6_SKYLAKE_DESKTOP: - case INTEL_FAM6_SKYLAKE_X: case INTEL_FAM6_KABYLAKE_MOBILE: case INTEL_FAM6_KABYLAKE_DESKTOP: x86_add_quirk(intel_pebs_isolation_quirk); @@ -4925,8 +4927,7 @@ __init int intel_pmu_init(void) x86_pmu.cpu_events = hsw_events_attrs; mem_attr = hsw_mem_events_attrs; tsx_attr = hsw_tsx_events_attrs; - intel_pmu_pebs_data_source_skl( - boot_cpu_data.x86_model == INTEL_FAM6_SKYLAKE_X); + intel_pmu_pebs_data_source_skl(pmem);
if (boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)) { x86_pmu.flags |= PMU_FL_TFA; @@ -4940,7 +4941,11 @@ __init int intel_pmu_init(void) name = "skylake"; break;
+ case INTEL_FAM6_ICELAKE_X: + case INTEL_FAM6_ICELAKE_XEON_D: + pmem = true; case INTEL_FAM6_ICELAKE_MOBILE: + case INTEL_FAM6_ICELAKE_DESKTOP: x86_pmu.late_ack = true; memcpy(hw_cache_event_ids, skl_hw_cache_event_ids, sizeof(hw_cache_event_ids)); memcpy(hw_cache_extra_regs, skl_hw_cache_extra_regs, sizeof(hw_cache_extra_regs)); @@ -4963,7 +4968,7 @@ __init int intel_pmu_init(void) x86_pmu.cpu_events = get_icl_events_attrs(); x86_pmu.rtm_abort_event = X86_CONFIG(.event=0xca, .umask=0x02); x86_pmu.lbr_pt_coexist = true; - intel_pmu_pebs_data_source_skl(false); + intel_pmu_pebs_data_source_skl(pmem); pr_cont("Icelake events, "); name = "icelake"; break;
[ Upstream commit 509466b7d480bc5d22e90b9fbe6122ae0e2fbe39 ]
runnable_avg_yN_inv[] is only used in kernel/sched/pelt.c but was included in several other places because they need other macros all came from kernel/sched/sched-pelt.h which was generated by Documentation/scheduler/sched-pelt. As the result, it causes compilation a lot of warnings,
kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=] kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=] kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=] ...
Silence it by appending the __maybe_unused attribute for it, so all generated variables and macros can still be kept in the same file.
Signed-off-by: Qian Cai cai@lca.pw Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/1559596304-31581-1-git-send-email-cai@lca.pw Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/scheduler/sched-pelt.c | 3 ++- kernel/sched/sched-pelt.h | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/Documentation/scheduler/sched-pelt.c b/Documentation/scheduler/sched-pelt.c index e4219139386a..7238b355919c 100644 --- a/Documentation/scheduler/sched-pelt.c +++ b/Documentation/scheduler/sched-pelt.c @@ -20,7 +20,8 @@ void calc_runnable_avg_yN_inv(void) int i; unsigned int x;
- printf("static const u32 runnable_avg_yN_inv[] = {"); + /* To silence -Wunused-but-set-variable warnings. */ + printf("static const u32 runnable_avg_yN_inv[] __maybe_unused = {"); for (i = 0; i < HALFLIFE; i++) { x = ((1UL<<32)-1)*pow(y, i);
diff --git a/kernel/sched/sched-pelt.h b/kernel/sched/sched-pelt.h index a26473674fb7..c529706bed11 100644 --- a/kernel/sched/sched-pelt.h +++ b/kernel/sched/sched-pelt.h @@ -1,7 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 */ /* Generated by Documentation/scheduler/sched-pelt; do not modify. */
-static const u32 runnable_avg_yN_inv[] = { +static const u32 runnable_avg_yN_inv[] __maybe_unused = { 0xffffffff, 0xfa83b2da, 0xf5257d14, 0xefe4b99a, 0xeac0c6e6, 0xe5b906e6, 0xe0ccdeeb, 0xdbfbb796, 0xd744fcc9, 0xd2a81d91, 0xce248c14, 0xc9b9bd85, 0xc5672a10, 0xc12c4cc9, 0xbd08a39e, 0xb8fbaf46, 0xb504f333, 0xb123f581,
[ Upstream commit d0e1a507bdc761a14906f03399d933ea639a1756 ]
Tom Vaden reported false failure of the check_msr() function, because some servers can do POST tracing and enable LBR tracing during bootup.
Kan confirmed that check_msr patch was to fix a bug report in guest, so it's ok to disable it for real HW.
Reported-by: Tom Vaden tom.vaden@hpe.com Signed-off-by: Jiri Olsa jolsa@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Tom Vaden tom.vaden@hpe.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: Liang Kan kan.liang@linux.intel.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/20190616141313.GD2500@krava [ Readability edits. ] Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/intel/core.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index b6cae65aa7ef..f0c14665893b 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -20,6 +20,7 @@ #include <asm/intel-family.h> #include <asm/apic.h> #include <asm/cpu_device_id.h> +#include <asm/hypervisor.h>
#include "../perf_event.h"
@@ -4054,6 +4055,13 @@ static bool check_msr(unsigned long msr, u64 mask) { u64 val_old, val_new, val_tmp;
+ /* + * Disable the check for real HW, so we don't + * mess with potentionaly enabled registers: + */ + if (hypervisor_is_type(X86_HYPER_NATIVE)) + return true; + /* * Read the current value, change it and read it back to see if it * matches, this is needed to detect certain hardware emulators
[ Upstream commit 543ac280b3576c0009e8c0fcd4d6bfc9978d7bd0 ]
Counting with invalid event coding for free-running counter may cause OOPs, e.g. uncore_iio_free_running_0/event=1/.
Current code only validate the event with free-running event format, event=0xff,umask=0xXY. Non-free-running event format never be checked for the PMU with free-running counters.
Add generic hw_config() to check and reject the invalid event coding for free-running PMU.
Signed-off-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: acme@kernel.org Cc: eranian@google.com Fixes: 0f519f0352e3 ("perf/x86/intel/uncore: Support IIO free-running counters on SKX") Link: https://lkml.kernel.org/r/1556672028-119221-2-git-send-email-kan.liang@linux... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/intel/uncore.h | 10 ++++++++++ arch/x86/events/intel/uncore_snbep.c | 1 + 2 files changed, 11 insertions(+)
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h index 79eb2e21e4f0..28499e39679f 100644 --- a/arch/x86/events/intel/uncore.h +++ b/arch/x86/events/intel/uncore.h @@ -419,6 +419,16 @@ static inline bool is_freerunning_event(struct perf_event *event) (((cfg >> 8) & 0xff) >= UNCORE_FREERUNNING_UMASK_START); }
+/* Check and reject invalid config */ +static inline int uncore_freerunning_hw_config(struct intel_uncore_box *box, + struct perf_event *event) +{ + if (is_freerunning_event(event)) + return 0; + + return -EINVAL; +} + static inline void uncore_disable_box(struct intel_uncore_box *box) { if (box->pmu->type->ops->disable_box) diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index b10e04387f38..8e4e8e423839 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -3585,6 +3585,7 @@ static struct uncore_event_desc skx_uncore_iio_freerunning_events[] = {
static struct intel_uncore_ops skx_uncore_iio_freerunning_ops = { .read_counter = uncore_msr_read_counter, + .hw_config = uncore_freerunning_hw_config, };
static struct attribute *skx_uncore_iio_freerunning_formats_attr[] = {
[ Upstream commit 8c655784e2cf59cb6140759b8b546d98261d1ad9 ]
With gcc-4.6.3:
WARNING: vmlinux.o(.text.unlikely+0x24c64): Section mismatch in reference from the function __integrity_init_keyring() to the function .init.text:set_platform_trusted_keys() The function __integrity_init_keyring() references the function __init set_platform_trusted_keys(). This is often because __integrity_init_keyring lacks a __init annotation or the annotation of set_platform_trusted_keys is wrong.
Indeed, if the compiler decides not to inline __integrity_init_keyring(), a warning is issued.
Fix this by adding the missing __init annotation.
Fixes: 9dc92c45177ab70e ("integrity: Define a trusted platform keyring") Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Reviewed-by: Nayna Jain nayna@linux.ibm.com Reviewed-by: James Morris jamorris@linux.microsoft.com Signed-off-by: Mimi Zohar zohar@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- security/integrity/digsig.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/security/integrity/digsig.c b/security/integrity/digsig.c index 4582bc26770a..868ade3e8970 100644 --- a/security/integrity/digsig.c +++ b/security/integrity/digsig.c @@ -69,8 +69,9 @@ int integrity_digsig_verify(const unsigned int id, const char *sig, int siglen, return -EOPNOTSUPP; }
-static int __integrity_init_keyring(const unsigned int id, key_perm_t perm, - struct key_restriction *restriction) +static int __init __integrity_init_keyring(const unsigned int id, + key_perm_t perm, + struct key_restriction *restriction) { const struct cred *cred = current_cred(); int err = 0;
[ Upstream commit 69d927bba39517d0980462efc051875b7f4db185 ]
Recent probing at the Linux Kernel Memory Model uncovered a 'surprise'. Strongly ordered architectures where the atomic RmW primitive implies full memory ordering and smp_mb__{before,after}_atomic() are a simple barrier() (such as x86) fail for:
*x = 1; atomic_inc(u); smp_mb__after_atomic(); r0 = *y;
Because, while the atomic_inc() implies memory order, it (surprisingly) does not provide a compiler barrier. This then allows the compiler to re-order like so:
atomic_inc(u); *x = 1; smp_mb__after_atomic(); r0 = *y;
Which the CPU is then allowed to re-order (under TSO rules) like:
atomic_inc(u); r0 = *y; *x = 1;
And this very much was not intended. Therefore strengthen the atomic RmW ops to include a compiler barrier.
NOTE: atomic_{or,and,xor} and the bitops already had the compiler barrier.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/atomic_t.txt | 3 +++ arch/x86/include/asm/atomic.h | 8 ++++---- arch/x86/include/asm/atomic64_64.h | 8 ++++---- arch/x86/include/asm/barrier.h | 4 ++-- 4 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/Documentation/atomic_t.txt b/Documentation/atomic_t.txt index dca3fb0554db..65bb09a29324 100644 --- a/Documentation/atomic_t.txt +++ b/Documentation/atomic_t.txt @@ -194,6 +194,9 @@ These helper barriers exist because architectures have varying implicit ordering on their SMP atomic primitives. For example our TSO architectures provide full ordered atomics and these barriers are no-ops.
+NOTE: when the atomic RmW ops are fully ordered, they should also imply a +compiler barrier. + Thus:
atomic_fetch_add(); diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h index ea3d95275b43..115127c7ad28 100644 --- a/arch/x86/include/asm/atomic.h +++ b/arch/x86/include/asm/atomic.h @@ -54,7 +54,7 @@ static __always_inline void arch_atomic_add(int i, atomic_t *v) { asm volatile(LOCK_PREFIX "addl %1,%0" : "+m" (v->counter) - : "ir" (i)); + : "ir" (i) : "memory"); }
/** @@ -68,7 +68,7 @@ static __always_inline void arch_atomic_sub(int i, atomic_t *v) { asm volatile(LOCK_PREFIX "subl %1,%0" : "+m" (v->counter) - : "ir" (i)); + : "ir" (i) : "memory"); }
/** @@ -95,7 +95,7 @@ static __always_inline bool arch_atomic_sub_and_test(int i, atomic_t *v) static __always_inline void arch_atomic_inc(atomic_t *v) { asm volatile(LOCK_PREFIX "incl %0" - : "+m" (v->counter)); + : "+m" (v->counter) :: "memory"); } #define arch_atomic_inc arch_atomic_inc
@@ -108,7 +108,7 @@ static __always_inline void arch_atomic_inc(atomic_t *v) static __always_inline void arch_atomic_dec(atomic_t *v) { asm volatile(LOCK_PREFIX "decl %0" - : "+m" (v->counter)); + : "+m" (v->counter) :: "memory"); } #define arch_atomic_dec arch_atomic_dec
diff --git a/arch/x86/include/asm/atomic64_64.h b/arch/x86/include/asm/atomic64_64.h index dadc20adba21..5e86c0d68ac1 100644 --- a/arch/x86/include/asm/atomic64_64.h +++ b/arch/x86/include/asm/atomic64_64.h @@ -45,7 +45,7 @@ static __always_inline void arch_atomic64_add(long i, atomic64_t *v) { asm volatile(LOCK_PREFIX "addq %1,%0" : "=m" (v->counter) - : "er" (i), "m" (v->counter)); + : "er" (i), "m" (v->counter) : "memory"); }
/** @@ -59,7 +59,7 @@ static inline void arch_atomic64_sub(long i, atomic64_t *v) { asm volatile(LOCK_PREFIX "subq %1,%0" : "=m" (v->counter) - : "er" (i), "m" (v->counter)); + : "er" (i), "m" (v->counter) : "memory"); }
/** @@ -87,7 +87,7 @@ static __always_inline void arch_atomic64_inc(atomic64_t *v) { asm volatile(LOCK_PREFIX "incq %0" : "=m" (v->counter) - : "m" (v->counter)); + : "m" (v->counter) : "memory"); } #define arch_atomic64_inc arch_atomic64_inc
@@ -101,7 +101,7 @@ static __always_inline void arch_atomic64_dec(atomic64_t *v) { asm volatile(LOCK_PREFIX "decq %0" : "=m" (v->counter) - : "m" (v->counter)); + : "m" (v->counter) : "memory"); } #define arch_atomic64_dec arch_atomic64_dec
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h index 14de0432d288..84f848c2541a 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -80,8 +80,8 @@ do { \ })
/* Atomic operations are already serializing on x86 */ -#define __smp_mb__before_atomic() barrier() -#define __smp_mb__after_atomic() barrier() +#define __smp_mb__before_atomic() do { } while (0) +#define __smp_mb__after_atomic() do { } while (0)
#include <asm-generic/barrier.h>
[ Upstream commit fdbdd7e8580eac9bdafa532746c865644d125e34 ]
In which case it simply returns "unknown", like when it can't figure out the evsel->name value.
This makes this code more robust and fixes a problem in 'perf trace' where a NULL evsel was being passed to a routine that only used the evsel for printing its name when a invalid syscall id was passed.
Reported-by: Leo Yan leo.yan@linaro.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Link: https://lkml.kernel.org/n/tip-f30ztaasku3z935cn3ak3h53@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/evsel.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 4a5947625c5c..2c46f9aa416c 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -589,6 +589,9 @@ const char *perf_evsel__name(struct perf_evsel *evsel) { char bf[128];
+ if (!evsel) + goto out_unknown; + if (evsel->name) return evsel->name;
@@ -628,7 +631,10 @@ const char *perf_evsel__name(struct perf_evsel *evsel)
evsel->name = strdup(bf);
- return evsel->name ?: "unknown"; + if (evsel->name) + return evsel->name; +out_unknown: + return "unknown"; }
const char *perf_evsel__group_name(struct perf_evsel *evsel)
[ Upstream commit 098eadce3c622c07b328d0a43dda379b38cf7c5e ]
Vhost_net was known to suffer from HOL[1] issues which is not easy to fix. Several downstream disable the feature by default. What's more, the datapath was split and datacopy path got the support of batching and XDP support recently which makes it faster than zerocopy part for small packets transmission.
It looks to me that disable zerocopy by default is more appropriate. It cold be enabled by default again in the future if we fix the above issues.
[1] https://patchwork.kernel.org/patch/3787671/
Signed-off-by: Jason Wang jasowang@redhat.com Acked-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vhost/net.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index d57ebdd616d9..247e5585af5d 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -35,7 +35,7 @@
#include "vhost.h"
-static int experimental_zcopytx = 1; +static int experimental_zcopytx = 0; module_param(experimental_zcopytx, int, 0444); MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;" " 1 -Enable; 0 - Disable");
[ Upstream commit efa14c3985828da3163f5372137cb64d992b0f79 ]
In some circumstances, the hardware can hand us a null receive descriptor, with no data attached but otherwise valid. Unfortunately, the driver was ill-equipped to handle such an event, and would stop processing packets at that point.
To fix this, use the Descriptor Done bit instead of the size to determine whether or not a descriptor is ready to be processed. Add some checks to allow for unused buffers.
Signed-off-by: Mitch Williams mitch.a.williams@intel.com Tested-by: Andrew Bowers andrewx.bowers@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_txrx.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c b/drivers/net/ethernet/intel/iavf/iavf_txrx.c index 06d1509d57f7..c97b9ecf026a 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c +++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c @@ -1236,6 +1236,9 @@ static void iavf_add_rx_frag(struct iavf_ring *rx_ring, unsigned int truesize = SKB_DATA_ALIGN(size + iavf_rx_offset(rx_ring)); #endif
+ if (!size) + return; + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, rx_buffer->page, rx_buffer->page_offset, size, truesize);
@@ -1260,6 +1263,9 @@ static struct iavf_rx_buffer *iavf_get_rx_buffer(struct iavf_ring *rx_ring, { struct iavf_rx_buffer *rx_buffer;
+ if (!size) + return NULL; + rx_buffer = &rx_ring->rx_bi[rx_ring->next_to_clean]; prefetchw(rx_buffer->page);
@@ -1299,6 +1305,8 @@ static struct sk_buff *iavf_construct_skb(struct iavf_ring *rx_ring, unsigned int headlen; struct sk_buff *skb;
+ if (!rx_buffer) + return NULL; /* prefetch first cache line of first page */ prefetch(va); #if L1_CACHE_BYTES < 128 @@ -1363,6 +1371,8 @@ static struct sk_buff *iavf_build_skb(struct iavf_ring *rx_ring, #endif struct sk_buff *skb;
+ if (!rx_buffer) + return NULL; /* prefetch first cache line of first page */ prefetch(va); #if L1_CACHE_BYTES < 128 @@ -1398,6 +1408,9 @@ static struct sk_buff *iavf_build_skb(struct iavf_ring *rx_ring, static void iavf_put_rx_buffer(struct iavf_ring *rx_ring, struct iavf_rx_buffer *rx_buffer) { + if (!rx_buffer) + return; + if (iavf_can_reuse_rx_page(rx_buffer)) { /* hand second half of page back to the ring */ iavf_reuse_rx_page(rx_ring, rx_buffer); @@ -1496,11 +1509,12 @@ static int iavf_clean_rx_irq(struct iavf_ring *rx_ring, int budget) * verified the descriptor has been written back. */ dma_rmb(); +#define IAVF_RXD_DD BIT(IAVF_RX_DESC_STATUS_DD_SHIFT) + if (!iavf_test_staterr(rx_desc, IAVF_RXD_DD)) + break;
size = (qword & IAVF_RXD_QW1_LENGTH_PBUF_MASK) >> IAVF_RXD_QW1_LENGTH_PBUF_SHIFT; - if (!size) - break;
iavf_trace(clean_rx_irq, rx_ring, rx_desc, skb); rx_buffer = iavf_get_rx_buffer(rx_ring, size); @@ -1516,7 +1530,8 @@ static int iavf_clean_rx_irq(struct iavf_ring *rx_ring, int budget) /* exit if we failed to retrieve a buffer */ if (!skb) { rx_ring->rx_stats.alloc_buff_failed++; - rx_buffer->pagecnt_bias++; + if (rx_buffer) + rx_buffer->pagecnt_bias++; break; }
[ Upstream commit 64d701c608fea362881e823b666327f5d28d7ffd ]
in the case of IPoIB with SRIOV enabled hardware ip link show command incorrecly prints 0 instead of a VF hardware address.
Before: 11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP mode DEFAULT group default qlen 256 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state disable, trust off, query_rss off ... After: 11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP mode DEFAULT group default qlen 256 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
v1->v2: just copy an address without modifing ifla_vf_mac v2->v3: update the changelog
Signed-off-by: Denis Kirjanov kda@linux-powerpc.org Acked-by: Doug Ledford dledford@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/ulp/ipoib/ipoib_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 9b5e11d3fb85..04ea7db08e87 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1998,6 +1998,7 @@ static int ipoib_get_vf_config(struct net_device *dev, int vf, return err;
ivf->vf = vf; + memcpy(ivf->mac, dev->dev_addr, dev->addr_len);
return 0; }
[ Upstream commit ac28ec07ae1c5c1e18ed6855eb105a328418da88 ]
commit c16015f36cc1 ("ASoC: rsnd: add .get_id/.get_id_sub") introduces rsnd_ctu_id which calcualates and gives the main Device id of the CTU by dividing the id by 4. rsnd_mod_id uses this interface to get the CTU main Device id. But this commit forgets to revert the main Device id calcution previously done in rsnd_ctu_probe_ which also divides the id by 4. This path corrects the same to get the correct main Device id.
The issue is observered when rsnd_ctu_probe_ is done for CTU1
Fixes: c16015f36cc1 ("ASoC: rsnd: add .get_id/.get_id_sub")
Signed-off-by: Nilkanth Ahirrao anilkanth@jp.adit-jv.com Signed-off-by: Suresh Udipi sudipi@jp.adit-jv.com Signed-off-by: Jiada Wang jiada_wang@mentor.com Acked-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sh/rcar/ctu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/sh/rcar/ctu.c b/sound/soc/sh/rcar/ctu.c index 8cb06dab234e..7647b3d4c0ba 100644 --- a/sound/soc/sh/rcar/ctu.c +++ b/sound/soc/sh/rcar/ctu.c @@ -108,7 +108,7 @@ static int rsnd_ctu_probe_(struct rsnd_mod *mod, struct rsnd_dai_stream *io, struct rsnd_priv *priv) { - return rsnd_cmd_attach(io, rsnd_mod_id(mod) / 4); + return rsnd_cmd_attach(io, rsnd_mod_id(mod)); }
static void rsnd_ctu_value_init(struct rsnd_dai_stream *io,
[ Upstream commit 3469fa84c1631face938efc42b3f488a2c2504e0 ]
We were renanimg 'main' to 'main_zstd' but then using 'main_libzstd();' in the main() for test-all.c, causing this:
$ cat /tmp/build/perf/feature/test-all.make.output test-all.c: In function ‘main’: test-all.c:236:2: error: implicit declaration of function ‘main_test_libzstd’; did you mean ‘main_test_zstd’? [-Werror=implicit-function-declaration] main_test_libzstd(); ^~~~~~~~~~~~~~~~~ main_test_zstd cc1: all warnings being treated as errors $
I.e. what was supposed to be the fast path feature test was _always_ failing, duh, fix it.
Cc: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Alexey Budankov alexey.budankov@linux.intel.com Fixes: 3b1c5d965971 ("tools build: Implement libzstd feature check, LIBZSTD_DIR and NO_LIBZSTD defines") Link: https://lkml.kernel.org/n/tip-ma4abk0utroiw4mwpmvnjlru@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/build/feature/test-all.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c index a59c53705093..939ac2fcc783 100644 --- a/tools/build/feature/test-all.c +++ b/tools/build/feature/test-all.c @@ -182,7 +182,7 @@ # include "test-disassembler-four-args.c" #undef main
-#define main main_test_zstd +#define main main_test_libzstd # include "test-libzstd.c" #undef main
[ Upstream commit eea1c227b9e9bad295e8ef984004a9acf12bb68c ]
The commit 7640ead93924 partially resolved the issue of callees incorrectly pruning the callers. With introduction of bounded loops and jmps_processed heuristic single verifier state may contain multiple branches and calls. It's possible that new verifier state (for future pruning) will be allocated inside callee. Then callee will exit (still within the same verifier state). It will go back to the caller and there R6-R9 registers will be read and will trigger mark_reg_read. But the reg->live for all frames but the top frame is not set to LIVE_NONE. Hence mark_reg_read will fail to propagate liveness into parent and future walking will incorrectly conclude that the states are equivalent because LIVE_READ is not set. In other words the rule for parent/live should be: whenever register parentage chain is set the reg->live should be set to LIVE_NONE. is_state_visited logic already follows this rule for spilled registers.
Fixes: 7640ead93924 ("bpf: verifier: make sure callees don't prune with caller differences") Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/verifier.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a5c369e60343..11528bdaa9dc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -6456,17 +6456,18 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) * the state of the call instruction (with WRITTEN set), and r0 comes * from callee with its full parentage chain, anyway. */ - for (j = 0; j <= cur->curframe; j++) - for (i = j < cur->curframe ? BPF_REG_6 : 0; i < BPF_REG_FP; i++) - cur->frame[j]->regs[i].parent = &new->frame[j]->regs[i]; /* clear write marks in current state: the writes we did are not writes * our child did, so they don't screen off its reads from us. * (There are no read marks in current state, because reads always mark * their parent and current state never has children yet. Only * explored_states can get read marks.) */ - for (i = 0; i < BPF_REG_FP; i++) - cur->frame[cur->curframe]->regs[i].live = REG_LIVE_NONE; + for (j = 0; j <= cur->curframe; j++) { + for (i = j < cur->curframe ? BPF_REG_6 : 0; i < BPF_REG_FP; i++) + cur->frame[j]->regs[i].parent = &new->frame[j]->regs[i]; + for (i = 0; i < BPF_REG_FP; i++) + cur->frame[j]->regs[i].live = REG_LIVE_NONE; + }
/* all stack frames are accessible from callee, clear them all */ for (j = 0; j <= cur->curframe; j++) {
[ Upstream commit 78d6ccce03e86de34c7000bcada493ed0679e350 ]
In some distros slang.h may be in a /usr/include 'slang' subdir, so use the if slang is not explicitely disabled (by using NO_SLANG=1) and its feature test for the common case (having /usr/include/slang.h) failed, use the results for the test that checks if it is in slang/slang.h.
Change the only file in perf that includes slang.h to use HAVE_SLANG_INCLUDE_SUBDIR and forget about this for good.
On a rhel6 system now we have:
$ /tmp/build/perf/perf -vv | grep slang libslang: [ on ] # HAVE_SLANG_SUPPORT $ ldd /tmp/build/perf/perf | grep libslang libslang.so.2 => /usr/lib64/libslang.so.2 (0x00007fa2d5a8d000) $ grep slang /tmp/build/perf/FEATURE-DUMP feature-libslang=0 feature-libslang-include-subdir=1 $ cat /etc/redhat-release CentOS release 6.10 (Final) $
While on fedora:29:
$ /tmp/build/perf/perf -vv | grep slang libslang: [ on ] # HAVE_SLANG_SUPPORT $ ldd /tmp/build/perf/perf | grep slang libslang.so.2 => /lib64/libslang.so.2 (0x00007f8eb11a7000) $ grep slang /tmp/build/perf/FEATURE-DUMP feature-libslang=1 feature-libslang-include-subdir=1 $ $ cat /etc/fedora-release Fedora release 29 (Twenty Nine) $
The feature-libslang-include-subdir=1 line is because the 'gettid()' test was added to test-all.c as the new glibc has an implementation for that, so we soon should have it not failing, i.e. should be the common case soon. Perhaps I should move it out till it becomes the norm...
Cc: Adrian Hunter adrian.hunter@intel.com Cc: Florian Fainelli f.fainelli@gmail.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Fixes: 1955c8cf5e26 ("perf tools: Don't hardcode host include path for libslang") Link: https://lkml.kernel.org/n/tip-bkgtpsu3uit821fuwsdhj9gd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/Makefile.config | 11 ++++++++--- tools/perf/ui/libslang.h | 5 +++++ 2 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index 85fbcd265351..17b81bc403e4 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -637,9 +637,14 @@ endif
ifndef NO_SLANG ifneq ($(feature-libslang), 1) - msg := $(warning slang not found, disables TUI support. Please install slang-devel, libslang-dev or libslang2-dev); - NO_SLANG := 1 - else + ifneq ($(feature-libslang-include-subdir), 1) + msg := $(warning slang not found, disables TUI support. Please install slang-devel, libslang-dev or libslang2-dev); + NO_SLANG := 1 + else + CFLAGS += -DHAVE_SLANG_INCLUDE_SUBDIR + endif + endif + ifndef NO_SLANG # Fedora has /usr/include/slang/slang.h, but ubuntu /usr/include/slang.h CFLAGS += -I/usr/include/slang CFLAGS += -DHAVE_SLANG_SUPPORT diff --git a/tools/perf/ui/libslang.h b/tools/perf/ui/libslang.h index c0686cda39a5..991e692b9b46 100644 --- a/tools/perf/ui/libslang.h +++ b/tools/perf/ui/libslang.h @@ -10,7 +10,12 @@ #ifndef HAVE_LONG_LONG #define HAVE_LONG_LONG __GLIBC_HAVE_LONG_LONG #endif + +#ifdef HAVE_SLANG_INCLUDE_SUBDIR +#include <slang/slang.h> +#else #include <slang.h> +#endif
#if SLANG_VERSION < 20104 #define slsmg_printf(msg, args...) \
[ Upstream commit c2bf1fc212f7e6f25ace1af8f0b3ac061ea48ba5 ]
Currently Linux does not follow PCIe spec regarding the required delays after reset. A concrete example is a Thunderbolt add-in-card that consists of a PCIe switch and two PCIe endpoints:
+-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller +-01.0-[04-36]-- DS hotplug port +-02.0-[37]----00.0 xHCI controller -04.0-[38-6b]-- DS hotplug port
The root port (1b.0) and the PCIe switch downstream ports are all PCIe gen3 so they support 8GT/s link speeds.
We wait for the PCIe hierarchy to enter D3cold (runtime):
pcieport 0000:00:1b.0: power state changed by ACPI to D3cold
When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the PCIe switch is put to reset and its power is re-applied. This means that we must follow the rules in PCIe 4.0 section 6.6.1.
For the PCIe gen3 ports we are dealing with here, the following applies:
With a Downstream Port that supports Link speeds greater than 5.0 GT/s, software must wait a minimum of 100 ms after Link training completes before sending a Configuration Request to the device immediately below that Port. Software can determine when Link training completes by polling the Data Link Layer Link Active bit or by setting up an associated interrupt (see Section 6.7.3.3).
Translating this into the above topology we would need to do this (DLLLA stands for Data Link Layer Link Active):
pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0 pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0 pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0
I've instrumented the kernel with additional logging so we can see the actual delays the kernel performs:
pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms pcieport 0000:00:1b.0: waking up bus pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60) ... pcieport 0000:00:1b.0: PME# disabled pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:01:00.0: PME# disabled pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:00.0: PME# disabled pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:01.0: PME# disabled pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:02.0: PME# disabled pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:04.0: PME# disabled pcieport 0000:02:01.0: PME# enabled pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms pcieport 0000:02:04.0: PME# enabled pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000) ... thunderbolt 0000:03:00.0: PME# disabled xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000) ... xhci_hcd 0000:37:00.0: PME# disabled
For the switch upstream port (01:00.0) we wait for 100ms but not taking into account the DLLLA requirement. We then wait 10ms for D3hot -> D0 transition of the root port and the two downstream hotplug ports. This means that we deviate from what the spec requires.
Performing the same check for system sleep (s2idle) transitions we can see following when resuming from s2idle:
pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60) ... pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0) pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0) pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1) pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60) pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0) pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60) pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60) pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0) pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1) pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60) pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001) pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0) pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702) pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001) pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00) pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1) pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400) pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151) pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00) pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161) pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402) pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1) pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802) pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302) pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000) ... thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
This is even worse. None of the mandatory delays are performed. If this would be S3 instead of s2idle then according to PCI FW spec 3.2 section 4.6.8. there is a specific _DSM that allows the OS to skip the delays but this platform does not provide the _DSM and does not go to S3 anyway so no firmware is involved that could already handle these delays.
In this particular Intel Coffee Lake platform these delays are not actually needed because there is an additional delay as part of the ACPI power resource that is used to turn on power to the hierarchy but since that additional delay is not required by any of standards (PCIe, ACPI) it is not present in the Intel Ice Lake, for example where missing the mandatory delays causes pciehp to start tearing down the stack too early (links are not yet trained).
For this reason, change the PCIe portdrv PM resume hooks so that they perform the mandatory delays before the downstream component gets resumed. We perform the delays before port services are resumed because otherwise pciehp might find that the link is not up (even if it is just training) and tears-down the hierarchy.
Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci.c | 29 ++++++++++----- drivers/pci/pci.h | 1 + drivers/pci/pcie/portdrv_core.c | 66 +++++++++++++++++++++++++++++++++ 3 files changed, 86 insertions(+), 10 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 8abc843b1615..87a1f902fa8e 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1004,15 +1004,10 @@ static void __pci_start_power_transition(struct pci_dev *dev, pci_power_t state) if (state == PCI_D0) { pci_platform_power_transition(dev, PCI_D0); /* - * Mandatory power management transition delays, see - * PCI Express Base Specification Revision 2.0 Section - * 6.6.1: Conventional Reset. Do not delay for - * devices powered on/off by corresponding bridge, - * because have already delayed for the bridge. + * Mandatory power management transition delays are + * handled in the PCIe portdrv resume hooks. */ if (dev->runtime_d3cold) { - if (dev->d3cold_delay && !dev->imm_ready) - msleep(dev->d3cold_delay); /* * When powering on a bridge from D3cold, the * whole hierarchy may be powered on into @@ -4568,14 +4563,16 @@ static int pci_pm_reset(struct pci_dev *dev, int probe)
return pci_dev_wait(dev, "PM D3->D0", PCIE_RESET_READY_POLL_MS); } + /** - * pcie_wait_for_link - Wait until link is active or inactive + * pcie_wait_for_link_delay - Wait until link is active or inactive * @pdev: Bridge device * @active: waiting for active or inactive? + * @delay: Delay to wait after link has become active (in ms) * * Use this to wait till link becomes active or inactive. */ -bool pcie_wait_for_link(struct pci_dev *pdev, bool active) +bool pcie_wait_for_link_delay(struct pci_dev *pdev, bool active, int delay) { int timeout = 1000; bool ret; @@ -4612,13 +4609,25 @@ bool pcie_wait_for_link(struct pci_dev *pdev, bool active) timeout -= 10; } if (active && ret) - msleep(100); + msleep(delay); else if (ret != active) pci_info(pdev, "Data Link Layer Link Active not %s in 1000 msec\n", active ? "set" : "cleared"); return ret == active; }
+/** + * pcie_wait_for_link - Wait until link is active or inactive + * @pdev: Bridge device + * @active: waiting for active or inactive? + * + * Use this to wait till link becomes active or inactive. + */ +bool pcie_wait_for_link(struct pci_dev *pdev, bool active) +{ + return pcie_wait_for_link_delay(pdev, active, 100); +} + void pci_reset_secondary_bus(struct pci_dev *dev) { u16 ctrl; diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 9cb99380c61e..59802b3def4b 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -493,6 +493,7 @@ static inline int pci_dev_specific_disable_acs_redir(struct pci_dev *dev) void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state, u32 service);
+bool pcie_wait_for_link_delay(struct pci_dev *pdev, bool active, int delay); bool pcie_wait_for_link(struct pci_dev *pdev, bool active); #ifdef CONFIG_PCIEASPM void pcie_aspm_init_link_state(struct pci_dev *pdev); diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index 1b330129089f..308c3e0c4a34 100644 --- a/drivers/pci/pcie/portdrv_core.c +++ b/drivers/pci/pcie/portdrv_core.c @@ -9,6 +9,7 @@ #include <linux/module.h> #include <linux/pci.h> #include <linux/kernel.h> +#include <linux/delay.h> #include <linux/errno.h> #include <linux/pm.h> #include <linux/pm_runtime.h> @@ -378,6 +379,67 @@ static int pm_iter(struct device *dev, void *data) return 0; }
+static int get_downstream_delay(struct pci_bus *bus) +{ + struct pci_dev *pdev; + int min_delay = 100; + int max_delay = 0; + + list_for_each_entry(pdev, &bus->devices, bus_list) { + if (!pdev->imm_ready) + min_delay = 0; + else if (pdev->d3cold_delay < min_delay) + min_delay = pdev->d3cold_delay; + if (pdev->d3cold_delay > max_delay) + max_delay = pdev->d3cold_delay; + } + + return max(min_delay, max_delay); +} + +/* + * wait_for_downstream_link - Wait for downstream link to establish + * @pdev: PCIe port whose downstream link is waited + * + * Handle delays according to PCIe 4.0 section 6.6.1 before configuration + * access to the downstream component is permitted. + * + * This blocks PCI core resume of the hierarchy below this port until the + * link is trained. Should be called before resuming port services to + * prevent pciehp from starting to tear-down the hierarchy too soon. + */ +static void wait_for_downstream_link(struct pci_dev *pdev) +{ + int delay; + + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT && + pci_pcie_type(pdev) != PCI_EXP_TYPE_DOWNSTREAM) + return; + + if (pci_dev_is_disconnected(pdev)) + return; + + if (!pdev->subordinate || list_empty(&pdev->subordinate->devices) || + !pdev->bridge_d3) + return; + + delay = get_downstream_delay(pdev->subordinate); + if (!delay) + return; + + dev_dbg(&pdev->dev, "waiting downstream link for %d ms\n", delay); + + /* + * If downstream port does not support speeds greater than 5 GT/s + * need to wait 100ms. For higher speeds (gen3) we need to wait + * first for the data link layer to become active. + */ + if (pcie_get_speed_cap(pdev) <= PCIE_SPEED_5_0GT) + msleep(delay); + else + pcie_wait_for_link_delay(pdev, true, delay); +} + /** * pcie_port_device_suspend - suspend port services associated with a PCIe port * @dev: PCI Express port to handle @@ -391,6 +453,8 @@ int pcie_port_device_suspend(struct device *dev) int pcie_port_device_resume_noirq(struct device *dev) { size_t off = offsetof(struct pcie_port_service_driver, resume_noirq); + + wait_for_downstream_link(to_pci_dev(dev)); return device_for_each_child(dev, &off, pm_iter); }
@@ -421,6 +485,8 @@ int pcie_port_device_runtime_suspend(struct device *dev) int pcie_port_device_runtime_resume(struct device *dev) { size_t off = offsetof(struct pcie_port_service_driver, runtime_resume); + + wait_for_downstream_link(to_pci_dev(dev)); return device_for_each_child(dev, &off, pm_iter); } #endif /* PM */
On Wed, Jul 24, 2019 at 3:31 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
[ Upstream commit c2bf1fc212f7e6f25ace1af8f0b3ac061ea48ba5 ]
Currently Linux does not follow PCIe spec regarding the required delays after reset. A concrete example is a Thunderbolt add-in-card that consists of a PCIe switch and two PCIe endpoints:
+-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller +-01.0-[04-36]-- DS hotplug port +-02.0-[37]----00.0 xHCI controller -04.0-[38-6b]-- DS hotplug port
The root port (1b.0) and the PCIe switch downstream ports are all PCIe gen3 so they support 8GT/s link speeds.
We wait for the PCIe hierarchy to enter D3cold (runtime):
pcieport 0000:00:1b.0: power state changed by ACPI to D3cold
When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the PCIe switch is put to reset and its power is re-applied. This means that we must follow the rules in PCIe 4.0 section 6.6.1.
For the PCIe gen3 ports we are dealing with here, the following applies:
With a Downstream Port that supports Link speeds greater than 5.0 GT/s, software must wait a minimum of 100 ms after Link training completes before sending a Configuration Request to the device immediately below that Port. Software can determine when Link training completes by polling the Data Link Layer Link Active bit or by setting up an associated interrupt (see Section 6.7.3.3).
Translating this into the above topology we would need to do this (DLLLA stands for Data Link Layer Link Active):
pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0 pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0 pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0
I've instrumented the kernel with additional logging so we can see the actual delays the kernel performs:
pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms pcieport 0000:00:1b.0: waking up bus pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60) ... pcieport 0000:00:1b.0: PME# disabled pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:01:00.0: PME# disabled pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:00.0: PME# disabled pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:01.0: PME# disabled pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:02.0: PME# disabled pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:04.0: PME# disabled pcieport 0000:02:01.0: PME# enabled pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms pcieport 0000:02:04.0: PME# enabled pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000) ... thunderbolt 0000:03:00.0: PME# disabled xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000) ... xhci_hcd 0000:37:00.0: PME# disabled
For the switch upstream port (01:00.0) we wait for 100ms but not taking into account the DLLLA requirement. We then wait 10ms for D3hot -> D0 transition of the root port and the two downstream hotplug ports. This means that we deviate from what the spec requires.
Performing the same check for system sleep (s2idle) transitions we can see following when resuming from s2idle:
pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60) ... pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0) pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0) pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1) pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60) pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0) pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60) pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60) pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0) pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1) pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60) pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001) pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0) pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702) pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001) pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00) pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1) pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400) pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151) pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00) pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161) pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402) pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1) pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802) pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302) pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000) ... thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
This is even worse. None of the mandatory delays are performed. If this would be S3 instead of s2idle then according to PCI FW spec 3.2 section 4.6.8. there is a specific _DSM that allows the OS to skip the delays but this platform does not provide the _DSM and does not go to S3 anyway so no firmware is involved that could already handle these delays.
In this particular Intel Coffee Lake platform these delays are not actually needed because there is an additional delay as part of the ACPI power resource that is used to turn on power to the hierarchy but since that additional delay is not required by any of standards (PCIe, ACPI) it is not present in the Intel Ice Lake, for example where missing the mandatory delays causes pciehp to start tearing down the stack too early (links are not yet trained).
For this reason, change the PCIe portdrv PM resume hooks so that they perform the mandatory delays before the downstream component gets resumed. We perform the delays before port services are resumed because otherwise pciehp might find that the link is not up (even if it is just training) and tears-down the hierarchy.
We have gotten multiple reports in Fedora that this patch has broken suspend for users of 5.1.20 and 5.2 stable kernels.
Justin
On Fri, Aug 02, 2019 at 12:06:39PM -0500, Justin Forbes wrote:
On Wed, Jul 24, 2019 at 3:31 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
[ Upstream commit c2bf1fc212f7e6f25ace1af8f0b3ac061ea48ba5 ]
Currently Linux does not follow PCIe spec regarding the required delays after reset. A concrete example is a Thunderbolt add-in-card that consists of a PCIe switch and two PCIe endpoints:
+-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller +-01.0-[04-36]-- DS hotplug port +-02.0-[37]----00.0 xHCI controller -04.0-[38-6b]-- DS hotplug port
The root port (1b.0) and the PCIe switch downstream ports are all PCIe gen3 so they support 8GT/s link speeds.
We wait for the PCIe hierarchy to enter D3cold (runtime):
pcieport 0000:00:1b.0: power state changed by ACPI to D3cold
When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the PCIe switch is put to reset and its power is re-applied. This means that we must follow the rules in PCIe 4.0 section 6.6.1.
For the PCIe gen3 ports we are dealing with here, the following applies:
With a Downstream Port that supports Link speeds greater than 5.0 GT/s, software must wait a minimum of 100 ms after Link training completes before sending a Configuration Request to the device immediately below that Port. Software can determine when Link training completes by polling the Data Link Layer Link Active bit or by setting up an associated interrupt (see Section 6.7.3.3).
Translating this into the above topology we would need to do this (DLLLA stands for Data Link Layer Link Active):
pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0 pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0 pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0
I've instrumented the kernel with additional logging so we can see the actual delays the kernel performs:
pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms pcieport 0000:00:1b.0: waking up bus pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60) ... pcieport 0000:00:1b.0: PME# disabled pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:01:00.0: PME# disabled pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:00.0: PME# disabled pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:01.0: PME# disabled pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:02.0: PME# disabled pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:04.0: PME# disabled pcieport 0000:02:01.0: PME# enabled pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms pcieport 0000:02:04.0: PME# enabled pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000) ... thunderbolt 0000:03:00.0: PME# disabled xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000) ... xhci_hcd 0000:37:00.0: PME# disabled
For the switch upstream port (01:00.0) we wait for 100ms but not taking into account the DLLLA requirement. We then wait 10ms for D3hot -> D0 transition of the root port and the two downstream hotplug ports. This means that we deviate from what the spec requires.
Performing the same check for system sleep (s2idle) transitions we can see following when resuming from s2idle:
pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60) ... pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0) pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0) pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1) pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60) pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0) pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60) pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60) pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0) pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1) pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60) pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001) pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0) pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702) pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001) pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00) pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1) pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400) pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151) pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00) pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161) pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402) pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1) pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802) pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302) pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000) ... thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
This is even worse. None of the mandatory delays are performed. If this would be S3 instead of s2idle then according to PCI FW spec 3.2 section 4.6.8. there is a specific _DSM that allows the OS to skip the delays but this platform does not provide the _DSM and does not go to S3 anyway so no firmware is involved that could already handle these delays.
In this particular Intel Coffee Lake platform these delays are not actually needed because there is an additional delay as part of the ACPI power resource that is used to turn on power to the hierarchy but since that additional delay is not required by any of standards (PCIe, ACPI) it is not present in the Intel Ice Lake, for example where missing the mandatory delays causes pciehp to start tearing down the stack too early (links are not yet trained).
For this reason, change the PCIe portdrv PM resume hooks so that they perform the mandatory delays before the downstream component gets resumed. We perform the delays before port services are resumed because otherwise pciehp might find that the link is not up (even if it is just training) and tears-down the hierarchy.
We have gotten multiple reports in Fedora that this patch has broken suspend for users of 5.1.20 and 5.2 stable kernels.
And is the issue also in 5.3-rcX kernels? If so, can we either get this reverted there, or find the fix for it?
thanks,
greg k-h
Hi,
On Sat, Aug 03, 2019 at 08:50:00AM +0200, Greg Kroah-Hartman wrote:
On Fri, Aug 02, 2019 at 12:06:39PM -0500, Justin Forbes wrote:
On Wed, Jul 24, 2019 at 3:31 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
[ Upstream commit c2bf1fc212f7e6f25ace1af8f0b3ac061ea48ba5 ]
Currently Linux does not follow PCIe spec regarding the required delays after reset. A concrete example is a Thunderbolt add-in-card that consists of a PCIe switch and two PCIe endpoints:
+-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller +-01.0-[04-36]-- DS hotplug port +-02.0-[37]----00.0 xHCI controller -04.0-[38-6b]-- DS hotplug port
The root port (1b.0) and the PCIe switch downstream ports are all PCIe gen3 so they support 8GT/s link speeds.
We wait for the PCIe hierarchy to enter D3cold (runtime):
pcieport 0000:00:1b.0: power state changed by ACPI to D3cold
When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the PCIe switch is put to reset and its power is re-applied. This means that we must follow the rules in PCIe 4.0 section 6.6.1.
For the PCIe gen3 ports we are dealing with here, the following applies:
With a Downstream Port that supports Link speeds greater than 5.0 GT/s, software must wait a minimum of 100 ms after Link training completes before sending a Configuration Request to the device immediately below that Port. Software can determine when Link training completes by polling the Data Link Layer Link Active bit or by setting up an associated interrupt (see Section 6.7.3.3).
Translating this into the above topology we would need to do this (DLLLA stands for Data Link Layer Link Active):
pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0 pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0 pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0
I've instrumented the kernel with additional logging so we can see the actual delays the kernel performs:
pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms pcieport 0000:00:1b.0: waking up bus pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60) ... pcieport 0000:00:1b.0: PME# disabled pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:01:00.0: PME# disabled pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:00.0: PME# disabled pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:01.0: PME# disabled pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:02.0: PME# disabled pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:04.0: PME# disabled pcieport 0000:02:01.0: PME# enabled pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms pcieport 0000:02:04.0: PME# enabled pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000) ... thunderbolt 0000:03:00.0: PME# disabled xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000) ... xhci_hcd 0000:37:00.0: PME# disabled
For the switch upstream port (01:00.0) we wait for 100ms but not taking into account the DLLLA requirement. We then wait 10ms for D3hot -> D0 transition of the root port and the two downstream hotplug ports. This means that we deviate from what the spec requires.
Performing the same check for system sleep (s2idle) transitions we can see following when resuming from s2idle:
pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60) ... pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0) pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0) pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1) pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60) pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0) pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60) pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60) pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0) pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1) pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60) pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001) pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0) pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702) pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001) pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00) pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1) pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400) pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151) pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00) pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161) pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402) pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1) pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802) pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302) pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000) ... thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
This is even worse. None of the mandatory delays are performed. If this would be S3 instead of s2idle then according to PCI FW spec 3.2 section 4.6.8. there is a specific _DSM that allows the OS to skip the delays but this platform does not provide the _DSM and does not go to S3 anyway so no firmware is involved that could already handle these delays.
In this particular Intel Coffee Lake platform these delays are not actually needed because there is an additional delay as part of the ACPI power resource that is used to turn on power to the hierarchy but since that additional delay is not required by any of standards (PCIe, ACPI) it is not present in the Intel Ice Lake, for example where missing the mandatory delays causes pciehp to start tearing down the stack too early (links are not yet trained).
For this reason, change the PCIe portdrv PM resume hooks so that they perform the mandatory delays before the downstream component gets resumed. We perform the delays before port services are resumed because otherwise pciehp might find that the link is not up (even if it is just training) and tears-down the hierarchy.
We have gotten multiple reports in Fedora that this patch has broken suspend for users of 5.1.20 and 5.2 stable kernels.
And is the issue also in 5.3-rcX kernels? If so, can we either get this reverted there, or find the fix for it?
AFAIK the issue is also in v5.3-rcX.
I started looking at the issue now. Hopefully there is a better solution than revert but let's see.
On Sat, Aug 3, 2019 at 1:50 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Fri, Aug 02, 2019 at 12:06:39PM -0500, Justin Forbes wrote:
On Wed, Jul 24, 2019 at 3:31 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
[ Upstream commit c2bf1fc212f7e6f25ace1af8f0b3ac061ea48ba5 ]
Currently Linux does not follow PCIe spec regarding the required delays after reset. A concrete example is a Thunderbolt add-in-card that consists of a PCIe switch and two PCIe endpoints:
+-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller +-01.0-[04-36]-- DS hotplug port +-02.0-[37]----00.0 xHCI controller -04.0-[38-6b]-- DS hotplug port
The root port (1b.0) and the PCIe switch downstream ports are all PCIe gen3 so they support 8GT/s link speeds.
We wait for the PCIe hierarchy to enter D3cold (runtime):
pcieport 0000:00:1b.0: power state changed by ACPI to D3cold
When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the PCIe switch is put to reset and its power is re-applied. This means that we must follow the rules in PCIe 4.0 section 6.6.1.
For the PCIe gen3 ports we are dealing with here, the following applies:
With a Downstream Port that supports Link speeds greater than 5.0 GT/s, software must wait a minimum of 100 ms after Link training completes before sending a Configuration Request to the device immediately below that Port. Software can determine when Link training completes by polling the Data Link Layer Link Active bit or by setting up an associated interrupt (see Section 6.7.3.3).
Translating this into the above topology we would need to do this (DLLLA stands for Data Link Layer Link Active):
pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0 pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0 pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0
I've instrumented the kernel with additional logging so we can see the actual delays the kernel performs:
pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms pcieport 0000:00:1b.0: waking up bus pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60) ... pcieport 0000:00:1b.0: PME# disabled pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:01:00.0: PME# disabled pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:00.0: PME# disabled pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:01.0: PME# disabled pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:02.0: PME# disabled pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:04.0: PME# disabled pcieport 0000:02:01.0: PME# enabled pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms pcieport 0000:02:04.0: PME# enabled pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000) ... thunderbolt 0000:03:00.0: PME# disabled xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000) ... xhci_hcd 0000:37:00.0: PME# disabled
For the switch upstream port (01:00.0) we wait for 100ms but not taking into account the DLLLA requirement. We then wait 10ms for D3hot -> D0 transition of the root port and the two downstream hotplug ports. This means that we deviate from what the spec requires.
Performing the same check for system sleep (s2idle) transitions we can see following when resuming from s2idle:
pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60) ... pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) ... pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0) pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0) pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff) pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1) pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60) pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0) pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60) pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60) pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0) pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1) pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60) pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001) pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0) pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702) pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001) pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00) pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1) pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400) pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151) pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00) pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161) pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402) pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1) pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802) pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302) pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407) xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000) ... thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
This is even worse. None of the mandatory delays are performed. If this would be S3 instead of s2idle then according to PCI FW spec 3.2 section 4.6.8. there is a specific _DSM that allows the OS to skip the delays but this platform does not provide the _DSM and does not go to S3 anyway so no firmware is involved that could already handle these delays.
In this particular Intel Coffee Lake platform these delays are not actually needed because there is an additional delay as part of the ACPI power resource that is used to turn on power to the hierarchy but since that additional delay is not required by any of standards (PCIe, ACPI) it is not present in the Intel Ice Lake, for example where missing the mandatory delays causes pciehp to start tearing down the stack too early (links are not yet trained).
For this reason, change the PCIe portdrv PM resume hooks so that they perform the mandatory delays before the downstream component gets resumed. We perform the delays before port services are resumed because otherwise pciehp might find that the link is not up (even if it is just training) and tears-down the hierarchy.
We have gotten multiple reports in Fedora that this patch has broken suspend for users of 5.1.20 and 5.2 stable kernels.
And is the issue also in 5.3-rcX kernels? If so, can we either get this reverted there, or find the fix for it?
Yes, testers have reported the issue is still present in 5.3-rc2 (vanilla upstream) and a Fedora snapshot from Thursday. https://bugzilla.kernel.org/show_bug.cgi?id=204413 was also opened.
Justin
[ Upstream commit 39e3622edeffa63c2871153d8743c5825b139968 ]
Since we changed the Tx ring handling and now depends on bit31 to figure out the owner of the descriptor, we should initialize this every time the device goes down-up instead of doing it once on driver init. If the value is not correctly initialized the device won't have any available descriptors
Changes since v1: - Typo fixes
Fixes: 35e07d234739 ("net: socionext: remove mmio reads on Tx") Signed-off-by: Ilias Apalodimas ilias.apalodimas@linaro.org Acked-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/socionext/netsec.c | 32 ++++++++++++++----------- 1 file changed, 18 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c index cba5881b2746..a10ef700f16d 100644 --- a/drivers/net/ethernet/socionext/netsec.c +++ b/drivers/net/ethernet/socionext/netsec.c @@ -1029,7 +1029,6 @@ static void netsec_free_dring(struct netsec_priv *priv, int id) static int netsec_alloc_dring(struct netsec_priv *priv, enum ring_id id) { struct netsec_desc_ring *dring = &priv->desc_ring[id]; - int i;
dring->vaddr = dma_alloc_coherent(priv->dev, DESC_SZ * DESC_NUM, &dring->desc_dma, GFP_KERNEL); @@ -1040,19 +1039,6 @@ static int netsec_alloc_dring(struct netsec_priv *priv, enum ring_id id) if (!dring->desc) goto err;
- if (id == NETSEC_RING_TX) { - for (i = 0; i < DESC_NUM; i++) { - struct netsec_de *de; - - de = dring->vaddr + (DESC_SZ * i); - /* de->attr is not going to be accessed by the NIC - * until netsec_set_tx_de() is called. - * No need for a dma_wmb() here - */ - de->attr = 1U << NETSEC_TX_SHIFT_OWN_FIELD; - } - } - return 0; err: netsec_free_dring(priv, id); @@ -1060,6 +1046,23 @@ static int netsec_alloc_dring(struct netsec_priv *priv, enum ring_id id) return -ENOMEM; }
+static void netsec_setup_tx_dring(struct netsec_priv *priv) +{ + struct netsec_desc_ring *dring = &priv->desc_ring[NETSEC_RING_TX]; + int i; + + for (i = 0; i < DESC_NUM; i++) { + struct netsec_de *de; + + de = dring->vaddr + (DESC_SZ * i); + /* de->attr is not going to be accessed by the NIC + * until netsec_set_tx_de() is called. + * No need for a dma_wmb() here + */ + de->attr = 1U << NETSEC_TX_SHIFT_OWN_FIELD; + } +} + static int netsec_setup_rx_dring(struct netsec_priv *priv) { struct netsec_desc_ring *dring = &priv->desc_ring[NETSEC_RING_RX]; @@ -1361,6 +1364,7 @@ static int netsec_netdev_open(struct net_device *ndev)
pm_runtime_get_sync(priv->dev);
+ netsec_setup_tx_dring(priv); ret = netsec_setup_rx_dring(priv); if (ret) { netif_err(priv, probe, priv->ndev,
[ Upstream commit 1b7aebf0487613033aff26420e32fa2076d52846 ]
cpuinfo_x86.x86_model is an unsigned type, so comparing against zero will generate a compilation warning:
arch/x86/kernel/cpu/cacheinfo.c: In function 'cacheinfo_amd_init_llc_id': arch/x86/kernel/cpu/cacheinfo.c:662:19: warning: comparison is always true \ due to limited range of data type [-Wtype-limits]
Remove the unnecessary lower bound check.
[ bp: Massage. ]
Fixes: 68091ee7ac3c ("x86/CPU/AMD: Calculate last level cache ID from number of sharing threads") Signed-off-by: Qian Cai cai@lca.pw Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Sean Christopherson sean.j.christopherson@intel.com Cc: "Gustavo A. R. Silva" gustavo@embeddedor.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: Ingo Molnar mingo@redhat.com Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Pu Wen puwen@hygon.cn Cc: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Cc: Thomas Gleixner tglx@linutronix.de Cc: x86-ml x86@kernel.org Link: https://lkml.kernel.org/r/1560954773-11967-1-git-send-email-cai@lca.pw Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/cpu/cacheinfo.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c index 395d46f78582..c7503be92f35 100644 --- a/arch/x86/kernel/cpu/cacheinfo.c +++ b/arch/x86/kernel/cpu/cacheinfo.c @@ -658,8 +658,7 @@ void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id) if (c->x86 < 0x17) { /* LLC is at the node level. */ per_cpu(cpu_llc_id, cpu) = node_id; - } else if (c->x86 == 0x17 && - c->x86_model >= 0 && c->x86_model <= 0x1F) { + } else if (c->x86 == 0x17 && c->x86_model <= 0x1F) { /* * LLC is at the core complex level. * Core complex ID is ApicId[3] for these processors.
[ Upstream commit a3fb01ba5af066521f3f3421839e501bb2c71805 ]
As is, iolatency recognizes done_bio and cleanup as ending paths. If a request is marked REQ_NOWAIT and fails to get a request, the bio is cleaned up via rq_qos_cleanup() and ended in bio_wouldblock_error(). This results in underflowing the inflight counter. Fix this by only accounting bios that were actually submitted.
Signed-off-by: Dennis Zhou dennis@kernel.org Cc: Josef Bacik josef@toxicpanda.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-iolatency.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index d22e61bced86..c91b84bb9d0a 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -600,6 +600,10 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio) if (!blkg || !bio_flagged(bio, BIO_TRACKED)) return;
+ /* We didn't actually submit this bio, don't account it. */ + if (bio->bi_status == BLK_STS_AGAIN) + return; + iolat = blkg_to_lat(bio->bi_blkg); if (!iolat) return;
[ Upstream commit 44758bafa53602f2581a6857bb20b55d4d8ad5b2 ]
ACPI GPEs (other than the EC one) can be enabled in two situations. First, the GPEs with existing _Lxx and _Exx methods are enabled implicitly by ACPICA during system initialization. Second, the GPEs without these methods (like GPEs listed by _PRW objects for wakeup devices) need to be enabled directly by the code that is going to use them (e.g. ACPI power management or device drivers).
In the former case, if the status of a given GPE is set to start with, its handler method (either _Lxx or _Exx) needs to be invoked to take care of the events (possibly) signaled before the GPE was enabled. In the latter case, however, the first caller of acpi_enable_gpe() for a given GPE should not be expected to care about any events that might be signaled through it earlier. In that case, it is better to clear the status of the GPE before enabling it, to prevent stale events from triggering unwanted actions (like spurious system resume, for example).
For this reason, modify acpi_ev_add_gpe_reference() to take an additional boolean argument indicating whether or not the GPE status needs to be cleared when its reference counter changes from zero to one and make acpi_enable_gpe() pass TRUE to it through that new argument.
Fixes: 18996f2db918 ("ACPICA: Events: Stop unconditionally clearing ACPI IRQs during suspend/resume") Reported-by: Furquan Shaikh furquan@google.com Tested-by: Furquan Shaikh furquan@google.com Tested-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/acpica/acevents.h | 3 ++- drivers/acpi/acpica/evgpe.c | 8 +++++++- drivers/acpi/acpica/evgpeblk.c | 2 +- drivers/acpi/acpica/evxface.c | 2 +- drivers/acpi/acpica/evxfgpe.c | 2 +- 5 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/acpi/acpica/acevents.h b/drivers/acpi/acpica/acevents.h index 831660179662..c8652f91054e 100644 --- a/drivers/acpi/acpica/acevents.h +++ b/drivers/acpi/acpica/acevents.h @@ -69,7 +69,8 @@ acpi_status acpi_ev_mask_gpe(struct acpi_gpe_event_info *gpe_event_info, u8 is_masked);
acpi_status -acpi_ev_add_gpe_reference(struct acpi_gpe_event_info *gpe_event_info); +acpi_ev_add_gpe_reference(struct acpi_gpe_event_info *gpe_event_info, + u8 clear_on_enable);
acpi_status acpi_ev_remove_gpe_reference(struct acpi_gpe_event_info *gpe_event_info); diff --git a/drivers/acpi/acpica/evgpe.c b/drivers/acpi/acpica/evgpe.c index 62d3aa74277b..344feba29063 100644 --- a/drivers/acpi/acpica/evgpe.c +++ b/drivers/acpi/acpica/evgpe.c @@ -146,6 +146,7 @@ acpi_ev_mask_gpe(struct acpi_gpe_event_info *gpe_event_info, u8 is_masked) * FUNCTION: acpi_ev_add_gpe_reference * * PARAMETERS: gpe_event_info - Add a reference to this GPE + * clear_on_enable - Clear GPE status before enabling it * * RETURN: Status * @@ -155,7 +156,8 @@ acpi_ev_mask_gpe(struct acpi_gpe_event_info *gpe_event_info, u8 is_masked) ******************************************************************************/
acpi_status -acpi_ev_add_gpe_reference(struct acpi_gpe_event_info *gpe_event_info) +acpi_ev_add_gpe_reference(struct acpi_gpe_event_info *gpe_event_info, + u8 clear_on_enable) { acpi_status status = AE_OK;
@@ -170,6 +172,10 @@ acpi_ev_add_gpe_reference(struct acpi_gpe_event_info *gpe_event_info)
/* Enable on first reference */
+ if (clear_on_enable) { + (void)acpi_hw_clear_gpe(gpe_event_info); + } + status = acpi_ev_update_gpe_enable_mask(gpe_event_info); if (ACPI_SUCCESS(status)) { status = acpi_ev_enable_gpe(gpe_event_info); diff --git a/drivers/acpi/acpica/evgpeblk.c b/drivers/acpi/acpica/evgpeblk.c index 328d1d6123ad..fb15e9e2373b 100644 --- a/drivers/acpi/acpica/evgpeblk.c +++ b/drivers/acpi/acpica/evgpeblk.c @@ -453,7 +453,7 @@ acpi_ev_initialize_gpe_block(struct acpi_gpe_xrupt_info *gpe_xrupt_info, continue; }
- status = acpi_ev_add_gpe_reference(gpe_event_info); + status = acpi_ev_add_gpe_reference(gpe_event_info, FALSE); if (ACPI_FAILURE(status)) { ACPI_EXCEPTION((AE_INFO, status, "Could not enable GPE 0x%02X", diff --git a/drivers/acpi/acpica/evxface.c b/drivers/acpi/acpica/evxface.c index 3df00eb6621b..279ef0557aa3 100644 --- a/drivers/acpi/acpica/evxface.c +++ b/drivers/acpi/acpica/evxface.c @@ -971,7 +971,7 @@ acpi_remove_gpe_handler(acpi_handle gpe_device, ACPI_GPE_DISPATCH_METHOD) || (ACPI_GPE_DISPATCH_TYPE(handler->original_flags) == ACPI_GPE_DISPATCH_NOTIFY)) && handler->originally_enabled) { - (void)acpi_ev_add_gpe_reference(gpe_event_info); + (void)acpi_ev_add_gpe_reference(gpe_event_info, FALSE); if (ACPI_GPE_IS_POLLING_NEEDED(gpe_event_info)) {
/* Poll edge triggered GPEs to handle existing events */ diff --git a/drivers/acpi/acpica/evxfgpe.c b/drivers/acpi/acpica/evxfgpe.c index 30a083902f52..710488ec59e9 100644 --- a/drivers/acpi/acpica/evxfgpe.c +++ b/drivers/acpi/acpica/evxfgpe.c @@ -108,7 +108,7 @@ acpi_status acpi_enable_gpe(acpi_handle gpe_device, u32 gpe_number) if (gpe_event_info) { if (ACPI_GPE_DISPATCH_TYPE(gpe_event_info->flags) != ACPI_GPE_DISPATCH_NONE) { - status = acpi_ev_add_gpe_reference(gpe_event_info); + status = acpi_ev_add_gpe_reference(gpe_event_info, TRUE); if (ACPI_SUCCESS(status) && ACPI_GPE_IS_POLLING_NEEDED(gpe_event_info)) {
[ Upstream commit f9481b08220d7dc1ff21e296a330ee8b721b44e4 ]
at91sam9g25ek showed the following error at probe: atmel_spi f0000000.spi: Using dma0chan2 (tx) and dma0chan3 (rx) for DMA transfers atmel_spi: probe of f0000000.spi failed with error -22
Commit 0a919ae49223 ("spi: Don't call spi_get_gpio_descs() before device name is set") moved the calling of spi_get_gpio_descs() after ctrl->dev is set, but didn't move the !ctrl->num_chipselect check. When there are chip selects in the device tree, the spi-atmel driver lets the SPI core discover them when registering the SPI master. The ctrl->num_chipselect is thus expected to be set by spi_get_gpio_descs().
Move the !ctlr->num_chipselect after spi_get_gpio_descs() as it was before the aforementioned commit. While touching this block, get rid of the explicit comparison with 0 and update the commenting style.
Fixes: 0a919ae49223 ("spi: Don't call spi_get_gpio_descs() before device name is set") Signed-off-by: Tudor Ambarus tudor.ambarus@microchip.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index 5e4654032bfa..29916e446143 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -2286,11 +2286,6 @@ int spi_register_controller(struct spi_controller *ctlr) if (status) return status;
- /* even if it's just one always-selected device, there must - * be at least one chipselect - */ - if (ctlr->num_chipselect == 0) - return -EINVAL; if (ctlr->bus_num >= 0) { /* devices with a fixed bus num must check-in with the num */ mutex_lock(&board_lock); @@ -2361,6 +2356,13 @@ int spi_register_controller(struct spi_controller *ctlr) } }
+ /* + * Even if it's just one always-selected device, there must + * be at least one chipselect. + */ + if (!ctlr->num_chipselect) + return -EINVAL; + status = device_add(&ctlr->dev); if (status < 0) { /* free bus id */
[ Upstream commit 7adc05d2dc3af95e4e1534841d58f736262142cd ]
Do put_device() if device_add() fails.
[ bp: do device_del() for the successfully created devices in edac_create_csrow_objects(), on the unwind path. ]
Signed-off-by: Greg KH gregkh@linuxfoundation.org Signed-off-by: Borislav Petkov bp@suse.de Link: https://lkml.kernel.org/r/20190427214925.GE16338@kroah.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/edac/edac_mc_sysfs.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c index 464174685589..bf9273437e3f 100644 --- a/drivers/edac/edac_mc_sysfs.c +++ b/drivers/edac/edac_mc_sysfs.c @@ -443,7 +443,8 @@ static int edac_create_csrow_objects(struct mem_ctl_info *mci) csrow = mci->csrows[i]; if (!nr_pages_per_csrow(csrow)) continue; - put_device(&mci->csrows[i]->dev); + + device_del(&mci->csrows[i]->dev); }
return err; @@ -645,9 +646,11 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci, dev_set_drvdata(&dimm->dev, dimm); pm_runtime_forbid(&mci->dev);
- err = device_add(&dimm->dev); + err = device_add(&dimm->dev); + if (err) + put_device(&dimm->dev);
- edac_dbg(0, "creating rank/dimm device %s\n", dev_name(&dimm->dev)); + edac_dbg(0, "created rank/dimm device %s\n", dev_name(&dimm->dev));
return err; } @@ -928,6 +931,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci, err = device_add(&mci->dev); if (err < 0) { edac_dbg(1, "failure: create device %s\n", dev_name(&mci->dev)); + put_device(&mci->dev); goto out; }
[ Upstream commit 585fb3d93d32dbe89e718b85009f9c322cc554cd ]
In edac_create_csrow_object(), the reference to the object is not released when adding the device to the device hierarchy fails (device_add()). This may result in a memory leak.
Signed-off-by: Pan Bian bianpan2016@163.com Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: James Morse james.morse@arm.com Cc: Mauro Carvalho Chehab mchehab@kernel.org Cc: linux-edac linux-edac@vger.kernel.org Link: https://lkml.kernel.org/r/1555554438-103953-1-git-send-email-bianpan2016@163... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/edac/edac_mc_sysfs.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c index bf9273437e3f..7c01e1cc030c 100644 --- a/drivers/edac/edac_mc_sysfs.c +++ b/drivers/edac/edac_mc_sysfs.c @@ -404,6 +404,8 @@ static inline int nr_pages_per_csrow(struct csrow_info *csrow) static int edac_create_csrow_object(struct mem_ctl_info *mci, struct csrow_info *csrow, int index) { + int err; + csrow->dev.type = &csrow_attr_type; csrow->dev.groups = csrow_dev_groups; device_initialize(&csrow->dev); @@ -415,7 +417,11 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci, edac_dbg(0, "creating (virtual) csrow node %s\n", dev_name(&csrow->dev));
- return device_add(&csrow->dev); + err = device_add(&csrow->dev); + if (err) + put_device(&csrow->dev); + + return err; }
/* Create a CSROW object under specifed edac_mc_device */
[ Upstream commit 2181e455612a8db2761eabbf126640552a451e96 ]
When a shared namespace is removed, we call blk_cleanup_queue() when the device can still be accessed as the current path and this can result in submission to a dying queue. Hence, direct_make_request() called by our mpath device may fail (propagating the failure to userspace). Instead, we want to failover this I/O to a different path if one exists. Thus, before we cleanup the request queue, we make sure that the device is cleared from the current path nor it can be selected again as such.
Fix this by: - clear the ns from the head->list and synchronize rcu to make sure there is no concurrent path search that restores it as the current path - clear the mpath current path in order to trigger a subsequent path search and sync srcu to wait for any ongoing request submissions - safely continue to namespace removal and blk_cleanup_queue
Signed-off-by: Anton Eidelman anton@lightbitslabs.com Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 120fb593d1da..22c68e3b71d5 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -3344,6 +3344,14 @@ static void nvme_ns_remove(struct nvme_ns *ns) return;
nvme_fault_inject_fini(ns); + + mutex_lock(&ns->ctrl->subsys->lock); + list_del_rcu(&ns->siblings); + mutex_unlock(&ns->ctrl->subsys->lock); + synchronize_rcu(); /* guarantee not available in head->list */ + nvme_mpath_clear_current_path(ns); + synchronize_srcu(&ns->head->srcu); /* wait for concurrent submissions */ + if (ns->disk && ns->disk->flags & GENHD_FL_UP) { del_gendisk(ns->disk); blk_cleanup_queue(ns->queue); @@ -3351,16 +3359,10 @@ static void nvme_ns_remove(struct nvme_ns *ns) blk_integrity_unregister(ns->disk); }
- mutex_lock(&ns->ctrl->subsys->lock); - list_del_rcu(&ns->siblings); - nvme_mpath_clear_current_path(ns); - mutex_unlock(&ns->ctrl->subsys->lock); - down_write(&ns->ctrl->namespaces_rwsem); list_del_init(&ns->list); up_write(&ns->ctrl->namespaces_rwsem);
- synchronize_srcu(&ns->head->srcu); nvme_mpath_check_last_path(ns); nvme_put_ns(ns); }
[ Upstream commit cee6c269b016ba89c62e34d6bccb103ee2c7de4f ]
If the state change to NVME_CTRL_CONNECTING fails, the dmesg is going to be like:
[ 293.689160] nvme nvme0: failed to mark controller CONNECTING [ 293.689160] nvme nvme0: Removing after probe failure status: 0
Even it prints the first line to indicate the situation, the second line is not proper because the status is 0 which means normally success of the previous operation.
This patch makes it indicate the proper error value when it fails. [ 25.932367] nvme nvme0: failed to mark controller CONNECTING [ 25.932369] nvme nvme0: Removing after probe failure status: -16
This situation is able to be easily reproduced by: root@target:~# rmmod nvme && modprobe nvme && rmmod nvme
Signed-off-by: Minwoo Im minwoo.im.dev@gmail.com Reviewed-by: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/pci.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 524d6bd6d095..385ba7a1e23b 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2528,6 +2528,7 @@ static void nvme_reset_work(struct work_struct *work) if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_CONNECTING)) { dev_warn(dev->ctrl.device, "failed to mark controller CONNECTING\n"); + result = -EBUSY; goto out; }
[ Upstream commit e71afda49335620e3d9adf56015676db33a3bd86 ]
This patch removes the confusing assignment of the variable result at the time of declaration and sets the value in error cases next to the places where the actual error is happening.
Here we also set the result value to -ENODEV when we fail at the final ctrl state transition in nvme_reset_work(). Without this assignment result will hold 0 from nvme_setup_io_queue() and on failure 0 will be passed to he nvme_remove_dead_ctrl() from final state transition.
Signed-off-by: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/pci.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 385ba7a1e23b..544d095d44e5 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2480,11 +2480,13 @@ static void nvme_reset_work(struct work_struct *work) struct nvme_dev *dev = container_of(work, struct nvme_dev, ctrl.reset_work); bool was_suspend = !!(dev->ctrl.ctrl_config & NVME_CC_SHN_NORMAL); - int result = -ENODEV; + int result; enum nvme_ctrl_state new_state = NVME_CTRL_LIVE;
- if (WARN_ON(dev->ctrl.state != NVME_CTRL_RESETTING)) + if (WARN_ON(dev->ctrl.state != NVME_CTRL_RESETTING)) { + result = -ENODEV; goto out; + }
/* * If we're called to reset a live controller first shut it down before @@ -2589,6 +2591,7 @@ static void nvme_reset_work(struct work_struct *work) if (!nvme_change_ctrl_state(&dev->ctrl, new_state)) { dev_warn(dev->ctrl.device, "failed to mark controller state %d\n", new_state); + result = -ENODEV; goto out; }
[ Upstream commit 510fd8ea98fcb586c01aef93d87c060a159ac30a ]
bio_add_pc_page() may merge pages when a bio is padded due to a flush. Fix iteration over the bio to free the correct pages in case of a merge.
Signed-off-by: Heiner Litz hlitz@ucsc.edu Reviewed-by: Javier González javier@javigon.com Signed-off-by: Matias Bjørling mb@lightnvm.io Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/lightnvm/pblk-core.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c index 773537804319..f546e6f28b8a 100644 --- a/drivers/lightnvm/pblk-core.c +++ b/drivers/lightnvm/pblk-core.c @@ -323,14 +323,16 @@ void pblk_free_rqd(struct pblk *pblk, struct nvm_rq *rqd, int type) void pblk_bio_free_pages(struct pblk *pblk, struct bio *bio, int off, int nr_pages) { - struct bio_vec bv; - int i; - - WARN_ON(off + nr_pages != bio->bi_vcnt); - - for (i = off; i < nr_pages + off; i++) { - bv = bio->bi_io_vec[i]; - mempool_free(bv.bv_page, &pblk->page_bio_pool); + struct bio_vec *bv; + struct page *page; + int i, e, nbv = 0; + + for (i = 0; i < bio->bi_vcnt; i++) { + bv = &bio->bi_io_vec[i]; + page = bv->bv_page; + for (e = 0; e < bv->bv_len; e += PBLK_EXPOSED_PAGE_SIZE, nbv++) + if (nbv >= off) + mempool_free(page++, &pblk->page_bio_pool); } }
[ Upstream commit 2f5af4ab7de14bd35f3435e6a47300276bbb6c17 ]
With gcc 4.1:
drivers/lightnvm/core.c: In function ‘nvm_remove_tgt’: drivers/lightnvm/core.c:510: warning: ‘t’ is used uninitialized in this function
Indeed, if no NVM devices have been registered, t will be an uninitialized pointer, and may be dereferenced later. A call to nvm_remove_tgt() can be triggered from userspace by issuing the NVM_DEV_REMOVE ioctl on the lightnvm control device.
Fix this by preinitializing t to NULL.
Fixes: 843f2edbdde085b4 ("lightnvm: do not remove instance under global lock") Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Matias Bjørling mb@lightnvm.io Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/lightnvm/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c index 7d555b110ecd..a600934fdd9c 100644 --- a/drivers/lightnvm/core.c +++ b/drivers/lightnvm/core.c @@ -478,7 +478,7 @@ static void __nvm_remove_target(struct nvm_target *t, bool graceful) */ static int nvm_remove_tgt(struct nvm_ioctl_remove *remove) { - struct nvm_target *t; + struct nvm_target *t = NULL; struct nvm_dev *dev;
down_read(&nvm_lock);
[ Upstream commit dad77d63903e91a2e97a0c984cabe5d36e91ba60 ]
If the "irq_queues" are greater than num_possible_cpus(), nvme_calc_irq_sets() can have irq set_size for HCTX_TYPE_DEFAULT greater than it can be afforded. 2039 affd->set_size[HCTX_TYPE_DEFAULT] = nrirqs - nr_read_queues;
It might cause a WARN() from the irq_build_affinity_masks() like [1]: 220 if (nr_present < numvecs) 221 WARN_ON(nr_present + nr_others < numvecs);
This patch prevents it from the WARN() by adjusting the max_vector value from the nvme_setup_irqs().
[1] WARN messages when modprobe nvme write_queues=32 poll_queues=0: root@target:~/nvme# nproc 8 root@target:~/nvme# modprobe nvme write_queues=32 poll_queues=0 [ 17.925326] nvme nvme0: pci function 0000:00:04.0 [ 17.940601] WARNING: CPU: 3 PID: 1030 at kernel/irq/affinity.c:221 irq_create_affinity_masks+0x222/0x330 [ 17.940602] Modules linked in: nvme nvme_core [last unloaded: nvme] [ 17.940605] CPU: 3 PID: 1030 Comm: kworker/u17:4 Tainted: G W 5.1.0+ #156 [ 17.940605] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 17.940608] Workqueue: nvme-reset-wq nvme_reset_work [nvme] [ 17.940609] RIP: 0010:irq_create_affinity_masks+0x222/0x330 [ 17.940611] Code: 4c 8d 4c 24 28 4c 8d 44 24 30 e8 c9 fa ff ff 89 44 24 18 e8 c0 38 fa ff 8b 44 24 18 44 8b 54 24 1c 5a 44 01 d0 41 39 c4 76 02 <0f> 0b 48 89 df 44 01 e5 e8 f1 ce 10 00 48 8b 34 24 44 89 f0 44 01 [ 17.940611] RSP: 0018:ffffc90002277c50 EFLAGS: 00010216 [ 17.940612] RAX: 0000000000000008 RBX: ffff88807ca48860 RCX: 0000000000000000 [ 17.940612] RDX: ffff88807bc03800 RSI: 0000000000000020 RDI: 0000000000000000 [ 17.940613] RBP: 0000000000000001 R08: ffffc90002277c78 R09: ffffc90002277c70 [ 17.940613] R10: 0000000000000008 R11: 0000000000000001 R12: 0000000000000020 [ 17.940614] R13: 0000000000025d08 R14: 0000000000000001 R15: ffff88807bc03800 [ 17.940614] FS: 0000000000000000(0000) GS:ffff88807db80000(0000) knlGS:0000000000000000 [ 17.940616] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 17.940617] CR2: 00005635e583f790 CR3: 000000000240a000 CR4: 00000000000006e0 [ 17.940617] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 17.940618] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 17.940618] Call Trace: [ 17.940622] __pci_enable_msix_range+0x215/0x540 [ 17.940623] ? kernfs_put+0x117/0x160 [ 17.940625] pci_alloc_irq_vectors_affinity+0x74/0x110 [ 17.940626] nvme_reset_work+0xc30/0x1397 [nvme] [ 17.940628] ? __switch_to_asm+0x34/0x70 [ 17.940628] ? __switch_to_asm+0x40/0x70 [ 17.940629] ? __switch_to_asm+0x34/0x70 [ 17.940630] ? __switch_to_asm+0x40/0x70 [ 17.940630] ? __switch_to_asm+0x34/0x70 [ 17.940631] ? __switch_to_asm+0x40/0x70 [ 17.940632] ? nvme_irq_check+0x30/0x30 [nvme] [ 17.940633] process_one_work+0x20b/0x3e0 [ 17.940634] worker_thread+0x1f9/0x3d0 [ 17.940635] ? cancel_delayed_work+0xa0/0xa0 [ 17.940636] kthread+0x117/0x120 [ 17.940637] ? kthread_stop+0xf0/0xf0 [ 17.940638] ret_from_fork+0x3a/0x50 [ 17.940639] ---[ end trace aca8a131361cd42a ]--- [ 17.942124] nvme nvme0: 7/1/0 default/read/poll queues
Signed-off-by: Minwoo Im minwoo.im.dev@gmail.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/pci.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 544d095d44e5..f5bc1c30cef5 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2068,6 +2068,7 @@ static int nvme_setup_irqs(struct nvme_dev *dev, unsigned int nr_io_queues) .priv = dev, }; unsigned int irq_queues, this_p_queues; + unsigned int nr_cpus = num_possible_cpus();
/* * Poll queues don't need interrupts, but we need at least one IO @@ -2078,7 +2079,10 @@ static int nvme_setup_irqs(struct nvme_dev *dev, unsigned int nr_io_queues) this_p_queues = nr_io_queues - 1; irq_queues = 1; } else { - irq_queues = nr_io_queues - this_p_queues + 1; + if (nr_cpus < nr_io_queues - this_p_queues) + irq_queues = nr_cpus + 1; + else + irq_queues = nr_io_queues - this_p_queues + 1; } dev->io_queues[HCTX_TYPE_POLL] = this_p_queues;
[ Upstream commit 9034f6251572a4744597c51dea5ab73a55f2b938 ]
For el0_dbg and el0_error, DAIF bits get explicitly cleared before calling ct_user_exit.
When context tracking is disabled, DAIF gets set (almost) immediately after. When context tracking is enabled, among the first things done is disabling IRQs.
What is actually needed is: - PSR.D = 0 so the system can be debugged (should be already the case) - PSR.A = 0 so async error can be handled during context tracking
Do not clear PSR.I in those two locations.
Reviewed-by: Marc Zyngier marc.zyngier@arm.com Acked-by: Mark Rutland mark.rutland@arm.com Reviewed-by: James Morse james.morse@arm.com Cc: Will Deacon will.deacon@arm.com Signed-off-by: Julien Thierry julien.thierry@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/entry.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 2df8d0a1d980..dbe467686332 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -859,7 +859,7 @@ el0_dbg: mov x1, x25 mov x2, sp bl do_debug_exception - enable_daif + enable_da_f ct_user_exit b ret_to_user el0_inv: @@ -911,7 +911,7 @@ el0_error_naked: enable_dbg mov x0, sp bl do_serror - enable_daif + enable_da_f ct_user_exit b ret_to_user ENDPROC(el0_error)
[ Upstream commit 597179b0ba550bd83fab1a9d57c42a9343c58514 ]
kernelci.org reports failed builds on arc because of what looks like an old missed 'select' statement:
net/xfrm/xfrm_algo.o: In function `xfrm_probe_algs': xfrm_algo.c:(.text+0x1e8): undefined reference to `crypto_has_ahash'
I don't see this in randconfig builds on other architectures, but it's fairly clear we want to select the hash code for it, like we do for all its other users. As Herbert points out, CRYPTO_BLKCIPHER is also required even though it has not popped up in build tests.
Fixes: 17bc19702221 ("ipsec: Use skcipher and ahash when probing algorithms") Signed-off-by: Arnd Bergmann arnd@arndb.de Acked-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/Kconfig | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/xfrm/Kconfig b/net/xfrm/Kconfig index c967fc3c38c8..51bb6018f3bf 100644 --- a/net/xfrm/Kconfig +++ b/net/xfrm/Kconfig @@ -15,6 +15,8 @@ config XFRM_ALGO tristate select XFRM select CRYPTO + select CRYPTO_HASH + select CRYPTO_BLKCIPHER
if INET config XFRM_USER
[ Upstream commit a84e355ecd3ed9759d7aaa40170aab78e2a68a06 ]
There are three error return paths that don't kfree params causing a memory leak. Fix this by adding an error return path that kfree's params before returning. Also add a check to see params failed to be allocated.
Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/media/davinci_vpfe/dm365_ipipe.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/staging/media/davinci_vpfe/dm365_ipipe.c b/drivers/staging/media/davinci_vpfe/dm365_ipipe.c index 30e2edc0cec5..b88855c7ffe8 100644 --- a/drivers/staging/media/davinci_vpfe/dm365_ipipe.c +++ b/drivers/staging/media/davinci_vpfe/dm365_ipipe.c @@ -1251,10 +1251,10 @@ static int ipipe_s_config(struct v4l2_subdev *sd, struct vpfe_ipipe_config *cfg) struct vpfe_ipipe_device *ipipe = v4l2_get_subdevdata(sd); unsigned int i; int rval = 0; + struct ipipe_module_params *params;
for (i = 0; i < ARRAY_SIZE(ipipe_modules); i++) { const struct ipipe_module_if *module_if; - struct ipipe_module_params *params; void *from, *to; size_t size;
@@ -1265,25 +1265,30 @@ static int ipipe_s_config(struct v4l2_subdev *sd, struct vpfe_ipipe_config *cfg) from = *(void **)((void *)cfg + module_if->config_offset);
params = kmalloc(sizeof(*params), GFP_KERNEL); + if (!params) + return -ENOMEM; to = (void *)params + module_if->param_offset; size = module_if->param_size;
if (to && from && size) { if (copy_from_user(to, (void __user *)from, size)) { rval = -EFAULT; - break; + goto error_free; } rval = module_if->set(ipipe, to); if (rval) - goto error; + goto error_free; } else if (to && !from && size) { rval = module_if->set(ipipe, NULL); if (rval) - goto error; + goto error_free; } kfree(params); } -error: + return rval; + +error_free: + kfree(params); return rval; }
[ Upstream commit cf47a0b882a4e5f6b34c7949d7b293e9287f1972 ]
syzkaller reports for memory leak when registering hooks [1]
As we moved the nf_unregister_net_hooks() call into __ip_vs_dev_cleanup(), defer the nf_register_net_hooks() call, so that hooks are allocated and freed from same pernet_operations (ipvs_core_dev_ops).
[1] BUG: memory leak unreferenced object 0xffff88810acd8a80 (size 96): comm "syz-executor073", pid 7254, jiffies 4294950560 (age 22.250s) hex dump (first 32 bytes): 02 00 00 00 00 00 00 00 50 8b bb 82 ff ff ff ff ........P....... 00 00 00 00 00 00 00 00 00 77 bb 82 ff ff ff ff .........w...... backtrace: [<0000000013db61f1>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<0000000013db61f1>] slab_post_alloc_hook mm/slab.h:439 [inline] [<0000000013db61f1>] slab_alloc_node mm/slab.c:3269 [inline] [<0000000013db61f1>] kmem_cache_alloc_node_trace+0x15b/0x2a0 mm/slab.c:3597 [<000000001a27307d>] __do_kmalloc_node mm/slab.c:3619 [inline] [<000000001a27307d>] __kmalloc_node+0x38/0x50 mm/slab.c:3627 [<0000000025054add>] kmalloc_node include/linux/slab.h:590 [inline] [<0000000025054add>] kvmalloc_node+0x4a/0xd0 mm/util.c:431 [<0000000050d1bc00>] kvmalloc include/linux/mm.h:637 [inline] [<0000000050d1bc00>] kvzalloc include/linux/mm.h:645 [inline] [<0000000050d1bc00>] allocate_hook_entries_size+0x3b/0x60 net/netfilter/core.c:61 [<00000000e8abe142>] nf_hook_entries_grow+0xae/0x270 net/netfilter/core.c:128 [<000000004b94797c>] __nf_register_net_hook+0x9a/0x170 net/netfilter/core.c:337 [<00000000d1545cbc>] nf_register_net_hook+0x34/0xc0 net/netfilter/core.c:464 [<00000000876c9b55>] nf_register_net_hooks+0x53/0xc0 net/netfilter/core.c:480 [<000000002ea868e0>] __ip_vs_init+0xe8/0x170 net/netfilter/ipvs/ip_vs_core.c:2280 [<000000002eb2d451>] ops_init+0x4c/0x140 net/core/net_namespace.c:130 [<000000000284ec48>] setup_net+0xde/0x230 net/core/net_namespace.c:316 [<00000000a70600fa>] copy_net_ns+0xf0/0x1e0 net/core/net_namespace.c:439 [<00000000ff26c15e>] create_new_namespaces+0x141/0x2a0 kernel/nsproxy.c:107 [<00000000b103dc79>] copy_namespaces+0xa1/0xe0 kernel/nsproxy.c:165 [<000000007cc008a2>] copy_process.part.0+0x11fd/0x2150 kernel/fork.c:2035 [<00000000c344af7c>] copy_process kernel/fork.c:1800 [inline] [<00000000c344af7c>] _do_fork+0x121/0x4f0 kernel/fork.c:2369
Reported-by: syzbot+722da59ccb264bc19910@syzkaller.appspotmail.com Fixes: 719c7d563c17 ("ipvs: Fix use-after-free in ip_vs_in") Signed-off-by: Julian Anastasov ja@ssi.bg Acked-by: Simon Horman horms@verge.net.au Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipvs/ip_vs_core.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c index 7138556b206b..d5103a9eb302 100644 --- a/net/netfilter/ipvs/ip_vs_core.c +++ b/net/netfilter/ipvs/ip_vs_core.c @@ -2245,7 +2245,6 @@ static const struct nf_hook_ops ip_vs_ops[] = { static int __net_init __ip_vs_init(struct net *net) { struct netns_ipvs *ipvs; - int ret;
ipvs = net_generic(net, ip_vs_net_id); if (ipvs == NULL) @@ -2277,17 +2276,11 @@ static int __net_init __ip_vs_init(struct net *net) if (ip_vs_sync_net_init(ipvs) < 0) goto sync_fail;
- ret = nf_register_net_hooks(net, ip_vs_ops, ARRAY_SIZE(ip_vs_ops)); - if (ret < 0) - goto hook_fail; - return 0; /* * Error handling */
-hook_fail: - ip_vs_sync_net_cleanup(ipvs); sync_fail: ip_vs_conn_net_cleanup(ipvs); conn_fail: @@ -2317,6 +2310,19 @@ static void __net_exit __ip_vs_cleanup(struct net *net) net->ipvs = NULL; }
+static int __net_init __ip_vs_dev_init(struct net *net) +{ + int ret; + + ret = nf_register_net_hooks(net, ip_vs_ops, ARRAY_SIZE(ip_vs_ops)); + if (ret < 0) + goto hook_fail; + return 0; + +hook_fail: + return ret; +} + static void __net_exit __ip_vs_dev_cleanup(struct net *net) { struct netns_ipvs *ipvs = net_ipvs(net); @@ -2336,6 +2342,7 @@ static struct pernet_operations ipvs_core_ops = { };
static struct pernet_operations ipvs_core_dev_ops = { + .init = __ip_vs_dev_init, .exit = __ip_vs_dev_cleanup, };
[ Upstream commit e08efef8fe7db87206314c19b341612c719f891a ]
Since the beginning the second clock ('special', 'sclk') was optional and it is not available on some variants of Exynos SoCs (i.e. Exynos5420 with v7 of MFC hardware).
However commit 1bce6fb3edf1 ("[media] s5p-mfc: Rework clock handling") made handling of all specified clocks mandatory. This patch restores original behavior of the driver and fixes its operation on Exynos5420 SoCs.
Fixes: 1bce6fb3edf1 ("[media] s5p-mfc: Rework clock handling") Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/s5p-mfc/s5p_mfc_pm.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/media/platform/s5p-mfc/s5p_mfc_pm.c b/drivers/media/platform/s5p-mfc/s5p_mfc_pm.c index 2e62f8721fa5..7d52431c2c83 100644 --- a/drivers/media/platform/s5p-mfc/s5p_mfc_pm.c +++ b/drivers/media/platform/s5p-mfc/s5p_mfc_pm.c @@ -34,6 +34,11 @@ int s5p_mfc_init_pm(struct s5p_mfc_dev *dev) for (i = 0; i < pm->num_clocks; i++) { pm->clocks[i] = devm_clk_get(pm->device, pm->clk_names[i]); if (IS_ERR(pm->clocks[i])) { + /* additional clocks are optional */ + if (i && PTR_ERR(pm->clocks[i]) == -ENOENT) { + pm->clocks[i] = NULL; + continue; + } mfc_err("Failed to get clock: %s\n", pm->clk_names[i]); return PTR_ERR(pm->clocks[i]);
[ Upstream commit b2ce5617dad254230551feda3599f2cc68e53ad8 ]
When building with CONFIG_VIDEO_ADV7511 and CONFIG_DRM_I2C_ADV7511 enabled as loadable modules, we see the following warning:
drivers/gpu/drm/bridge/adv7511/adv7511.ko drivers/media/i2c/adv7511.ko
Rework so that the file is named adv7511-v4l2.c.
Signed-off-by: Anders Roxell anders.roxell@linaro.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/i2c/Makefile | 2 +- drivers/media/i2c/{adv7511.c => adv7511-v4l2.c} | 5 +++++ 2 files changed, 6 insertions(+), 1 deletion(-) rename drivers/media/i2c/{adv7511.c => adv7511-v4l2.c} (99%)
diff --git a/drivers/media/i2c/Makefile b/drivers/media/i2c/Makefile index d8ad9dad495d..fd4ea86dedd5 100644 --- a/drivers/media/i2c/Makefile +++ b/drivers/media/i2c/Makefile @@ -35,7 +35,7 @@ obj-$(CONFIG_VIDEO_ADV748X) += adv748x/ obj-$(CONFIG_VIDEO_ADV7604) += adv7604.o obj-$(CONFIG_VIDEO_ADV7842) += adv7842.o obj-$(CONFIG_VIDEO_AD9389B) += ad9389b.o -obj-$(CONFIG_VIDEO_ADV7511) += adv7511.o +obj-$(CONFIG_VIDEO_ADV7511) += adv7511-v4l2.o obj-$(CONFIG_VIDEO_VPX3220) += vpx3220.o obj-$(CONFIG_VIDEO_VS6624) += vs6624.o obj-$(CONFIG_VIDEO_BT819) += bt819.o diff --git a/drivers/media/i2c/adv7511.c b/drivers/media/i2c/adv7511-v4l2.c similarity index 99% rename from drivers/media/i2c/adv7511.c rename to drivers/media/i2c/adv7511-v4l2.c index cec5ebb1c9e6..2ad6bdf1a9fc 100644 --- a/drivers/media/i2c/adv7511.c +++ b/drivers/media/i2c/adv7511-v4l2.c @@ -5,6 +5,11 @@ * Copyright 2013 Cisco Systems, Inc. and/or its affiliates. All rights reserved. */
+/* + * This file is named adv7511-v4l2.c so it doesn't conflict with the Analog + * Device ADV7511 (config fragment CONFIG_DRM_I2C_ADV7511). + */ +
#include <linux/kernel.h> #include <linux/module.h>
[ Upstream commit d897a4ab11dc8a9fda50d2eccc081a96a6385998 ]
Don't allow the TAI-UTC offset of the system clock to be set by adjtimex() to a value larger than 100000 seconds.
This prevents an overflow in the conversion to int, prevents the CLOCK_TAI clock from getting too far ahead of the CLOCK_REALTIME clock, and it is still large enough to allow leap seconds to be inserted at the maximum rate currently supported by the kernel (once per day) for the next ~270 years, however unlikely it is that someone can survive a catastrophic event which slowed down the rotation of the Earth so much.
Reported-by: Weikang shi swkhack@gmail.com Signed-off-by: Miroslav Lichvar mlichvar@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: John Stultz john.stultz@linaro.org Cc: Prarit Bhargava prarit@redhat.com Cc: Richard Cochran richardcochran@gmail.com Cc: Stephen Boyd sboyd@kernel.org Link: https://lkml.kernel.org/r/20190618154713.20929-1-mlichvar@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/time/ntp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c index 8de4f789dc1b..65eb796610dc 100644 --- a/kernel/time/ntp.c +++ b/kernel/time/ntp.c @@ -43,6 +43,7 @@ static u64 tick_length_base; #define MAX_TICKADJ 500LL /* usecs */ #define MAX_TICKADJ_SCALED \ (((MAX_TICKADJ * NSEC_PER_USEC) << NTP_SCALE_SHIFT) / NTP_INTERVAL_FREQ) +#define MAX_TAI_OFFSET 100000
/* * phase-lock loop variables @@ -691,7 +692,8 @@ static inline void process_adjtimex_modes(const struct __kernel_timex *txc, time_constant = max(time_constant, 0l); }
- if (txc->modes & ADJ_TAI && txc->constant >= 0) + if (txc->modes & ADJ_TAI && + txc->constant >= 0 && txc->constant <= MAX_TAI_OFFSET) *time_tai = txc->constant;
if (txc->modes & ADJ_OFFSET)
[ Upstream commit a9314773a91a1d3b36270085246a6715a326ff00 ]
With CONFIG_PROC_FS=n the following warning is emitted:
kernel/time/timer_list.c:361:36: warning: unused variable 'timer_list_sops' [-Wunused-const-variable] static const struct seq_operations timer_list_sops = {
Add #ifdef guard around procfs specific code.
Signed-off-by: Nathan Huckleberry nhuck@google.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Nick Desaulniers ndesaulniers@google.com Cc: john.stultz@linaro.org Cc: sboyd@kernel.org Cc: clang-built-linux@googlegroups.com Link: https://github.com/ClangBuiltLinux/linux/issues/534 Link: https://lkml.kernel.org/r/20190614181604.112297-1-nhuck@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/time/timer_list.c | 36 +++++++++++++++++++----------------- 1 file changed, 19 insertions(+), 17 deletions(-)
diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c index 98ba50dcb1b2..acb326f5f50a 100644 --- a/kernel/time/timer_list.c +++ b/kernel/time/timer_list.c @@ -282,23 +282,6 @@ static inline void timer_list_header(struct seq_file *m, u64 now) SEQ_printf(m, "\n"); }
-static int timer_list_show(struct seq_file *m, void *v) -{ - struct timer_list_iter *iter = v; - - if (iter->cpu == -1 && !iter->second_pass) - timer_list_header(m, iter->now); - else if (!iter->second_pass) - print_cpu(m, iter->cpu, iter->now); -#ifdef CONFIG_GENERIC_CLOCKEVENTS - else if (iter->cpu == -1 && iter->second_pass) - timer_list_show_tickdevices_header(m); - else - print_tickdevice(m, tick_get_device(iter->cpu), iter->cpu); -#endif - return 0; -} - void sysrq_timer_list_show(void) { u64 now = ktime_to_ns(ktime_get()); @@ -317,6 +300,24 @@ void sysrq_timer_list_show(void) return; }
+#ifdef CONFIG_PROC_FS +static int timer_list_show(struct seq_file *m, void *v) +{ + struct timer_list_iter *iter = v; + + if (iter->cpu == -1 && !iter->second_pass) + timer_list_header(m, iter->now); + else if (!iter->second_pass) + print_cpu(m, iter->cpu, iter->now); +#ifdef CONFIG_GENERIC_CLOCKEVENTS + else if (iter->cpu == -1 && iter->second_pass) + timer_list_show_tickdevices_header(m); + else + print_tickdevice(m, tick_get_device(iter->cpu), iter->cpu); +#endif + return 0; +} + static void *move_iter(struct timer_list_iter *iter, loff_t offset) { for (; offset; offset--) { @@ -376,3 +377,4 @@ static int __init init_timer_list_procfs(void) return 0; } __initcall(init_timer_list_procfs); +#endif
[ Upstream commit 8d4e29a51a954b43e06d916772fa4f50b7e5bbd6 ]
In the patch refactoring the fw-node, the mt9m111 was broken for all platform_data based platforms, which were the first aim of this driver. Only the devicetree platform are still functional, probably because the testing was done on these.
The result is that -EINVAL is systematically return for such platforms, what this patch fixes.
[Sakari Ailus: Rework this to resolve a merge conflict and use dev_fwnode]
Fixes: 98480d65c48c ("media: mt9m111: allow to setup pixclk polarity") Signed-off-by: Robert Jarzmik robert.jarzmik@free.fr Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/i2c/mt9m111.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/media/i2c/mt9m111.c b/drivers/media/i2c/mt9m111.c index 362c3b93636e..5a642b5ad076 100644 --- a/drivers/media/i2c/mt9m111.c +++ b/drivers/media/i2c/mt9m111.c @@ -1245,9 +1245,11 @@ static int mt9m111_probe(struct i2c_client *client, if (!mt9m111) return -ENOMEM;
- ret = mt9m111_probe_fw(client, mt9m111); - if (ret) - return ret; + if (dev_fwnode(&client->dev)) { + ret = mt9m111_probe_fw(client, mt9m111); + if (ret) + return ret; + }
mt9m111->clk = v4l2_clk_get(&client->dev, "mclk"); if (IS_ERR(mt9m111->clk))
[ Upstream commit b545542a0b866f7975254e41c595836e9bc0ff2f ]
commit 34ac3c3eb8f0c07 ("ASoC: core: lock client_mutex while removing link components") added mutex_lock() at soc_remove_link_components().
Is is called from snd_soc_unbind_card()
snd_soc_unbind_card() => soc_remove_link_components() soc_cleanup_card_resources() soc_remove_dai_links() => soc_remove_link_components()
And, there are 2 way to call it.
(1) snd_soc_unregister_component() ** mutex_lock() snd_soc_component_del_unlocked() => snd_soc_unbind_card() ** mutex_unlock()
(2) snd_soc_unregister_card() => snd_soc_unbind_card()
(1) case is already using mutex_lock() when it calles snd_soc_unbind_card(), thus, we will get lockdep warning.
commit 495f926c68ddb90 ("ASoC: core: Fix deadlock in snd_soc_instantiate_card()") tried to fixup it, but still not enough. We still have lockdep warning when we try unbind/bind.
We need mutex_lock() under snd_soc_unregister_card() instead of snd_remove_link_components()/snd_soc_unbind_card().
Fixes: 34ac3c3eb8f0c07 ("ASoC: core: lock client_mutex while removing link components") Fixes: 495f926c68ddb90 ("ASoC: core: Fix deadlock in snd_soc_instantiate_card()") Signed-off-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/soc-core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index 41c0cfaf2db5..9138fcb15cd3 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -2837,14 +2837,12 @@ static void snd_soc_unbind_card(struct snd_soc_card *card, bool unregister) snd_soc_dapm_shutdown(card); snd_soc_flush_all_delayed_work(card);
- mutex_lock(&client_mutex); /* remove all components used by DAI links on this card */ for_each_comp_order(order) { for_each_card_rtds(card, rtd) { soc_remove_link_components(card, rtd, order); } } - mutex_unlock(&client_mutex);
soc_cleanup_card_resources(card); if (!unregister) @@ -2863,7 +2861,9 @@ static void snd_soc_unbind_card(struct snd_soc_card *card, bool unregister) */ int snd_soc_unregister_card(struct snd_soc_card *card) { + mutex_lock(&client_mutex); snd_soc_unbind_card(card, true); + mutex_unlock(&client_mutex); dev_dbg(card->dev, "ASoC: Unregistered card '%s'\n", card->name);
return 0;
[ Upstream commit 2af22f3ec3ca452f1e79b967f634708ff01ced8a ]
Some Qualcomm Snapdragon based laptops built to run Microsoft Windows are clearly ACPI 5.1 based, given that that is the first ACPI revision that supports ARM, and introduced the FADT 'arm_boot_flags' field, which has a non-zero field on those systems.
So in these cases, infer from the ARM boot flags that the FADT must be 5.1 or later, and treat it as 5.1.
Acked-by: Sudeep Holla sudeep.holla@arm.com Tested-by: Lee Jones lee.jones@linaro.org Reviewed-by: Graeme Gregory graeme.gregory@linaro.org Acked-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Acked-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/acpi.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c index 2804330c95dc..3a58e9db5cfe 100644 --- a/arch/arm64/kernel/acpi.c +++ b/arch/arm64/kernel/acpi.c @@ -152,10 +152,14 @@ static int __init acpi_fadt_sanity_check(void) */ if (table->revision < 5 || (table->revision == 5 && fadt->minor_revision < 1)) { - pr_err("Unsupported FADT revision %d.%d, should be 5.1+\n", + pr_err(FW_BUG "Unsupported FADT revision %d.%d, should be 5.1+\n", table->revision, fadt->minor_revision); - ret = -EINVAL; - goto out; + + if (!fadt->arm_boot_flags) { + ret = -EINVAL; + goto out; + } + pr_err("FADT has ARM boot flags set, assuming 5.1\n"); }
if (!(fadt->flags & ACPI_FADT_HW_REDUCED)) {
[ Upstream commit 56d159a4ec6d8da7313aac6fcbb95d8fffe689ba ]
Sequence number handling assumed that the BIT processor frame number starts counting at 1, but this is not true for the MPEG-2 decoder, which starts at 0. Fix the sequence counter offset detection to handle this.
Signed-off-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/coda/coda-bit.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/media/platform/coda/coda-bit.c b/drivers/media/platform/coda/coda-bit.c index 976f6aa69f41..1eeed34f300d 100644 --- a/drivers/media/platform/coda/coda-bit.c +++ b/drivers/media/platform/coda/coda-bit.c @@ -1739,6 +1739,7 @@ static int __coda_start_decoding(struct coda_ctx *ctx) v4l2_err(&dev->v4l2_dev, "CODA_COMMAND_SEQ_INIT timeout\n"); return ret; } + ctx->sequence_offset = ~0U; ctx->initialized = 1;
/* Update kfifo out pointer from coda bitstream read pointer */ @@ -2151,7 +2152,9 @@ static void coda_finish_decode(struct coda_ctx *ctx) v4l2_err(&dev->v4l2_dev, "decoded frame index out of range: %d\n", decoded_idx); } else { - val = coda_read(dev, CODA_RET_DEC_PIC_FRAME_NUM) - 1; + val = coda_read(dev, CODA_RET_DEC_PIC_FRAME_NUM); + if (ctx->sequence_offset == -1) + ctx->sequence_offset = val; val -= ctx->sequence_offset; spin_lock(&ctx->buffer_meta_lock); if (!list_empty(&ctx->buffer_meta_list)) {
[ Upstream commit f3775f89852d167990b0d718587774cf00d22ac2 ]
coda_encoder_cmd() is racy, as the last scheduled picture run worker can still be in-flight while the ENC_CMD_STOP command is issued. Depending on the exact timing the sequence numbers might already be changed, but the last buffer might not have been put on the destination queue yet.
In this case the current implementation would prematurely wake the destination queue with last_buffer_dequeued=true, causing userspace to call streamoff before the last buffer is handled.
Close this race window by synchronizing with the pic_run_worker before doing the sequence check.
Signed-off-by: Marco Felsch m.felsch@pengutronix.de [l.stach@pengutronix.de: switch to flush_work, reword commit message] Signed-off-by: Lucas Stach l.stach@pengutronix.de Signed-off-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/coda/coda-common.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/media/platform/coda/coda-common.c b/drivers/media/platform/coda/coda-common.c index 6238047273f2..68a585d3af91 100644 --- a/drivers/media/platform/coda/coda-common.c +++ b/drivers/media/platform/coda/coda-common.c @@ -1024,6 +1024,8 @@ static int coda_encoder_cmd(struct file *file, void *fh, /* Set the stream-end flag on this context */ ctx->bit_stream_param |= CODA_BIT_STREAM_END_FLAG;
+ flush_work(&ctx->pic_run_work); + /* If there is no buffer in flight, wake up */ if (!ctx->streamon_out || ctx->qsequence == ctx->osequence) { dst_vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx,
[ Upstream commit b3b7d96817cdb8b6fc353867705275dce8f41ccc ]
If no more frames are decoded in bitstream end mode, and a previously decoded frame has been returned, the firmware still increments the frame number. To avoid a sequence number mismatch after decoder restart, increment the sequence_offset correction parameter.
Signed-off-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/coda/coda-bit.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/media/platform/coda/coda-bit.c b/drivers/media/platform/coda/coda-bit.c index 1eeed34f300d..8c9743e067cf 100644 --- a/drivers/media/platform/coda/coda-bit.c +++ b/drivers/media/platform/coda/coda-bit.c @@ -2147,6 +2147,9 @@ static void coda_finish_decode(struct coda_ctx *ctx) else if (ctx->display_idx < 0) ctx->hold = true; } else if (decoded_idx == -2) { + if (ctx->display_idx >= 0 && + ctx->display_idx < ctx->num_internal_frames) + ctx->sequence_offset++; /* no frame was decoded, we still return remaining buffers */ } else if (decoded_idx < 0 || decoded_idx >= ctx->num_internal_frames) { v4l2_err(&dev->v4l2_dev,
[ Upstream commit 77ae46e11df5c96bb4582633851f838f5d954df4 ]
v4l2_fill_pixfmt() returns -EINVAL if the pixelformat used as parameter is invalid or if the user is trying to use a multiplanar format with the singleplanar API. Currently, the vimc_cap_try_fmt_vid_cap() returns such value, but vimc_cap_s_fmt_vid_cap() is ignoring it. Fix that and returns an error value if vimc_cap_try_fmt_vid_cap() has failed.
Signed-off-by: André Almeida andrealmeid@collabora.com Suggested-by: Helen Koike helen.koike@collabora.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/vimc/vimc-capture.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/media/platform/vimc/vimc-capture.c b/drivers/media/platform/vimc/vimc-capture.c index 946dc0908566..664855708fdf 100644 --- a/drivers/media/platform/vimc/vimc-capture.c +++ b/drivers/media/platform/vimc/vimc-capture.c @@ -142,12 +142,15 @@ static int vimc_cap_s_fmt_vid_cap(struct file *file, void *priv, struct v4l2_format *f) { struct vimc_cap_device *vcap = video_drvdata(file); + int ret;
/* Do not change the format while stream is on */ if (vb2_is_busy(&vcap->queue)) return -EBUSY;
- vimc_cap_try_fmt_vid_cap(file, priv, f); + ret = vimc_cap_try_fmt_vid_cap(file, priv, f); + if (ret) + return ret;
dev_dbg(vcap->dev, "%s: format update: " "old:%dx%d (0x%x, %d, %d, %d, %d) "
[ Upstream commit 6bc5a4a1927556ff9adce1aa95ea408c95453225 ]
This driver has three locking issues:
- The wait_event_interruptible() condition calls hdpvr_get_next_buffer(dev) which uses a mutex, which is not allowed. Rewrite with list_empty_careful() that doesn't need locking.
- In hdpvr_read() the call to hdpvr_stop_streaming() didn't lock io_mutex, but it should have since stop_streaming expects that.
- In hdpvr_device_release() io_mutex was locked when calling flush_work(), but there it shouldn't take that mutex since the work done by flush_work() also wants to lock that mutex.
There are also two other changes (suggested by Keith):
- msecs_to_jiffies(4000); (a NOP) should have been msleep(4000). - Change v4l2_dbg to v4l2_info to always log if streaming had to be restarted.
Reported-by: Keith Pyle kpyle@austin.rr.com Suggested-by: Keith Pyle kpyle@austin.rr.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/hdpvr/hdpvr-video.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/drivers/media/usb/hdpvr/hdpvr-video.c b/drivers/media/usb/hdpvr/hdpvr-video.c index 7580fc5f2f12..6a6405b80797 100644 --- a/drivers/media/usb/hdpvr/hdpvr-video.c +++ b/drivers/media/usb/hdpvr/hdpvr-video.c @@ -435,7 +435,7 @@ static ssize_t hdpvr_read(struct file *file, char __user *buffer, size_t count, /* wait for the first buffer */ if (!(file->f_flags & O_NONBLOCK)) { if (wait_event_interruptible(dev->wait_data, - hdpvr_get_next_buffer(dev))) + !list_empty_careful(&dev->rec_buff_list))) return -ERESTARTSYS; }
@@ -461,10 +461,17 @@ static ssize_t hdpvr_read(struct file *file, char __user *buffer, size_t count, goto err; } if (!err) { - v4l2_dbg(MSG_INFO, hdpvr_debug, &dev->v4l2_dev, - "timeout: restart streaming\n"); + v4l2_info(&dev->v4l2_dev, + "timeout: restart streaming\n"); + mutex_lock(&dev->io_mutex); hdpvr_stop_streaming(dev); - msecs_to_jiffies(4000); + mutex_unlock(&dev->io_mutex); + /* + * The FW needs about 4 seconds after streaming + * stopped before it is ready to restart + * streaming. + */ + msleep(4000); err = hdpvr_start_streaming(dev); if (err) { ret = err; @@ -1127,9 +1134,7 @@ static void hdpvr_device_release(struct video_device *vdev) struct hdpvr_device *dev = video_get_drvdata(vdev);
hdpvr_delete(dev); - mutex_lock(&dev->io_mutex); flush_work(&dev->worker); - mutex_unlock(&dev->io_mutex);
v4l2_device_unregister(&dev->v4l2_dev); v4l2_ctrl_handler_free(&dev->hdl);
[ Upstream commit 0fec7e72ae1391bb2d7527efb54fe6ae88acabce ]
The PHY selection bit also exists on SoCs without an internal PHY; if it's set to 1 (internal PHY, default value) then the MAC will not make use of any PHY on such SoCs.
This problem appears when adapting for H6, which has no real internal PHY (the "internal PHY" on H6 is not on-die, but on a co-packaged AC200 chip, connected via RMII interface at GPIO bank A).
Force the PHY selection bit to 0 when the SOC doesn't have an internal PHY, to address the problem of a wrong default value.
Signed-off-by: Icenowy Zheng icenowy@aosc.io Signed-off-by: Ondrej Jirman megous@megous.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c index a69c34f605b1..98a15ba8be9f 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c @@ -884,6 +884,11 @@ static int sun8i_dwmac_set_syscon(struct stmmac_priv *priv) * address. No need to mask it again. */ reg |= 1 << H3_EPHY_ADDR_SHIFT; + } else { + /* For SoCs without internal PHY the PHY selection bit should be + * set to 0 (external PHY). + */ + reg &= ~H3_EPHY_SELECT; }
if (!of_property_read_u32(node, "allwinner,tx-delay-ps", &val)) {
[ Upstream commit 6c0ed66f1a5b84e2a812c7c2d6571a5621bf3396 ]
rtl_usb_probe() must do error handle rtl_deinit_core() only if rtl_init_core() is done, otherwise goto error_out2.
| usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0 | rtl_usb: reg 0xf0, usbctrl_vendorreq TimeOut! status:0xffffffb9 value=0x0 | rtl8192cu: Chip version 0x10 | rtl_usb: reg 0xa, usbctrl_vendorreq TimeOut! status:0xffffffb9 value=0x0 | rtl_usb: Too few input end points found | INFO: trying to register non-static key. | the code is fine but needs lockdep annotation. | turning off the locking correctness validator. | CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.1.0-rc4-319354-g9a33b36 #3 | Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS | Google 01/01/2011 | Workqueue: usb_hub_wq hub_event | Call Trace: | __dump_stack lib/dump_stack.c:77 [inline] | dump_stack+0xe8/0x16e lib/dump_stack.c:113 | assign_lock_key kernel/locking/lockdep.c:786 [inline] | register_lock_class+0x11b8/0x1250 kernel/locking/lockdep.c:1095 | __lock_acquire+0xfb/0x37c0 kernel/locking/lockdep.c:3582 | lock_acquire+0x10d/0x2f0 kernel/locking/lockdep.c:4211 | __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] | _raw_spin_lock_irqsave+0x44/0x60 kernel/locking/spinlock.c:152 | rtl_c2hcmd_launcher+0xd1/0x390 | drivers/net/wireless/realtek/rtlwifi/base.c:2344 | rtl_deinit_core+0x25/0x2d0 drivers/net/wireless/realtek/rtlwifi/base.c:574 | rtl_usb_probe.cold+0x861/0xa70 | drivers/net/wireless/realtek/rtlwifi/usb.c:1093 | usb_probe_interface+0x31d/0x820 drivers/usb/core/driver.c:361 | really_probe+0x2da/0xb10 drivers/base/dd.c:509 | driver_probe_device+0x21d/0x350 drivers/base/dd.c:671 | __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778 | bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454 | __device_attach+0x223/0x3a0 drivers/base/dd.c:844 | bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514 | device_add+0xad2/0x16e0 drivers/base/core.c:2106 | usb_set_configuration+0xdf7/0x1740 drivers/usb/core/message.c:2021 | generic_probe+0xa2/0xda drivers/usb/core/generic.c:210 | usb_probe_device+0xc0/0x150 drivers/usb/core/driver.c:266 | really_probe+0x2da/0xb10 drivers/base/dd.c:509 | driver_probe_device+0x21d/0x350 drivers/base/dd.c:671 | __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778 | bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454 | __device_attach+0x223/0x3a0 drivers/base/dd.c:844 | bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514 | device_add+0xad2/0x16e0 drivers/base/core.c:2106 | usb_new_device.cold+0x537/0xccf drivers/usb/core/hub.c:2534 | hub_port_connect drivers/usb/core/hub.c:5089 [inline] | hub_port_connect_change drivers/usb/core/hub.c:5204 [inline] | port_event drivers/usb/core/hub.c:5350 [inline] | hub_event+0x138e/0x3b00 drivers/usb/core/hub.c:5432 | process_one_work+0x90f/0x1580 kernel/workqueue.c:2269 | worker_thread+0x9b/0xe20 kernel/workqueue.c:2415 | kthread+0x313/0x420 kernel/kthread.c:253 | ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
Reported-by: syzbot+1fcc5ef45175fc774231@syzkaller.appspotmail.com Signed-off-by: Ping-Ke Shih pkshih@realtek.com Acked-by: Larry Finger Larry.Finger@lwfinger.net Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/realtek/rtlwifi/usb.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtlwifi/usb.c b/drivers/net/wireless/realtek/rtlwifi/usb.c index e24fda5e9087..34d68dbf4b4c 100644 --- a/drivers/net/wireless/realtek/rtlwifi/usb.c +++ b/drivers/net/wireless/realtek/rtlwifi/usb.c @@ -1064,13 +1064,13 @@ int rtl_usb_probe(struct usb_interface *intf, rtlpriv->cfg->ops->read_eeprom_info(hw); err = _rtl_usb_init(hw); if (err) - goto error_out; + goto error_out2; rtl_usb_init_sw(hw); /* Init mac80211 sw */ err = rtl_init_core(hw); if (err) { pr_err("Can't allocate sw for mac80211\n"); - goto error_out; + goto error_out2; } if (rtlpriv->cfg->ops->init_sw_vars(hw)) { pr_err("Can't init_sw_vars\n"); @@ -1091,6 +1091,7 @@ int rtl_usb_probe(struct usb_interface *intf,
error_out: rtl_deinit_core(hw); +error_out2: _rtl_usb_io_handler_release(hw); usb_put_dev(udev); complete(&rtlpriv->firmware_loading_complete);
[ Upstream commit 4079e8ccabc3b6d1b503f2376123cb515d14921f ]
Do not schedule rx_tasklet when the usb dongle is disconnected. Moreover do not grub rx_lock in mt7601u_kill_rx since usb_poison_urb can run concurrently with urb completion and we can unlink urbs from rx ring in any order. This patch fixes the common kernel warning reported when the device is removed.
[ 24.921354] usb 3-14: USB disconnect, device number 7 [ 24.921593] ------------[ cut here ]------------ [ 24.921594] RX urb mismatch [ 24.921675] WARNING: CPU: 4 PID: 163 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0xcb/0xd0 [mt7601u] [ 24.921769] CPU: 4 PID: 163 Comm: kworker/4:2 Tainted: G OE 4.19.31-041931-generic #201903231635 [ 24.921770] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z97 Extreme4, BIOS P1.30 05/23/2014 [ 24.921782] Workqueue: usb_hub_wq hub_event [ 24.921797] RIP: 0010:mt7601u_complete_rx+0xcb/0xd0 [mt7601u] [ 24.921800] RSP: 0018:ffff9bd9cfd03d08 EFLAGS: 00010086 [ 24.921802] RAX: 0000000000000000 RBX: ffff9bd9bf043540 RCX: 0000000000000006 [ 24.921803] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff9bd9cfd16420 [ 24.921804] RBP: ffff9bd9cfd03d28 R08: 0000000000000002 R09: 00000000000003a8 [ 24.921805] R10: 0000002f485fca34 R11: 0000000000000000 R12: ffff9bd9bf043c1c [ 24.921806] R13: ffff9bd9c62fa3c0 R14: 0000000000000082 R15: 0000000000000000 [ 24.921807] FS: 0000000000000000(0000) GS:ffff9bd9cfd00000(0000) knlGS:0000000000000000 [ 24.921808] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 24.921808] CR2: 00007fb2648b0000 CR3: 0000000142c0a004 CR4: 00000000001606e0 [ 24.921809] Call Trace: [ 24.921812] <IRQ> [ 24.921819] __usb_hcd_giveback_urb+0x8b/0x140 [ 24.921821] usb_hcd_giveback_urb+0xca/0xe0 [ 24.921828] xhci_giveback_urb_in_irq.isra.42+0x82/0xf0 [ 24.921834] handle_cmd_completion+0xe02/0x10d0 [ 24.921837] xhci_irq+0x274/0x4a0 [ 24.921838] xhci_msi_irq+0x11/0x20 [ 24.921851] __handle_irq_event_percpu+0x44/0x190 [ 24.921856] handle_irq_event_percpu+0x32/0x80 [ 24.921861] handle_irq_event+0x3b/0x5a [ 24.921867] handle_edge_irq+0x80/0x190 [ 24.921874] handle_irq+0x20/0x30 [ 24.921889] do_IRQ+0x4e/0xe0 [ 24.921891] common_interrupt+0xf/0xf [ 24.921892] </IRQ> [ 24.921900] RIP: 0010:usb_hcd_flush_endpoint+0x78/0x180 [ 24.921354] usb 3-14: USB disconnect, device number 7
Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt7601u/dma.c | 33 +++++++++++---------- 1 file changed, 18 insertions(+), 15 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt7601u/dma.c b/drivers/net/wireless/mediatek/mt7601u/dma.c index 66d60283e456..0faa3db6fde4 100644 --- a/drivers/net/wireless/mediatek/mt7601u/dma.c +++ b/drivers/net/wireless/mediatek/mt7601u/dma.c @@ -185,10 +185,23 @@ static void mt7601u_complete_rx(struct urb *urb) struct mt7601u_rx_queue *q = &dev->rx_q; unsigned long flags;
- spin_lock_irqsave(&dev->rx_lock, flags); + /* do no schedule rx tasklet if urb has been unlinked + * or the device has been removed + */ + switch (urb->status) { + case -ECONNRESET: + case -ESHUTDOWN: + case -ENOENT: + return; + default: + dev_err_ratelimited(dev->dev, "rx urb failed: %d\n", + urb->status); + /* fall through */ + case 0: + break; + }
- if (mt7601u_urb_has_error(urb)) - dev_err(dev->dev, "Error: RX urb failed:%d\n", urb->status); + spin_lock_irqsave(&dev->rx_lock, flags); if (WARN_ONCE(q->e[q->end].urb != urb, "RX urb mismatch")) goto out;
@@ -355,19 +368,9 @@ int mt7601u_dma_enqueue_tx(struct mt7601u_dev *dev, struct sk_buff *skb, static void mt7601u_kill_rx(struct mt7601u_dev *dev) { int i; - unsigned long flags;
- spin_lock_irqsave(&dev->rx_lock, flags); - - for (i = 0; i < dev->rx_q.entries; i++) { - int next = dev->rx_q.end; - - spin_unlock_irqrestore(&dev->rx_lock, flags); - usb_poison_urb(dev->rx_q.e[next].urb); - spin_lock_irqsave(&dev->rx_lock, flags); - } - - spin_unlock_irqrestore(&dev->rx_lock, flags); + for (i = 0; i < dev->rx_q.entries; i++) + usb_poison_urb(dev->rx_q.e[i].urb); }
static int mt7601u_submit_rx_buf(struct mt7601u_dev *dev,
[ Upstream commit bc53d3d777f81385c1bb08b07bd1c06450ecc2c1 ]
Without 'set -e', shell scripts continue running even after any error occurs. The missed 'set -e' is a typical bug in shell scripting.
For example, when a disk space shortage occurs while this script is running, it actually ends up with generating a truncated capflags.c.
Yet, mkcapflags.sh continues running and exits with 0. So, the build system assumes it has succeeded.
It will not be re-generated in the next invocation of Make since its timestamp is newer than that of any of the source files.
Add 'set -e' so that any error in this script is caught and propagated to the build system.
Since 9c2af1c7377a ("kbuild: add .DELETE_ON_ERROR special target"), make automatically deletes the target on any failure. So, the broken capflags.c will be deleted automatically.
Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: "H. Peter Anvin" hpa@zytor.com Cc: Borislav Petkov bp@alien8.de Link: https://lkml.kernel.org/r/20190625072622.17679-1-yamada.masahiro@socionext.c... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/cpu/mkcapflags.sh | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/x86/kernel/cpu/mkcapflags.sh b/arch/x86/kernel/cpu/mkcapflags.sh index d0dfb892c72f..aed45b8895d5 100644 --- a/arch/x86/kernel/cpu/mkcapflags.sh +++ b/arch/x86/kernel/cpu/mkcapflags.sh @@ -4,6 +4,8 @@ # Generate the x86_cap/bug_flags[] arrays from include/asm/cpufeatures.h #
+set -e + IN=$1 OUT=$2
[ Upstream commit 23377c200b2eb48a60d0f228b2a2e75ed6ee6060 ]
When the device is disconnected while passing traffic it is possible to receive out of order urbs causing a memory leak since the skb linked to the current tx urb is not removed. Fix the issue deallocating the skb cleaning up the tx ring. Moreover this patch fixes the following kernel warning
[ 57.480771] usb 1-1: USB disconnect, device number 2 [ 57.483451] ------------[ cut here ]------------ [ 57.483462] TX urb mismatch [ 57.483481] WARNING: CPU: 1 PID: 32 at drivers/net/wireless/mediatek/mt7601u/dma.c:245 mt7601u_complete_tx+0x165/00 [ 57.483483] Modules linked in: [ 57.483496] CPU: 1 PID: 32 Comm: kworker/1:1 Not tainted 5.2.0-rc1+ #72 [ 57.483498] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-2.fc30 04/01/2014 [ 57.483502] Workqueue: usb_hub_wq hub_event [ 57.483507] RIP: 0010:mt7601u_complete_tx+0x165/0x1e0 [ 57.483510] Code: 8b b5 10 04 00 00 8b 8d 14 04 00 00 eb 8b 80 3d b1 cb e1 00 00 75 9e 48 c7 c7 a4 ea 05 82 c6 05 f [ 57.483513] RSP: 0000:ffffc900000a0d28 EFLAGS: 00010092 [ 57.483516] RAX: 000000000000000f RBX: ffff88802c0a62c0 RCX: ffffc900000a0c2c [ 57.483518] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff810a8371 [ 57.483520] RBP: ffff88803ced6858 R08: 0000000000000000 R09: 0000000000000001 [ 57.483540] R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000046 [ 57.483542] R13: ffff88802c0a6c88 R14: ffff88803baab540 R15: ffff88803a0cc078 [ 57.483548] FS: 0000000000000000(0000) GS:ffff88803eb00000(0000) knlGS:0000000000000000 [ 57.483550] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 57.483552] CR2: 000055e7f6780100 CR3: 0000000028c86000 CR4: 00000000000006a0 [ 57.483554] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 57.483556] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 57.483559] Call Trace: [ 57.483561] <IRQ> [ 57.483565] __usb_hcd_giveback_urb+0x77/0xe0 [ 57.483570] xhci_giveback_urb_in_irq.isra.0+0x8b/0x140 [ 57.483574] handle_cmd_completion+0xf5b/0x12c0 [ 57.483577] xhci_irq+0x1f6/0x1810 [ 57.483581] ? lockdep_hardirqs_on+0x9e/0x180 [ 57.483584] ? _raw_spin_unlock_irq+0x24/0x30 [ 57.483588] __handle_irq_event_percpu+0x3a/0x260 [ 57.483592] handle_irq_event_percpu+0x1c/0x60 [ 57.483595] handle_irq_event+0x2f/0x4c [ 57.483599] handle_edge_irq+0x7e/0x1a0 [ 57.483603] handle_irq+0x17/0x20 [ 57.483607] do_IRQ+0x54/0x110 [ 57.483610] common_interrupt+0xf/0xf [ 57.483612] </IRQ>
Acked-by: Jakub Kicinski kubakici@wp.pl Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt7601u/dma.c | 21 ++++++++++++++++----- drivers/net/wireless/mediatek/mt7601u/tx.c | 4 ++-- 2 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt7601u/dma.c b/drivers/net/wireless/mediatek/mt7601u/dma.c index 0faa3db6fde4..f6a0454abe04 100644 --- a/drivers/net/wireless/mediatek/mt7601u/dma.c +++ b/drivers/net/wireless/mediatek/mt7601u/dma.c @@ -233,14 +233,25 @@ static void mt7601u_complete_tx(struct urb *urb) struct sk_buff *skb; unsigned long flags;
- spin_lock_irqsave(&dev->tx_lock, flags); + switch (urb->status) { + case -ECONNRESET: + case -ESHUTDOWN: + case -ENOENT: + return; + default: + dev_err_ratelimited(dev->dev, "tx urb failed: %d\n", + urb->status); + /* fall through */ + case 0: + break; + }
- if (mt7601u_urb_has_error(urb)) - dev_err(dev->dev, "Error: TX urb failed:%d\n", urb->status); + spin_lock_irqsave(&dev->tx_lock, flags); if (WARN_ONCE(q->e[q->start].urb != urb, "TX urb mismatch")) goto out;
skb = q->e[q->start].skb; + q->e[q->start].skb = NULL; trace_mt_tx_dma_done(dev, skb);
__skb_queue_tail(&dev->tx_skb_done, skb); @@ -440,10 +451,10 @@ static void mt7601u_free_tx_queue(struct mt7601u_tx_queue *q) { int i;
- WARN_ON(q->used); - for (i = 0; i < q->entries; i++) { usb_poison_urb(q->e[i].urb); + if (q->e[i].skb) + mt7601u_tx_status(q->dev, q->e[i].skb); usb_free_urb(q->e[i].urb); } } diff --git a/drivers/net/wireless/mediatek/mt7601u/tx.c b/drivers/net/wireless/mediatek/mt7601u/tx.c index 906e19c5f628..f3dff8319a4c 100644 --- a/drivers/net/wireless/mediatek/mt7601u/tx.c +++ b/drivers/net/wireless/mediatek/mt7601u/tx.c @@ -109,9 +109,9 @@ void mt7601u_tx_status(struct mt7601u_dev *dev, struct sk_buff *skb) info->status.rates[0].idx = -1; info->flags |= IEEE80211_TX_STAT_ACK;
- spin_lock(&dev->mac_lock); + spin_lock_bh(&dev->mac_lock); ieee80211_tx_status(dev->hw, skb); - spin_unlock(&dev->mac_lock); + spin_unlock_bh(&dev->mac_lock); }
static int mt7601u_skb_rooms(struct mt7601u_dev *dev, struct sk_buff *skb)
[ Upstream commit 5db7c8b9f9fc2aeec671ae3ca6375752c162e0e7 ]
syzkaller reports for memory leak in start_sync_thread [1]
As Eric points out, kthread may start and stop before the threadfn function is called, so there is no chance the data (tinfo in our case) to be released in thread.
Fix this by releasing tinfo in the controlling code instead.
[1] BUG: memory leak unreferenced object 0xffff8881206bf700 (size 32): comm "syz-executor761", pid 7268, jiffies 4294943441 (age 20.470s) hex dump (first 32 bytes): 00 40 7c 09 81 88 ff ff 80 45 b8 21 81 88 ff ff .@|......E.!.... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000057619e23>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<0000000057619e23>] slab_post_alloc_hook mm/slab.h:439 [inline] [<0000000057619e23>] slab_alloc mm/slab.c:3326 [inline] [<0000000057619e23>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553 [<0000000086ce5479>] kmalloc include/linux/slab.h:547 [inline] [<0000000086ce5479>] start_sync_thread+0x5d2/0xe10 net/netfilter/ipvs/ip_vs_sync.c:1862 [<000000001a9229cc>] do_ip_vs_set_ctl+0x4c5/0x780 net/netfilter/ipvs/ip_vs_ctl.c:2402 [<00000000ece457c8>] nf_sockopt net/netfilter/nf_sockopt.c:106 [inline] [<00000000ece457c8>] nf_setsockopt+0x4c/0x80 net/netfilter/nf_sockopt.c:115 [<00000000942f62d4>] ip_setsockopt net/ipv4/ip_sockglue.c:1258 [inline] [<00000000942f62d4>] ip_setsockopt+0x9b/0xb0 net/ipv4/ip_sockglue.c:1238 [<00000000a56a8ffd>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616 [<00000000fa895401>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3130 [<0000000095eef4cf>] __sys_setsockopt+0x98/0x120 net/socket.c:2078 [<000000009747cf88>] __do_sys_setsockopt net/socket.c:2089 [inline] [<000000009747cf88>] __se_sys_setsockopt net/socket.c:2086 [inline] [<000000009747cf88>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086 [<00000000ded8ba80>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<00000000893b4ac8>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported-by: syzbot+7e2e50c8adfccd2e5041@syzkaller.appspotmail.com Suggested-by: Eric Biggers ebiggers@kernel.org Fixes: 998e7a76804b ("ipvs: Use kthread_run() instead of doing a double-fork via kernel_thread()") Signed-off-by: Julian Anastasov ja@ssi.bg Acked-by: Simon Horman horms@verge.net.au Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/ip_vs.h | 6 +- net/netfilter/ipvs/ip_vs_ctl.c | 4 - net/netfilter/ipvs/ip_vs_sync.c | 134 +++++++++++++++++--------------- 3 files changed, 76 insertions(+), 68 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h index 2ac40135b576..b36a1df93e7c 100644 --- a/include/net/ip_vs.h +++ b/include/net/ip_vs.h @@ -808,11 +808,12 @@ struct ipvs_master_sync_state { struct ip_vs_sync_buff *sync_buff; unsigned long sync_queue_len; unsigned int sync_queue_delay; - struct task_struct *master_thread; struct delayed_work master_wakeup_work; struct netns_ipvs *ipvs; };
+struct ip_vs_sync_thread_data; + /* How much time to keep dests in trash */ #define IP_VS_DEST_TRASH_PERIOD (120 * HZ)
@@ -943,7 +944,8 @@ struct netns_ipvs { spinlock_t sync_lock; struct ipvs_master_sync_state *ms; spinlock_t sync_buff_lock; - struct task_struct **backup_threads; + struct ip_vs_sync_thread_data *master_tinfo; + struct ip_vs_sync_thread_data *backup_tinfo; int threads_mask; volatile int sync_state; struct mutex sync_mutex; diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 776c87ed4813..741d91aa4a8d 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -2396,9 +2396,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len) cfg.syncid = dm->syncid; ret = start_sync_thread(ipvs, &cfg, dm->state); } else { - mutex_lock(&ipvs->sync_mutex); ret = stop_sync_thread(ipvs, dm->state); - mutex_unlock(&ipvs->sync_mutex); } goto out_dec; } @@ -3515,10 +3513,8 @@ static int ip_vs_genl_del_daemon(struct netns_ipvs *ipvs, struct nlattr **attrs) if (!attrs[IPVS_DAEMON_ATTR_STATE]) return -EINVAL;
- mutex_lock(&ipvs->sync_mutex); ret = stop_sync_thread(ipvs, nla_get_u32(attrs[IPVS_DAEMON_ATTR_STATE])); - mutex_unlock(&ipvs->sync_mutex); return ret; }
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c index 2526be6b3d90..a4a78c4b06de 100644 --- a/net/netfilter/ipvs/ip_vs_sync.c +++ b/net/netfilter/ipvs/ip_vs_sync.c @@ -195,6 +195,7 @@ union ip_vs_sync_conn { #define IPVS_OPT_F_PARAM (1 << (IPVS_OPT_PARAM-1))
struct ip_vs_sync_thread_data { + struct task_struct *task; struct netns_ipvs *ipvs; struct socket *sock; char *buf; @@ -374,8 +375,11 @@ static inline void sb_queue_tail(struct netns_ipvs *ipvs, max(IPVS_SYNC_SEND_DELAY, 1)); ms->sync_queue_len++; list_add_tail(&sb->list, &ms->sync_queue); - if ((++ms->sync_queue_delay) == IPVS_SYNC_WAKEUP_RATE) - wake_up_process(ms->master_thread); + if ((++ms->sync_queue_delay) == IPVS_SYNC_WAKEUP_RATE) { + int id = (int)(ms - ipvs->ms); + + wake_up_process(ipvs->master_tinfo[id].task); + } } else ip_vs_sync_buff_release(sb); spin_unlock(&ipvs->sync_lock); @@ -1636,8 +1640,10 @@ static void master_wakeup_work_handler(struct work_struct *work) spin_lock_bh(&ipvs->sync_lock); if (ms->sync_queue_len && ms->sync_queue_delay < IPVS_SYNC_WAKEUP_RATE) { + int id = (int)(ms - ipvs->ms); + ms->sync_queue_delay = IPVS_SYNC_WAKEUP_RATE; - wake_up_process(ms->master_thread); + wake_up_process(ipvs->master_tinfo[id].task); } spin_unlock_bh(&ipvs->sync_lock); } @@ -1703,10 +1709,6 @@ static int sync_thread_master(void *data) if (sb) ip_vs_sync_buff_release(sb);
- /* release the sending multicast socket */ - sock_release(tinfo->sock); - kfree(tinfo); - return 0; }
@@ -1740,11 +1742,6 @@ static int sync_thread_backup(void *data) } }
- /* release the sending multicast socket */ - sock_release(tinfo->sock); - kfree(tinfo->buf); - kfree(tinfo); - return 0; }
@@ -1752,8 +1749,8 @@ static int sync_thread_backup(void *data) int start_sync_thread(struct netns_ipvs *ipvs, struct ipvs_sync_daemon_cfg *c, int state) { - struct ip_vs_sync_thread_data *tinfo = NULL; - struct task_struct **array = NULL, *task; + struct ip_vs_sync_thread_data *ti = NULL, *tinfo; + struct task_struct *task; struct net_device *dev; char *name; int (*threadfn)(void *data); @@ -1822,7 +1819,7 @@ int start_sync_thread(struct netns_ipvs *ipvs, struct ipvs_sync_daemon_cfg *c, threadfn = sync_thread_master; } else if (state == IP_VS_STATE_BACKUP) { result = -EEXIST; - if (ipvs->backup_threads) + if (ipvs->backup_tinfo) goto out_early;
ipvs->bcfg = *c; @@ -1849,28 +1846,22 @@ int start_sync_thread(struct netns_ipvs *ipvs, struct ipvs_sync_daemon_cfg *c, master_wakeup_work_handler); ms->ipvs = ipvs; } - } else { - array = kcalloc(count, sizeof(struct task_struct *), - GFP_KERNEL); - result = -ENOMEM; - if (!array) - goto out; } + result = -ENOMEM; + ti = kcalloc(count, sizeof(struct ip_vs_sync_thread_data), + GFP_KERNEL); + if (!ti) + goto out;
for (id = 0; id < count; id++) { - result = -ENOMEM; - tinfo = kmalloc(sizeof(*tinfo), GFP_KERNEL); - if (!tinfo) - goto out; + tinfo = &ti[id]; tinfo->ipvs = ipvs; - tinfo->sock = NULL; if (state == IP_VS_STATE_BACKUP) { + result = -ENOMEM; tinfo->buf = kmalloc(ipvs->bcfg.sync_maxlen, GFP_KERNEL); if (!tinfo->buf) goto out; - } else { - tinfo->buf = NULL; } tinfo->id = id; if (state == IP_VS_STATE_MASTER) @@ -1885,17 +1876,15 @@ int start_sync_thread(struct netns_ipvs *ipvs, struct ipvs_sync_daemon_cfg *c, result = PTR_ERR(task); goto out; } - tinfo = NULL; - if (state == IP_VS_STATE_MASTER) - ipvs->ms[id].master_thread = task; - else - array[id] = task; + tinfo->task = task; }
/* mark as active */
- if (state == IP_VS_STATE_BACKUP) - ipvs->backup_threads = array; + if (state == IP_VS_STATE_MASTER) + ipvs->master_tinfo = ti; + else + ipvs->backup_tinfo = ti; spin_lock_bh(&ipvs->sync_buff_lock); ipvs->sync_state |= state; spin_unlock_bh(&ipvs->sync_buff_lock); @@ -1910,29 +1899,31 @@ int start_sync_thread(struct netns_ipvs *ipvs, struct ipvs_sync_daemon_cfg *c,
out: /* We do not need RTNL lock anymore, release it here so that - * sock_release below and in the kthreads can use rtnl_lock - * to leave the mcast group. + * sock_release below can use rtnl_lock to leave the mcast group. */ rtnl_unlock(); - count = id; - while (count-- > 0) { - if (state == IP_VS_STATE_MASTER) - kthread_stop(ipvs->ms[count].master_thread); - else - kthread_stop(array[count]); + id = min(id, count - 1); + if (ti) { + for (tinfo = ti + id; tinfo >= ti; tinfo--) { + if (tinfo->task) + kthread_stop(tinfo->task); + } } if (!(ipvs->sync_state & IP_VS_STATE_MASTER)) { kfree(ipvs->ms); ipvs->ms = NULL; } mutex_unlock(&ipvs->sync_mutex); - if (tinfo) { - if (tinfo->sock) - sock_release(tinfo->sock); - kfree(tinfo->buf); - kfree(tinfo); + + /* No more mutexes, release socks */ + if (ti) { + for (tinfo = ti + id; tinfo >= ti; tinfo--) { + if (tinfo->sock) + sock_release(tinfo->sock); + kfree(tinfo->buf); + } + kfree(ti); } - kfree(array); return result;
out_early: @@ -1944,15 +1935,18 @@ int start_sync_thread(struct netns_ipvs *ipvs, struct ipvs_sync_daemon_cfg *c,
int stop_sync_thread(struct netns_ipvs *ipvs, int state) { - struct task_struct **array; + struct ip_vs_sync_thread_data *ti, *tinfo; int id; int retc = -EINVAL;
IP_VS_DBG(7, "%s(): pid %d\n", __func__, task_pid_nr(current));
+ mutex_lock(&ipvs->sync_mutex); if (state == IP_VS_STATE_MASTER) { + retc = -ESRCH; if (!ipvs->ms) - return -ESRCH; + goto err; + ti = ipvs->master_tinfo;
/* * The lock synchronizes with sb_queue_tail(), so that we don't @@ -1971,38 +1965,56 @@ int stop_sync_thread(struct netns_ipvs *ipvs, int state) struct ipvs_master_sync_state *ms = &ipvs->ms[id]; int ret;
+ tinfo = &ti[id]; pr_info("stopping master sync thread %d ...\n", - task_pid_nr(ms->master_thread)); + task_pid_nr(tinfo->task)); cancel_delayed_work_sync(&ms->master_wakeup_work); - ret = kthread_stop(ms->master_thread); + ret = kthread_stop(tinfo->task); if (retc >= 0) retc = ret; } kfree(ipvs->ms); ipvs->ms = NULL; + ipvs->master_tinfo = NULL; } else if (state == IP_VS_STATE_BACKUP) { - if (!ipvs->backup_threads) - return -ESRCH; + retc = -ESRCH; + if (!ipvs->backup_tinfo) + goto err; + ti = ipvs->backup_tinfo;
ipvs->sync_state &= ~IP_VS_STATE_BACKUP; - array = ipvs->backup_threads; retc = 0; for (id = ipvs->threads_mask; id >= 0; id--) { int ret;
+ tinfo = &ti[id]; pr_info("stopping backup sync thread %d ...\n", - task_pid_nr(array[id])); - ret = kthread_stop(array[id]); + task_pid_nr(tinfo->task)); + ret = kthread_stop(tinfo->task); if (retc >= 0) retc = ret; } - kfree(array); - ipvs->backup_threads = NULL; + ipvs->backup_tinfo = NULL; + } else { + goto err; } + id = ipvs->threads_mask; + mutex_unlock(&ipvs->sync_mutex); + + /* No more mutexes, release socks */ + for (tinfo = ti + id; tinfo >= ti; tinfo--) { + if (tinfo->sock) + sock_release(tinfo->sock); + kfree(tinfo->buf); + } + kfree(ti);
/* decrease the module use count */ ip_vs_use_count_dec(); + return retc;
+err: + mutex_unlock(&ipvs->sync_mutex); return retc; }
@@ -2021,7 +2033,6 @@ void ip_vs_sync_net_cleanup(struct netns_ipvs *ipvs) { int retc;
- mutex_lock(&ipvs->sync_mutex); retc = stop_sync_thread(ipvs, IP_VS_STATE_MASTER); if (retc && retc != -ESRCH) pr_err("Failed to stop Master Daemon\n"); @@ -2029,5 +2040,4 @@ void ip_vs_sync_net_cleanup(struct netns_ipvs *ipvs) retc = stop_sync_thread(ipvs, IP_VS_STATE_BACKUP); if (retc && retc != -ESRCH) pr_err("Failed to stop Backup Daemon\n"); - mutex_unlock(&ipvs->sync_mutex); }
[ Upstream commit 2dcb79cde6129d948a237ef7b48a73a0c82f1e01 ]
Fix following crash that occurs when the driver is processing rx packets while the device is not initialized yet
$ rmmod mt7615e [ 67.210261] mt7615e 0000:01:00.0: Message -239 (seq 2) timeout $ modprobe mt7615e [ 72.406937] bus=0x1, slot = 0x0, irq=0x16 [ 72.436590] CPU 0 Unable to handle kernel paging request at virtual address 00000004, epc == 8eec4240, ra == 8eec41e0 [ 72.450291] mt7615e 0000:01:00.0: Firmware is not ready for download [ 72.457724] Oops[#1]: [ 72.470494] mt7615e: probe of 0000:01:00.0 failed with error -5 [ 72.474829] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.114 #0 [ 72.498702] task: 805769e0 task.stack: 80564000 [ 72.507709] $ 0 : 00000000 00000001 00000000 00000001 [ 72.518106] $ 4 : 8f704dbc 00000000 00000000 8f7046c0 [ 72.528500] $ 8 : 00000024 8045e98c 81210008 11000000 [ 72.538895] $12 : 8fc09f60 00000008 00000019 00000033 [ 72.549289] $16 : 8f704d80 e00000ff 8f0c7800 3c182406 [ 72.559684] $20 : 00000006 8ee615a0 4e000108 00000000 [ 72.570078] $24 : 0000004c 8000cf94 [ 72.580474] $28 : 80564000 8fc09e38 00000001 8eec41e0 [ 72.590869] Hi : 00000001 [ 72.596582] Lo : 00000000 [ 72.602319] epc : 8eec4240 mt7615_mac_fill_rx+0xac/0x494 [mt7615e] [ 72.614953] ra : 8eec41e0 mt7615_mac_fill_rx+0x4c/0x494 [mt7615e] [ 72.627580] Status: 11008403 KERNEL EXL IE [ 72.635899] Cause : 40800008 (ExcCode 02) [ 72.643860] BadVA : 00000004 [ 72.649573] PrId : 0001992f (MIPS 1004Kc) [ 72.657704] Modules linked in: mt7615e pppoe ppp_async pppox ppp_generic nf_conntrack_ipv6 mt76x2e mt76x2_common mt76x02_lib mt7603e mt76 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_mu] [ 72.792717] Process swapper/0 (pid: 0, threadinfo=80564000, task=805769e0, tls=00000000) [ 72.808799] Stack : 8f0c7800 00000800 8f0c7800 8032b874 00000000 40000000 8f704d80 8ee615a0 [ 72.825428] 8dc88010 00000001 8ee615e0 8eec09b0 8dc88010 8032b914 8f3aee80 80567d20 [ 72.842055] 00000000 8ee615e0 40000000 8f0c7800 00000108 8eec9944 00000000 00000000 [ 72.858682] 80508f10 80510000 00000001 80567d20 8ee615a0 00000000 00000000 8ee61c00 [ 72.875308] 8ee61c40 00000040 80610000 80580000 00000000 8ee615dc 8ee61a68 00000001 [ 72.891936] ... [ 72.896793] Call Trace: [ 72.901649] [<8eec4240>] mt7615_mac_fill_rx+0xac/0x494 [mt7615e] [ 72.913602] [<8eec09b0>] mt7615_queue_rx_skb+0xe4/0x12c [mt7615e] [ 72.925734] [<8eec9944>] mt76_dma_cleanup+0x390/0x42c [mt76] [ 72.936988] Code: ae020018 8ea20004 24030001 <94420004> a602002a 8ea20004 90420000 14430003 a2020034 [ 72.956390] [ 72.959676] ---[ end trace f176967739edb19f ]---
Fixes: 04b8e65922f6 ("mt76: add mac80211 driver for MT7615 PCIe-based chipsets") Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt76/mt7615/mac.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/mac.c b/drivers/net/wireless/mediatek/mt76/mt7615/mac.c index b8f48d10f27a..a27bc6791aa7 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7615/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7615/mac.c @@ -96,6 +96,9 @@ int mt7615_mac_fill_rx(struct mt7615_dev *dev, struct sk_buff *skb) bool unicast, remove_pad, insert_ccmp_hdr = false; int i, idx;
+ if (!test_bit(MT76_STATE_RUNNING, &dev->mt76.state)) + return -EINVAL; + memset(status, 0, sizeof(*status));
unicast = (rxd1 & MT_RXD1_NORMAL_ADDR_TYPE) == MT_RXD1_NORMAL_U2M;
[ Upstream commit 4b553f3ca4cbde67399aa3a756c37eb92145b8a1 ]
In function ath10k_sdio_mbox_rx_alloc() [sdio.c], ath10k_sdio_mbox_alloc_rx_pkt() is called without handling the error cases. This will make the driver think the allocation for skb is successful and try to access the skb. If we enable failslab, system will easily crash with NULL pointer dereferencing.
Call trace of CONFIG_FAILSLAB: ath10k_sdio_irq_handler+0x570/0xa88 [ath10k_sdio] process_sdio_pending_irqs+0x4c/0x174 sdio_run_irqs+0x3c/0x64 sdio_irq_work+0x1c/0x28
Fixes: d96db25d2025 ("ath10k: add initial SDIO support") Signed-off-by: Claire Chang tientzu@chromium.org Reviewed-by: Brian Norris briannorris@chromium.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/sdio.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c index fae56c67766f..73ef3e75d199 100644 --- a/drivers/net/wireless/ath/ath10k/sdio.c +++ b/drivers/net/wireless/ath/ath10k/sdio.c @@ -602,6 +602,10 @@ static int ath10k_sdio_mbox_rx_alloc(struct ath10k *ar, full_len, last_in_bundle, last_in_bundle); + if (ret) { + ath10k_warn(ar, "alloc_rx_pkt error %d\n", ret); + goto err; + } }
ar_sdio->n_rx_pkts = i;
[ Upstream commit 08d80e4cd27ba19f9bee9e5f788f9a9fc440a22f ]
On SMP platform, when continuously running wifi up/down, the napi poll can be scheduled during chip reset, which will call ath10k_pci_has_fw_crashed() to check the fw status. But in the reset period, the value from FW_INDICATOR_ADDRESS register will return 0xdeadbeef, which also be treated as fw crash. Fix the issue by moving chip reset after napi disabled.
ath10k_pci 0000:01:00.0: firmware crashed! (guid 73b30611-5b1e-4bdd-90b4-64c81eb947b6) ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe ath10k_pci 0000:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal otp max-sta 512 raw 0 hwcrypto 1 ath10k_pci 0000:01:00.0: failed to get memcpy hi address for firmware address 4: -16 ath10k_pci 0000:01:00.0: failed to read firmware dump area: -16 ath10k_pci 0000:01:00.0: Copy Engine register dump: ath10k_pci 0000:01:00.0: [00]: 0x0004a000 0 0 0 0 ath10k_pci 0000:01:00.0: [01]: 0x0004a400 0 0 0 0 ath10k_pci 0000:01:00.0: [02]: 0x0004a800 0 0 0 0 ath10k_pci 0000:01:00.0: [03]: 0x0004ac00 0 0 0 0 ath10k_pci 0000:01:00.0: [04]: 0x0004b000 0 0 0 0 ath10k_pci 0000:01:00.0: [05]: 0x0004b400 0 0 0 0 ath10k_pci 0000:01:00.0: [06]: 0x0004b800 0 0 0 0 ath10k_pci 0000:01:00.0: [07]: 0x0004bc00 1 0 1 0 ath10k_pci 0000:01:00.0: [08]: 0x0004c000 0 0 0 0 ath10k_pci 0000:01:00.0: [09]: 0x0004c400 0 0 0 0 ath10k_pci 0000:01:00.0: [10]: 0x0004c800 0 0 0 0 ath10k_pci 0000:01:00.0: [11]: 0x0004cc00 0 0 0 0
Tested HW: QCA9984,QCA9887,WCN3990
Signed-off-by: Miaoqing Pan miaoqing@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/pci.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c index 2c27f407a851..6e5f7ae00253 100644 --- a/drivers/net/wireless/ath/ath10k/pci.c +++ b/drivers/net/wireless/ath/ath10k/pci.c @@ -2059,6 +2059,11 @@ static void ath10k_pci_hif_stop(struct ath10k *ar)
ath10k_dbg(ar, ATH10K_DBG_BOOT, "boot hif stop\n");
+ ath10k_pci_irq_disable(ar); + ath10k_pci_irq_sync(ar); + napi_synchronize(&ar->napi); + napi_disable(&ar->napi); + /* Most likely the device has HTT Rx ring configured. The only way to * prevent the device from accessing (and possible corrupting) host * memory is to reset the chip now. @@ -2072,10 +2077,6 @@ static void ath10k_pci_hif_stop(struct ath10k *ar) */ ath10k_pci_safe_chip_reset(ar);
- ath10k_pci_irq_disable(ar); - ath10k_pci_irq_sync(ar); - napi_synchronize(&ar->napi); - napi_disable(&ar->napi); ath10k_pci_flush(ar);
spin_lock_irqsave(&ar_pci->ps_lock, flags);
[ Upstream commit 011d4111c8c602ea829fa4917af1818eb0500a90 ]
Observed PCIE device wake up failed after ~120 iterations of soft-reboot test. The error message is "ath10k_pci 0000:01:00.0: failed to wake up device : -110"
The call trace as below: ath10k_pci_probe -> ath10k_pci_force_wake -> ath10k_pci_wake_wait -> ath10k_pci_is_awake
Once trigger the device to wake up, we will continuously check the RTC state until it returns RTC_STATE_V_ON or timeout.
But for QCA99x0 chips, we use wrong value for RTC_STATE_V_ON. Occasionally, we get 0x7 on the fist read, we thought as a failure case, but actually is the right value, also verified with the spec. So fix the issue by changing RTC_STATE_V_ON from 0x5 to 0x7, passed ~2000 iterations.
Tested HW: QCA9984
Signed-off-by: Miaoqing Pan miaoqing@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/hw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath10k/hw.c b/drivers/net/wireless/ath/ath10k/hw.c index ad082b7d7643..b242085c3c16 100644 --- a/drivers/net/wireless/ath/ath10k/hw.c +++ b/drivers/net/wireless/ath/ath10k/hw.c @@ -158,7 +158,7 @@ const struct ath10k_hw_values qca6174_values = { };
const struct ath10k_hw_values qca99x0_values = { - .rtc_state_val_on = 5, + .rtc_state_val_on = 7, .ce_count = 12, .msi_assign_ce_max = 12, .num_target_ce_config_wlan = 10,
[ Upstream commit 8a5b0177a7f6099ff534a4d9ce72673af5c3cade ]
Currently on each driver reload internal counter is being increased. It causes failure to enumerate driver devices, as they have hardcoded: .codec_name = "ehdaudio0D2", As there is currently no devices with multiple hda codecs and there is currently no established way to reliably differentiate, between them, always assign bus->idx = 0;
This fixes a problem when we unload and reload machine driver idx gets incremented, so .codec_name would've needed to be set to "ehdaudio1D2" after first reload and so on.
Signed-off-by: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com Acked-by: Takashi Iwai tiwai@suse.de Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/hda/ext/hdac_ext_bus.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/sound/hda/ext/hdac_ext_bus.c b/sound/hda/ext/hdac_ext_bus.c index a3a113ef5d56..4f9f1d2a2ec5 100644 --- a/sound/hda/ext/hdac_ext_bus.c +++ b/sound/hda/ext/hdac_ext_bus.c @@ -85,7 +85,6 @@ int snd_hdac_ext_bus_init(struct hdac_bus *bus, struct device *dev, const struct hdac_ext_bus_ops *ext_ops) { int ret; - static int idx;
/* check if io ops are provided, if not load the defaults */ if (io_ops == NULL) @@ -96,7 +95,12 @@ int snd_hdac_ext_bus_init(struct hdac_bus *bus, struct device *dev, return ret;
bus->ext_ops = ext_ops; - bus->idx = idx++; + /* FIXME: + * Currently only one bus is supported, if there is device with more + * buses, bus->idx should be greater than 0, but there needs to be a + * reliable way to always assign same number. + */ + bus->idx = 0; bus->cmd_dma_state = true;
return 0;
[ Upstream commit 9f94c7f947e919c343b30f080285af53d0fa9902 ]
Attempting to profile 1024 or more CPUs with perf causes two errors:
perf record -a [ perf record: Woken up X times to write data ] way too many cpu caches.. [ perf record: Captured and wrote X MB perf.data (X samples) ]
perf report -C 1024 Error: failed to set cpu bitmap Requested CPU 1024 too large. Consider raising MAX_NR_CPUS
Increasing MAX_NR_CPUS from 1024 to 2048 and redefining MAX_CACHES as MAX_NR_CPUS * 4 returns normal functionality to perf:
perf record -a [ perf record: Woken up X times to write data ] [ perf record: Captured and wrote X MB perf.data (X samples) ]
perf report -C 1024 ...
Signed-off-by: Kyle Meyer kyle.meyer@hpe.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lkml.kernel.org/r/20190620193630.154025-1-meyerk@stormcage.eag.rdlabs... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/perf.h | 2 +- tools/perf/util/header.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/perf.h b/tools/perf/perf.h index d59dee61b64d..a26555baf692 100644 --- a/tools/perf/perf.h +++ b/tools/perf/perf.h @@ -26,7 +26,7 @@ static inline unsigned long long rdclock(void) }
#ifndef MAX_NR_CPUS -#define MAX_NR_CPUS 1024 +#define MAX_NR_CPUS 2048 #endif
extern const char *input_name; diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c index fb0aa661644b..b82d4577d969 100644 --- a/tools/perf/util/header.c +++ b/tools/perf/util/header.c @@ -1100,7 +1100,7 @@ static int build_caches(struct cpu_cache_level caches[], u32 size, u32 *cntp) return 0; }
-#define MAX_CACHES 2000 +#define MAX_CACHES (MAX_NR_CPUS * 4)
static int write_cache(struct feat_fd *ff, struct perf_evlist *evlist __maybe_unused)
[ Upstream commit 0f6ff78540bd1b4df1e0f17806b0ce2e1dff0d78 ]
When we unload Skylake driver we may end up calling hdac_component_master_unbind(), it uses acomp->audio_ops, which we set in hdmi_codec_probe(), so we need to set it to NULL in hdmi_codec_remove(), otherwise we will dereference no longer existing pointer.
Signed-off-by: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/hdac_hdmi.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/sound/soc/codecs/hdac_hdmi.c b/sound/soc/codecs/hdac_hdmi.c index 1f57126708e7..c9f9820968bb 100644 --- a/sound/soc/codecs/hdac_hdmi.c +++ b/sound/soc/codecs/hdac_hdmi.c @@ -1859,6 +1859,12 @@ static void hdmi_codec_remove(struct snd_soc_component *component) { struct hdac_hdmi_priv *hdmi = snd_soc_component_get_drvdata(component); struct hdac_device *hdev = hdmi->hdev; + int ret; + + ret = snd_hdac_acomp_register_notifier(hdev->bus, NULL); + if (ret < 0) + dev_err(&hdev->dev, "notifier unregister failed: err: %d\n", + ret);
pm_runtime_disable(&hdev->dev); }
[ Upstream commit db599f9ed9bd31b018b6c48ad7c6b21d5b790ecf ]
One of the cases where the parameters for injection may be updated is when there are no more in-flight I/O requests. The number of in-flight requests is stored in the field bfqd->rq_in_driver of the descriptor bfqd of the device. So, the controlled condition is bfqd->rq_in_driver == 0.
Unfortunately, this is wrong because, the instruction that checks this condition is in the code path that handles the completion of a request, and, in particular, the instruction is executed before bfqd->rq_in_driver is decremented in such a code path.
This commit fixes this issue by just replacing 0 with 1 in the comparison.
Reported-by: Srivatsa S. Bhat (VMware) srivatsa@csail.mit.edu Tested-by: Srivatsa S. Bhat (VMware) srivatsa@csail.mit.edu Signed-off-by: Paolo Valente paolo.valente@linaro.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/bfq-iosched.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index e5db3856b194..404e776aa36d 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -5398,8 +5398,14 @@ static void bfq_update_inject_limit(struct bfq_data *bfqd, * total service time, and there seem to be the right * conditions to do it, or we can lower the last base value * computed. + * + * NOTE: (bfqd->rq_in_driver == 1) means that there is no I/O + * request in flight, because this function is in the code + * path that handles the completion of a request of bfqq, and, + * in particular, this function is executed before + * bfqd->rq_in_driver is decremented in such a code path. */ - if ((bfqq->last_serv_time_ns == 0 && bfqd->rq_in_driver == 0) || + if ((bfqq->last_serv_time_ns == 0 && bfqd->rq_in_driver == 1) || tot_time_ns < bfqq->last_serv_time_ns) { bfqq->last_serv_time_ns = tot_time_ns; /*
[ Upstream commit 7a3916706e858ad0bc3b5629c68168e1449de26a ]
Release all requested IRQ's on the request error to properly clean up allocated resources.
Signed-off-by: Dmitry Osipenko digetx@gmail.com Acked-By: Peter De Schrijver pdeschrijver@nvidia.com Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clocksource/timer-tegra20.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/clocksource/timer-tegra20.c b/drivers/clocksource/timer-tegra20.c index 1e7ece279730..fe5cc0963ac9 100644 --- a/drivers/clocksource/timer-tegra20.c +++ b/drivers/clocksource/timer-tegra20.c @@ -288,7 +288,7 @@ static int __init tegra_init_timer(struct device_node *np) pr_err("%s: can't map IRQ for CPU%d\n", __func__, cpu); ret = -EINVAL; - goto out; + goto out_irq; }
irq_set_status_flags(cpu_to->clkevt.irq, IRQ_NOAUTOEN); @@ -298,7 +298,8 @@ static int __init tegra_init_timer(struct device_node *np) if (ret) { pr_err("%s: cannot setup irq %d for CPU%d\n", __func__, cpu_to->clkevt.irq, cpu); - ret = -EINVAL; + irq_dispose_mapping(cpu_to->clkevt.irq); + cpu_to->clkevt.irq = 0; goto out_irq; } }
[ Upstream commit ca156e006add67e4beea7896be395160735e09b0 ]
ZAC support added sense data requesting on error for both ZAC and ATA devices. This seems to cause erratic error handling behaviors on some SSDs where the device reports sense data availability and then delivers the wrong content making EH take the wrong actions. The failure mode was sporadic on a LITE-ON ssd and couldn't be reliably reproduced.
There is no value in requesting sense data from non-ZAC ATA devices while there's a significant risk of introducing EH misbehaviors which are difficult to reproduce and fix. Let's do the sense data dancing only for ZAC devices.
Reviewed-by: Hannes Reinecke hare@suse.com Tested-by: Masato Suzuki masato.suzuki@wdc.com Reviewed-by: Damien Le Moal damien.lemoal@wdc.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/libata-eh.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index 9d687e1d4325..3bfd9da58473 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -1469,7 +1469,7 @@ static int ata_eh_read_log_10h(struct ata_device *dev, tf->hob_lbah = buf[10]; tf->nsect = buf[12]; tf->hob_nsect = buf[13]; - if (ata_id_has_ncq_autosense(dev->id)) + if (dev->class == ATA_DEV_ZAC && ata_id_has_ncq_autosense(dev->id)) tf->auxiliary = buf[14] << 16 | buf[15] << 8 | buf[16];
return 0; @@ -1716,7 +1716,8 @@ void ata_eh_analyze_ncq_error(struct ata_link *link) memcpy(&qc->result_tf, &tf, sizeof(tf)); qc->result_tf.flags = ATA_TFLAG_ISADDR | ATA_TFLAG_LBA | ATA_TFLAG_LBA48; qc->err_mask |= AC_ERR_DEV | AC_ERR_NCQ; - if ((qc->result_tf.command & ATA_SENSE) || qc->result_tf.auxiliary) { + if (dev->class == ATA_DEV_ZAC && + ((qc->result_tf.command & ATA_SENSE) || qc->result_tf.auxiliary)) { char sense_key, asc, ascq;
sense_key = (qc->result_tf.auxiliary >> 16) & 0xff; @@ -1770,10 +1771,11 @@ static unsigned int ata_eh_analyze_tf(struct ata_queued_cmd *qc, }
switch (qc->dev->class) { - case ATA_DEV_ATA: case ATA_DEV_ZAC: if (stat & ATA_SENSE) ata_eh_request_sense(qc, qc->scsicmd); + /* fall through */ + case ATA_DEV_ATA: if (err & ATA_ICRC) qc->err_mask |= AC_ERR_ATA_BUS; if (err & (ATA_UNC | ATA_AMNF))
[ Upstream commit fc9babc2574691d3bbf0428f007b22261fed55c6 ]
We're adjusting the timer's base for each per-CPU timer to point to the actual start of the timer since device-tree defines a compound registers range that includes all of the timers. In this case the original base need to be restore before calling iounmap to unmap the proper address.
Signed-off-by: Dmitry Osipenko digetx@gmail.com Acked-by: Jon Hunter jonathanh@nvidia.com Acked-by: Thierry Reding treding@nvidia.com Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clocksource/timer-tegra20.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/clocksource/timer-tegra20.c b/drivers/clocksource/timer-tegra20.c index fe5cc0963ac9..462be34b41c4 100644 --- a/drivers/clocksource/timer-tegra20.c +++ b/drivers/clocksource/timer-tegra20.c @@ -319,6 +319,8 @@ static int __init tegra_init_timer(struct device_node *np) irq_dispose_mapping(cpu_to->clkevt.irq); } } + + to->of_base.base = timer_reg_base; out: timer_of_cleanup(to); return ret;
[ Upstream commit 6282edb72bed5324352522d732080d4c1b9dfed6 ]
Exynos SoCs based on CA7/CA15 have 2 timer interfaces: custom Exynos MCT (Multi Core Timer) and standard ARM Architected Timers.
There are use cases, where both timer interfaces are used simultanously. One of such examples is using Exynos MCT for the main system timer and ARM Architected Timers for the KVM and virtualized guests (KVM requires arch timers).
Exynos Multi-Core Timer driver (exynos_mct) must be however started before ARM Architected Timers (arch_timer), because they both share some common hardware blocks (global system counter) and turning on MCT is needed to get ARM Architected Timer working properly.
To ensure selecting Exynos MCT as the main system timer, increase MCT timer rating. To ensure proper starting order of both timers during suspend/resume cycle, increase MCT hotplug priority over ARM Archictected Timers.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Reviewed-by: Krzysztof Kozlowski krzk@kernel.org Reviewed-by: Chanwoo Choi cw00.choi@samsung.com Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clocksource/exynos_mct.c | 4 ++-- include/linux/cpuhotplug.h | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/clocksource/exynos_mct.c b/drivers/clocksource/exynos_mct.c index e8eab16b154b..74cb299f5089 100644 --- a/drivers/clocksource/exynos_mct.c +++ b/drivers/clocksource/exynos_mct.c @@ -206,7 +206,7 @@ static void exynos4_frc_resume(struct clocksource *cs)
static struct clocksource mct_frc = { .name = "mct-frc", - .rating = 400, + .rating = 450, /* use value higher than ARM arch timer */ .read = exynos4_frc_read, .mask = CLOCKSOURCE_MASK(32), .flags = CLOCK_SOURCE_IS_CONTINUOUS, @@ -461,7 +461,7 @@ static int exynos4_mct_starting_cpu(unsigned int cpu) evt->set_state_oneshot_stopped = set_state_shutdown; evt->tick_resume = set_state_shutdown; evt->features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT; - evt->rating = 450; + evt->rating = 500; /* use value higher than ARM arch timer */
exynos4_mct_write(TICK_BASE_CNT, mevt->base + MCT_L_TCNTB_OFFSET);
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 52ec0d9fa1f7..068793a619ca 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -116,10 +116,10 @@ enum cpuhp_state { CPUHP_AP_PERF_ARM_ACPI_STARTING, CPUHP_AP_PERF_ARM_STARTING, CPUHP_AP_ARM_L2X0_STARTING, + CPUHP_AP_EXYNOS4_MCT_TIMER_STARTING, CPUHP_AP_ARM_ARCH_TIMER_STARTING, CPUHP_AP_ARM_GLOBAL_TIMER_STARTING, CPUHP_AP_JCORE_TIMER_STARTING, - CPUHP_AP_EXYNOS4_MCT_TIMER_STARTING, CPUHP_AP_ARM_TWD_STARTING, CPUHP_AP_QCOM_TIMER_STARTING, CPUHP_AP_TEGRA_TIMER_STARTING,
[ Upstream commit e7600865db32b69deb0109b8254244dca592adcf ]
Commit f8e608982022 ("netfilter: ctnetlink: Resolve conntrack L3-protocol flush regression") introduced a regression in which deletion of conntrack entries would fail because the L3 protocol information is replaced by AF_UNSPEC. As a result the search for the entry to be deleted would turn up empty due to the tuple used to perform the search is now different from the tuple used to initially set up the entry.
For flushing the conntrack table we do however want to keep the option for nfgenmsg->version to have a non-zero value to allow for newer user-space tools to request treatment under the new behavior. With that it is possible to independently flush tables for a defined L3 protocol. This was introduced with the enhancements in in commit 59c08c69c278 ("netfilter: ctnetlink: Support L3 protocol-filter on flush").
Older user-space tools will retain the behavior of flushing all tables regardless of defined L3 protocol.
Fixes: f8e608982022 ("netfilter: ctnetlink: Resolve conntrack L3-protocol flush regression") Suggested-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Felix Kaechele felix@kaechele.ca Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_conntrack_netlink.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index 7db79c1b8084..1b77444d5b52 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -1256,7 +1256,6 @@ static int ctnetlink_del_conntrack(struct net *net, struct sock *ctnl, struct nf_conntrack_tuple tuple; struct nf_conn *ct; struct nfgenmsg *nfmsg = nlmsg_data(nlh); - u_int8_t u3 = nfmsg->version ? nfmsg->nfgen_family : AF_UNSPEC; struct nf_conntrack_zone zone; int err;
@@ -1266,11 +1265,13 @@ static int ctnetlink_del_conntrack(struct net *net, struct sock *ctnl,
if (cda[CTA_TUPLE_ORIG]) err = ctnetlink_parse_tuple(cda, &tuple, CTA_TUPLE_ORIG, - u3, &zone); + nfmsg->nfgen_family, &zone); else if (cda[CTA_TUPLE_REPLY]) err = ctnetlink_parse_tuple(cda, &tuple, CTA_TUPLE_REPLY, - u3, &zone); + nfmsg->nfgen_family, &zone); else { + u_int8_t u3 = nfmsg->version ? nfmsg->nfgen_family : AF_UNSPEC; + return ctnetlink_flush_conntrack(net, cda, NETLINK_CB(skb).portid, nlmsg_report(nlh), u3);
[ Upstream commit f7019b7b0ad14bde732b8953161994edfc384953 ]
Clang warns:
In file included from net/xdp/xsk_queue.c:10: net/xdp/xsk_queue.h:292:2: warning: expression result unused [-Wunused-value] WRITE_ONCE(q->ring->producer, q->prod_tail); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/compiler.h:284:6: note: expanded from macro 'WRITE_ONCE' __u.__val; \ ~~~ ^~~~~ 1 warning generated.
The q->prod_tail assignment has a comma at the end, not a semi-colon. Fix that so clang no longer warns and everything works as expected.
Fixes: c497176cb2e4 ("xsk: add Rx receive functions and poll support") Link: https://github.com/ClangBuiltLinux/linux/issues/544 Signed-off-by: Nathan Chancellor natechancellor@gmail.com Acked-by: Nick Desaulniers ndesaulniers@google.com Acked-by: Jonathan Lemon jonathan.lemon@gmail.com Acked-by: Björn Töpel bjorn.topel@intel.com Acked-by: Song Liu songliubraving@fb.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/xdp/xsk_queue.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index 88b9ae24658d..cba4a640d5e8 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -288,7 +288,7 @@ static inline void xskq_produce_flush_desc(struct xsk_queue *q) /* Order producer and data */ smp_wmb(); /* B, matches C */
- q->prod_tail = q->prod_head, + q->prod_tail = q->prod_head; WRITE_ONCE(q->ring->producer, q->prod_tail); }
[ Upstream commit 2034a42d1747fc1e1eeef2c6f1789c4d0762cb9c ]
The decoding of shortenend codes is broken. It only works as expected if there are no erasures.
When decoding with erasures, Lambda (the error and erasure locator polynomial) is initialized from the given erasure positions. The pad parameter is not accounted for by the initialisation code, and hence Lambda is initialized from incorrect erasure positions.
The fix is to adjust the erasure positions by the supplied pad.
Signed-off-by: Ferdinand Blomqvist ferdinand.blomqvist@gmail.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/20190620141039.9874-3-ferdinand.blomqvist@gmail.co... Signed-off-by: Sasha Levin sashal@kernel.org --- lib/reed_solomon/decode_rs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/reed_solomon/decode_rs.c b/lib/reed_solomon/decode_rs.c index 1db74eb098d0..3313bf944ff1 100644 --- a/lib/reed_solomon/decode_rs.c +++ b/lib/reed_solomon/decode_rs.c @@ -99,9 +99,9 @@ if (no_eras > 0) { /* Init lambda to be the erasure locator polynomial */ lambda[1] = alpha_to[rs_modnn(rs, - prim * (nn - 1 - eras_pos[0]))]; + prim * (nn - 1 - (eras_pos[0] + pad)))]; for (i = 1; i < no_eras; i++) { - u = rs_modnn(rs, prim * (nn - 1 - eras_pos[i])); + u = rs_modnn(rs, prim * (nn - 1 - (eras_pos[i] + pad))); for (j = i + 1; j > 0; j--) { tmp = index_of[lambda[j - 1]]; if (tmp != nn) {
[ Upstream commit 75672dda27bd00109a84cd975c17949ad9c45663 ]
Yauheni reported the following code do not work correctly on BE arches:
ALU_ARSH_X: DST = (u64) (u32) ((*(s32 *) &DST) >> SRC); CONT; ALU_ARSH_K: DST = (u64) (u32) ((*(s32 *) &DST) >> IMM); CONT;
and are causing failure of test_verifier test 'arsh32 on imm 2' on BE arches.
The code is taking address and interpreting memory directly, so is not endianness neutral. We should instead perform standard C type casting on the variable. A u64 to s32 conversion will drop the high 32-bit and reserve the low 32-bit as signed integer, this is all we want.
Fixes: 2dc6b100f928 ("bpf: interpreter support BPF_ALU | BPF_ARSH") Reported-by: Yauheni Kaliuta yauheni.kaliuta@redhat.com Reviewed-by: Jakub Kicinski jakub.kicinski@netronome.com Reviewed-by: Quentin Monnet quentin.monnet@netronome.com Signed-off-by: Jiong Wang jiong.wang@netronome.com Acked-by: Song Liu songliubraving@fb.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 080e2bb644cc..f2148db91439 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1364,10 +1364,10 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) insn++; CONT; ALU_ARSH_X: - DST = (u64) (u32) ((*(s32 *) &DST) >> SRC); + DST = (u64) (u32) (((s32) DST) >> SRC); CONT; ALU_ARSH_K: - DST = (u64) (u32) ((*(s32 *) &DST) >> IMM); + DST = (u64) (u32) (((s32) DST) >> IMM); CONT; ALU64_ARSH_X: (*(s64 *) &DST) >>= SRC;
[ Upstream commit ef4d6a8556b637ad27c8c2a2cff1dda3da38e9a9 ]
Check if the syndrome provided by the caller is zero, and act accordingly.
Signed-off-by: Ferdinand Blomqvist ferdinand.blomqvist@gmail.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/20190620141039.9874-6-ferdinand.blomqvist@gmail.co... Signed-off-by: Sasha Levin sashal@kernel.org --- lib/reed_solomon/decode_rs.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/lib/reed_solomon/decode_rs.c b/lib/reed_solomon/decode_rs.c index 3313bf944ff1..121beb2f0930 100644 --- a/lib/reed_solomon/decode_rs.c +++ b/lib/reed_solomon/decode_rs.c @@ -42,8 +42,18 @@ BUG_ON(pad < 0 || pad >= nn);
/* Does the caller provide the syndrome ? */ - if (s != NULL) - goto decode; + if (s != NULL) { + for (i = 0; i < nroots; i++) { + /* The syndrome is in index form, + * so nn represents zero + */ + if (s[i] != nn) + goto decode; + } + + /* syndrome is zero, no errors to correct */ + return 0; + }
/* form the syndromes; i.e., evaluate data(x) at roots of * g(x) */
[ Upstream commit 025bf37725f1929542361eef2245df30badf242e ]
In case the requested gpio property is not found in the device tree, some callers of gpiod_get_from_of_node() expect a return value of NULL, others expect -ENOENT. In particular devm_fwnode_get_index_gpiod_from_child() expects -ENOENT. Currently it gets a NULL, which breaks the loop that tries all gpio_suffixes. The result is that a gpio property is not found, even though it is there.
This patch changes gpiod_get_from_of_node() to return -ENOENT instead of NULL when the requested gpio property is not found in the device tree. Additionally it modifies all calling functions to properly evaluate the return value.
Another approach would be to leave the return value of gpiod_get_from_of_node() as is and fix the bug in devm_fwnode_get_index_gpiod_from_child(). Other callers would still need to be reworked. The effort would be the same as with the chosen solution.
Signed-off-by: Georg Waibel georg.waibel@sensor-technik.de Reviewed-by: Krzysztof Kozlowski krzk@kernel.org Reviewed-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpiolib.c | 6 +----- drivers/regulator/da9211-regulator.c | 2 ++ drivers/regulator/s2mps11.c | 4 +++- drivers/regulator/s5m8767.c | 4 +++- drivers/regulator/tps65090-regulator.c | 7 ++++--- 5 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index e013d417a936..be1d1d2f8aaa 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -4244,8 +4244,7 @@ EXPORT_SYMBOL_GPL(gpiod_get_index); * * Returns: * On successful request the GPIO pin is configured in accordance with - * provided @dflags. If the node does not have the requested GPIO - * property, NULL is returned. + * provided @dflags. * * In case of error an ERR_PTR() is returned. */ @@ -4267,9 +4266,6 @@ struct gpio_desc *gpiod_get_from_of_node(struct device_node *node, index, &flags);
if (!desc || IS_ERR(desc)) { - /* If it is not there, just return NULL */ - if (PTR_ERR(desc) == -ENOENT) - return NULL; return desc; }
diff --git a/drivers/regulator/da9211-regulator.c b/drivers/regulator/da9211-regulator.c index da37b4ccd834..0309823d2c72 100644 --- a/drivers/regulator/da9211-regulator.c +++ b/drivers/regulator/da9211-regulator.c @@ -289,6 +289,8 @@ static struct da9211_pdata *da9211_parse_regulators_dt( 0, GPIOD_OUT_HIGH | GPIOD_FLAGS_BIT_NONEXCLUSIVE, "da9211-enable"); + if (IS_ERR(pdata->gpiod_ren[n])) + pdata->gpiod_ren[n] = NULL; n++; }
diff --git a/drivers/regulator/s2mps11.c b/drivers/regulator/s2mps11.c index 134c62db36c5..b518a81f75a3 100644 --- a/drivers/regulator/s2mps11.c +++ b/drivers/regulator/s2mps11.c @@ -821,7 +821,9 @@ static void s2mps14_pmic_dt_parse_ext_control_gpio(struct platform_device *pdev, 0, GPIOD_OUT_HIGH | GPIOD_FLAGS_BIT_NONEXCLUSIVE, "s2mps11-regulator"); - if (IS_ERR(gpio[reg])) { + if (PTR_ERR(gpio[reg]) == -ENOENT) + gpio[reg] = NULL; + else if (IS_ERR(gpio[reg])) { dev_err(&pdev->dev, "Failed to get control GPIO for %d/%s\n", reg, rdata[reg].name); continue; diff --git a/drivers/regulator/s5m8767.c b/drivers/regulator/s5m8767.c index bb9d1a083299..6ca27e9d5ef7 100644 --- a/drivers/regulator/s5m8767.c +++ b/drivers/regulator/s5m8767.c @@ -574,7 +574,9 @@ static int s5m8767_pmic_dt_parse_pdata(struct platform_device *pdev, 0, GPIOD_OUT_HIGH | GPIOD_FLAGS_BIT_NONEXCLUSIVE, "s5m8767"); - if (IS_ERR(rdata->ext_control_gpiod)) + if (PTR_ERR(rdata->ext_control_gpiod) == -ENOENT) + rdata->ext_control_gpiod = NULL; + else if (IS_ERR(rdata->ext_control_gpiod)) return PTR_ERR(rdata->ext_control_gpiod);
rdata->id = i; diff --git a/drivers/regulator/tps65090-regulator.c b/drivers/regulator/tps65090-regulator.c index ca39b3d55123..10ea4b5a0f55 100644 --- a/drivers/regulator/tps65090-regulator.c +++ b/drivers/regulator/tps65090-regulator.c @@ -371,11 +371,12 @@ static struct tps65090_platform_data *tps65090_parse_dt_reg_data( "dcdc-ext-control-gpios", 0, gflags, "tps65090"); - if (IS_ERR(rpdata->gpiod)) - return ERR_CAST(rpdata->gpiod); - if (!rpdata->gpiod) + if (PTR_ERR(rpdata->gpiod) == -ENOENT) { dev_err(&pdev->dev, "could not find DCDC external control GPIO\n"); + rpdata->gpiod = NULL; + } else if (IS_ERR(rpdata->gpiod)) + return ERR_CAST(rpdata->gpiod); }
if (of_property_read_u32(tps65090_matches[idx].of_node,
[ Upstream commit d736fc6c68a5f76e89a6c2c4100e3678706003a3 ]
When doing global reset, the MAC autoneg state of fibre port is set to default, which may cause user configuration lost. This patch fixes it by restore the MAC autoneg state after reset.
Fixes: 22f48e24a23d ("net: hns3: add autoneg and change speed support for fibre port") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index 4d9bcad26f06..645b9b3e0256 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -2389,6 +2389,15 @@ static int hclge_mac_init(struct hclge_dev *hdev) return ret; }
+ if (hdev->hw.mac.support_autoneg) { + ret = hclge_set_autoneg_en(hdev, hdev->hw.mac.autoneg); + if (ret) { + dev_err(&hdev->pdev->dev, + "Config mac autoneg fail ret=%d\n", ret); + return ret; + } + } + mac->link = 0;
if (mac->user_fec_mode & BIT(HNAE3_FEC_USER_DEF)) {
[ Upstream commit f53297d67800feb5fafd94abd926c889aefee690 ]
The ingress and egress ACL root namespaces are created per vport and stored into arrays. However, the vport number is not the same as the index. Passing the array index, instead of vport number, to get the correct ingress and egress acl namespace.
Fixes: 9b93ab981e3b ("net/mlx5: Separate ingress/egress namespaces for each vport") Signed-off-by: Jianbo Liu jianbol@mellanox.com Reviewed-by: Oz Shlomo ozsh@mellanox.com Reviewed-by: Eli Britstein elibr@mellanox.com Reviewed-by: Roi Dayan roid@mellanox.com Reviewed-by: Mark Bloch markb@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index 6a921e24cd5e..acab26b88261 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -939,7 +939,7 @@ int esw_vport_enable_egress_acl(struct mlx5_eswitch *esw, vport->vport, MLX5_CAP_ESW_EGRESS_ACL(dev, log_max_ft_size));
root_ns = mlx5_get_flow_vport_acl_namespace(dev, MLX5_FLOW_NAMESPACE_ESW_EGRESS, - vport->vport); + mlx5_eswitch_vport_num_to_index(esw, vport->vport)); if (!root_ns) { esw_warn(dev, "Failed to get E-Switch egress flow namespace for vport (%d)\n", vport->vport); return -EOPNOTSUPP; @@ -1057,7 +1057,7 @@ int esw_vport_enable_ingress_acl(struct mlx5_eswitch *esw, vport->vport, MLX5_CAP_ESW_INGRESS_ACL(dev, log_max_ft_size));
root_ns = mlx5_get_flow_vport_acl_namespace(dev, MLX5_FLOW_NAMESPACE_ESW_INGRESS, - vport->vport); + mlx5_eswitch_vport_num_to_index(esw, vport->vport)); if (!root_ns) { esw_warn(dev, "Failed to get E-Switch ingress flow namespace for vport (%d)\n", vport->vport); return -EOPNOTSUPP;
[ Upstream commit 655c91414579d7bb115a4f7898ee726fc18e0984 ]
Some transceivers may comply with SFF-8472 but not implement the Digital Diagnostic Monitoring (DDM) interface described in it. The existence of such area is specified by bit 6 of byte 92, set to 1 if implemented.
Currently, due to not checking this bit ixgbe fails trying to read SFP module's eeprom with the follow message:
ethtool -m enP51p1s0f0 Cannot get Module EEPROM data: Input/output error
Because it fails to read the additional 256 bytes in which it was assumed to exist the DDM data.
This issue was noticed using a Mellanox Passive DAC PN 01FT738. The eeprom data was confirmed by Mellanox as correct and present in other Passive DACs in from other manufacturers.
Signed-off-by: "Mauro S. M. Rodrigues" maurosr@linux.vnet.ibm.com Reviewed-by: Jesse Brandeburg jesse.brandeburg@intel.com Tested-by: Andrew Bowers andrewx.bowers@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 3 ++- drivers/net/ethernet/intel/ixgbe/ixgbe_phy.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c index acba067cc15a..7c52ae8ac005 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c @@ -3226,7 +3226,8 @@ static int ixgbe_get_module_info(struct net_device *dev, page_swap = true; }
- if (sff8472_rev == IXGBE_SFF_SFF_8472_UNSUP || page_swap) { + if (sff8472_rev == IXGBE_SFF_SFF_8472_UNSUP || page_swap || + !(addr_mode & IXGBE_SFF_DDM_IMPLEMENTED)) { /* We have a SFP, but it does not support SFF-8472 */ modinfo->type = ETH_MODULE_SFF_8079; modinfo->eeprom_len = ETH_MODULE_SFF_8079_LEN; diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.h index 214b01085718..6544c4539c0d 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.h +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.h @@ -45,6 +45,7 @@ #define IXGBE_SFF_SOFT_RS_SELECT_10G 0x8 #define IXGBE_SFF_SOFT_RS_SELECT_1G 0x0 #define IXGBE_SFF_ADDRESSING_MODE 0x4 +#define IXGBE_SFF_DDM_IMPLEMENTED 0x40 #define IXGBE_SFF_QSFP_DA_ACTIVE_CABLE 0x1 #define IXGBE_SFF_QSFP_DA_PASSIVE_CABLE 0x8 #define IXGBE_SFF_QSFP_CONNECTOR_NOT_SEPARABLE 0x23
[ Upstream commit 473971187d6727609951858c63bf12b0307ef015 ]
The same bug that gcc hit in the past is apparently now showing up with clang, which decides to inline __serpent_setkey_sbox:
crypto/serpent_generic.c:268:5: error: stack frame size of 2112 bytes in function '__serpent_setkey' [-Werror,-Wframe-larger-than=]
Marking it 'noinline' reduces the stack usage from 2112 bytes to 192 and 96 bytes, respectively, and seems to generate more useful object code.
Fixes: c871c10e4ea7 ("crypto: serpent - improve __serpent_setkey with UBSAN") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Eric Biggers ebiggers@kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- crypto/serpent_generic.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/crypto/serpent_generic.c b/crypto/serpent_generic.c index 16f612b6dbca..a9cc0b2aa0d6 100644 --- a/crypto/serpent_generic.c +++ b/crypto/serpent_generic.c @@ -225,7 +225,13 @@ x4 ^= x2; \ })
-static void __serpent_setkey_sbox(u32 r0, u32 r1, u32 r2, u32 r3, u32 r4, u32 *k) +/* + * both gcc and clang have misoptimized this function in the past, + * producing horrible object code from spilling temporary variables + * on the stack. Forcing this part out of line avoids that. + */ +static noinline void __serpent_setkey_sbox(u32 r0, u32 r1, u32 r2, + u32 r3, u32 r4, u32 *k) { k += 100; S3(r3, r4, r0, r1, r2); store_and_load_keys(r1, r2, r4, r3, 28, 24);
[ Upstream commit 90acc0653d2bee203174e66d519fbaaa513502de ]
Build testing with some core crypto options disabled revealed a few modules that are missing CRYPTO_HASH:
crypto/asymmetric_keys/x509_public_key.o: In function `x509_get_sig_params': x509_public_key.c:(.text+0x4c7): undefined reference to `crypto_alloc_shash' x509_public_key.c:(.text+0x5e5): undefined reference to `crypto_shash_digest' crypto/asymmetric_keys/pkcs7_verify.o: In function `pkcs7_digest.isra.0': pkcs7_verify.c:(.text+0xab): undefined reference to `crypto_alloc_shash' pkcs7_verify.c:(.text+0x1b2): undefined reference to `crypto_shash_digest' pkcs7_verify.c:(.text+0x3c1): undefined reference to `crypto_shash_update' pkcs7_verify.c:(.text+0x411): undefined reference to `crypto_shash_finup'
This normally doesn't show up in randconfig tests because there is a large number of other options that select CRYPTO_HASH.
Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- crypto/asymmetric_keys/Kconfig | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/crypto/asymmetric_keys/Kconfig b/crypto/asymmetric_keys/Kconfig index be70ca6c85d3..1f1f004dc757 100644 --- a/crypto/asymmetric_keys/Kconfig +++ b/crypto/asymmetric_keys/Kconfig @@ -15,6 +15,7 @@ config ASYMMETRIC_PUBLIC_KEY_SUBTYPE select MPILIB select CRYPTO_HASH_INFO select CRYPTO_AKCIPHER + select CRYPTO_HASH help This option provides support for asymmetric public key type handling. If signature generation and/or verification are to be used, @@ -65,6 +66,7 @@ config TPM_KEY_PARSER config PKCS7_MESSAGE_PARSER tristate "PKCS#7 message parser" depends on X509_CERTIFICATE_PARSER + select CRYPTO_HASH select ASN1 select OID_REGISTRY help @@ -87,6 +89,7 @@ config SIGNED_PE_FILE_VERIFICATION bool "Support for PE file signature verification" depends on PKCS7_MESSAGE_PARSER=y depends on SYSTEM_DATA_VERIFICATION + select CRYPTO_HASH select ASN1 select OID_REGISTRY help
[ Upstream commit df5c4150501ee7e86383be88f6490d970adcf157 ]
In commit 3c0efb745a17 ("ath9k: discard undersized packets") the lower bound of RX packets was set to 10 (min ACK size) to filter those that would otherwise be treated as invalid at mac80211.
Alas, short radar pulses are reported as PHY_ERROR frames with length set to 3. Therefore their detection stopped working after that commit.
NOTE: ath9k drivers built thereafter will not pass DFS certification.
This extends the criteria for short packets to explicitly handle PHY_ERROR frames.
Fixes: 3c0efb745a17 ("ath9k: discard undersized packets") Signed-off-by: Zefir Kurtisi zefir.kurtisi@neratec.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath9k/recv.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c index 4e97f7f3b2a3..06e660858766 100644 --- a/drivers/net/wireless/ath/ath9k/recv.c +++ b/drivers/net/wireless/ath/ath9k/recv.c @@ -815,6 +815,7 @@ static int ath9k_rx_skb_preprocess(struct ath_softc *sc, struct ath_common *common = ath9k_hw_common(ah); struct ieee80211_hdr *hdr; bool discard_current = sc->rx.discard_next; + bool is_phyerr;
/* * Discard corrupt descriptors which are marked in @@ -827,8 +828,11 @@ static int ath9k_rx_skb_preprocess(struct ath_softc *sc,
/* * Discard zero-length packets and packets smaller than an ACK + * which are not PHY_ERROR (short radar pulses have a length of 3) */ - if (rx_stats->rs_datalen < 10) { + is_phyerr = rx_stats->rs_status & ATH9K_RXERR_PHY; + if (!rx_stats->rs_datalen || + (rx_stats->rs_datalen < 10 && !is_phyerr)) { RX_STAT_INC(sc, rx_len_err); goto corrupt; }
[ Upstream commit 1a276003111c0404f6bfeffe924c5a21f482428b ]
This change fixes a rare race condition of handling WMI events after wmi_call expires.
wmi_recv_cmd immediately handles an event when reply_buf is defined and a wmi_call is waiting for the event. However, in case the wmi_call has already timed-out, there will be no waiting/running wmi_call and the event will be queued in WMI queue and will be handled later in wmi_event_handle. Meanwhile, a new similar wmi_call for the same command and event may be issued. In this case, when handling the queued event we got WARN_ON printed.
Fixing this case as a valid timeout and drop the unexpected event.
Signed-off-by: Ahmad Masri amasri@codeaurora.org Signed-off-by: Maya Erez merez@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/wil6210/wmi.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/wil6210/wmi.c b/drivers/net/wireless/ath/wil6210/wmi.c index d89cd41e78ac..89a75ff29410 100644 --- a/drivers/net/wireless/ath/wil6210/wmi.c +++ b/drivers/net/wireless/ath/wil6210/wmi.c @@ -3220,7 +3220,18 @@ static void wmi_event_handle(struct wil6210_priv *wil, /* check if someone waits for this event */ if (wil->reply_id && wil->reply_id == id && wil->reply_mid == mid) { - WARN_ON(wil->reply_buf); + if (wil->reply_buf) { + /* event received while wmi_call is waiting + * with a buffer. Such event should be handled + * in wmi_recv_cmd function. Handling the event + * here means a previous wmi_call was timeout. + * Drop the event and do not handle it. + */ + wil_err(wil, + "Old event (%d, %s) while wmi_call is waiting. Drop it and Continue waiting\n", + id, eventid2name(id)); + return; + }
wmi_evt_call_handler(vif, id, evt_data, len - sizeof(*wmi));
[ Upstream commit d8655e7630dafa88bc37f101640e39c736399771 ]
Commit 9da21b1509d8 ("EDAC: Poll timeout cannot be zero, p2") assumes edac_mc_poll_msec to be unsigned long, but the type of the variable still remained as int. Setting edac_mc_poll_msec can trigger out-of-bounds write.
Reproducer:
# echo 1001 > /sys/module/edac_core/parameters/edac_mc_poll_msec
KASAN report:
BUG: KASAN: global-out-of-bounds in edac_set_poll_msec+0x140/0x150 Write of size 8 at addr ffffffffb91b2d00 by task bash/1996
CPU: 1 PID: 1996 Comm: bash Not tainted 5.2.0-rc6+ #23 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 Call Trace: dump_stack+0xca/0x13e print_address_description.cold+0x5/0x246 __kasan_report.cold+0x75/0x9a ? edac_set_poll_msec+0x140/0x150 kasan_report+0xe/0x20 edac_set_poll_msec+0x140/0x150 ? dimmdev_location_show+0x30/0x30 ? vfs_lock_file+0xe0/0xe0 ? _raw_spin_lock+0x87/0xe0 param_attr_store+0x1b5/0x310 ? param_array_set+0x4f0/0x4f0 module_attr_store+0x58/0x80 ? module_attr_show+0x80/0x80 sysfs_kf_write+0x13d/0x1a0 kernfs_fop_write+0x2bc/0x460 ? sysfs_kf_bin_read+0x270/0x270 ? kernfs_notify+0x1f0/0x1f0 __vfs_write+0x81/0x100 vfs_write+0x1e1/0x560 ksys_write+0x126/0x250 ? __ia32_sys_read+0xb0/0xb0 ? do_syscall_64+0x1f/0x390 do_syscall_64+0xc1/0x390 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fa7caa5e970 Code: 73 01 c3 48 8b 0d 28 d5 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 99 2d 2c 00 00 75 10 b8 01 00 00 00 04 RSP: 002b:00007fff6acfdfe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fa7caa5e970 RDX: 0000000000000005 RSI: 0000000000e95c08 RDI: 0000000000000001 RBP: 0000000000e95c08 R08: 00007fa7cad1e760 R09: 00007fa7cb36a700 R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000005 R13: 0000000000000001 R14: 00007fa7cad1d600 R15: 0000000000000005
The buggy address belongs to the variable: edac_mc_poll_msec+0x0/0x40
Memory state around the buggy address: ffffffffb91b2c00: 00 00 00 00 fa fa fa fa 00 00 00 00 fa fa fa fa ffffffffb91b2c80: 00 00 00 00 fa fa fa fa 00 00 00 00 fa fa fa fa
ffffffffb91b2d00: 04 fa fa fa fa fa fa fa 04 fa fa fa fa fa fa fa
^ ffffffffb91b2d80: 04 fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 ffffffffb91b2e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Fix it by changing the type of edac_mc_poll_msec to unsigned int. The reason why this patch adopts unsigned int rather than unsigned long is msecs_to_jiffies() assumes arg to be unsigned int. We can avoid integer conversion bugs and unsigned int will be large enough for edac_mc_poll_msec.
Reviewed-by: James Morse james.morse@arm.com Fixes: 9da21b1509d8 ("EDAC: Poll timeout cannot be zero, p2") Signed-off-by: Eiichi Tsukata devel@etsukata.com Signed-off-by: Tony Luck tony.luck@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/edac/edac_mc_sysfs.c | 16 ++++++++-------- drivers/edac/edac_module.h | 2 +- 2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c index 7c01e1cc030c..4386ea4b9b5a 100644 --- a/drivers/edac/edac_mc_sysfs.c +++ b/drivers/edac/edac_mc_sysfs.c @@ -26,7 +26,7 @@ static int edac_mc_log_ue = 1; static int edac_mc_log_ce = 1; static int edac_mc_panic_on_ue; -static int edac_mc_poll_msec = 1000; +static unsigned int edac_mc_poll_msec = 1000;
/* Getter functions for above */ int edac_mc_get_log_ue(void) @@ -45,30 +45,30 @@ int edac_mc_get_panic_on_ue(void) }
/* this is temporary */ -int edac_mc_get_poll_msec(void) +unsigned int edac_mc_get_poll_msec(void) { return edac_mc_poll_msec; }
static int edac_set_poll_msec(const char *val, const struct kernel_param *kp) { - unsigned long l; + unsigned int i; int ret;
if (!val) return -EINVAL;
- ret = kstrtoul(val, 0, &l); + ret = kstrtouint(val, 0, &i); if (ret) return ret;
- if (l < 1000) + if (i < 1000) return -EINVAL;
- *((unsigned long *)kp->arg) = l; + *((unsigned int *)kp->arg) = i;
/* notify edac_mc engine to reset the poll period */ - edac_mc_reset_delay_period(l); + edac_mc_reset_delay_period(i);
return 0; } @@ -82,7 +82,7 @@ MODULE_PARM_DESC(edac_mc_log_ue, module_param(edac_mc_log_ce, int, 0644); MODULE_PARM_DESC(edac_mc_log_ce, "Log correctable error to console: 0=off 1=on"); -module_param_call(edac_mc_poll_msec, edac_set_poll_msec, param_get_int, +module_param_call(edac_mc_poll_msec, edac_set_poll_msec, param_get_uint, &edac_mc_poll_msec, 0644); MODULE_PARM_DESC(edac_mc_poll_msec, "Polling period in milliseconds");
diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h index dd7d0b509aa3..75528f07abd5 100644 --- a/drivers/edac/edac_module.h +++ b/drivers/edac/edac_module.h @@ -36,7 +36,7 @@ extern int edac_mc_get_log_ue(void); extern int edac_mc_get_log_ce(void); extern int edac_mc_get_panic_on_ue(void); extern int edac_get_poll_msec(void); -extern int edac_mc_get_poll_msec(void); +extern unsigned int edac_mc_get_poll_msec(void);
unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf, unsigned len);
[ Upstream commit e18953240de8b46360a67090c87ee1ef8160b35d ]
When an XDP program is set, a full reopen of all channels happens in two cases:
1. When there was no program set, and a new one is being set.
2. When there was a program set, but it's being unset.
The full reopen is necessary, because the channel parameters may change if XDP is enabled or disabled. However, it's performed in an unsafe way: if the new channels fail to open, the old ones are already closed, and the interface goes down. Use the safe way to switch channels instead. The same way is already used for other configuration changes.
Signed-off-by: Maxim Mikityanskiy maximmi@mellanox.com Signed-off-by: Tariq Toukan tariqt@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/en_main.c | 31 ++++++++++++------- 1 file changed, 20 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index a8e8350b38aa..8db9fdbc03ea 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4192,8 +4192,6 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) /* no need for full reset when exchanging programs */ reset = (!priv->channels.params.xdp_prog || !prog);
- if (was_opened && reset) - mlx5e_close_locked(netdev); if (was_opened && !reset) { /* num_channels is invariant here, so we can take the * batched reference right upfront. @@ -4205,20 +4203,31 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) } }
- /* exchange programs, extra prog reference we got from caller - * as long as we don't fail from this point onwards. - */ - old_prog = xchg(&priv->channels.params.xdp_prog, prog); + if (was_opened && reset) { + struct mlx5e_channels new_channels = {}; + + new_channels.params = priv->channels.params; + new_channels.params.xdp_prog = prog; + mlx5e_set_rq_type(priv->mdev, &new_channels.params); + old_prog = priv->channels.params.xdp_prog; + + err = mlx5e_safe_switch_channels(priv, &new_channels, NULL); + if (err) + goto unlock; + } else { + /* exchange programs, extra prog reference we got from caller + * as long as we don't fail from this point onwards. + */ + old_prog = xchg(&priv->channels.params.xdp_prog, prog); + } + if (old_prog) bpf_prog_put(old_prog);
- if (reset) /* change RQ type according to priv->xdp_prog */ + if (!was_opened && reset) /* change RQ type according to priv->xdp_prog */ mlx5e_set_rq_type(priv->mdev, &priv->channels.params);
- if (was_opened && reset) - err = mlx5e_open_locked(netdev); - - if (!test_bit(MLX5E_STATE_OPENED, &priv->state) || reset) + if (!was_opened || reset) goto unlock;
/* exchanging programs w/o reset, we update ref counts on behalf
[ Upstream commit 0ae49cb7aa005ed18fe8f4d6ccf73019b78ac7b2 ]
When everything is OK in bch_journal_read(), finally the return value is returned by, return ret; which assumes ret will be 0 here. This assumption is wrong when all journal buckets as are full and filled with valid journal entries. In such cache the last location referencess read_bucket() sets 'ret' to 1, which means new jset added into jset list. The jset list is list 'journal' in caller run_cache_set().
Return 1 to run_cache_set() means something wrong and the cache set won't start, but indeed everything is OK.
This patch changes the line at end of bch_journal_read() to directly return 0 since everything if verything is good. Then a bogus error is fixed.
Signed-off-by: Coly Li colyli@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/bcache/journal.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index 12dae9348147..4e5fc05720fc 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -268,7 +268,7 @@ int bch_journal_read(struct cache_set *c, struct list_head *list) struct journal_replay, list)->j.seq;
- return ret; + return 0; #undef read_bucket }
[ Upstream commit e775339e1ae1205b47d94881db124c11385e597c ]
If CACHE_SET_IO_DISABLE of a cache set flag is set by too many I/O errors, currently allocator routines can still continue allocate space which may introduce inconsistent metadata state.
This patch checkes CACHE_SET_IO_DISABLE bit in following allocator routines, - bch_bucket_alloc() - __bch_bucket_alloc_set() Once CACHE_SET_IO_DISABLE is set on cache set, the allocator routines may reject allocation request earlier to avoid potential inconsistent metadata.
Signed-off-by: Coly Li colyli@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/bcache/alloc.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c index f8986effcb50..6f776823b9ba 100644 --- a/drivers/md/bcache/alloc.c +++ b/drivers/md/bcache/alloc.c @@ -393,6 +393,11 @@ long bch_bucket_alloc(struct cache *ca, unsigned int reserve, bool wait) struct bucket *b; long r;
+ + /* No allocation if CACHE_SET_IO_DISABLE bit is set */ + if (unlikely(test_bit(CACHE_SET_IO_DISABLE, &ca->set->flags))) + return -1; + /* fastpath */ if (fifo_pop(&ca->free[RESERVE_NONE], r) || fifo_pop(&ca->free[reserve], r)) @@ -484,6 +489,10 @@ int __bch_bucket_alloc_set(struct cache_set *c, unsigned int reserve, { int i;
+ /* No allocation if CACHE_SET_IO_DISABLE bit is set */ + if (unlikely(test_bit(CACHE_SET_IO_DISABLE, &c->flags))) + return -1; + lockdep_assert_held(&c->bucket_lock); BUG_ON(!n || n > c->caches_loaded || n > MAX_CACHES_PER_SET);
[ Upstream commit 383ff2183ad16a8842d1fbd9dd3e1cbd66813e64 ]
When too many I/O errors happen on cache set and CACHE_SET_IO_DISABLE bit is set, bch_journal() may continue to work because the journaling bkey might be still in write set yet. The caller of bch_journal() may believe the journal still work but the truth is in-memory journal write set won't be written into cache device any more. This behavior may introduce potential inconsistent metadata status.
This patch checks CACHE_SET_IO_DISABLE bit at the head of bch_journal(), if the bit is set, bch_journal() returns NULL immediately to notice caller to know journal does not work.
Signed-off-by: Coly Li colyli@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/bcache/journal.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index 4e5fc05720fc..54f8886b6177 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -811,6 +811,10 @@ atomic_t *bch_journal(struct cache_set *c, struct journal_write *w; atomic_t *ret;
+ /* No journaling if CACHE_SET_IO_DISABLE set already */ + if (unlikely(test_bit(CACHE_SET_IO_DISABLE, &c->flags))) + return NULL; + if (!CACHE_SYNC(&c->sb)) return NULL;
[ Upstream commit 80265d8dfd77792e133793cef44a21323aac2908 ]
When enable lockdep engine, a lockdep warning can be observed when reboot or shutdown system,
[ 3142.764557][ T1] bcache: bcache_reboot() Stopping all devices: [ 3142.776265][ T2649] [ 3142.777159][ T2649] ====================================================== [ 3142.780039][ T2649] WARNING: possible circular locking dependency detected [ 3142.782869][ T2649] 5.2.0-rc4-lp151.20-default+ #1 Tainted: G W [ 3142.785684][ T2649] ------------------------------------------------------ [ 3142.788479][ T2649] kworker/3:67/2649 is trying to acquire lock: [ 3142.790738][ T2649] 00000000aaf02291 ((wq_completion)bcache_writeback_wq){+.+.}, at: flush_workqueue+0x87/0x4c0 [ 3142.794678][ T2649] [ 3142.794678][ T2649] but task is already holding lock: [ 3142.797402][ T2649] 000000004fcf89c5 (&bch_register_lock){+.+.}, at: cached_dev_free+0x17/0x120 [bcache] [ 3142.801462][ T2649] [ 3142.801462][ T2649] which lock already depends on the new lock. [ 3142.801462][ T2649] [ 3142.805277][ T2649] [ 3142.805277][ T2649] the existing dependency chain (in reverse order) is: [ 3142.808902][ T2649] [ 3142.808902][ T2649] -> #2 (&bch_register_lock){+.+.}: [ 3142.812396][ T2649] __mutex_lock+0x7a/0x9d0 [ 3142.814184][ T2649] cached_dev_free+0x17/0x120 [bcache] [ 3142.816415][ T2649] process_one_work+0x2a4/0x640 [ 3142.818413][ T2649] worker_thread+0x39/0x3f0 [ 3142.820276][ T2649] kthread+0x125/0x140 [ 3142.822061][ T2649] ret_from_fork+0x3a/0x50 [ 3142.823965][ T2649] [ 3142.823965][ T2649] -> #1 ((work_completion)(&cl->work)#2){+.+.}: [ 3142.827244][ T2649] process_one_work+0x277/0x640 [ 3142.829160][ T2649] worker_thread+0x39/0x3f0 [ 3142.830958][ T2649] kthread+0x125/0x140 [ 3142.832674][ T2649] ret_from_fork+0x3a/0x50 [ 3142.834915][ T2649] [ 3142.834915][ T2649] -> #0 ((wq_completion)bcache_writeback_wq){+.+.}: [ 3142.838121][ T2649] lock_acquire+0xb4/0x1c0 [ 3142.840025][ T2649] flush_workqueue+0xae/0x4c0 [ 3142.842035][ T2649] drain_workqueue+0xa9/0x180 [ 3142.844042][ T2649] destroy_workqueue+0x17/0x250 [ 3142.846142][ T2649] cached_dev_free+0x52/0x120 [bcache] [ 3142.848530][ T2649] process_one_work+0x2a4/0x640 [ 3142.850663][ T2649] worker_thread+0x39/0x3f0 [ 3142.852464][ T2649] kthread+0x125/0x140 [ 3142.854106][ T2649] ret_from_fork+0x3a/0x50 [ 3142.855880][ T2649] [ 3142.855880][ T2649] other info that might help us debug this: [ 3142.855880][ T2649] [ 3142.859663][ T2649] Chain exists of: [ 3142.859663][ T2649] (wq_completion)bcache_writeback_wq --> (work_completion)(&cl->work)#2 --> &bch_register_lock [ 3142.859663][ T2649] [ 3142.865424][ T2649] Possible unsafe locking scenario: [ 3142.865424][ T2649] [ 3142.868022][ T2649] CPU0 CPU1 [ 3142.869885][ T2649] ---- ---- [ 3142.871751][ T2649] lock(&bch_register_lock); [ 3142.873379][ T2649] lock((work_completion)(&cl->work)#2); [ 3142.876399][ T2649] lock(&bch_register_lock); [ 3142.879727][ T2649] lock((wq_completion)bcache_writeback_wq); [ 3142.882064][ T2649] [ 3142.882064][ T2649] *** DEADLOCK *** [ 3142.882064][ T2649] [ 3142.885060][ T2649] 3 locks held by kworker/3:67/2649: [ 3142.887245][ T2649] #0: 00000000e774cdd0 ((wq_completion)events){+.+.}, at: process_one_work+0x21e/0x640 [ 3142.890815][ T2649] #1: 00000000f7df89da ((work_completion)(&cl->work)#2){+.+.}, at: process_one_work+0x21e/0x640 [ 3142.894884][ T2649] #2: 000000004fcf89c5 (&bch_register_lock){+.+.}, at: cached_dev_free+0x17/0x120 [bcache] [ 3142.898797][ T2649] [ 3142.898797][ T2649] stack backtrace: [ 3142.900961][ T2649] CPU: 3 PID: 2649 Comm: kworker/3:67 Tainted: G W 5.2.0-rc4-lp151.20-default+ #1 [ 3142.904789][ T2649] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018 [ 3142.909168][ T2649] Workqueue: events cached_dev_free [bcache] [ 3142.911422][ T2649] Call Trace: [ 3142.912656][ T2649] dump_stack+0x85/0xcb [ 3142.914181][ T2649] print_circular_bug+0x19a/0x1f0 [ 3142.916193][ T2649] __lock_acquire+0x16cd/0x1850 [ 3142.917936][ T2649] ? __lock_acquire+0x6a8/0x1850 [ 3142.919704][ T2649] ? lock_acquire+0xb4/0x1c0 [ 3142.921335][ T2649] ? find_held_lock+0x34/0xa0 [ 3142.923052][ T2649] lock_acquire+0xb4/0x1c0 [ 3142.924635][ T2649] ? flush_workqueue+0x87/0x4c0 [ 3142.926375][ T2649] flush_workqueue+0xae/0x4c0 [ 3142.928047][ T2649] ? flush_workqueue+0x87/0x4c0 [ 3142.929824][ T2649] ? drain_workqueue+0xa9/0x180 [ 3142.931686][ T2649] drain_workqueue+0xa9/0x180 [ 3142.933534][ T2649] destroy_workqueue+0x17/0x250 [ 3142.935787][ T2649] cached_dev_free+0x52/0x120 [bcache] [ 3142.937795][ T2649] process_one_work+0x2a4/0x640 [ 3142.939803][ T2649] worker_thread+0x39/0x3f0 [ 3142.941487][ T2649] ? process_one_work+0x640/0x640 [ 3142.943389][ T2649] kthread+0x125/0x140 [ 3142.944894][ T2649] ? kthread_create_worker_on_cpu+0x70/0x70 [ 3142.947744][ T2649] ret_from_fork+0x3a/0x50 [ 3142.970358][ T2649] bcache: bcache_device_free() bcache0 stopped
Here is how the deadlock happens. 1) bcache_reboot() calls bcache_device_stop(), then inside bcache_device_stop() BCACHE_DEV_CLOSING bit is set on d->flags. Then closure_queue(&d->cl) is called to invoke cached_dev_flush(). 2) In cached_dev_flush(), cached_dev_free() is called by continu_at(). 3) In cached_dev_free(), when stopping the writeback kthread of the cached device by kthread_stop(), dc->writeback_thread will be waken up to quite the kthread while-loop, then cached_dev_put() is called in bch_writeback_thread(). 4) Calling cached_dev_put() in writeback kthread may drop dc->count to 0, then dc->detach kworker is scheduled, which is initialized as cached_dev_detach_finish(). 5) Inside cached_dev_detach_finish(), the last line of code is to call closure_put(&dc->disk.cl), which drops the last reference counter of closrure dc->disk.cl, then the callback cached_dev_flush() gets called. Now cached_dev_flush() is called for second time in the code path, the first time is in step 2). And again bch_register_lock will be acquired again, and a A-A lock (lockdep terminology) is happening.
The root cause of the above A-A lock is in cached_dev_free(), mutex bch_register_lock is held before stopping writeback kthread and other kworkers. Fortunately now we have variable 'bcache_is_reboot', which may prevent device registration or unregistration during reboot/shutdown time, so it is unncessary to hold bch_register_lock such early now.
This is how this patch fixes the reboot/shutdown time A-A lock issue: After moving mutex_lock(&bch_register_lock) to a later location where before atomic_read(&dc->running) in cached_dev_free(), such A-A lock problem can be solved without any reboot time registration race.
Signed-off-by: Coly Li colyli@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/bcache/super.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 1b63ac876169..0a25774e175a 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1190,8 +1190,6 @@ static void cached_dev_free(struct closure *cl) { struct cached_dev *dc = container_of(cl, struct cached_dev, disk.cl);
- mutex_lock(&bch_register_lock); - if (test_and_clear_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags)) cancel_writeback_rate_update_dwork(dc);
@@ -1202,6 +1200,8 @@ static void cached_dev_free(struct closure *cl) if (!IS_ERR_OR_NULL(dc->status_update_thread)) kthread_stop(dc->status_update_thread);
+ mutex_lock(&bch_register_lock); + if (atomic_read(&dc->running)) bd_unlink_disk_holder(dc->bdev, dc->disk.disk); bcache_device_free(&dc->disk);
[ Upstream commit b387e9b58679c60f5b1e4313939bd4878204fc37 ]
When system memory is in heavy pressure, bch_gc_thread_start() from run_cache_set() may fail due to out of memory. In such condition, c->gc_thread is assigned to -ENOMEM, not NULL pointer. Then in following failure code path bch_cache_set_error(), when cache_set_flush() gets called, the code piece to stop c->gc_thread is broken, if (!IS_ERR_OR_NULL(c->gc_thread)) kthread_stop(c->gc_thread);
And KASAN catches such NULL pointer deference problem, with the warning information:
[ 561.207881] ================================================================== [ 561.207900] BUG: KASAN: null-ptr-deref in kthread_stop+0x3b/0x440 [ 561.207904] Write of size 4 at addr 000000000000001c by task kworker/15:1/313
[ 561.207913] CPU: 15 PID: 313 Comm: kworker/15:1 Tainted: G W 5.0.0-vanilla+ #3 [ 561.207916] Hardware name: Lenovo ThinkSystem SR650 -[7X05CTO1WW]-/-[7X05CTO1WW]-, BIOS -[IVE136T-2.10]- 03/22/2019 [ 561.207935] Workqueue: events cache_set_flush [bcache] [ 561.207940] Call Trace: [ 561.207948] dump_stack+0x9a/0xeb [ 561.207955] ? kthread_stop+0x3b/0x440 [ 561.207960] ? kthread_stop+0x3b/0x440 [ 561.207965] kasan_report+0x176/0x192 [ 561.207973] ? kthread_stop+0x3b/0x440 [ 561.207981] kthread_stop+0x3b/0x440 [ 561.207995] cache_set_flush+0xd4/0x6d0 [bcache] [ 561.208008] process_one_work+0x856/0x1620 [ 561.208015] ? find_held_lock+0x39/0x1d0 [ 561.208028] ? drain_workqueue+0x380/0x380 [ 561.208048] worker_thread+0x87/0xb80 [ 561.208058] ? __kthread_parkme+0xb6/0x180 [ 561.208067] ? process_one_work+0x1620/0x1620 [ 561.208072] kthread+0x326/0x3e0 [ 561.208079] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 561.208090] ret_from_fork+0x3a/0x50 [ 561.208110] ================================================================== [ 561.208113] Disabling lock debugging due to kernel taint [ 561.208115] irq event stamp: 11800231 [ 561.208126] hardirqs last enabled at (11800231): [<ffffffff83008538>] do_syscall_64+0x18/0x410 [ 561.208127] BUG: unable to handle kernel NULL pointer dereference at 000000000000001c [ 561.208129] #PF error: [WRITE] [ 561.312253] hardirqs last disabled at (11800230): [<ffffffff830052ff>] trace_hardirqs_off_thunk+0x1a/0x1c [ 561.312259] softirqs last enabled at (11799832): [<ffffffff850005c7>] __do_softirq+0x5c7/0x8c3 [ 561.405975] PGD 0 P4D 0 [ 561.442494] softirqs last disabled at (11799821): [<ffffffff831add2c>] irq_exit+0x1ac/0x1e0 [ 561.791359] Oops: 0002 [#1] SMP KASAN NOPTI [ 561.791362] CPU: 15 PID: 313 Comm: kworker/15:1 Tainted: G B W 5.0.0-vanilla+ #3 [ 561.791363] Hardware name: Lenovo ThinkSystem SR650 -[7X05CTO1WW]-/-[7X05CTO1WW]-, BIOS -[IVE136T-2.10]- 03/22/2019 [ 561.791371] Workqueue: events cache_set_flush [bcache] [ 561.791374] RIP: 0010:kthread_stop+0x3b/0x440 [ 561.791376] Code: 00 00 65 8b 05 26 d5 e0 7c 89 c0 48 0f a3 05 ec aa df 02 0f 82 dc 02 00 00 4c 8d 63 20 be 04 00 00 00 4c 89 e7 e8 65 c5 53 00 <f0> ff 43 20 48 8d 7b 24 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 [ 561.791377] RSP: 0018:ffff88872fc8fd10 EFLAGS: 00010286 [ 561.838895] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 561.838916] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 561.838934] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 561.838948] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 561.838966] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 561.838979] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 561.838996] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 563.067028] RAX: 0000000000000000 RBX: fffffffffffffffc RCX: ffffffff832dd314 [ 563.067030] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000297 [ 563.067032] RBP: ffff88872fc8fe88 R08: fffffbfff0b8213d R09: fffffbfff0b8213d [ 563.067034] R10: 0000000000000001 R11: fffffbfff0b8213c R12: 000000000000001c [ 563.408618] R13: ffff88dc61cc0f68 R14: ffff888102b94900 R15: ffff88dc61cc0f68 [ 563.408620] FS: 0000000000000000(0000) GS:ffff888f7dc00000(0000) knlGS:0000000000000000 [ 563.408622] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 563.408623] CR2: 000000000000001c CR3: 0000000f48a1a004 CR4: 00000000007606e0 [ 563.408625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 563.408627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 563.904795] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 563.915796] PKRU: 55555554 [ 563.915797] Call Trace: [ 563.915807] cache_set_flush+0xd4/0x6d0 [bcache] [ 563.915812] process_one_work+0x856/0x1620 [ 564.001226] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 564.033563] ? find_held_lock+0x39/0x1d0 [ 564.033567] ? drain_workqueue+0x380/0x380 [ 564.033574] worker_thread+0x87/0xb80 [ 564.062823] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 564.118042] ? __kthread_parkme+0xb6/0x180 [ 564.118046] ? process_one_work+0x1620/0x1620 [ 564.118048] kthread+0x326/0x3e0 [ 564.118050] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 564.167066] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 564.252441] ret_from_fork+0x3a/0x50 [ 564.252447] Modules linked in: msr rpcrdma sunrpc rdma_ucm ib_iser ib_umad rdma_cm ib_ipoib i40iw configfs iw_cm ib_cm libiscsi scsi_transport_iscsi mlx4_ib ib_uverbs mlx4_en ib_core nls_iso8859_1 nls_cp437 vfat fat intel_rapl skx_edac x86_pkg_temp_thermal coretemp iTCO_wdt iTCO_vendor_support crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ses raid0 aesni_intel cdc_ether enclosure usbnet ipmi_ssif joydev aes_x86_64 i40e scsi_transport_sas mii bcache md_mod crypto_simd mei_me ioatdma crc64 ptp cryptd pcspkr i2c_i801 mlx4_core glue_helper pps_core mei lpc_ich dca wmi ipmi_si ipmi_devintf nd_pmem dax_pmem nd_btt ipmi_msghandler device_dax pcc_cpufreq button hid_generic usbhid mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect xhci_pci sysimgblt fb_sys_fops xhci_hcd ttm megaraid_sas drm usbcore nfit libnvdimm sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua efivarfs [ 564.299390] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree. [ 564.348360] CR2: 000000000000001c [ 564.348362] ---[ end trace b7f0e5cc7b2103b0 ]---
Therefore, it is not enough to only check whether c->gc_thread is NULL, we should use IS_ERR_OR_NULL() to check both NULL pointer and error value.
This patch changes the above buggy code piece in this way, if (!IS_ERR_OR_NULL(c->gc_thread)) kthread_stop(c->gc_thread);
Signed-off-by: Coly Li colyli@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/bcache/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 0a25774e175a..4cc8a300a557 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1564,7 +1564,7 @@ static void cache_set_flush(struct closure *cl) kobject_put(&c->internal); kobject_del(&c->kobj);
- if (c->gc_thread) + if (!IS_ERR_OR_NULL(c->gc_thread)) kthread_stop(c->gc_thread);
if (!IS_ERR_OR_NULL(c->root))
[ Upstream commit a59ff6ccc2bf2e2934b31bbf734f0bc04b5ec78a ]
It is quite frequently to observe deadlock in bcache_reboot() happens and hang the system reboot process. The reason is, in bcache_reboot() when calling bch_cache_set_stop() and bcache_device_stop() the mutex bch_register_lock is held. But in the process to stop cache set and bcache device, bch_register_lock will be acquired again. If this mutex is held here, deadlock will happen inside the stopping process. The aftermath of the deadlock is, whole system reboot gets hung.
The fix is to avoid holding bch_register_lock for the following loops in bcache_reboot(), list_for_each_entry_safe(c, tc, &bch_cache_sets, list) bch_cache_set_stop(c);
list_for_each_entry_safe(dc, tdc, &uncached_devices, list) bcache_device_stop(&dc->disk);
A module range variable 'bcache_is_reboot' is added, it sets to true in bcache_reboot(). In register_bcache(), if bcache_is_reboot is checked to be true, reject the registration by returning -EBUSY immediately.
Signed-off-by: Coly Li colyli@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/bcache/super.c | 40 ++++++++++++++++++++++++++++++++++++++- drivers/md/bcache/sysfs.c | 26 +++++++++++++++++++++++++ 2 files changed, 65 insertions(+), 1 deletion(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 4cc8a300a557..dcd8b319a01e 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -40,6 +40,7 @@ static const char invalid_uuid[] = {
static struct kobject *bcache_kobj; struct mutex bch_register_lock; +bool bcache_is_reboot; LIST_HEAD(bch_cache_sets); static LIST_HEAD(uncached_devices);
@@ -49,6 +50,7 @@ static wait_queue_head_t unregister_wait; struct workqueue_struct *bcache_wq; struct workqueue_struct *bch_journal_wq;
+ #define BTREE_MAX_PAGES (256 * 1024 / PAGE_SIZE) /* limitation of partitions number on single bcache device */ #define BCACHE_MINORS 128 @@ -2301,6 +2303,11 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr, if (!try_module_get(THIS_MODULE)) return -EBUSY;
+ /* For latest state of bcache_is_reboot */ + smp_mb(); + if (bcache_is_reboot) + return -EBUSY; + path = kstrndup(buffer, size, GFP_KERNEL); if (!path) goto err; @@ -2380,6 +2387,9 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
static int bcache_reboot(struct notifier_block *n, unsigned long code, void *x) { + if (bcache_is_reboot) + return NOTIFY_DONE; + if (code == SYS_DOWN || code == SYS_HALT || code == SYS_POWER_OFF) { @@ -2392,19 +2402,45 @@ static int bcache_reboot(struct notifier_block *n, unsigned long code, void *x)
mutex_lock(&bch_register_lock);
+ if (bcache_is_reboot) + goto out; + + /* New registration is rejected since now */ + bcache_is_reboot = true; + /* + * Make registering caller (if there is) on other CPU + * core know bcache_is_reboot set to true earlier + */ + smp_mb(); + if (list_empty(&bch_cache_sets) && list_empty(&uncached_devices)) goto out;
+ mutex_unlock(&bch_register_lock); + pr_info("Stopping all devices:");
+ /* + * The reason bch_register_lock is not held to call + * bch_cache_set_stop() and bcache_device_stop() is to + * avoid potential deadlock during reboot, because cache + * set or bcache device stopping process will acqurie + * bch_register_lock too. + * + * We are safe here because bcache_is_reboot sets to + * true already, register_bcache() will reject new + * registration now. bcache_is_reboot also makes sure + * bcache_reboot() won't be re-entered on by other thread, + * so there is no race in following list iteration by + * list_for_each_entry_safe(). + */ list_for_each_entry_safe(c, tc, &bch_cache_sets, list) bch_cache_set_stop(c);
list_for_each_entry_safe(dc, tdc, &uncached_devices, list) bcache_device_stop(&dc->disk);
- mutex_unlock(&bch_register_lock);
/* * Give an early chance for other kthreads and @@ -2531,6 +2567,8 @@ static int __init bcache_init(void) bch_debug_init(); closure_debug_init();
+ bcache_is_reboot = false; + return 0; err: bcache_exit(); diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index bfb437ffb13c..327493f634bb 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -16,6 +16,8 @@ #include <linux/sort.h> #include <linux/sched/clock.h>
+extern bool bcache_is_reboot; + /* Default is 0 ("writethrough") */ static const char * const bch_cache_modes[] = { "writethrough", @@ -271,6 +273,10 @@ STORE(__cached_dev) struct cache_set *c; struct kobj_uevent_env *env;
+ /* no user space access if system is rebooting */ + if (bcache_is_reboot) + return -EBUSY; + #define d_strtoul(var) sysfs_strtoul(var, dc->var) #define d_strtoul_nonzero(var) sysfs_strtoul_clamp(var, dc->var, 1, INT_MAX) #define d_strtoi_h(var) sysfs_hatoi(var, dc->var) @@ -408,6 +414,10 @@ STORE(bch_cached_dev) struct cached_dev *dc = container_of(kobj, struct cached_dev, disk.kobj);
+ /* no user space access if system is rebooting */ + if (bcache_is_reboot) + return -EBUSY; + mutex_lock(&bch_register_lock); size = __cached_dev_store(kobj, attr, buf, size);
@@ -511,6 +521,10 @@ STORE(__bch_flash_dev) kobj); struct uuid_entry *u = &d->c->uuids[d->id];
+ /* no user space access if system is rebooting */ + if (bcache_is_reboot) + return -EBUSY; + sysfs_strtoul(data_csum, d->data_csum);
if (attr == &sysfs_size) { @@ -746,6 +760,10 @@ STORE(__bch_cache_set) struct cache_set *c = container_of(kobj, struct cache_set, kobj); ssize_t v;
+ /* no user space access if system is rebooting */ + if (bcache_is_reboot) + return -EBUSY; + if (attr == &sysfs_unregister) bch_cache_set_unregister(c);
@@ -865,6 +883,10 @@ STORE(bch_cache_set_internal) { struct cache_set *c = container_of(kobj, struct cache_set, internal);
+ /* no user space access if system is rebooting */ + if (bcache_is_reboot) + return -EBUSY; + return bch_cache_set_store(&c->kobj, attr, buf, size); }
@@ -1050,6 +1072,10 @@ STORE(__bch_cache) struct cache *ca = container_of(kobj, struct cache, kobj); ssize_t v;
+ /* no user space access if system is rebooting */ + if (bcache_is_reboot) + return -EBUSY; + if (attr == &sysfs_discard) { bool v = strtoul_or_return(buf);
[ Upstream commit 7e865eba00a3df2dc8c4746173a8ca1c1c7f042e ]
When enable lockdep and reboot system with a writeback mode bcache device, the following potential deadlock warning is reported by lockdep engine.
[ 101.536569][ T401] kworker/2:2/401 is trying to acquire lock: [ 101.538575][ T401] 00000000bbf6e6c7 ((wq_completion)bcache_writeback_wq){+.+.}, at: flush_workqueue+0x87/0x4c0 [ 101.542054][ T401] [ 101.542054][ T401] but task is already holding lock: [ 101.544587][ T401] 00000000f5f305b3 ((work_completion)(&cl->work)#2){+.+.}, at: process_one_work+0x21e/0x640 [ 101.548386][ T401] [ 101.548386][ T401] which lock already depends on the new lock. [ 101.548386][ T401] [ 101.551874][ T401] [ 101.551874][ T401] the existing dependency chain (in reverse order) is: [ 101.555000][ T401] [ 101.555000][ T401] -> #1 ((work_completion)(&cl->work)#2){+.+.}: [ 101.557860][ T401] process_one_work+0x277/0x640 [ 101.559661][ T401] worker_thread+0x39/0x3f0 [ 101.561340][ T401] kthread+0x125/0x140 [ 101.562963][ T401] ret_from_fork+0x3a/0x50 [ 101.564718][ T401] [ 101.564718][ T401] -> #0 ((wq_completion)bcache_writeback_wq){+.+.}: [ 101.567701][ T401] lock_acquire+0xb4/0x1c0 [ 101.569651][ T401] flush_workqueue+0xae/0x4c0 [ 101.571494][ T401] drain_workqueue+0xa9/0x180 [ 101.573234][ T401] destroy_workqueue+0x17/0x250 [ 101.575109][ T401] cached_dev_free+0x44/0x120 [bcache] [ 101.577304][ T401] process_one_work+0x2a4/0x640 [ 101.579357][ T401] worker_thread+0x39/0x3f0 [ 101.581055][ T401] kthread+0x125/0x140 [ 101.582709][ T401] ret_from_fork+0x3a/0x50 [ 101.584592][ T401] [ 101.584592][ T401] other info that might help us debug this: [ 101.584592][ T401] [ 101.588355][ T401] Possible unsafe locking scenario: [ 101.588355][ T401] [ 101.590974][ T401] CPU0 CPU1 [ 101.592889][ T401] ---- ---- [ 101.594743][ T401] lock((work_completion)(&cl->work)#2); [ 101.596785][ T401] lock((wq_completion)bcache_writeback_wq); [ 101.600072][ T401] lock((work_completion)(&cl->work)#2); [ 101.602971][ T401] lock((wq_completion)bcache_writeback_wq); [ 101.605255][ T401] [ 101.605255][ T401] *** DEADLOCK *** [ 101.605255][ T401] [ 101.608310][ T401] 2 locks held by kworker/2:2/401: [ 101.610208][ T401] #0: 00000000cf2c7d17 ((wq_completion)events){+.+.}, at: process_one_work+0x21e/0x640 [ 101.613709][ T401] #1: 00000000f5f305b3 ((work_completion)(&cl->work)#2){+.+.}, at: process_one_work+0x21e/0x640 [ 101.617480][ T401] [ 101.617480][ T401] stack backtrace: [ 101.619539][ T401] CPU: 2 PID: 401 Comm: kworker/2:2 Tainted: G W 5.2.0-rc4-lp151.20-default+ #1 [ 101.623225][ T401] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018 [ 101.627210][ T401] Workqueue: events cached_dev_free [bcache] [ 101.629239][ T401] Call Trace: [ 101.630360][ T401] dump_stack+0x85/0xcb [ 101.631777][ T401] print_circular_bug+0x19a/0x1f0 [ 101.633485][ T401] __lock_acquire+0x16cd/0x1850 [ 101.635184][ T401] ? __lock_acquire+0x6a8/0x1850 [ 101.636863][ T401] ? lock_acquire+0xb4/0x1c0 [ 101.638421][ T401] ? find_held_lock+0x34/0xa0 [ 101.640015][ T401] lock_acquire+0xb4/0x1c0 [ 101.641513][ T401] ? flush_workqueue+0x87/0x4c0 [ 101.643248][ T401] flush_workqueue+0xae/0x4c0 [ 101.644832][ T401] ? flush_workqueue+0x87/0x4c0 [ 101.646476][ T401] ? drain_workqueue+0xa9/0x180 [ 101.648303][ T401] drain_workqueue+0xa9/0x180 [ 101.649867][ T401] destroy_workqueue+0x17/0x250 [ 101.651503][ T401] cached_dev_free+0x44/0x120 [bcache] [ 101.653328][ T401] process_one_work+0x2a4/0x640 [ 101.655029][ T401] worker_thread+0x39/0x3f0 [ 101.656693][ T401] ? process_one_work+0x640/0x640 [ 101.658501][ T401] kthread+0x125/0x140 [ 101.660012][ T401] ? kthread_create_worker_on_cpu+0x70/0x70 [ 101.661985][ T401] ret_from_fork+0x3a/0x50 [ 101.691318][ T401] bcache: bcache_device_free() bcache0 stopped
Here is how the above potential deadlock may happen in reboot/shutdown code path, 1) bcache_reboot() is called firstly in the reboot/shutdown code path, then in bcache_reboot(), bcache_device_stop() is called. 2) bcache_device_stop() sets BCACHE_DEV_CLOSING on d->falgs, then call closure_queue(&d->cl) to invoke cached_dev_flush(). And in turn cached_dev_flush() calls cached_dev_free() via closure_at() 3) In cached_dev_free(), after stopped writebach kthread dc->writeback_thread, the kwork dc->writeback_write_wq is stopping by destroy_workqueue(). 4) Inside destroy_workqueue(), drain_workqueue() is called. Inside drain_workqueue(), flush_workqueue() is called. Then wq->lockdep_map is acquired by lock_map_acquire() in flush_workqueue(). After the lock acquired the rest part of flush_workqueue() just wait for the workqueue to complete. 5) Now we look back at writeback thread routine bch_writeback_thread(), in the main while-loop, write_dirty() is called via continue_at() in read_dirty_submit(), which is called via continue_at() in while-loop level called function read_dirty(). Inside write_dirty() it may be re-called on workqueeu dc->writeback_write_wq via continue_at(). It means when the writeback kthread is stopped in cached_dev_free() there might be still one kworker queued on dc->writeback_write_wq to execute write_dirty() again. 6) Now this kworker is scheduled on dc->writeback_write_wq to run by process_one_work() (which is called by worker_thread()). Before calling the kwork routine, wq->lockdep_map is acquired. 7) But wq->lockdep_map is acquired already in step 4), so a A-A lock (lockdep terminology) scenario happens.
Indeed on multiple cores syatem, the above deadlock is very rare to happen, just as the code comments in process_one_work() says, 2263 * AFAICT there is no possible deadlock scenario between the 2264 * flush_work() and complete() primitives (except for single-threaded 2265 * workqueues), so hiding them isn't a problem.
But it is still good to fix such lockdep warning, even no one running bcache on single core system.
The fix is simple. This patch solves the above potential deadlock by, - Do not destroy workqueue dc->writeback_write_wq in cached_dev_free(). - Flush and destroy dc->writeback_write_wq in writebach kthread routine bch_writeback_thread(), where after quit the thread main while-loop and before cached_dev_put() is called.
By this fix, dc->writeback_write_wq will be stopped and destroy before the writeback kthread stopped, so the chance for a A-A locking on wq->lockdep_map is disappeared, such A-A deadlock won't happen any more.
Signed-off-by: Coly Li colyli@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/bcache/super.c | 2 -- drivers/md/bcache/writeback.c | 4 ++++ 2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index dcd8b319a01e..4ccc5e5fe3a1 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1197,8 +1197,6 @@ static void cached_dev_free(struct closure *cl)
if (!IS_ERR_OR_NULL(dc->writeback_thread)) kthread_stop(dc->writeback_thread); - if (dc->writeback_write_wq) - destroy_workqueue(dc->writeback_write_wq); if (!IS_ERR_OR_NULL(dc->status_update_thread)) kthread_stop(dc->status_update_thread);
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c index 73f0efac2b9f..df0f4e5a051a 100644 --- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -735,6 +735,10 @@ static int bch_writeback_thread(void *arg) } }
+ if (dc->writeback_write_wq) { + flush_workqueue(dc->writeback_write_wq); + destroy_workqueue(dc->writeback_write_wq); + } cached_dev_put(dc); wait_for_kthread_stop();
[ Upstream commit 18d219b783da61a6cc77581f55fc4af2fa16bc36 ]
When setting -Wformat=2, there is a compiler warning like this:
hclge_main.c:xxx:x: warning: format not a string literal and no format arguments [-Wformat-nonliteral] strs[i].desc); ^~~~
This patch adds missing format parameter "%s" to snprintf() to fix it.
Fixes: 46a3df9f9718 ("Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Yonglong Liu liuyonglong@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index 645b9b3e0256..f661281de36b 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -552,8 +552,7 @@ static u8 *hclge_comm_get_strings(u32 stringset, return buff;
for (i = 0; i < size; i++) { - snprintf(buff, ETH_GSTRING_LEN, - strs[i].desc); + snprintf(buff, ETH_GSTRING_LEN, "%s", strs[i].desc); buff = buff + ETH_GSTRING_LEN; }
[ Upstream commit 04f25edb48c441fc278ecc154c270f16966cbb90 ]
When hdev->tx_sch_mode is HCLGE_FLAG_VNET_BASE_SCH_MODE, the hclge_tm_schd_mode_vnet_base_cfg calls hclge_tm_pri_schd_mode_cfg with vport->vport_id as pri_id, which is used as index for hdev->tm_info.tc_info, it will cause out of bound access issue if vport_id is equal to or larger than HNAE3_MAX_TC.
Also hardware only support maximum speed of HCLGE_ETHER_MAX_RATE.
So this patch adds two checks for above cases.
Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver") Signed-off-by: Yunsheng Lin linyunsheng@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c index a7bbb6d3091a..0d53062f7bb5 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c @@ -54,7 +54,8 @@ static int hclge_shaper_para_calc(u32 ir, u8 shaper_level, u32 tick;
/* Calc tick */ - if (shaper_level >= HCLGE_SHAPER_LVL_CNT) + if (shaper_level >= HCLGE_SHAPER_LVL_CNT || + ir > HCLGE_ETHER_MAX_RATE) return -EINVAL;
tick = tick_array[shaper_level]; @@ -1124,6 +1125,9 @@ static int hclge_tm_schd_mode_vnet_base_cfg(struct hclge_vport *vport) int ret; u8 i;
+ if (vport->vport_id >= HNAE3_MAX_TC) + return -EINVAL; + ret = hclge_tm_pri_schd_mode_cfg(hdev, vport->vport_id); if (ret) return ret;
[ Upstream commit c709df58832c5f575f0255bea4b09ad477fc62ea ]
Currently the memory allocated for qmi handle is not being freed during de-init which leads to memory leak.
Free the allocated qmi memory in qmi deinit to avoid memory leak.
Tested HW: WCN3990 Tested FW: WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1
Fixes: fda6fee0001e ("ath10k: add QMI message handshake for wcn3990 client") Signed-off-by: Dundi Raviteja dundi@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/qmi.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/wireless/ath/ath10k/qmi.c b/drivers/net/wireless/ath/ath10k/qmi.c index a7bc2c70d076..8f8f717a23ee 100644 --- a/drivers/net/wireless/ath/ath10k/qmi.c +++ b/drivers/net/wireless/ath/ath10k/qmi.c @@ -1002,6 +1002,7 @@ int ath10k_qmi_deinit(struct ath10k *ar) qmi_handle_release(&qmi->qmi_hdl); cancel_work_sync(&qmi->event_work); destroy_workqueue(qmi->event_wq); + kfree(qmi); ar_snoc->qmi = NULL;
return 0;
[ Upstream commit 3ed39f8e747a7aafeec07bb244f2c3a1bdca5730 ]
The workqueue need to flush and destory while remove sdio module, otherwise it will have thread which is not destory after remove sdio modules.
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00007-QCARMSWP-1.
Signed-off-by: Wen Gong wgong@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/sdio.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c index 73ef3e75d199..28bdf0212538 100644 --- a/drivers/net/wireless/ath/ath10k/sdio.c +++ b/drivers/net/wireless/ath/ath10k/sdio.c @@ -2081,6 +2081,9 @@ static void ath10k_sdio_remove(struct sdio_func *func) cancel_work_sync(&ar_sdio->wr_async_work); ath10k_core_unregister(ar); ath10k_core_destroy(ar); + + flush_workqueue(ar_sdio->workqueue); + destroy_workqueue(ar_sdio->workqueue); }
static const struct sdio_device_id ath10k_sdio_devices[] = {
[ Upstream commit 8ec3ede559956f8ad58db7b57d25ac724bab69e9 ]
The Header Parser allows identifying various fields in the packet headers, used for various kind of filtering and classification steps.
This is a re-entrant process, where the offset in the packet header depends on the previous lookup results. This offset is represented in the SRAM results of the TCAM, as a shift to be operated.
This shift can be negative in some cases, such as in IPv6 parsing.
This commit prevents overriding the sign bit when setting the shift value, which could cause instabilities when parsing IPv6 flows.
Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Suggested-by: Alan Winkowski walan@marvell.com Signed-off-by: Maxime Chevallier maxime.chevallier@bootlin.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c index ae2240074d8e..5692c6087bbb 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c @@ -312,7 +312,8 @@ static void mvpp2_prs_sram_shift_set(struct mvpp2_prs_entry *pe, int shift, }
/* Set value */ - pe->sram[MVPP2_BIT_TO_WORD(MVPP2_PRS_SRAM_SHIFT_OFFS)] = shift & MVPP2_PRS_SRAM_SHIFT_MASK; + pe->sram[MVPP2_BIT_TO_WORD(MVPP2_PRS_SRAM_SHIFT_OFFS)] |= + shift & MVPP2_PRS_SRAM_SHIFT_MASK;
/* Reset and set operation */ mvpp2_prs_sram_bits_clear(pe, MVPP2_PRS_SRAM_OP_SEL_SHIFT_OFFS,
[ Upstream commit 1e08511d5d01884a3c9070afd52a47799312074a ]
If a packet which is utilizing the launchtime feature (via SO_TXTIME socket option) also requests the hardware transmit timestamp, the hardware timestamp is not delivered to the userspace. This is because the value in skb->tstamp is mistaken as the software timestamp.
Applications, like ptp4l, request a hardware timestamp by setting the SOF_TIMESTAMPING_TX_HARDWARE socket option. Whenever a new timestamp is detected by the driver (this work is done in igb_ptp_tx_work() which calls igb_ptp_tx_hwtstamps() in igb_ptp.c[1]), it will queue the timestamp in the ERR_QUEUE for the userspace to read. When the userspace is ready, it will issue a recvmsg() call to collect this timestamp. The problem is in this recvmsg() call. If the skb->tstamp is not cleared out, it will be interpreted as a software timestamp and the hardware tx timestamp will not be successfully sent to the userspace. Look at skb_is_swtx_tstamp() and the callee function __sock_recv_timestamp() in net/socket.c for more details.
Signed-off-by: Vedang Patel vedang.patel@intel.com Tested-by: Aaron Brown aaron.f.brown@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igb/igb_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 39f33afc479c..005c1693efc8 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -5687,6 +5687,7 @@ static void igb_tx_ctxtdesc(struct igb_ring *tx_ring, */ if (tx_ring->launchtime_enable) { ts = ns_to_timespec64(first->skb->tstamp); + first->skb->tstamp = 0; context_desc->seqnum_seed = cpu_to_le32(ts.tv_nsec / 32); } else { context_desc->seqnum_seed = 0;
[ Upstream commit bc3781edcea017aa1a29abd953b776cdba298ce2 ]
Local device and link partner config auto-negotiation on both, local device config pause frame use as: rx on/tx off, link partner config pause frame use as: rx off/tx on.
We except the result is: Local device: Autonegotiate: on RX: on TX: off RX negotiated: on TX negotiated: off
Link partner: Autonegotiate: on RX: off TX: on RX negotiated: off TX negotiated: on
But actually, the result of Local device and link partner is both: Autonegotiate: on RX: off TX: off RX negotiated: off TX negotiated: off
The root cause is that the supported flag is has only Pause, reference to the function genphy_config_advert(): static int genphy_config_advert(struct phy_device *phydev) { ... linkmode_and(phydev->advertising, phydev->advertising, phydev->supported); ... } The pause frame use of link partner is rx off/tx on, so its advertising only set the bit Asym_Pause, and the supported is only set the bit Pause, so the result of linkmode_and(), is rx off/tx off.
This patch adds Asym_Pause to the supported flag to fix it.
Signed-off-by: Yonglong Liu liuyonglong@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 1 + drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c | 7 +++++++ 2 files changed, 8 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index f661281de36b..bab04d2d674a 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -1057,6 +1057,7 @@ static void hclge_parse_copper_link_mode(struct hclge_dev *hdev, linkmode_set_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, supported); linkmode_set_bit(ETHTOOL_LINK_MODE_TP_BIT, supported); linkmode_set_bit(ETHTOOL_LINK_MODE_Pause_BIT, supported); + linkmode_set_bit(ETHTOOL_LINK_MODE_Asym_Pause_BIT, supported); }
static void hclge_parse_link_mode(struct hclge_dev *hdev, u8 speed_ability) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c index 1e8134892d77..32d6a59b731a 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c @@ -224,6 +224,13 @@ int hclge_mac_connect_phy(struct hnae3_handle *handle) linkmode_and(phydev->supported, phydev->supported, mask); linkmode_copy(phydev->advertising, phydev->supported);
+ /* supported flag is Pause and Asym Pause, but default advertising + * should be rx on, tx on, so need clear Asym Pause in advertising + * flag + */ + linkmode_clear_bit(ETHTOOL_LINK_MODE_Asym_Pause_BIT, + phydev->advertising); + return 0; }
[ Upstream commit 337d1727a3895775b5e5ef67d3ca0fea2e2ae768 ]
Assign OF node to CPSW slave devices, otherwise it is not possible to bind e.g. DSA switch to them. Without this patch, the DSA code tries to find the ethernet device by OF match, but fails to do so because the slave device has NULL OF node.
Signed-off-by: Marek Vasut marex@denx.de Cc: David S. Miller davem@davemloft.net Cc: Ivan Khoronzhuk ivan.khoronzhuk@linaro.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ti/cpsw.c | 3 +++ drivers/net/ethernet/ti/cpsw_priv.h | 1 + 2 files changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index 634fc484a0b3..4e3026f9abed 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -2179,6 +2179,7 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data, return ret; }
+ slave_data->slave_node = slave_node; slave_data->phy_node = of_parse_phandle(slave_node, "phy-handle", 0); parp = of_get_property(slave_node, "phy_id", &lenp); @@ -2330,6 +2331,7 @@ static int cpsw_probe_dual_emac(struct cpsw_priv *priv)
/* register the network device */ SET_NETDEV_DEV(ndev, cpsw->dev); + ndev->dev.of_node = cpsw->slaves[1].data->slave_node; ret = register_netdev(ndev); if (ret) dev_err(cpsw->dev, "cpsw: error registering net device\n"); @@ -2507,6 +2509,7 @@ static int cpsw_probe(struct platform_device *pdev)
/* register the network device */ SET_NETDEV_DEV(ndev, dev); + ndev->dev.of_node = cpsw->slaves[0].data->slave_node; ret = register_netdev(ndev); if (ret) { dev_err(dev, "error registering net device\n"); diff --git a/drivers/net/ethernet/ti/cpsw_priv.h b/drivers/net/ethernet/ti/cpsw_priv.h index 04795b97ee71..e32f11da2dce 100644 --- a/drivers/net/ethernet/ti/cpsw_priv.h +++ b/drivers/net/ethernet/ti/cpsw_priv.h @@ -272,6 +272,7 @@ struct cpsw_host_regs { };
struct cpsw_slave_data { + struct device_node *slave_node; struct device_node *phy_node; char phy_id[MII_BUS_ID_SIZE]; int phy_if;
[ Upstream commit 92924064106e410cdc015f1dbfc0499309f9f5b1 ]
An ipsec structure will not be allocated if the hardware does not support offload. Fixes the following Oops:
[ 191.045452] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 191.054232] Mem abort info: [ 191.057014] ESR = 0x96000004 [ 191.060057] Exception class = DABT (current EL), IL = 32 bits [ 191.065963] SET = 0, FnV = 0 [ 191.069004] EA = 0, S1PTW = 0 [ 191.072132] Data abort info: [ 191.074999] ISV = 0, ISS = 0x00000004 [ 191.078822] CM = 0, WnR = 0 [ 191.081780] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000043d9e467 [ 191.088382] [0000000000000000] pgd=0000000000000000 [ 191.093252] Internal error: Oops: 96000004 [#1] SMP [ 191.098119] Modules linked in: vhost_net vhost tap vfio_pci vfio_virqfd vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter devlink ebtables ip6table_filter ip6_tables iptable_filter bpfilter ipmi_ssif nls_iso8859_1 input_leds joydev ipmi_si hns_roce_hw_v2 ipmi_devintf hns_roce ipmi_msghandler cppc_cpufreq sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 ses enclosure btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor hid_generic usbhid hid raid6_pq libcrc32c raid1 raid0 multipath linear ixgbevf hibmc_drm ttm [ 191.168607] drm_kms_helper aes_ce_blk aes_ce_cipher syscopyarea crct10dif_ce sysfillrect ghash_ce qla2xxx sysimgblt sha2_ce sha256_arm64 hisi_sas_v3_hw fb_sys_fops sha1_ce uas nvme_fc mpt3sas ixgbe drm hisi_sas_main nvme_fabrics usb_storage hclge scsi_transport_fc ahci libsas hnae3 raid_class libahci xfrm_algo scsi_transport_sas mdio aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64 [ 191.202952] CPU: 94 PID: 0 Comm: swapper/94 Not tainted 4.19.0-rc1+ #11 [ 191.209553] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.20.01 04/26/2019 [ 191.218064] pstate: 20400089 (nzCv daIf +PAN -UAO) [ 191.222873] pc : ixgbe_ipsec_vf_clear+0x60/0xd0 [ixgbe] [ 191.228093] lr : ixgbe_msg_task+0x2d0/0x1088 [ixgbe] [ 191.233044] sp : ffff000009b3bcd0 [ 191.236346] x29: ffff000009b3bcd0 x28: 0000000000000000 [ 191.241647] x27: ffff000009628000 x26: 0000000000000000 [ 191.246946] x25: ffff803f652d7600 x24: 0000000000000004 [ 191.252246] x23: ffff803f6a718900 x22: 0000000000000000 [ 191.257546] x21: 0000000000000000 x20: 0000000000000000 [ 191.262845] x19: 0000000000000000 x18: 0000000000000000 [ 191.268144] x17: 0000000000000000 x16: 0000000000000000 [ 191.273443] x15: 0000000000000000 x14: 0000000100000026 [ 191.278742] x13: 0000000100000025 x12: ffff8a5f7fbe0df0 [ 191.284042] x11: 000000010000000b x10: 0000000000000040 [ 191.289341] x9 : 0000000000001100 x8 : ffff803f6a824fd8 [ 191.294640] x7 : ffff803f6a825098 x6 : 0000000000000001 [ 191.299939] x5 : ffff000000f0ffc0 x4 : 0000000000000000 [ 191.305238] x3 : ffff000028c00000 x2 : ffff803f652d7600 [ 191.310538] x1 : 0000000000000000 x0 : ffff000000f205f0 [ 191.315838] Process swapper/94 (pid: 0, stack limit = 0x00000000addfed5a) [ 191.322613] Call trace: [ 191.325055] ixgbe_ipsec_vf_clear+0x60/0xd0 [ixgbe] [ 191.329927] ixgbe_msg_task+0x2d0/0x1088 [ixgbe] [ 191.334536] ixgbe_msix_other+0x274/0x330 [ixgbe] [ 191.339233] __handle_irq_event_percpu+0x78/0x270 [ 191.343924] handle_irq_event_percpu+0x40/0x98 [ 191.348355] handle_irq_event+0x50/0xa8 [ 191.352180] handle_fasteoi_irq+0xbc/0x148 [ 191.356263] generic_handle_irq+0x34/0x50 [ 191.360259] __handle_domain_irq+0x68/0xc0 [ 191.364343] gic_handle_irq+0x84/0x180 [ 191.368079] el1_irq+0xe8/0x180 [ 191.371208] arch_cpu_idle+0x30/0x1a8 [ 191.374860] do_idle+0x1dc/0x2a0 [ 191.378077] cpu_startup_entry+0x2c/0x30 [ 191.381988] secondary_start_kernel+0x150/0x1e0 [ 191.386506] Code: 6b15003f 54000320 f1404a9f 54000060 (79400260)
Fixes: eda0333ac2930 ("ixgbe: add VF IPsec management") Signed-off-by: Dann Frazier dann.frazier@canonical.com Acked-by: Shannon Nelson snelson@pensando.io Tested-by: Andrew Bowers andrewx.bowers@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c index ff85ce5791a3..31629fc7e820 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c @@ -842,6 +842,9 @@ void ixgbe_ipsec_vf_clear(struct ixgbe_adapter *adapter, u32 vf) struct ixgbe_ipsec *ipsec = adapter->ipsec; int i;
+ if (!ipsec) + return; + /* search rx sa table */ for (i = 0; i < IXGBE_IPSEC_MAX_SA_COUNT && ipsec->num_rx_sa; i++) { if (!ipsec->rx_tbl[i].used)
[ Upstream commit ac70499ee97231a418dc1a4d6c9dc102e8f64631 ]
In some buggy scenarios we could possible attempt to transmit frames larger than maximum MSDU size. Since our devices don't know how to handle this, it may result in asserts, hangs etc. This can happen, for example, when we receive a large multicast frame and try to transmit it back to the air in AP mode. Since in a legal scenario this should never happen, drop such frames and warn about it.
Signed-off-by: Andrei Otcheretianski andrei.otcheretianski@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/tx.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c index 0c2aabc842f9..96f8d38ea321 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c @@ -726,6 +726,9 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct sk_buff *skb)
memcpy(&info, skb->cb, sizeof(info));
+ if (WARN_ON_ONCE(skb->len > IEEE80211_MAX_DATA_LEN + hdrlen)) + return -1; + if (WARN_ON_ONCE(info.flags & IEEE80211_TX_CTL_AMPDU)) return -1;
[ Upstream commit 0472301a28f6cf53a6bc5783e48a2d0bbff4682f ]
Merge commit 1c8c5a9d38f60 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next") undid the fix from commit 36f9814a494 ("bpf: fix uapi hole for 32 bit compat applications") by taking the gpl_compatible 1-bit field definition from commit b85fab0e67b162 ("bpf: Add gpl_compatible flag to struct bpf_prog_info") as is. That breaks architectures with 16-bit alignment like m68k. Add 31-bit pad after gpl_compatible to restore alignment of following fields.
Thanks to Dmitry V. Levin his analysis of this bug history.
Signed-off-by: Baruch Siach baruch@tkos.co.il Acked-by: Song Liu songliubraving@fb.com Cc: Jiri Olsa jolsa@kernel.org Cc: Daniel Borkmann daniel@iogearbox.net Cc: Geert Uytterhoeven geert@linux-m68k.org Cc: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/uapi/linux/bpf.h | 1 + tools/include/uapi/linux/bpf.h | 1 + 2 files changed, 2 insertions(+)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index a8b823c30b43..29a5bc3d5c66 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3143,6 +3143,7 @@ struct bpf_prog_info { char name[BPF_OBJ_NAME_LEN]; __u32 ifindex; __u32 gpl_compatible:1; + __u32 :31; /* alignment pad */ __u64 netns_dev; __u64 netns_ino; __u32 nr_jited_ksyms; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index a8b823c30b43..29a5bc3d5c66 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -3143,6 +3143,7 @@ struct bpf_prog_info { char name[BPF_OBJ_NAME_LEN]; __u32 ifindex; __u32 gpl_compatible:1; + __u32 :31; /* alignment pad */ __u64 netns_dev; __u64 netns_ino; __u32 nr_jited_ksyms;
[ Upstream commit 5d1549847c76b1ffcf8e388ef4d0f229bdd1d7e8 ]
Since v5.1-rc1, some types of packets do not get unreachable reply with the following iptables setting. Fox example,
$ iptables -A INPUT -p icmp --icmp-type 8 -j REJECT $ ping 127.0.0.1 -c 1 PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. — 127.0.0.1 ping statistics — 1 packets transmitted, 0 received, 100% packet loss, time 0ms
We should have got the following reply from command line, but we did not.
From 127.0.0.1 icmp_seq=1 Destination Port Unreachable
Yi Zhao reported it and narrowed it down to: 7fc38225363d ("netfilter: reject: skip csum verification for protocols that don't support it"),
This is because nf_ip_checksum still expects pseudo-header protocol type 0 for packets that are of neither TCP or UDP, and thus ICMP packets are mistakenly treated as TCP/UDP.
This patch corrects the conditions in nf_ip_checksum and all other places that still call it with protocol 0.
Fixes: 7fc38225363d ("netfilter: reject: skip csum verification for protocols that don't support it") Reported-by: Yi Zhao yi.zhao@windriver.com Signed-off-by: He Zhe zhe.he@windriver.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_conntrack_proto_icmp.c | 2 +- net/netfilter/nf_nat_proto.c | 2 +- net/netfilter/utils.c | 5 +++-- 3 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/net/netfilter/nf_conntrack_proto_icmp.c b/net/netfilter/nf_conntrack_proto_icmp.c index a824367ed518..dd53e2b20f6b 100644 --- a/net/netfilter/nf_conntrack_proto_icmp.c +++ b/net/netfilter/nf_conntrack_proto_icmp.c @@ -218,7 +218,7 @@ int nf_conntrack_icmpv4_error(struct nf_conn *tmpl, /* See ip_conntrack_proto_tcp.c */ if (state->net->ct.sysctl_checksum && state->hook == NF_INET_PRE_ROUTING && - nf_ip_checksum(skb, state->hook, dataoff, 0)) { + nf_ip_checksum(skb, state->hook, dataoff, IPPROTO_ICMP)) { icmp_error_log(skb, state, "bad hw icmp checksum"); return -NF_ACCEPT; } diff --git a/net/netfilter/nf_nat_proto.c b/net/netfilter/nf_nat_proto.c index 07da07788f6b..83a24cc5753b 100644 --- a/net/netfilter/nf_nat_proto.c +++ b/net/netfilter/nf_nat_proto.c @@ -564,7 +564,7 @@ int nf_nat_icmp_reply_translation(struct sk_buff *skb,
if (!skb_make_writable(skb, hdrlen + sizeof(*inside))) return 0; - if (nf_ip_checksum(skb, hooknum, hdrlen, 0)) + if (nf_ip_checksum(skb, hooknum, hdrlen, IPPROTO_ICMP)) return 0;
inside = (void *)skb->data + hdrlen; diff --git a/net/netfilter/utils.c b/net/netfilter/utils.c index 06dc55590441..51b454d8fa9c 100644 --- a/net/netfilter/utils.c +++ b/net/netfilter/utils.c @@ -17,7 +17,8 @@ __sum16 nf_ip_checksum(struct sk_buff *skb, unsigned int hook, case CHECKSUM_COMPLETE: if (hook != NF_INET_PRE_ROUTING && hook != NF_INET_LOCAL_IN) break; - if ((protocol == 0 && !csum_fold(skb->csum)) || + if ((protocol != IPPROTO_TCP && protocol != IPPROTO_UDP && + !csum_fold(skb->csum)) || !csum_tcpudp_magic(iph->saddr, iph->daddr, skb->len - dataoff, protocol, skb->csum)) { @@ -26,7 +27,7 @@ __sum16 nf_ip_checksum(struct sk_buff *skb, unsigned int hook, } /* fall through */ case CHECKSUM_NONE: - if (protocol == 0) + if (protocol != IPPROTO_TCP && protocol != IPPROTO_UDP) skb->csum = 0; else skb->csum = csum_tcpudp_nofold(iph->saddr, iph->daddr,
[ Upstream commit fc838c775f35e272e5cc7ef43853f0b55babbe37 ]
The driver should delay only in recording stop flow between writing to DBGC_IN_SAMPLE register and DBGC_OUT_CTRL register. Any other delay is not needed.
Change the following: 1. Remove any unnecessary delays in the flow 2. Increase the delay in the stop recording flow since 100 micro is not enough 3. Use usleep_range instead of delay since the driver is allowed to sleep in this flow.
Signed-off-by: Shahar S Matityahu shahar.s.matityahu@intel.com Fixes: 5cfe79c8d92a ("iwlwifi: fw: stop and start debugging using host command") Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/fw/dbg.c | 2 -- drivers/net/wireless/intel/iwlwifi/fw/dbg.h | 6 ++++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/fw/dbg.c b/drivers/net/wireless/intel/iwlwifi/fw/dbg.c index 33d7bc5500db..c875e173771c 100644 --- a/drivers/net/wireless/intel/iwlwifi/fw/dbg.c +++ b/drivers/net/wireless/intel/iwlwifi/fw/dbg.c @@ -2303,8 +2303,6 @@ void iwl_fw_dbg_collect_sync(struct iwl_fw_runtime *fwrt) /* start recording again if the firmware is not crashed */ if (!test_bit(STATUS_FW_ERROR, &fwrt->trans->status) && fwrt->fw->dbg.dest_tlv) { - /* wait before we collect the data till the DBGC stop */ - udelay(500); iwl_fw_dbg_restart_recording(fwrt, ¶ms); } } diff --git a/drivers/net/wireless/intel/iwlwifi/fw/dbg.h b/drivers/net/wireless/intel/iwlwifi/fw/dbg.h index fd0ad220e961..c5c015a66106 100644 --- a/drivers/net/wireless/intel/iwlwifi/fw/dbg.h +++ b/drivers/net/wireless/intel/iwlwifi/fw/dbg.h @@ -294,7 +294,10 @@ _iwl_fw_dbg_stop_recording(struct iwl_trans *trans, }
iwl_write_umac_prph(trans, DBGC_IN_SAMPLE, 0); - udelay(100); + /* wait for the DBGC to finish writing the internal buffer to DRAM to + * avoid halting the HW while writing + */ + usleep_range(700, 1000); iwl_write_umac_prph(trans, DBGC_OUT_CTRL, 0); #ifdef CONFIG_IWLWIFI_DEBUGFS trans->dbg_rec_on = false; @@ -324,7 +327,6 @@ _iwl_fw_dbg_restart_recording(struct iwl_trans *trans, iwl_set_bits_prph(trans, MON_BUFF_SAMPLE_CTL, 0x1); } else { iwl_write_umac_prph(trans, DBGC_IN_SAMPLE, params->in_sample); - udelay(100); iwl_write_umac_prph(trans, DBGC_OUT_CTRL, params->out_ctrl); } }
[ Upstream commit c20dc142dd7b2884b8570eeab323bcd4a84294fa ]
Some chips with older firmware can continue to perform DMA read from context memory even after the memory has been freed. In the PCI shutdown method, we need to call pci_disable_device() to shutdown DMA to prevent this DMA before we put the device into D3hot. DMA memory request in D3hot state will generate PCI fatal error. Similarly, in the driver remove method, the context memory should only be freed after DMA has been shutdown for correctness.
Fixes: 98f04cf0f1fc ("bnxt_en: Check context memory requirements from firmware.") Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index f758b2e0591f..b9bc829aa9da 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -10262,10 +10262,10 @@ static void bnxt_remove_one(struct pci_dev *pdev) bnxt_dcb_free(bp); kfree(bp->edev); bp->edev = NULL; + bnxt_cleanup_pci(bp); bnxt_free_ctx_mem(bp); kfree(bp->ctx); bp->ctx = NULL; - bnxt_cleanup_pci(bp); bnxt_free_port_stats(bp); free_netdev(dev); } @@ -10859,6 +10859,7 @@ static void bnxt_shutdown(struct pci_dev *pdev)
if (system_state == SYSTEM_POWER_OFF) { bnxt_clear_int_mode(bp); + pci_disable_device(pdev); pci_wake_from_d3(pdev, bp->wol); pci_set_power_state(pdev, PCI_D3hot); }
[ Upstream commit d77b1ad8e87dc5a6cd0d9158b097a4817946ca3b ]
The current logic assumes that the RDMA driver uses one statistics context adjacent to the ones used by the network driver. This assumption is not true and the statistics context used by the RDMA driver is tied to its MSIX base vector. This wrong assumption can cause RDMA driver failure after changing ethtool rings on the network side. Fix the statistics reservation logic accordingly.
Fixes: 780baad44f0f ("bnxt_en: Reserve 1 stat_ctx for RDMA driver.") Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index b9bc829aa9da..9090c79387c1 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -5508,7 +5508,16 @@ static int bnxt_cp_rings_in_use(struct bnxt *bp)
static int bnxt_get_func_stat_ctxs(struct bnxt *bp) { - return bp->cp_nr_rings + bnxt_get_ulp_stat_ctxs(bp); + int ulp_stat = bnxt_get_ulp_stat_ctxs(bp); + int cp = bp->cp_nr_rings; + + if (!ulp_stat) + return cp; + + if (bnxt_nq_rings_in_use(bp) > cp + bnxt_get_ulp_msix_num(bp)) + return bnxt_get_ulp_msix_base(bp) + ulp_stat; + + return cp + ulp_stat; }
static bool bnxt_need_reserve_rings(struct bnxt *bp) @@ -7477,11 +7486,7 @@ unsigned int bnxt_get_avail_cp_rings_for_en(struct bnxt *bp)
unsigned int bnxt_get_avail_stat_ctxs_for_en(struct bnxt *bp) { - unsigned int stat; - - stat = bnxt_get_max_func_stat_ctxs(bp) - bnxt_get_ulp_stat_ctxs(bp); - stat -= bp->cp_nr_rings; - return stat; + return bnxt_get_max_func_stat_ctxs(bp) - bnxt_get_func_stat_ctxs(bp); }
int bnxt_get_avail_msix(struct bnxt *bp, int num)
[ Upstream commit 1dbc59fa4bbaa108b641cd65a54f662b75e4ed36 ]
In an earlier commit to improve NQ reservations on 57500 chips, we set the resv_irqs on the 57500 VFs to the fixed value assigned by the PF regardless of how many are actually used. The current code assumes that resv_irqs minus the ones used by the network driver must be the ones for the RDMA driver. This is no longer true and we may return more MSIX vectors than requested, causing inconsistency. Fix it by capping the value.
Fixes: 01989c6b69d9 ("bnxt_en: Improve NQ reservations.") Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c index bfa342a98d08..fc77caf0a076 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c @@ -157,8 +157,10 @@ static int bnxt_req_msix_vecs(struct bnxt_en_dev *edev, int ulp_id,
if (BNXT_NEW_RM(bp)) { struct bnxt_hw_resc *hw_resc = &bp->hw_resc; + int resv_msix;
- avail_msix = hw_resc->resv_irqs - bp->cp_nr_rings; + resv_msix = hw_resc->resv_irqs - bp->cp_nr_rings; + avail_msix = min_t(int, resv_msix, avail_msix); edev->ulp_tbl[ulp_id].msix_requested = avail_msix; } bnxt_fill_msix_vecs(bp, ent);
[ Upstream commit 7c2b3629d09ddec810dc4c1d3a6657c32def8f71 ]
To save power, the hda hdmi driver in ASoC invokes snd_hdac_ext_bus_link_put to disable CORB/RIRB buffers DMA if there is no user of bus and invokes snd_hdac_ext_bus_link_get to set up CORB/RIRB buffers when it is used. Unsolicited responses is disabled in snd_hdac_bus_stop_cmd_io called by snd_hdac_ext_bus_link_put , but it is not enabled in snd_hdac_bus_init_cmd_io called by snd_hdac_ext_bus_link_get. So for put-get sequence, Unsolicited responses is disabled and headphone can't be detected by hda codecs.
Now unsolicited responses is only enabled in snd_hdac_bus_reset_link which resets controller. The function is only called for setup of controller. This patch enables Unsolicited responses after RIRB is initialized in snd_hdac_bus_init_cmd_io which works together with snd_hdac_bus_reset_link to set up controller.
Tested legacy hda driver and SOF driver on intel whiskeylake.
Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Rander Wang rander.wang@linux.intel.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/hda/hdac_controller.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c index b02f74528b66..812dc144fb5b 100644 --- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -79,6 +79,8 @@ void snd_hdac_bus_init_cmd_io(struct hdac_bus *bus) snd_hdac_chip_writew(bus, RINTCNT, 1); /* enable rirb dma and response irq */ snd_hdac_chip_writeb(bus, RIRBCTL, AZX_RBCTL_DMA_EN | AZX_RBCTL_IRQ_EN); + /* Accept unsolicited responses */ + snd_hdac_chip_updatel(bus, GCTL, AZX_GCTL_UNSOL, AZX_GCTL_UNSOL); spin_unlock_irq(&bus->reg_lock); } EXPORT_SYMBOL_GPL(snd_hdac_bus_init_cmd_io); @@ -415,9 +417,6 @@ int snd_hdac_bus_reset_link(struct hdac_bus *bus, bool full_reset) return -EBUSY; }
- /* Accept unsolicited responses */ - snd_hdac_chip_updatel(bus, GCTL, AZX_GCTL_UNSOL, AZX_GCTL_UNSOL); - /* detect codecs */ if (!bus->codec_mask) { bus->codec_mask = snd_hdac_chip_readw(bus, STATESTS);
[ Upstream commit 145c407c808352acd625be793396fd4f33c794f8 ]
After setting up metric groups through the event parser, the metricgroup code looks them up again in the event list.
Make sure we only look up events that haven't been used by some other metric. The data structures currently cannot handle more than one metric per event. This avoids problems with multiple events partially overlapping.
Signed-off-by: Andi Kleen ak@linux.intel.com Acked-by: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Link: http://lkml.kernel.org/r/20190624193711.35241-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/stat-shadow.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c index 83d8094be4fe..e545e2a8ae71 100644 --- a/tools/perf/util/stat-shadow.c +++ b/tools/perf/util/stat-shadow.c @@ -303,7 +303,7 @@ static struct perf_evsel *perf_stat__find_event(struct perf_evlist *evsel_list, struct perf_evsel *c2;
evlist__for_each_entry (evsel_list, c2) { - if (!strcasecmp(c2->name, name)) + if (!strcasecmp(c2->name, name) && !c2->collect_stat) return c2; } return NULL; @@ -342,7 +342,8 @@ void perf_stat__collect_metric_expr(struct perf_evlist *evsel_list) if (leader) { /* Search in group */ for_each_group_member (oc, leader) { - if (!strcasecmp(oc->name, metric_names[i])) { + if (!strcasecmp(oc->name, metric_names[i]) && + !oc->collect_stat) { found = true; break; }
[ Upstream commit e3a9427323a53ceee540276a74af7706f350d052 ]
Since Fixes: 8c5421c016a4 ("perf pmu: Display pmu name when printing unmerged events in stat") using --no-merge adds the PMU name to the evsel name.
This breaks the metric value lookup because the parser doesn't know about this.
Remove the extra postfixes for the metric evaluation.
Signed-off-by: Andi Kleen ak@linux.intel.com Acked-by: Jiri Olsa jolsa@kernel.org Cc: Agustin Vega-Frias agustinv@codeaurora.org Cc: Kan Liang kan.liang@linux.intel.com Fixes: 8c5421c016a4 ("perf pmu: Display pmu name when printing unmerged events in stat") Link: http://lkml.kernel.org/r/20190624193711.35241-5-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/stat-shadow.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c index e545e2a8ae71..0ef98e991ade 100644 --- a/tools/perf/util/stat-shadow.c +++ b/tools/perf/util/stat-shadow.c @@ -723,6 +723,7 @@ static void generic_metric(struct perf_stat_config *config, double ratio; int i; void *ctxp = out->ctx; + char *n, *pn;
expr__ctx_init(&pctx); expr__add_id(&pctx, name, avg); @@ -742,7 +743,19 @@ static void generic_metric(struct perf_stat_config *config, stats = &v->stats; scale = 1.0; } - expr__add_id(&pctx, metric_events[i]->name, avg_stats(stats)*scale); + + n = strdup(metric_events[i]->name); + if (!n) + return; + /* + * This display code with --no-merge adds [cpu] postfixes. + * These are not supported by the parser. Remove everything + * after the space. + */ + pn = strchr(n, ' '); + if (pn) + *pn = 0; + expr__add_id(&pctx, n, avg_stats(stats)*scale); } if (!metric_events[i]) { const char *p = metric_expr; @@ -759,6 +772,9 @@ static void generic_metric(struct perf_stat_config *config, (metric_name ? metric_name : name) : "", 0); } else print_metric(config, ctxp, NULL, NULL, "", 0); + + for (i = 1; i < pctx.num_ids; i++) + free((void *)pctx.ids[i].name); }
void perf_stat__print_shadow_stats(struct perf_stat_config *config,
[ Upstream commit 6c5f4e5cb35b4694dc035d91092d23f596ecd06a ]
Event merging is mainly to collapse similar events in lots of different duplicated PMUs.
It can break metric displaying. It's possible for two metrics to have the same event, and when the two events happen in a row the second wouldn't be displayed. This would also not show the second metric.
To avoid this don't merge events in the same PMU. This makes sense, if we have multiple events in the same PMU there is likely some reason for it (e.g. using multiple groups) and we better not merge them.
While in theory it would be possible to construct metrics that have events with the same name in different PMU no current metrics have this problem.
This is the fix for perf stat -M UPI,IPC (needs also another bug fix to completely work)
Signed-off-by: Andi Kleen ak@linux.intel.com Acked-by: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Fixes: 430daf2dc7af ("perf stat: Collapse identically named events") Link: http://lkml.kernel.org/r/20190624193711.35241-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/stat-display.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c index 4c53bae5644b..94bed4031def 100644 --- a/tools/perf/util/stat-display.c +++ b/tools/perf/util/stat-display.c @@ -542,7 +542,8 @@ static void collect_all_aliases(struct perf_stat_config *config, struct perf_evs alias->scale != counter->scale || alias->cgrp != counter->cgrp || strcmp(alias->unit, counter->unit) || - perf_evsel__is_clock(alias) != perf_evsel__is_clock(counter)) + perf_evsel__is_clock(alias) != perf_evsel__is_clock(counter) || + !strcmp(alias->pmu_name, counter->pmu_name)) break; alias->merged_stat = true; cb(config, alias, data, false);
[ Upstream commit 2f87f33f4226523df9c9cc28f9874ea02fcc3d3f ]
The metric group code tries to find a group it added earlier in the evlist. Fix the lookup to handle groups with partially overlaps correctly. When a sub string match fails and we reset the match, we have to compare the first element again.
I also renamed the find_evsel function to find_evsel_group to make its purpose clearer.
With the earlier changes this fixes:
Before:
% perf stat -M UPI,IPC sleep 1 ... 1,032,922 uops_retired.retire_slots # 1.1 UPI 1,896,096 inst_retired.any 1,896,096 inst_retired.any 1,177,254 cpu_clk_unhalted.thread
After:
% perf stat -M UPI,IPC sleep 1 ... 1,013,193 uops_retired.retire_slots # 1.1 UPI 932,033 inst_retired.any 932,033 inst_retired.any # 0.9 IPC 1,091,245 cpu_clk_unhalted.thread
Signed-off-by: Andi Kleen ak@linux.intel.com Acked-by: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Fixes: b18f3e365019 ("perf stat: Support JSON metrics in perf stat") Link: http://lkml.kernel.org/r/20190624193711.35241-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/metricgroup.c | 47 ++++++++++++++++++++++++++--------- 1 file changed, 35 insertions(+), 12 deletions(-)
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c index 699e020737d9..fabdb6dde88e 100644 --- a/tools/perf/util/metricgroup.c +++ b/tools/perf/util/metricgroup.c @@ -85,26 +85,49 @@ struct egroup { const char *metric_expr; };
-static struct perf_evsel *find_evsel(struct perf_evlist *perf_evlist, - const char **ids, - int idnum, - struct perf_evsel **metric_events) +static bool record_evsel(int *ind, struct perf_evsel **start, + int idnum, + struct perf_evsel **metric_events, + struct perf_evsel *ev) +{ + metric_events[*ind] = ev; + if (*ind == 0) + *start = ev; + if (++*ind == idnum) { + metric_events[*ind] = NULL; + return true; + } + return false; +} + +static struct perf_evsel *find_evsel_group(struct perf_evlist *perf_evlist, + const char **ids, + int idnum, + struct perf_evsel **metric_events) { struct perf_evsel *ev, *start = NULL; int ind = 0;
evlist__for_each_entry (perf_evlist, ev) { + if (ev->collect_stat) + continue; if (!strcmp(ev->name, ids[ind])) { - metric_events[ind] = ev; - if (ind == 0) - start = ev; - if (++ind == idnum) { - metric_events[ind] = NULL; + if (record_evsel(&ind, &start, idnum, + metric_events, ev)) return start; - } } else { + /* + * We saw some other event that is not + * in our list of events. Discard + * the whole match and start again. + */ ind = 0; start = NULL; + if (!strcmp(ev->name, ids[ind])) { + if (record_evsel(&ind, &start, idnum, + metric_events, ev)) + return start; + } } } /* @@ -134,8 +157,8 @@ static int metricgroup__setup_events(struct list_head *groups, ret = -ENOMEM; break; } - evsel = find_evsel(perf_evlist, eg->ids, eg->idnum, - metric_events); + evsel = find_evsel_group(perf_evlist, eg->ids, eg->idnum, + metric_events); if (!evsel) { pr_debug("Cannot resolve %s: %s\n", eg->metric_name, eg->metric_expr);
[ Upstream commit 7c31e54aeee517d1318dfc0bde9fa7de75893dc6 ]
__vxlan_dev_create() destroys FDB using specific pointer which indicates a fdb when error occurs. But that pointer should not be used when register_netdevice() fails because register_netdevice() internally destroys fdb when error occurs.
This patch makes vxlan_fdb_create() to do not link fdb entry to vxlan dev internally. Instead, a new function vxlan_fdb_insert() is added to link fdb to vxlan dev.
vxlan_fdb_insert() is called after calling register_netdevice(). This routine can avoid situation that ->ndo_uninit() destroys fdb entry in error path of register_netdevice(). Hence, error path of __vxlan_dev_create() routine can have an opportunity to destroy default fdb entry by hand.
Test command ip link add bonding_masters type vxlan id 0 group 239.1.1.1 \ dev enp0s9 dstport 4789
Splat looks like: [ 213.392816] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 213.401257] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 213.402178] CPU: 0 PID: 1414 Comm: ip Not tainted 5.2.0-rc5+ #256 [ 213.402178] RIP: 0010:vxlan_fdb_destroy+0x120/0x220 [vxlan] [ 213.402178] Code: df 48 8b 2b 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 06 01 00 00 4c 8b 63 08 48 b8 00 00 00 00 00 fc d [ 213.402178] RSP: 0018:ffff88810cb9f0a0 EFLAGS: 00010202 [ 213.402178] RAX: dffffc0000000000 RBX: ffff888101d4a8c8 RCX: 0000000000000000 [ 213.402178] RDX: 1bd5a00000000040 RSI: ffff888101d4a8c8 RDI: ffff888101d4a8d0 [ 213.402178] RBP: 0000000000000000 R08: fffffbfff22b72d9 R09: 0000000000000000 [ 213.402178] R10: 00000000ffffffef R11: 0000000000000000 R12: dead000000000200 [ 213.402178] R13: ffff88810cb9f1f8 R14: ffff88810efccda0 R15: ffff88810efccda0 [ 213.402178] FS: 00007f7f6621a0c0(0000) GS:ffff88811b000000(0000) knlGS:0000000000000000 [ 213.402178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 213.402178] CR2: 000055746f0807d0 CR3: 00000001123e0000 CR4: 00000000001006f0 [ 213.402178] Call Trace: [ 213.402178] __vxlan_dev_create+0x3a9/0x7d0 [vxlan] [ 213.402178] ? vxlan_changelink+0x740/0x740 [vxlan] [ 213.402178] ? rcu_read_unlock+0x60/0x60 [vxlan] [ 213.402178] ? __kasan_kmalloc.constprop.3+0xa0/0xd0 [ 213.402178] vxlan_newlink+0x8d/0xc0 [vxlan] [ 213.402178] ? __vxlan_dev_create+0x7d0/0x7d0 [vxlan] [ 213.554119] ? __netlink_ns_capable+0xc3/0xf0 [ 213.554119] __rtnl_newlink+0xb75/0x1180 [ 213.554119] ? rtnl_link_unregister+0x230/0x230 [ ... ]
Fixes: 0241b836732f ("vxlan: fix default fdb entry netlink notify ordering during netdev create") Suggested-by: Roopa Prabhu roopa@cumulusnetworks.com Signed-off-by: Taehee Yoo ap420073@gmail.com Acked-by: Roopa Prabhu roopa@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/vxlan.c | 37 +++++++++++++++++++++++++++---------- 1 file changed, 27 insertions(+), 10 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 083f3f0bf37f..b4283f52a09d 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -804,6 +804,14 @@ static struct vxlan_fdb *vxlan_fdb_alloc(struct vxlan_dev *vxlan, return f; }
+static void vxlan_fdb_insert(struct vxlan_dev *vxlan, const u8 *mac, + __be32 src_vni, struct vxlan_fdb *f) +{ + ++vxlan->addrcnt; + hlist_add_head_rcu(&f->hlist, + vxlan_fdb_head(vxlan, mac, src_vni)); +} + static int vxlan_fdb_create(struct vxlan_dev *vxlan, const u8 *mac, union vxlan_addr *ip, __u16 state, __be16 port, __be32 src_vni, @@ -829,18 +837,13 @@ static int vxlan_fdb_create(struct vxlan_dev *vxlan, return rc; }
- ++vxlan->addrcnt; - hlist_add_head_rcu(&f->hlist, - vxlan_fdb_head(vxlan, mac, src_vni)); - *fdb = f;
return 0; }
-static void vxlan_fdb_free(struct rcu_head *head) +static void __vxlan_fdb_free(struct vxlan_fdb *f) { - struct vxlan_fdb *f = container_of(head, struct vxlan_fdb, rcu); struct vxlan_rdst *rd, *nd;
list_for_each_entry_safe(rd, nd, &f->remotes, list) { @@ -850,6 +853,13 @@ static void vxlan_fdb_free(struct rcu_head *head) kfree(f); }
+static void vxlan_fdb_free(struct rcu_head *head) +{ + struct vxlan_fdb *f = container_of(head, struct vxlan_fdb, rcu); + + __vxlan_fdb_free(f); +} + static void vxlan_fdb_destroy(struct vxlan_dev *vxlan, struct vxlan_fdb *f, bool do_notify, bool swdev_notify) { @@ -977,6 +987,7 @@ static int vxlan_fdb_update_create(struct vxlan_dev *vxlan, if (rc < 0) return rc;
+ vxlan_fdb_insert(vxlan, mac, src_vni, f); rc = vxlan_fdb_notify(vxlan, f, first_remote_rtnl(f), RTM_NEWNEIGH, swdev_notify, extack); if (rc) @@ -3571,12 +3582,17 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev, if (err) goto errout;
- /* notify default fdb entry */ if (f) { + vxlan_fdb_insert(vxlan, all_zeros_mac, + vxlan->default_dst.remote_vni, f); + + /* notify default fdb entry */ err = vxlan_fdb_notify(vxlan, f, first_remote_rtnl(f), RTM_NEWNEIGH, true, extack); - if (err) - goto errout; + if (err) { + vxlan_fdb_destroy(vxlan, f, false, false); + goto unregister; + } }
list_add(&vxlan->next, &vn->vxlan_list); @@ -3588,7 +3604,8 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev, * destroy the entry by hand here. */ if (f) - vxlan_fdb_destroy(vxlan, f, false, false); + __vxlan_fdb_free(f); +unregister: if (unregister) unregister_netdevice(dev); return err;
[ Upstream commit 3c91f25c2f72ba6001775a5932857c1d2131c531 ]
Currently bnx2x ptp worker tries to read a register with timestamp information in case of TX packet timestamping and in case it fails, the routine reschedules itself indefinitely. This was reported as a kworker always at 100% of CPU usage, which was narrowed down to be bnx2x ptp_task.
By following the ioctl handler, we could narrow down the problem to an NTP tool (chrony) requesting HW timestamping from bnx2x NIC with RX filter zeroed; this isn't reproducible for example with ptp4l (from linuxptp) since this tool requests a supported RX filter. It seems NIC FW timestamp mechanism cannot work well with RX_FILTER_NONE - driver's PTP filter init routine skips a register write to the adapter if there's not a supported filter request.
This patch addresses the problem of bnx2x ptp thread's everlasting reschedule by retrying the register read 10 times; between the read attempts the thread sleeps for an increasing amount of time starting in 1ms to give FW some time to perform the timestamping. If it still fails after all retries, we bail out in order to prevent an unbound resource consumption from bnx2x.
The patch also adds an ethtool statistic for accounting the skipped TX timestamp packets and it reduces the priority of timestamping error messages to prevent log flooding. The code was tested using both linuxptp and chrony.
Reported-and-tested-by: Przemyslaw Hausman przemyslaw.hausman@canonical.com Suggested-by: Sudarsana Reddy Kalluru skalluru@marvell.com Signed-off-by: Guilherme G. Piccoli gpiccoli@canonical.com Acked-by: Sudarsana Reddy Kalluru skalluru@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 5 ++- .../ethernet/broadcom/bnx2x/bnx2x_ethtool.c | 4 ++- .../net/ethernet/broadcom/bnx2x/bnx2x_main.c | 33 ++++++++++++++----- .../net/ethernet/broadcom/bnx2x/bnx2x_stats.h | 3 ++ 4 files changed, 34 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c index 008ad0ca89ba..c12c1bab0fe4 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c @@ -3857,9 +3857,12 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) { if (!(bp->flags & TX_TIMESTAMPING_EN)) { + bp->eth_stats.ptp_skip_tx_ts++; BNX2X_ERR("Tx timestamping was not enabled, this packet will not be timestamped\n"); } else if (bp->ptp_tx_skb) { - BNX2X_ERR("The device supports only a single outstanding packet to timestamp, this packet will not be timestamped\n"); + bp->eth_stats.ptp_skip_tx_ts++; + netdev_err_once(bp->dev, + "Device supports only a single outstanding packet to timestamp, this packet won't be timestamped\n"); } else { skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS; /* schedule check for Tx timestamp */ diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c index 51fc845de31a..4a0ba6801c9e 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c @@ -182,7 +182,9 @@ static const struct { { STATS_OFFSET32(driver_filtered_tx_pkt), 4, false, "driver_filtered_tx_pkt" }, { STATS_OFFSET32(eee_tx_lpi), - 4, true, "Tx LPI entry count"} + 4, true, "Tx LPI entry count"}, + { STATS_OFFSET32(ptp_skip_tx_ts), + 4, false, "ptp_skipped_tx_tstamp" }, };
#define BNX2X_NUM_STATS ARRAY_SIZE(bnx2x_stats_arr) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c index 03ac10b1cd1e..2cc14db8f0ec 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c @@ -15214,11 +15214,24 @@ static void bnx2x_ptp_task(struct work_struct *work) u32 val_seq; u64 timestamp, ns; struct skb_shared_hwtstamps shhwtstamps; + bool bail = true; + int i; + + /* FW may take a while to complete timestamping; try a bit and if it's + * still not complete, may indicate an error state - bail out then. + */ + for (i = 0; i < 10; i++) { + /* Read Tx timestamp registers */ + val_seq = REG_RD(bp, port ? NIG_REG_P1_TLLH_PTP_BUF_SEQID : + NIG_REG_P0_TLLH_PTP_BUF_SEQID); + if (val_seq & 0x10000) { + bail = false; + break; + } + msleep(1 << i); + }
- /* Read Tx timestamp registers */ - val_seq = REG_RD(bp, port ? NIG_REG_P1_TLLH_PTP_BUF_SEQID : - NIG_REG_P0_TLLH_PTP_BUF_SEQID); - if (val_seq & 0x10000) { + if (!bail) { /* There is a valid timestamp value */ timestamp = REG_RD(bp, port ? NIG_REG_P1_TLLH_PTP_BUF_TS_MSB : NIG_REG_P0_TLLH_PTP_BUF_TS_MSB); @@ -15233,16 +15246,18 @@ static void bnx2x_ptp_task(struct work_struct *work) memset(&shhwtstamps, 0, sizeof(shhwtstamps)); shhwtstamps.hwtstamp = ns_to_ktime(ns); skb_tstamp_tx(bp->ptp_tx_skb, &shhwtstamps); - dev_kfree_skb_any(bp->ptp_tx_skb); - bp->ptp_tx_skb = NULL;
DP(BNX2X_MSG_PTP, "Tx timestamp, timestamp cycles = %llu, ns = %llu\n", timestamp, ns); } else { - DP(BNX2X_MSG_PTP, "There is no valid Tx timestamp yet\n"); - /* Reschedule to keep checking for a valid timestamp value */ - schedule_work(&bp->ptp_task); + DP(BNX2X_MSG_PTP, + "Tx timestamp is not recorded (register read=%u)\n", + val_seq); + bp->eth_stats.ptp_skip_tx_ts++; } + + dev_kfree_skb_any(bp->ptp_tx_skb); + bp->ptp_tx_skb = NULL; }
void bnx2x_set_rx_ts(struct bnx2x *bp, struct sk_buff *skb) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h index b2644ed13d06..d55e63692cf3 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h @@ -207,6 +207,9 @@ struct bnx2x_eth_stats { u32 driver_filtered_tx_pkt; /* src: Clear-on-Read register; Will not survive PMF Migration */ u32 eee_tx_lpi; + + /* PTP */ + u32 ptp_skip_tx_ts; };
struct bnx2x_eth_q_stats {
[ Upstream commit 78226f6eaac80bf30256a33a4926c194ceefdf36 ]
This is for fixing bug KMSAN: uninit-value in ax88772_bind
Tested by https://groups.google.com/d/msg/syzkaller-bugs/aFQurGotng4/eB_HlNhhCwAJ
Reported-by: syzbot+8a3fc6674bbc3978ed4e@syzkaller.appspotmail.com
syzbot found the following crash on:
HEAD commit: f75e4cfe kmsan: use kmsan_handle_urb() in urb.c git tree: kmsan console output: https://syzkaller.appspot.com/x/log.txt?x=136d720ea00000 kernel config: https://syzkaller.appspot.com/x/.config?x=602468164ccdc30a dashboard link: https://syzkaller.appspot.com/bug?extid=8a3fc6674bbc3978ed4e compiler: clang version 9.0.0 (/home/glider/llvm/clang 06d00afa61eef8f7f501ebdb4e8612ea43ec2d78) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12788316a00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=120359aaa00000
================================================================== BUG: KMSAN: uninit-value in is_valid_ether_addr include/linux/etherdevice.h:200 [inline] BUG: KMSAN: uninit-value in asix_set_netdev_dev_addr drivers/net/usb/asix_devices.c:73 [inline] BUG: KMSAN: uninit-value in ax88772_bind+0x93d/0x11e0 drivers/net/usb/asix_devices.c:724 CPU: 0 PID: 3348 Comm: kworker/0:2 Not tainted 5.1.0+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: usb_hub_wq hub_event Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x191/0x1f0 lib/dump_stack.c:113 kmsan_report+0x130/0x2a0 mm/kmsan/kmsan.c:622 __msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:310 is_valid_ether_addr include/linux/etherdevice.h:200 [inline] asix_set_netdev_dev_addr drivers/net/usb/asix_devices.c:73 [inline] ax88772_bind+0x93d/0x11e0 drivers/net/usb/asix_devices.c:724 usbnet_probe+0x10f5/0x3940 drivers/net/usb/usbnet.c:1728 usb_probe_interface+0xd66/0x1320 drivers/usb/core/driver.c:361 really_probe+0xdae/0x1d80 drivers/base/dd.c:513 driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671 __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778 bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454 __device_attach+0x454/0x730 drivers/base/dd.c:844 device_initial_probe+0x4a/0x60 drivers/base/dd.c:891 bus_probe_device+0x137/0x390 drivers/base/bus.c:514 device_add+0x288d/0x30e0 drivers/base/core.c:2106 usb_set_configuration+0x30dc/0x3750 drivers/usb/core/message.c:2027 generic_probe+0xe7/0x280 drivers/usb/core/generic.c:210 usb_probe_device+0x14c/0x200 drivers/usb/core/driver.c:266 really_probe+0xdae/0x1d80 drivers/base/dd.c:513 driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671 __device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778 bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454 __device_attach+0x454/0x730 drivers/base/dd.c:844 device_initial_probe+0x4a/0x60 drivers/base/dd.c:891 bus_probe_device+0x137/0x390 drivers/base/bus.c:514 device_add+0x288d/0x30e0 drivers/base/core.c:2106 usb_new_device+0x23e5/0x2ff0 drivers/usb/core/hub.c:2534 hub_port_connect drivers/usb/core/hub.c:5089 [inline] hub_port_connect_change drivers/usb/core/hub.c:5204 [inline] port_event drivers/usb/core/hub.c:5350 [inline] hub_event+0x48d1/0x7290 drivers/usb/core/hub.c:5432 process_one_work+0x1572/0x1f00 kernel/workqueue.c:2269 process_scheduled_works kernel/workqueue.c:2331 [inline] worker_thread+0x189c/0x2460 kernel/workqueue.c:2417 kthread+0x4b5/0x4f0 kernel/kthread.c:254 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:355
Signed-off-by: Phong Tran tranmanphong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/asix_devices.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c index c9bc96310ed4..ef548beba684 100644 --- a/drivers/net/usb/asix_devices.c +++ b/drivers/net/usb/asix_devices.c @@ -226,7 +226,7 @@ static void asix_phy_reset(struct usbnet *dev, unsigned int reset_bits) static int ax88172_bind(struct usbnet *dev, struct usb_interface *intf) { int ret = 0; - u8 buf[ETH_ALEN]; + u8 buf[ETH_ALEN] = {0}; int i; unsigned long gpio_bits = dev->driver_info->data;
@@ -677,7 +677,7 @@ static int asix_resume(struct usb_interface *intf) static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf) { int ret, i; - u8 buf[ETH_ALEN], chipcode = 0; + u8 buf[ETH_ALEN] = {0}, chipcode = 0; u32 phyid; struct asix_common_private *priv;
@@ -1061,7 +1061,7 @@ static const struct net_device_ops ax88178_netdev_ops = { static int ax88178_bind(struct usbnet *dev, struct usb_interface *intf) { int ret; - u8 buf[ETH_ALEN]; + u8 buf[ETH_ALEN] = {0};
usbnet_get_endpoints(dev,intf);
[ Upstream commit 99f0eae653b2db64917d0b58099eb51e300b311d ]
If the rxrpc_eproto tracepoint is enabled, an oops will be cause by the trace line that rxrpc_extract_header() tries to emit when a protocol error occurs (typically because the packet is short) because the call argument is NULL.
Fix this by using ?: to assume 0 as the debug_id if call is NULL.
This can then be induced by:
echo -e '\0\0\0\0\0\0\0\0' | ncat -4u --send-only <addr> 20001
where addr has the following program running on it:
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/socket.h> #include <arpa/inet.h> #include <linux/rxrpc.h> int main(void) { struct sockaddr_rxrpc srx; int fd; memset(&srx, 0, sizeof(srx)); srx.srx_family = AF_RXRPC; srx.srx_service = 0; srx.transport_type = AF_INET; srx.transport_len = sizeof(srx.transport.sin); srx.transport.sin.sin_family = AF_INET; srx.transport.sin.sin_port = htons(0x4e21); fd = socket(AF_RXRPC, SOCK_DGRAM, AF_INET6); bind(fd, (struct sockaddr *)&srx, sizeof(srx)); sleep(20); return 0; }
It results in the following oops.
BUG: kernel NULL pointer dereference, address: 0000000000000340 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page ... RIP: 0010:trace_event_raw_event_rxrpc_rx_eproto+0x47/0xac ... Call Trace: <IRQ> rxrpc_extract_header+0x86/0x171 ? rcu_read_lock_sched_held+0x5d/0x63 ? rxrpc_new_skb+0xd4/0x109 rxrpc_input_packet+0xef/0x14fc ? rxrpc_input_data+0x986/0x986 udp_queue_rcv_one_skb+0xbf/0x3d0 udp_unicast_rcv_skb.isra.8+0x64/0x71 ip_protocol_deliver_rcu+0xe4/0x1b4 ip_local_deliver+0xf0/0x154 __netif_receive_skb_one_core+0x50/0x6c netif_receive_skb_internal+0x26b/0x2e9 napi_gro_receive+0xf8/0x1da rtl8169_poll+0x303/0x4c4 net_rx_action+0x10e/0x333 __do_softirq+0x1a5/0x38f irq_exit+0x54/0xc4 do_IRQ+0xda/0xf8 common_interrupt+0xf/0xf </IRQ> ... ? cpuidle_enter_state+0x23c/0x34d cpuidle_enter+0x2a/0x36 do_idle+0x163/0x1ea cpu_startup_entry+0x1d/0x1f start_secondary+0x157/0x172 secondary_startup_64+0xa4/0xb0
Fixes: a25e21f0bcd2 ("rxrpc, afs: Use debug_ids rather than pointers in traces") Signed-off-by: David Howells dhowells@redhat.com Reviewed-by: Marc Dionne marc.dionne@auristor.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/trace/events/rxrpc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h index d85816878a52..cc1d060cbf13 100644 --- a/include/trace/events/rxrpc.h +++ b/include/trace/events/rxrpc.h @@ -1379,7 +1379,7 @@ TRACE_EVENT(rxrpc_rx_eproto, ),
TP_fast_assign( - __entry->call = call->debug_id; + __entry->call = call ? call->debug_id : 0; __entry->serial = serial; __entry->why = why; ),
[ Upstream commit cdfc7f888c2a355b01308e97c6df108f1c2b64e8 ]
GCC8 started emitting warning about using strncpy with number of bytes exactly equal destination size, which is generally unsafe, as can lead to non-zero terminated string being copied. Use IFNAMSIZ - 1 as number of bytes to ensure name is always zero-terminated.
Signed-off-by: Andrii Nakryiko andriin@fb.com Cc: Magnus Karlsson magnus.karlsson@intel.com Acked-by: Yonghong Song yhs@fb.com Acked-by: Magnus Karlsson magnus.karlsson@intel.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/xsk.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c index 38667b62f1fe..8a7a05bc657d 100644 --- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -337,7 +337,8 @@ static int xsk_get_max_queues(struct xsk_socket *xsk)
channels.cmd = ETHTOOL_GCHANNELS; ifr.ifr_data = (void *)&channels; - strncpy(ifr.ifr_name, xsk->ifname, IFNAMSIZ); + strncpy(ifr.ifr_name, xsk->ifname, IFNAMSIZ - 1); + ifr.ifr_name[IFNAMSIZ - 1] = '\0'; err = ioctl(fd, SIOCETHTOOL, &ifr); if (err && errno != EOPNOTSUPP) { ret = -errno;
[ Upstream commit 33bae185f74d49a0d7b1bfaafb8e959efce0f243 ]
Based on the following report from Smatch, fix the potential NULL pointer dereference check:
tools/lib/bpf/libbpf.c:3493 bpf_prog_load_xattr() warn: variable dereferenced before check 'attr' (see line 3483)
3479 int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, 3480 struct bpf_object **pobj, int *prog_fd) 3481 { 3482 struct bpf_object_open_attr open_attr = { 3483 .file = attr->file, 3484 .prog_type = attr->prog_type, ^^^^^^ 3485 };
At the head of function, it directly access 'attr' without checking if it's NULL pointer. This patch moves the values assignment after validating 'attr' and 'attr->file'.
Signed-off-by: Leo Yan leo.yan@linaro.org Acked-by: Yonghong Song yhs@fb.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/libbpf.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 151f7ac1882e..3865a5d27251 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -3487,10 +3487,7 @@ int bpf_prog_load(const char *file, enum bpf_prog_type type, int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, struct bpf_object **pobj, int *prog_fd) { - struct bpf_object_open_attr open_attr = { - .file = attr->file, - .prog_type = attr->prog_type, - }; + struct bpf_object_open_attr open_attr = {}; struct bpf_program *prog, *first_prog = NULL; enum bpf_attach_type expected_attach_type; enum bpf_prog_type prog_type; @@ -3503,6 +3500,9 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, if (!attr->file) return -EINVAL;
+ open_attr.file = attr->file; + open_attr.prog_type = attr->prog_type; + obj = bpf_object__open_xattr(&open_attr); if (IS_ERR_OR_NULL(obj)) return -ENOENT;
[ Upstream commit 11aca65ec4db09527d3e9b6b41a0615b7da4386b ]
Selftests are reporting this failure in test_lwt_seg6local.sh:
+ ip netns exec ns2 ip -6 route add fb00::6 encap bpf in obj test_lwt_seg6local.o sec encap_srh dev veth2 Error fetching program/map! Failed to parse eBPF program: Operation not permitted
The problem is __attribute__((always_inline)) alone is not enough to prevent clang from inserting those functions in .text. In that case, .text is not marked as relocateable.
See the output of objdump -h test_lwt_seg6local.o:
Idx Name Size VMA LMA File off Algn 0 .text 00003530 0000000000000000 0000000000000000 00000040 2**3 CONTENTS, ALLOC, LOAD, READONLY, CODE
This causes the iproute bpf loader to fail in bpf_fetch_prog_sec: bpf_has_call_data returns true but bpf_fetch_prog_relo fails as there's no relocateable .text section in the file.
To fix this, convert to 'static __always_inline'.
v2: Use 'static __always_inline' instead of 'static inline __attribute__((always_inline))'
Fixes: c99a84eac026 ("selftests/bpf: test for seg6local End.BPF action") Signed-off-by: Jiri Benc jbenc@redhat.com Acked-by: Yonghong Song yhs@fb.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../testing/selftests/bpf/progs/test_lwt_seg6local.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/test_lwt_seg6local.c b/tools/testing/selftests/bpf/progs/test_lwt_seg6local.c index 0575751bc1bc..e2f6ed0a583d 100644 --- a/tools/testing/selftests/bpf/progs/test_lwt_seg6local.c +++ b/tools/testing/selftests/bpf/progs/test_lwt_seg6local.c @@ -61,7 +61,7 @@ struct sr6_tlv_t { unsigned char value[0]; } BPF_PACKET_HEADER;
-__attribute__((always_inline)) struct ip6_srh_t *get_srh(struct __sk_buff *skb) +static __always_inline struct ip6_srh_t *get_srh(struct __sk_buff *skb) { void *cursor, *data_end; struct ip6_srh_t *srh; @@ -95,7 +95,7 @@ __attribute__((always_inline)) struct ip6_srh_t *get_srh(struct __sk_buff *skb) return srh; }
-__attribute__((always_inline)) +static __always_inline int update_tlv_pad(struct __sk_buff *skb, uint32_t new_pad, uint32_t old_pad, uint32_t pad_off) { @@ -125,7 +125,7 @@ int update_tlv_pad(struct __sk_buff *skb, uint32_t new_pad, return 0; }
-__attribute__((always_inline)) +static __always_inline int is_valid_tlv_boundary(struct __sk_buff *skb, struct ip6_srh_t *srh, uint32_t *tlv_off, uint32_t *pad_size, uint32_t *pad_off) @@ -184,7 +184,7 @@ int is_valid_tlv_boundary(struct __sk_buff *skb, struct ip6_srh_t *srh, return 0; }
-__attribute__((always_inline)) +static __always_inline int add_tlv(struct __sk_buff *skb, struct ip6_srh_t *srh, uint32_t tlv_off, struct sr6_tlv_t *itlv, uint8_t tlv_size) { @@ -228,7 +228,7 @@ int add_tlv(struct __sk_buff *skb, struct ip6_srh_t *srh, uint32_t tlv_off, return update_tlv_pad(skb, new_pad, pad_size, pad_off); }
-__attribute__((always_inline)) +static __always_inline int delete_tlv(struct __sk_buff *skb, struct ip6_srh_t *srh, uint32_t tlv_off) { @@ -266,7 +266,7 @@ int delete_tlv(struct __sk_buff *skb, struct ip6_srh_t *srh, return update_tlv_pad(skb, new_pad, pad_size, pad_off); }
-__attribute__((always_inline)) +static __always_inline int has_egr_tlv(struct __sk_buff *skb, struct ip6_srh_t *srh) { int tlv_offset = sizeof(struct ip6_t) + sizeof(struct ip6_srh_t) +
[ Upstream commit 9d1bc24b52fb8c5d859f9a47084bf1179470e04c ]
bond_xmit_roundrobin() checks for IGMP packets but it parses the IP header even before checking skb->protocol.
We should validate the IP header with pskb_may_pull() before using iph->protocol.
Reported-and-tested-by: syzbot+e5be16aa39ad6e755391@syzkaller.appspotmail.com Fixes: a2fd940f4cff ("bonding: fix broken multicast with round-robin mode") Cc: Jay Vosburgh j.vosburgh@gmail.com Cc: Veaceslav Falico vfalico@gmail.com Cc: Andy Gospodarek andy@greyhouse.net Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 37 ++++++++++++++++++++------------- 1 file changed, 23 insertions(+), 14 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 799fc38c5c34..b0aab3a0a1bf 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -3866,8 +3866,8 @@ static netdev_tx_t bond_xmit_roundrobin(struct sk_buff *skb, struct net_device *bond_dev) { struct bonding *bond = netdev_priv(bond_dev); - struct iphdr *iph = ip_hdr(skb); struct slave *slave; + int slave_cnt; u32 slave_id;
/* Start with the curr_active_slave that joined the bond as the @@ -3876,23 +3876,32 @@ static netdev_tx_t bond_xmit_roundrobin(struct sk_buff *skb, * send the join/membership reports. The curr_active_slave found * will send all of this type of traffic. */ - if (iph->protocol == IPPROTO_IGMP && skb->protocol == htons(ETH_P_IP)) { - slave = rcu_dereference(bond->curr_active_slave); - if (slave) - bond_dev_queue_xmit(bond, skb, slave->dev); - else - bond_xmit_slave_id(bond, skb, 0); - } else { - int slave_cnt = READ_ONCE(bond->slave_cnt); + if (skb->protocol == htons(ETH_P_IP)) { + int noff = skb_network_offset(skb); + struct iphdr *iph;
- if (likely(slave_cnt)) { - slave_id = bond_rr_gen_slave_id(bond); - bond_xmit_slave_id(bond, skb, slave_id % slave_cnt); - } else { - bond_tx_drop(bond_dev, skb); + if (unlikely(!pskb_may_pull(skb, noff + sizeof(*iph)))) + goto non_igmp; + + iph = ip_hdr(skb); + if (iph->protocol == IPPROTO_IGMP) { + slave = rcu_dereference(bond->curr_active_slave); + if (slave) + bond_dev_queue_xmit(bond, skb, slave->dev); + else + bond_xmit_slave_id(bond, skb, 0); + return NETDEV_TX_OK; } }
+non_igmp: + slave_cnt = READ_ONCE(bond->slave_cnt); + if (likely(slave_cnt)) { + slave_id = bond_rr_gen_slave_id(bond); + bond_xmit_slave_id(bond, skb, slave_id % slave_cnt); + } else { + bond_tx_drop(bond_dev, skb); + } return NETDEV_TX_OK; }
[ Upstream commit 3285170f28a850638794cdfe712eb6d93e51e706 ]
Commit 372e722ea4dd4ca1 ("gpiolib: use descriptors internally") renamed the functions to use a "gpiod" prefix, and commit 79a9becda8940deb ("gpiolib: export descriptor-based GPIO interface") introduced the "raw" variants, but both changes forgot to update the comments.
Readd a similar reference to gpiod_set_value(), which was accidentally removed by commit 1e77fc82110ac36f ("gpio: Add missing open drain/source handling to gpiod_set_value_cansleep()").
Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://lore.kernel.org/r/20190701142738.25219-1-geert+renesas@glider.be Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpiolib.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index be1d1d2f8aaa..bb3104d2eb0c 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -3025,7 +3025,7 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep, int gpiod_get_raw_value(const struct gpio_desc *desc) { VALIDATE_DESC(desc); - /* Should be using gpio_get_value_cansleep() */ + /* Should be using gpiod_get_raw_value_cansleep() */ WARN_ON(desc->gdev->chip->can_sleep); return gpiod_get_raw_value_commit(desc); } @@ -3046,7 +3046,7 @@ int gpiod_get_value(const struct gpio_desc *desc) int value;
VALIDATE_DESC(desc); - /* Should be using gpio_get_value_cansleep() */ + /* Should be using gpiod_get_value_cansleep() */ WARN_ON(desc->gdev->chip->can_sleep);
value = gpiod_get_raw_value_commit(desc); @@ -3317,7 +3317,7 @@ int gpiod_set_array_value_complex(bool raw, bool can_sleep, void gpiod_set_raw_value(struct gpio_desc *desc, int value) { VALIDATE_DESC_VOID(desc); - /* Should be using gpiod_set_value_cansleep() */ + /* Should be using gpiod_set_raw_value_cansleep() */ WARN_ON(desc->gdev->chip->can_sleep); gpiod_set_raw_value_commit(desc, value); } @@ -3358,6 +3358,7 @@ static void gpiod_set_value_nocheck(struct gpio_desc *desc, int value) void gpiod_set_value(struct gpio_desc *desc, int value) { VALIDATE_DESC_VOID(desc); + /* Should be using gpiod_set_value_cansleep() */ WARN_ON(desc->gdev->chip->can_sleep); gpiod_set_value_nocheck(desc, value); }
[ Upstream commit 8dd8f005bdd45823fc153ef490239558caf6ff20 ]
We make the invalid assumption in arm_smmu_detach_dev() that the ATC is clear after calling pci_disable_ats(). For one thing, only enabling the PCIe ATS capability constitutes an implicit invalidation event, so the comment was wrong. More importantly, the ATS capability isn't necessarily disabled by pci_disable_ats() in a PF, if the associated VFs have ATS enabled. Explicitly invalidate all ATC entries in arm_smmu_detach_dev(). The endpoint cannot form new ATC entries because STE.EATS is clear.
Fixes: 9ce27afc0830 ("iommu/arm-smmu-v3: Add support for PCI ATS") Reported-by: Manoj Kumar Manoj.Kumar3@arm.com Reported-by: Robin Murphy Robin.Murphy@arm.com Signed-off-by: Jean-Philippe Brucker jean-philippe.brucker@arm.com Acked-by: Will Deacon will@kernel.org Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/arm-smmu-v3.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 4d5a694f02c2..0fee8f7957ec 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -1884,9 +1884,13 @@ static int arm_smmu_enable_ats(struct arm_smmu_master *master)
static void arm_smmu_disable_ats(struct arm_smmu_master *master) { + struct arm_smmu_cmdq_ent cmd; + if (!master->ats_enabled || !dev_is_pci(master->dev)) return;
+ arm_smmu_atc_inv_to_cmd(0, 0, 0, &cmd); + arm_smmu_atc_inv_master(master, &cmd); pci_disable_ats(to_pci_dev(master->dev)); master->ats_enabled = false; } @@ -1906,7 +1910,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master) master->domain = NULL; arm_smmu_install_ste_for_dev(master);
- /* Disabling ATS invalidates all ATC entries */ arm_smmu_disable_ats(master); }
[ Upstream commit 1bcc1fd64e4dd903f4d868a9e053986e3b102713 ]
After calling of_node_put() on the codec_ep and codec_port variables, they are still being used, which may result in use-after-free. We fix this issue by calling of_node_put() after the last usage.
Fixes: fce9b90c1ab7 ("ASoC: audio-graph-card: cleanup DAI link loop method - step2") Signed-off-by: Wen Yang wen.yang99@zte.com.cn Cc: Liam Girdwood lgirdwood@gmail.com Cc: Mark Brown broonie@kernel.org Cc: Jaroslav Kysela perex@perex.cz Cc: Takashi Iwai tiwai@suse.com Cc: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Cc: alsa-devel@alsa-project.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/1562229530-8121-1-git-send-email-wen.yang99@zte.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/generic/audio-graph-card.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/sound/soc/generic/audio-graph-card.c b/sound/soc/generic/audio-graph-card.c index ec7e673ba475..70ed28d97d49 100644 --- a/sound/soc/generic/audio-graph-card.c +++ b/sound/soc/generic/audio-graph-card.c @@ -435,9 +435,6 @@ static int graph_for_each_link(struct asoc_simple_priv *priv, codec_ep = of_graph_get_remote_endpoint(cpu_ep); codec_port = of_get_parent(codec_ep);
- of_node_put(codec_ep); - of_node_put(codec_port); - /* get convert-xxx property */ memset(&adata, 0, sizeof(adata)); graph_parse_convert(dev, codec_ep, &adata); @@ -457,6 +454,9 @@ static int graph_for_each_link(struct asoc_simple_priv *priv, else ret = func_noml(priv, cpu_ep, codec_ep, li);
+ of_node_put(codec_ep); + of_node_put(codec_port); + if (ret < 0) return ret;
[ Upstream commit aa52bcbe0e72fac36b1862db08b9c09c4caefae3 ]
Michael reported crash with by bpf program in json mode on powerpc:
# bpftool prog -p dump jited id 14 [{ "name": "0xd00000000a9aa760", "insns": [{ "pc": "0x0", "operation": "nop", "operands": [null ] },{ "pc": "0x4", "operation": "nop", "operands": [null ] },{ "pc": "0x8", "operation": "mflr", Segmentation fault (core dumped)
The code is assuming char pointers in format, which is not always true at least for powerpc. Fixing this by dumping the whole string into buffer based on its format.
Please note that libopcodes code does not check return values from fprintf callback, but as per Jakub suggestion returning -1 on allocation failure so we do the best effort to propagate the error.
Fixes: 107f041212c1 ("tools: bpftool: add JSON output for `bpftool prog dump jited *` command") Reported-by: Michael Petlan mpetlan@redhat.com Signed-off-by: Jiri Olsa jolsa@kernel.org Reviewed-by: Quentin Monnet quentin.monnet@netronome.com Reviewed-by: Jakub Kicinski jakub.kicinski@netronome.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/bpf/bpftool/jit_disasm.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/tools/bpf/bpftool/jit_disasm.c b/tools/bpf/bpftool/jit_disasm.c index 3ef3093560ba..bfed711258ce 100644 --- a/tools/bpf/bpftool/jit_disasm.c +++ b/tools/bpf/bpftool/jit_disasm.c @@ -11,6 +11,8 @@ * Licensed under the GNU General Public License, version 2.0 (GPLv2) */
+#define _GNU_SOURCE +#include <stdio.h> #include <stdarg.h> #include <stdint.h> #include <stdio.h> @@ -44,11 +46,13 @@ static int fprintf_json(void *out, const char *fmt, ...) char *s;
va_start(ap, fmt); + if (vasprintf(&s, fmt, ap) < 0) + return -1; + va_end(ap); + if (!oper_count) { int i;
- s = va_arg(ap, char *); - /* Strip trailing spaces */ i = strlen(s) - 1; while (s[i] == ' ') @@ -61,11 +65,10 @@ static int fprintf_json(void *out, const char *fmt, ...) } else if (!strcmp(fmt, ",")) { /* Skip */ } else { - s = va_arg(ap, char *); jsonw_string(json_wtr, s); oper_count++; } - va_end(ap); + free(s); return 0; }
[ Upstream commit 2d5066fc175ea77a733d84df9ef414b34f311641 ]
For revision 0x20, the broadcast promisc is enabled by firmware, it's unnecessary to enable it when initializing VF.
For revision 0x21, it's necessary to enable broadcast promisc mode when initializing or re-initializing VF, otherwise, it will be unable to send and receive promisc packets.
Fixes: f01f5559cac8 ("net: hns3: don't allow vf to enable promisc mode") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index 5d53467ee2d2..3b02745605d4 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -2512,6 +2512,12 @@ static int hclgevf_reset_hdev(struct hclgevf_dev *hdev) return ret; }
+ if (pdev->revision >= 0x21) { + ret = hclgevf_set_promisc_mode(hdev, true); + if (ret) + return ret; + } + dev_info(&hdev->pdev->dev, "Reset done\n");
return 0; @@ -2591,9 +2597,11 @@ static int hclgevf_init_hdev(struct hclgevf_dev *hdev) * firmware makes sure broadcast packets can be accepted. * For revision 0x21, default to enable broadcast promisc mode. */ - ret = hclgevf_set_promisc_mode(hdev, true); - if (ret) - goto err_config; + if (pdev->revision >= 0x21) { + ret = hclgevf_set_promisc_mode(hdev, true); + if (ret) + goto err_config; + }
/* Initialize RSS for this VF */ ret = hclgevf_rss_init_hw(hdev);
[ Upstream commit 49b1255603de5183c5e377200be3b3afe0dcdb86 ]
Currently, the driver queries the media port information, and updates the port capability periodically. But it sets an error mac->speed_type value, which stops update port capability.
Fixes: 88d10bd6f730 ("net: hns3: add support for multiple media type") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index bab04d2d674a..f2bffc05e902 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -2592,6 +2592,7 @@ static int hclge_get_sfp_info(struct hclge_dev *hdev, struct hclge_mac *mac) mac->speed_ability = le32_to_cpu(resp->speed_ability); mac->autoneg = resp->autoneg; mac->support_autoneg = resp->autoneg_ability; + mac->speed_type = QUERY_ACTIVE_SPEED; if (!resp->active_fec) mac->fec_mode = 0; else
[ Upstream commit 4ce9146e0370fcd573f0372d9b4e5a211112567c ]
Syzkaller found that it is possible to provoke a memory leak by never freeing rx_skb in struct bcsp_struct.
Fix by freeing in bcsp_close()
Signed-off-by: Tomas Bortoli tomasbortoli@gmail.com Reported-by: syzbot+98162c885993b72f19c4@syzkaller.appspotmail.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/hci_bcsp.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/bluetooth/hci_bcsp.c b/drivers/bluetooth/hci_bcsp.c index 82b13faa9422..fe2e307009f4 100644 --- a/drivers/bluetooth/hci_bcsp.c +++ b/drivers/bluetooth/hci_bcsp.c @@ -744,6 +744,11 @@ static int bcsp_close(struct hci_uart *hu) skb_queue_purge(&bcsp->rel); skb_queue_purge(&bcsp->unrel);
+ if (bcsp->rx_skb) { + kfree_skb(bcsp->rx_skb); + bcsp->rx_skb = NULL; + } + kfree(bcsp); return 0; }
[ Upstream commit 44d34af2e4cfd0c5357182f8b43f3e0a1fe30a2e ]
Without the QCA ROME setup routine this adapter fails to establish a SCO connection.
T: Bus=01 Lev=01 Prnt=01 Port=08 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3491 Rev=00.01 C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb I: If#=0x1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
Signed-off-by: João Paulo Rechi Vita jprvita@endlessm.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 50aed5259c2b..21fa5c889857 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -264,6 +264,7 @@ static const struct usb_device_id blacklist_table[] = { { USB_DEVICE(0x04ca, 0x3015), .driver_info = BTUSB_QCA_ROME }, { USB_DEVICE(0x04ca, 0x3016), .driver_info = BTUSB_QCA_ROME }, { USB_DEVICE(0x04ca, 0x301a), .driver_info = BTUSB_QCA_ROME }, + { USB_DEVICE(0x13d3, 0x3491), .driver_info = BTUSB_QCA_ROME }, { USB_DEVICE(0x13d3, 0x3496), .driver_info = BTUSB_QCA_ROME },
/* Broadcom BCM2035 */
[ Upstream commit 881cec4f6b4da78e54b73c046a60f39315964c7d ]
Without the QCA ROME setup routine this adapter fails to establish a SCO connection.
T: Bus=01 Lev=01 Prnt=01 Port=04 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3501 Rev=00.01 C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb I: If#=0x1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
Signed-off-by: João Paulo Rechi Vita jprvita@endlessm.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 21fa5c889857..6d61f5aafc78 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -266,6 +266,7 @@ static const struct usb_device_id blacklist_table[] = { { USB_DEVICE(0x04ca, 0x301a), .driver_info = BTUSB_QCA_ROME }, { USB_DEVICE(0x13d3, 0x3491), .driver_info = BTUSB_QCA_ROME }, { USB_DEVICE(0x13d3, 0x3496), .driver_info = BTUSB_QCA_ROME }, + { USB_DEVICE(0x13d3, 0x3501), .driver_info = BTUSB_QCA_ROME },
/* Broadcom BCM2035 */ { USB_DEVICE(0x0a5c, 0x2009), .driver_info = BTUSB_BCM92035 },
[ Upstream commit b188b03270b7f8568fc714101ce82fbf5e811c5a ]
Handle overlooked case where the target address is assigned to a peer and neither route nor gateway exist.
For one peer, no checks are performed to see if it is meant to receive packets for a given address.
As soon as there is a second peer however, checks are performed to deal with routes and gateways for handling complex setups with multiple hops to a target address. This logic assumed that no route and no gateway imply that the destination address can not be reached, which is false in case of a direct peer.
Acked-by: Jukka Rissanen jukka.rissanen@linux.intel.com Tested-by: Michael Scott mike@foundries.io Signed-off-by: Josua Mayer josua.mayer@jm0.eu Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/6lowpan.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/net/bluetooth/6lowpan.c b/net/bluetooth/6lowpan.c index 1555b0c6f7ec..9001bf331d56 100644 --- a/net/bluetooth/6lowpan.c +++ b/net/bluetooth/6lowpan.c @@ -180,10 +180,16 @@ static inline struct lowpan_peer *peer_lookup_dst(struct lowpan_btle_dev *dev, }
if (!rt) { - nexthop = &lowpan_cb(skb)->gw; - - if (ipv6_addr_any(nexthop)) - return NULL; + if (ipv6_addr_any(&lowpan_cb(skb)->gw)) { + /* There is neither route nor gateway, + * probably the destination is a direct peer. + */ + nexthop = daddr; + } else { + /* There is a known gateway + */ + nexthop = &lowpan_cb(skb)->gw; + } } else { nexthop = rt6_nexthop(rt, daddr);
[ Upstream commit c09cb1293523dd786ae54a12fd88001542cba2f6 ]
The NMI handlers handle_percpu_devid_fasteoi_nmi() and handle_fasteoi_nmi() do not update the interrupt counts. Due to that the NMI interrupt count does not show up correctly in /proc/interrupts.
Add the statistics and treat the NMI handlers in the same way as per cpu interrupts and prevent them from updating irq_desc::tot_count as this might be corrupted due to concurrency.
[ tglx: Massaged changelog ]
Fixes: 2dcf1fbcad35 ("genirq: Provide NMI handlers") Signed-off-by: Shijith Thotton sthotton@marvell.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/1562313336-11888-1-git-send-email-sthotton@marvell... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/irq/chip.c | 4 ++++ kernel/irq/irqdesc.c | 8 +++++++- 2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index 3ff4a1260885..b76703b2c0af 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -754,6 +754,8 @@ void handle_fasteoi_nmi(struct irq_desc *desc) unsigned int irq = irq_desc_get_irq(desc); irqreturn_t res;
+ __kstat_incr_irqs_this_cpu(desc); + trace_irq_handler_entry(irq, action); /* * NMIs cannot be shared, there is only one action. @@ -968,6 +970,8 @@ void handle_percpu_devid_fasteoi_nmi(struct irq_desc *desc) unsigned int irq = irq_desc_get_irq(desc); irqreturn_t res;
+ __kstat_incr_irqs_this_cpu(desc); + trace_irq_handler_entry(irq, action); res = action->handler(irq, raw_cpu_ptr(action->percpu_dev_id)); trace_irq_handler_exit(irq, action, res); diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c index c52b737ab8e3..9149dde5a7b0 100644 --- a/kernel/irq/irqdesc.c +++ b/kernel/irq/irqdesc.c @@ -946,6 +946,11 @@ unsigned int kstat_irqs_cpu(unsigned int irq, int cpu) *per_cpu_ptr(desc->kstat_irqs, cpu) : 0; }
+static bool irq_is_nmi(struct irq_desc *desc) +{ + return desc->istate & IRQS_NMI; +} + /** * kstat_irqs - Get the statistics for an interrupt * @irq: The interrupt number @@ -963,7 +968,8 @@ unsigned int kstat_irqs(unsigned int irq) if (!desc || !desc->kstat_irqs) return 0; if (!irq_settings_is_per_cpu_devid(desc) && - !irq_settings_is_per_cpu(desc)) + !irq_settings_is_per_cpu(desc) && + !irq_is_nmi(desc)) return desc->tot_count;
for_each_possible_cpu(cpu)
[ Upstream commit bff5a556c149804de29347a88a884d25e4e4e3a2 ]
'probe libc's inet_pton & backtrace it with ping' testcase sometimes fails on powerpc because distro ping binary does not have symbol information and thus it prints "[unknown]" function name in the backtrace.
Accept "[unknown]" as valid function name for powerpc as well.
# perf test -v "probe libc's inet_pton & backtrace it with ping"
Before:
59: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 79695 ping 79718 [077] 96483.787025: probe_libc:inet_pton: (7fff83a754c8) 7fff83a754c8 __GI___inet_pton+0x8 (/usr/lib64/power9/libc-2.28.so) 7fff83a2b7a0 gaih_inet.constprop.7+0x1020 (/usr/lib64/power9/libc-2.28.so) 7fff83a2c170 getaddrinfo+0x160 (/usr/lib64/power9/libc-2.28.so) 1171830f4 [unknown] (/usr/bin/ping) FAIL: expected backtrace entry ".*+0x[[:xdigit:]]+[[:space:]](.*/bin/ping.*)$" got "1171830f4 [unknown] (/usr/bin/ping)" test child finished with -1 ---- end ---- probe libc's inet_pton & backtrace it with ping: FAILED!
After:
59: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 79085 ping 79108 [045] 96400.214177: probe_libc:inet_pton: (7fffbb9654c8) 7fffbb9654c8 __GI___inet_pton+0x8 (/usr/lib64/power9/libc-2.28.so) 7fffbb91b7a0 gaih_inet.constprop.7+0x1020 (/usr/lib64/power9/libc-2.28.so) 7fffbb91c170 getaddrinfo+0x160 (/usr/lib64/power9/libc-2.28.so) 132e830f4 [unknown] (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok
Signed-off-by: Seeteena Thoufeek s1seetee@linux.vnet.ibm.com Reviewed-by: Kim Phillips kim.phillips@amd.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Hendrik Brueckner brueckner@linux.ibm.com Cc: Jiri Olsa jolsa@redhat.com Cc: Michael Petlan mpetlan@redhat.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Sandipan Das sandipan@linux.ibm.com Fixes: 1632936480a5 ("perf tests: Fix record+probe_libc_inet_pton.sh without ping's debuginfo") Link: http://lkml.kernel.org/r/1561630614-3216-1-git-send-email-s1seetee@linux.vne... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/shell/record+probe_libc_inet_pton.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh index 61c9f8fc6fa1..58a99a292930 100755 --- a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh +++ b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh @@ -44,7 +44,7 @@ trace_libc_inet_pton_backtrace() { eventattr='max-stack=4' echo "gaih_inet.*+0x[[:xdigit:]]+[[:space:]]($libc)$" >> $expected echo "getaddrinfo+0x[[:xdigit:]]+[[:space:]]($libc)$" >> $expected - echo ".*+0x[[:xdigit:]]+[[:space:]](.*/bin/ping.*)$" >> $expected + echo ".*(+0x[[:xdigit:]]+|[unknown])[[:space:]](.*/bin/ping.*)$" >> $expected ;; *) eventattr='max-stack=3'
[ Upstream commit 28261da8a26f4915aa257d12d506c6ba179d961f ]
Because of both sides doing L2CAP disconnection at the same time, it was possible to receive L2CAP Disconnection Response with CID that was already freed. That caused problems if CID was already reused and L2CAP Connection Request with same CID was sent out. Before this patch kernel deleted channel context regardless of the state of the channel.
Example where leftover Disconnection Response (frame #402) causes local device to delete L2CAP channel which was not yet connected. This in turn confuses remote device's stack because same CID is re-used without properly disconnecting.
Btmon capture before patch: ** snip **
ACL Data RX: Handle 43 flags 0x02 dlen 8 #394 [hci1] 10.748949
Channel: 65 len 4 [PSM 3 mode 0] {chan 2} RFCOMM: Disconnect (DISC) (0x43) Address: 0x03 cr 1 dlci 0x00 Control: 0x53 poll/final 1 Length: 0 FCS: 0xfd < ACL Data TX: Handle 43 flags 0x00 dlen 8 #395 [hci1] 10.749062 Channel: 65 len 4 [PSM 3 mode 0] {chan 2} RFCOMM: Unnumbered Ack (UA) (0x63) Address: 0x03 cr 1 dlci 0x00 Control: 0x73 poll/final 1 Length: 0 FCS: 0xd7 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #396 [hci1] 10.749073 L2CAP: Disconnection Request (0x06) ident 17 len 4 Destination CID: 65 Source CID: 65
HCI Event: Number of Completed Packets (0x13) plen 5 #397 [hci1] 10.752391
Num handles: 1 Handle: 43 Count: 1
HCI Event: Number of Completed Packets (0x13) plen 5 #398 [hci1] 10.753394
Num handles: 1 Handle: 43 Count: 1
ACL Data RX: Handle 43 flags 0x02 dlen 12 #399 [hci1] 10.756499
L2CAP: Disconnection Request (0x06) ident 26 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #400 [hci1] 10.756548 L2CAP: Disconnection Response (0x07) ident 26 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #401 [hci1] 10.757459 L2CAP: Connection Request (0x02) ident 18 len 4 PSM: 1 (0x0001) Source CID: 65
ACL Data RX: Handle 43 flags 0x02 dlen 12 #402 [hci1] 10.759148
L2CAP: Disconnection Response (0x07) ident 17 len 4 Destination CID: 65 Source CID: 65 = bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o.. 10.759447
HCI Event: Number of Completed Packets (0x13) plen 5 #403 [hci1] 10.759386
Num handles: 1 Handle: 43 Count: 1
ACL Data RX: Handle 43 flags 0x02 dlen 12 #404 [hci1] 10.760397
L2CAP: Connection Request (0x02) ident 27 len 4 PSM: 3 (0x0003) Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 16 #405 [hci1] 10.760441 L2CAP: Connection Response (0x03) ident 27 len 8 Destination CID: 65 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 43 flags 0x00 dlen 27 #406 [hci1] 10.760449 L2CAP: Configure Request (0x04) ident 19 len 19 Destination CID: 65 Flags: 0x0000 Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 1013 Option: Retransmission and Flow Control (0x04) [mandatory] Mode: Basic (0x00) TX window size: 0 Max transmit: 0 Retransmission timeout: 0 Monitor timeout: 0 Maximum PDU size: 0
HCI Event: Number of Completed Packets (0x13) plen 5 #407 [hci1] 10.761399
Num handles: 1 Handle: 43 Count: 1
ACL Data RX: Handle 43 flags 0x02 dlen 16 #408 [hci1] 10.762942
L2CAP: Connection Response (0x03) ident 18 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) *snip*
Similar case after the patch: *snip*
ACL Data RX: Handle 43 flags 0x02 dlen 8 #22702 [hci0] 1664.411056
Channel: 65 len 4 [PSM 3 mode 0] {chan 3} RFCOMM: Disconnect (DISC) (0x43) Address: 0x03 cr 1 dlci 0x00 Control: 0x53 poll/final 1 Length: 0 FCS: 0xfd < ACL Data TX: Handle 43 flags 0x00 dlen 8 #22703 [hci0] 1664.411136 Channel: 65 len 4 [PSM 3 mode 0] {chan 3} RFCOMM: Unnumbered Ack (UA) (0x63) Address: 0x03 cr 1 dlci 0x00 Control: 0x73 poll/final 1 Length: 0 FCS: 0xd7 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22704 [hci0] 1664.411143 L2CAP: Disconnection Request (0x06) ident 11 len 4 Destination CID: 65 Source CID: 65
HCI Event: Number of Completed Pac.. (0x13) plen 5 #22705 [hci0] 1664.414009
Num handles: 1 Handle: 43 Count: 1
HCI Event: Number of Completed Pac.. (0x13) plen 5 #22706 [hci0] 1664.415007
Num handles: 1 Handle: 43 Count: 1
ACL Data RX: Handle 43 flags 0x02 dlen 12 #22707 [hci0] 1664.418674
L2CAP: Disconnection Request (0x06) ident 17 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22708 [hci0] 1664.418762 L2CAP: Disconnection Response (0x07) ident 17 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22709 [hci0] 1664.421073 L2CAP: Connection Request (0x02) ident 12 len 4 PSM: 1 (0x0001) Source CID: 65
ACL Data RX: Handle 43 flags 0x02 dlen 12 #22710 [hci0] 1664.421371
L2CAP: Disconnection Response (0x07) ident 11 len 4 Destination CID: 65 Source CID: 65
HCI Event: Number of Completed Pac.. (0x13) plen 5 #22711 [hci0] 1664.424082
Num handles: 1 Handle: 43 Count: 1
HCI Event: Number of Completed Pac.. (0x13) plen 5 #22712 [hci0] 1664.425040
Num handles: 1 Handle: 43 Count: 1
ACL Data RX: Handle 43 flags 0x02 dlen 12 #22713 [hci0] 1664.426103
L2CAP: Connection Request (0x02) ident 18 len 4 PSM: 3 (0x0003) Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 16 #22714 [hci0] 1664.426186 L2CAP: Connection Response (0x03) ident 18 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 43 flags 0x00 dlen 27 #22715 [hci0] 1664.426196 L2CAP: Configure Request (0x04) ident 13 len 19 Destination CID: 65 Flags: 0x0000 Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 1013 Option: Retransmission and Flow Control (0x04) [mandatory] Mode: Basic (0x00) TX window size: 0 Max transmit: 0 Retransmission timeout: 0 Monitor timeout: 0 Maximum PDU size: 0
ACL Data RX: Handle 43 flags 0x02 dlen 16 #22716 [hci0] 1664.428804
L2CAP: Connection Response (0x03) ident 12 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) *snip*
Fix is to check that channel is in state BT_DISCONN before deleting the channel.
This bug was found while fuzzing Bluez's OBEX implementation using Synopsys Defensics.
Reported-by: Matti Kamunen matti.kamunen@synopsys.com Reported-by: Ari Timonen ari.timonen@synopsys.com Signed-off-by: Matias Karhumaa matias.karhumaa@gmail.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/l2cap_core.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 5406d7cd46ad..771e3e17bb6a 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -4394,6 +4394,12 @@ static inline int l2cap_disconnect_rsp(struct l2cap_conn *conn,
l2cap_chan_lock(chan);
+ if (chan->state != BT_DISCONN) { + l2cap_chan_unlock(chan); + mutex_unlock(&conn->chan_lock); + return 0; + } + l2cap_chan_hold(chan); l2cap_chan_del(chan, 0);
[ Upstream commit dcae9052ebb0c5b2614de620323d615fcbfda7f8 ]
This change is similar to commit a1616a5ac99e ("Bluetooth: hidp: fix buffer overflow") but for the compat ioctl. We take a string from the user and forgot to ensure that it's NUL terminated.
I have also changed the strncpy() in to strscpy() in hidp_setup_hid(). The difference is the strncpy() doesn't necessarily NUL terminate the destination string. Either change would fix the problem but it's nice to take a belt and suspenders approach and do both.
Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hidp/core.c | 2 +- net/bluetooth/hidp/sock.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c index a442e21f3894..5abd423b55fa 100644 --- a/net/bluetooth/hidp/core.c +++ b/net/bluetooth/hidp/core.c @@ -775,7 +775,7 @@ static int hidp_setup_hid(struct hidp_session *session, hid->version = req->version; hid->country = req->country;
- strncpy(hid->name, req->name, sizeof(hid->name)); + strscpy(hid->name, req->name, sizeof(hid->name));
snprintf(hid->phys, sizeof(hid->phys), "%pMR", &l2cap_pi(session->ctrl_sock->sk)->chan->src); diff --git a/net/bluetooth/hidp/sock.c b/net/bluetooth/hidp/sock.c index 2151913892ce..03be6a4baef3 100644 --- a/net/bluetooth/hidp/sock.c +++ b/net/bluetooth/hidp/sock.c @@ -192,6 +192,7 @@ static int hidp_sock_compat_ioctl(struct socket *sock, unsigned int cmd, unsigne ca.version = ca32.version; ca.flags = ca32.flags; ca.idle_to = ca32.idle_to; + ca32.name[sizeof(ca32.name) - 1] = '\0'; memcpy(ca.name, ca32.name, 128);
csock = sockfd_lookup(ca.ctrl_sock, &err);
[ Upstream commit e30155fd23c9c141cbe7d99b786e10a83a328837 ]
If an invalid role is sent from user space, gtp_encap_enable() will fail. Then, it should call gtp_encap_disable_sock() but current code doesn't. It makes memory leak.
Fixes: 91ed81f9abc7 ("gtp: support SGSN-side tunnels") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/gtp.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index fc45b749db46..01fc51892e48 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -843,8 +843,13 @@ static int gtp_encap_enable(struct gtp_dev *gtp, struct nlattr *data[])
if (data[IFLA_GTP_ROLE]) { role = nla_get_u32(data[IFLA_GTP_ROLE]); - if (role > GTP_ROLE_SGSN) + if (role > GTP_ROLE_SGSN) { + if (sk0) + gtp_encap_disable_sock(sk0); + if (sk1u) + gtp_encap_disable_sock(sk1u); return -EINVAL; + } }
gtp->sk0 = sk0;
[ Upstream commit c49a8682fc5d298d44e8d911f4fa14690ea9485e ]
Problem: The Linux Bluetooth stack yields complete control over the BLE connection interval to the remote device.
The Linux Bluetooth stack provides access to the BLE connection interval min and max values through /sys/kernel/debug/bluetooth/hci0/ conn_min_interval and /sys/kernel/debug/bluetooth/hci0/conn_max_interval. These values are used for initial BLE connections, but the remote device has the ability to request a connection parameter update. In the event that the remote side requests to change the connection interval, the Linux kernel currently only validates that the desired value is within the acceptable range in the Bluetooth specification (6 - 3200, corresponding to 7.5ms - 4000ms). There is currently no validation that the desired value requested by the remote device is within the min/max limits specified in the conn_min_interval/conn_max_interval configurations. This essentially leads to Linux yielding complete control over the connection interval to the remote device.
The proposed patch adds a verification step to the connection parameter update mechanism, ensuring that the desired value is within the min/max bounds of the current connection. If the desired value is outside of the current connection min/max values, then the connection parameter update request is rejected and the negative response is returned to the remote device. Recall that the initial connection is established using the local conn_min_interval/conn_max_interval values, so this allows the Linux administrator to retain control over the BLE connection interval.
The one downside that I see is that the current default Linux values for conn_min_interval and conn_max_interval typically correspond to 30ms and 50ms respectively. If this change were accepted, then it is feasible that some devices would no longer be able to negotiate to their desired connection interval values. This might be remedied by setting the default Linux conn_min_interval and conn_max_interval values to the widest supported range (6 - 3200 / 7.5ms - 4000ms). This could lead to the same behavior as the current implementation, where the remote device could request to change the connection interval value to any value that is permitted by the Bluetooth specification, and Linux would accept the desired value.
Signed-off-by: Carey Sonsino csonsino@gmail.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hci_event.c | 5 +++++ net/bluetooth/l2cap_core.c | 9 ++++++++- 2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index 9e4fcf406d9c..17c50a98e7f7 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -5588,6 +5588,11 @@ static void hci_le_remote_conn_param_req_evt(struct hci_dev *hdev, return send_conn_param_neg_reply(hdev, handle, HCI_ERROR_UNKNOWN_CONN_ID);
+ if (min < hcon->le_conn_min_interval || + max > hcon->le_conn_max_interval) + return send_conn_param_neg_reply(hdev, handle, + HCI_ERROR_INVALID_LL_PARAMS); + if (hci_check_conn_params(min, max, latency, timeout)) return send_conn_param_neg_reply(hdev, handle, HCI_ERROR_INVALID_LL_PARAMS); diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 771e3e17bb6a..32d2be9d6858 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -5297,7 +5297,14 @@ static inline int l2cap_conn_param_update_req(struct l2cap_conn *conn,
memset(&rsp, 0, sizeof(rsp));
- err = hci_check_conn_params(min, max, latency, to_multiplier); + if (min < hcon->le_conn_min_interval || + max > hcon->le_conn_max_interval) { + BT_DBG("requested connection interval exceeds current bounds."); + err = -EINVAL; + } else { + err = hci_check_conn_params(min, max, latency, to_multiplier); + } + if (err) rsp.result = cpu_to_le16(L2CAP_CONN_PARAM_REJECTED); else
[ Upstream commit e198987e7dd7d3645a53875151cd6f8fc425b706 ]
gtp_encap_enable_socket() and gtp_encap_destroy() are not protected by rcu_read_lock(). and it's not safe to write sk->sk_user_data. This patch make these functions to use lock_sock() instead of rcu_dereference_sk_user_data().
Test commands: gtp-link add gtp1
Splat looks like: [ 83.238315] ============================= [ 83.239127] WARNING: suspicious RCU usage [ 83.239702] 5.2.0-rc6+ #49 Not tainted [ 83.240268] ----------------------------- [ 83.241205] drivers/net/gtp.c:799 suspicious rcu_dereference_check() usage! [ 83.243828] [ 83.243828] other info that might help us debug this: [ 83.243828] [ 83.246325] [ 83.246325] rcu_scheduler_active = 2, debug_locks = 1 [ 83.247314] 1 lock held by gtp-link/1008: [ 83.248523] #0: 0000000017772c7f (rtnl_mutex){+.+.}, at: __rtnl_newlink+0x5f5/0x11b0 [ 83.251503] [ 83.251503] stack backtrace: [ 83.252173] CPU: 0 PID: 1008 Comm: gtp-link Not tainted 5.2.0-rc6+ #49 [ 83.253271] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 83.254562] Call Trace: [ 83.254995] dump_stack+0x7c/0xbb [ 83.255567] gtp_encap_enable_socket+0x2df/0x360 [gtp] [ 83.256415] ? gtp_find_dev+0x1a0/0x1a0 [gtp] [ 83.257161] ? memset+0x1f/0x40 [ 83.257843] gtp_newlink+0x90/0xa21 [gtp] [ 83.258497] ? __netlink_ns_capable+0xc3/0xf0 [ 83.259260] __rtnl_newlink+0xb9f/0x11b0 [ 83.260022] ? rtnl_link_unregister+0x230/0x230 [ ... ]
Fixes: 1e3a3abd8b28 ("gtp: make GTP sockets in gtp_newlink optional") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/gtp.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 01fc51892e48..61f19e66be55 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -289,12 +289,14 @@ static void gtp_encap_destroy(struct sock *sk) { struct gtp_dev *gtp;
- gtp = rcu_dereference_sk_user_data(sk); + lock_sock(sk); + gtp = sk->sk_user_data; if (gtp) { udp_sk(sk)->encap_type = 0; rcu_assign_sk_user_data(sk, NULL); sock_put(sk); } + release_sock(sk); }
static void gtp_encap_disable_sock(struct sock *sk) @@ -796,7 +798,8 @@ static struct sock *gtp_encap_enable_socket(int fd, int type, goto out_sock; }
- if (rcu_dereference_sk_user_data(sock->sk)) { + lock_sock(sock->sk); + if (sock->sk->sk_user_data) { sk = ERR_PTR(-EBUSY); goto out_sock; } @@ -812,6 +815,7 @@ static struct sock *gtp_encap_enable_socket(int fd, int type, setup_udp_tunnel_sock(sock_net(sock->sk), sock, &tuncfg);
out_sock: + release_sock(sock->sk); sockfd_put(sock); return sk; }
[ Upstream commit 3f167e1921865b379a9becf03828e7202c7b4917 ]
ipv4_pdp_add() is called in RCU read-side critical section. So GFP_KERNEL should not be used in the function. This patch make ipv4_pdp_add() to use GFP_ATOMIC instead of GFP_KERNEL.
Test commands: gtp-link add gtp1 & gtp-tunnel add gtp1 v1 100 200 1.1.1.1 2.2.2.2
Splat looks like: [ 130.618881] ============================= [ 130.626382] WARNING: suspicious RCU usage [ 130.626994] 5.2.0-rc6+ #50 Not tainted [ 130.627622] ----------------------------- [ 130.628223] ./include/linux/rcupdate.h:266 Illegal context switch in RCU read-side critical section! [ 130.629684] [ 130.629684] other info that might help us debug this: [ 130.629684] [ 130.631022] [ 130.631022] rcu_scheduler_active = 2, debug_locks = 1 [ 130.632136] 4 locks held by gtp-tunnel/1025: [ 130.632925] #0: 000000002b93c8b7 (cb_lock){++++}, at: genl_rcv+0x15/0x40 [ 130.634159] #1: 00000000f17bc999 (genl_mutex){+.+.}, at: genl_rcv_msg+0xfb/0x130 [ 130.635487] #2: 00000000c644ed8e (rtnl_mutex){+.+.}, at: gtp_genl_new_pdp+0x18c/0x1150 [gtp] [ 130.636936] #3: 0000000007a1cde7 (rcu_read_lock){....}, at: gtp_genl_new_pdp+0x187/0x1150 [gtp] [ 130.638348] [ 130.638348] stack backtrace: [ 130.639062] CPU: 1 PID: 1025 Comm: gtp-tunnel Not tainted 5.2.0-rc6+ #50 [ 130.641318] Call Trace: [ 130.641707] dump_stack+0x7c/0xbb [ 130.642252] ___might_sleep+0x2c0/0x3b0 [ 130.642862] kmem_cache_alloc_trace+0x1cd/0x2b0 [ 130.643591] gtp_genl_new_pdp+0x6c5/0x1150 [gtp] [ 130.644371] genl_family_rcv_msg+0x63a/0x1030 [ 130.645074] ? mutex_lock_io_nested+0x1090/0x1090 [ 130.645845] ? genl_unregister_family+0x630/0x630 [ 130.646592] ? debug_show_all_locks+0x2d0/0x2d0 [ 130.647293] ? check_flags.part.40+0x440/0x440 [ 130.648099] genl_rcv_msg+0xa3/0x130 [ ... ]
Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/gtp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 61f19e66be55..b770335c03c1 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -954,7 +954,7 @@ static int ipv4_pdp_add(struct gtp_dev *gtp, struct sock *sk,
}
- pctx = kmalloc(sizeof(struct pdp_ctx), GFP_KERNEL); + pctx = kmalloc(sizeof(*pctx), GFP_ATOMIC); if (pctx == NULL) return -ENOMEM;
[ Upstream commit 1788b8569f5de27da09087fa3f6580d2aa04cc75 ]
gtp_encap_destroy() is called twice. 1. When interface is deleted. 2. When udp socket is destroyed. either gtp->sk0 or gtp->sk1u could be freed by sock_put() in gtp_encap_destroy(). so, when gtp_encap_destroy() is called again, it would uses freed sk pointer.
patch makes gtp_encap_destroy() to set either gtp->sk0 or gtp->sk1u to null. in addition, both gtp->sk0 and gtp->sk1u pointer are protected by rtnl_lock. so, rtnl_lock() is added.
Test command: gtp-link add gtp1 & killall gtp-link ip link del gtp1
Splat looks like: [ 83.182767] BUG: KASAN: use-after-free in __lock_acquire+0x3a20/0x46a0 [ 83.184128] Read of size 8 at addr ffff8880cc7d5360 by task ip/1008 [ 83.185567] CPU: 1 PID: 1008 Comm: ip Not tainted 5.2.0-rc6+ #50 [ 83.188469] Call Trace: [ ... ] [ 83.200126] lock_acquire+0x141/0x380 [ 83.200575] ? lock_sock_nested+0x3a/0xf0 [ 83.201069] _raw_spin_lock_bh+0x38/0x70 [ 83.201551] ? lock_sock_nested+0x3a/0xf0 [ 83.202044] lock_sock_nested+0x3a/0xf0 [ 83.202520] gtp_encap_destroy+0x18/0xe0 [gtp] [ 83.203065] gtp_encap_disable.isra.14+0x13/0x50 [gtp] [ 83.203687] gtp_dellink+0x56/0x170 [gtp] [ 83.204190] rtnl_delete_link+0xb4/0x100 [ ... ] [ 83.236513] Allocated by task 976: [ 83.236925] save_stack+0x19/0x80 [ 83.237332] __kasan_kmalloc.constprop.3+0xa0/0xd0 [ 83.237894] kmem_cache_alloc+0xd8/0x280 [ 83.238360] sk_prot_alloc.isra.42+0x50/0x200 [ 83.238874] sk_alloc+0x32/0x940 [ 83.239264] inet_create+0x283/0xc20 [ 83.239684] __sock_create+0x2dd/0x540 [ 83.240136] __sys_socket+0xca/0x1a0 [ 83.240550] __x64_sys_socket+0x6f/0xb0 [ 83.240998] do_syscall_64+0x9c/0x450 [ 83.241466] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 83.242061] [ 83.242249] Freed by task 0: [ 83.242616] save_stack+0x19/0x80 [ 83.243013] __kasan_slab_free+0x111/0x150 [ 83.243498] kmem_cache_free+0x89/0x250 [ 83.244444] __sk_destruct+0x38f/0x5a0 [ 83.245366] rcu_core+0x7e9/0x1c20 [ 83.245766] __do_softirq+0x213/0x8fa
Fixes: 1e3a3abd8b28 ("gtp: make GTP sockets in gtp_newlink optional") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/gtp.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index b770335c03c1..5615cdb7202c 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -285,13 +285,17 @@ static int gtp1u_udp_encap_recv(struct gtp_dev *gtp, struct sk_buff *skb) return gtp_rx(pctx, skb, hdrlen, gtp->role); }
-static void gtp_encap_destroy(struct sock *sk) +static void __gtp_encap_destroy(struct sock *sk) { struct gtp_dev *gtp;
lock_sock(sk); gtp = sk->sk_user_data; if (gtp) { + if (gtp->sk0 == sk) + gtp->sk0 = NULL; + else + gtp->sk1u = NULL; udp_sk(sk)->encap_type = 0; rcu_assign_sk_user_data(sk, NULL); sock_put(sk); @@ -299,12 +303,19 @@ static void gtp_encap_destroy(struct sock *sk) release_sock(sk); }
+static void gtp_encap_destroy(struct sock *sk) +{ + rtnl_lock(); + __gtp_encap_destroy(sk); + rtnl_unlock(); +} + static void gtp_encap_disable_sock(struct sock *sk) { if (!sk) return;
- gtp_encap_destroy(sk); + __gtp_encap_destroy(sk); }
static void gtp_encap_disable(struct gtp_dev *gtp) @@ -1043,6 +1054,7 @@ static int gtp_genl_new_pdp(struct sk_buff *skb, struct genl_info *info) return -EINVAL; }
+ rtnl_lock(); rcu_read_lock();
gtp = gtp_find_dev(sock_net(skb->sk), info->attrs); @@ -1067,6 +1079,7 @@ static int gtp_genl_new_pdp(struct sk_buff *skb, struct genl_info *info)
out_unlock: rcu_read_unlock(); + rtnl_unlock(); return err; }
[ Upstream commit a2bed90704c68d3763bf24decb1b781a45395de8 ]
Current gtp_newlink() could be called after unregister_pernet_subsys(). gtp_newlink() uses gtp_net but it can be destroyed by unregister_pernet_subsys(). So unregister_pernet_subsys() should be called after rtnl_link_unregister().
Test commands: #SHELL 1 while : do for i in {1..5} do ./gtp-link add gtp$i & done killall gtp-link done
#SHELL 2 while : do modprobe -rv gtp done
Splat looks like: [ 753.176631] BUG: KASAN: use-after-free in gtp_newlink+0x9b4/0xa5c [gtp] [ 753.177722] Read of size 8 at addr ffff8880d48f2458 by task gtp-link/7126 [ 753.179082] CPU: 0 PID: 7126 Comm: gtp-link Tainted: G W 5.2.0-rc6+ #50 [ 753.185801] Call Trace: [ 753.186264] dump_stack+0x7c/0xbb [ 753.186863] ? gtp_newlink+0x9b4/0xa5c [gtp] [ 753.187583] print_address_description+0xc7/0x240 [ 753.188382] ? gtp_newlink+0x9b4/0xa5c [gtp] [ 753.189097] ? gtp_newlink+0x9b4/0xa5c [gtp] [ 753.189846] __kasan_report+0x12a/0x16f [ 753.190542] ? gtp_newlink+0x9b4/0xa5c [gtp] [ 753.191298] kasan_report+0xe/0x20 [ 753.191893] gtp_newlink+0x9b4/0xa5c [gtp] [ 753.192580] ? __netlink_ns_capable+0xc3/0xf0 [ 753.193370] __rtnl_newlink+0xb9f/0x11b0 [ ... ] [ 753.241201] Allocated by task 7186: [ 753.241844] save_stack+0x19/0x80 [ 753.242399] __kasan_kmalloc.constprop.3+0xa0/0xd0 [ 753.243192] __kmalloc+0x13e/0x300 [ 753.243764] ops_init+0xd6/0x350 [ 753.244314] register_pernet_operations+0x249/0x6f0 [ ... ] [ 753.251770] Freed by task 7178: [ 753.252288] save_stack+0x19/0x80 [ 753.252833] __kasan_slab_free+0x111/0x150 [ 753.253962] kfree+0xc7/0x280 [ 753.254509] ops_free_list.part.11+0x1c4/0x2d0 [ 753.255241] unregister_pernet_operations+0x262/0x390 [ ... ] [ 753.285883] list_add corruption. next->prev should be prev (ffff8880d48f2458), but was ffff8880d497d878. (next. [ 753.287241] ------------[ cut here ]------------ [ 753.287794] kernel BUG at lib/list_debug.c:25! [ 753.288364] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 753.289099] CPU: 0 PID: 7126 Comm: gtp-link Tainted: G B W 5.2.0-rc6+ #50 [ 753.291036] RIP: 0010:__list_add_valid+0x74/0xd0 [ 753.291589] Code: 48 39 da 75 27 48 39 f5 74 36 48 39 dd 74 31 48 83 c4 08 b8 01 00 00 00 5b 5d c3 48 89 d9 48b [ 753.293779] RSP: 0018:ffff8880cae8f398 EFLAGS: 00010286 [ 753.294401] RAX: 0000000000000075 RBX: ffff8880d497d878 RCX: 0000000000000000 [ 753.296260] RDX: 0000000000000075 RSI: 0000000000000008 RDI: ffffed10195d1e69 [ 753.297070] RBP: ffff8880cd250ae0 R08: ffffed101b4bff21 R09: ffffed101b4bff21 [ 753.297899] R10: 0000000000000001 R11: ffffed101b4bff20 R12: ffff8880d497d878 [ 753.298703] R13: 0000000000000000 R14: ffff8880cd250ae0 R15: ffff8880d48f2458 [ 753.299564] FS: 00007f5f79805740(0000) GS:ffff8880da400000(0000) knlGS:0000000000000000 [ 753.300533] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 753.301231] CR2: 00007fe8c7ef4f10 CR3: 00000000b71a6006 CR4: 00000000000606f0 [ 753.302183] Call Trace: [ 753.302530] gtp_newlink+0x5f6/0xa5c [gtp] [ 753.303037] ? __netlink_ns_capable+0xc3/0xf0 [ 753.303576] __rtnl_newlink+0xb9f/0x11b0 [ 753.304092] ? rtnl_link_unregister+0x230/0x230
Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/gtp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 5615cdb7202c..607f38712b4e 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -1382,9 +1382,9 @@ late_initcall(gtp_init);
static void __exit gtp_fini(void) { - unregister_pernet_subsys(>p_net_ops); genl_unregister_family(>p_genl_family); rtnl_link_unregister(>p_link_ops); + unregister_pernet_subsys(>p_net_ops);
pr_info("GTP module unloaded\n"); }
[ Upstream commit bf0bdd1343efbbf65b4d53aef1fce14acbd79d50 ]
Unlike driver mode, generic xdp receive could be triggered by different threads on different CPU cores at the same time leading to the fill and rx queue breakage. For example, this could happen while sending packets from two processes to the first interface of veth pair while the second part of it is open with AF_XDP socket.
Need to take a lock for each generic receive to avoid race.
Fixes: c497176cb2e4 ("xsk: add Rx receive functions and poll support") Signed-off-by: Ilya Maximets i.maximets@samsung.com Acked-by: Magnus Karlsson magnus.karlsson@intel.com Tested-by: William Tu u9012063@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/xdp_sock.h | 2 ++ net/xdp/xsk.c | 31 ++++++++++++++++++++++--------- 2 files changed, 24 insertions(+), 9 deletions(-)
diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index d074b6d60f8a..ac3c047d058c 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -67,6 +67,8 @@ struct xdp_sock { * in the SKB destructor callback. */ spinlock_t tx_completion_lock; + /* Protects generic receive. */ + spinlock_t rx_lock; u64 rx_dropped; };
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index a14e8864e4fa..5e0637db92ea 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -123,13 +123,17 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) u64 addr; int err;
- if (xs->dev != xdp->rxq->dev || xs->queue_id != xdp->rxq->queue_index) - return -EINVAL; + spin_lock_bh(&xs->rx_lock); + + if (xs->dev != xdp->rxq->dev || xs->queue_id != xdp->rxq->queue_index) { + err = -EINVAL; + goto out_unlock; + }
if (!xskq_peek_addr(xs->umem->fq, &addr) || len > xs->umem->chunk_size_nohr - XDP_PACKET_HEADROOM) { - xs->rx_dropped++; - return -ENOSPC; + err = -ENOSPC; + goto out_drop; }
addr += xs->umem->headroom; @@ -138,13 +142,21 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) memcpy(buffer, xdp->data_meta, len + metalen); addr += metalen; err = xskq_produce_batch_desc(xs->rx, addr, len); - if (!err) { - xskq_discard_addr(xs->umem->fq); - xsk_flush(xs); - return 0; - } + if (err) + goto out_drop; + + xskq_discard_addr(xs->umem->fq); + xskq_produce_flush_desc(xs->rx);
+ spin_unlock_bh(&xs->rx_lock); + + xs->sk.sk_data_ready(&xs->sk); + return 0; + +out_drop: xs->rx_dropped++; +out_unlock: + spin_unlock_bh(&xs->rx_lock); return err; }
@@ -765,6 +777,7 @@ static int xsk_create(struct net *net, struct socket *sock, int protocol,
xs = xdp_sk(sk); mutex_init(&xs->mutex); + spin_lock_init(&xs->rx_lock); spin_lock_init(&xs->tx_completion_lock);
mutex_lock(&net->xdp.lock);
[ Upstream commit 433a06d7d74e677c40b1148c70c48677ff62fb6b ]
Defer probing of the orion-mdio interface when getting a clock returns EPROBE_DEFER. This avoids locking up the Armada 8k SoC when mdio is used before all clocks have been enabled.
Signed-off-by: Josua Mayer josua@solid-run.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/mvmdio.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/marvell/mvmdio.c b/drivers/net/ethernet/marvell/mvmdio.c index c5dac6bd2be4..903836e334d8 100644 --- a/drivers/net/ethernet/marvell/mvmdio.c +++ b/drivers/net/ethernet/marvell/mvmdio.c @@ -321,6 +321,10 @@ static int orion_mdio_probe(struct platform_device *pdev)
for (i = 0; i < ARRAY_SIZE(dev->clk); i++) { dev->clk[i] = of_clk_get(pdev->dev.of_node, i); + if (PTR_ERR(dev->clk[i]) == -EPROBE_DEFER) { + ret = -EPROBE_DEFER; + goto out_clk; + } if (IS_ERR(dev->clk[i])) break; clk_prepare_enable(dev->clk[i]); @@ -362,6 +366,7 @@ static int orion_mdio_probe(struct platform_device *pdev) if (dev->err_interrupt > 0) writel(0, dev->regs + MVMDIO_ERR_INT_MASK);
+out_clk: for (i = 0; i < ARRAY_SIZE(dev->clk); i++) { if (IS_ERR(dev->clk[i])) break;
[ Upstream commit f96315f2f17e7b2580d2fec7c4d6a706a131d904 ]
When change MTU or other operations, which just calling .reset_notify to do HNAE3_DOWN_CLIENT and HNAE3_UP_CLIENT, then the netdev_tx_reset_queue() in the hns3_clear_all_ring() will be ignored. So the dev_watchdog() may misdiagnose a TX timeout.
This patch separates netdev_tx_reset_queue() from hns3_clear_all_ring(), and unifies hns3_clear_all_ring() and hns3_force_clear_all_ring into one, since they are doing similar things.
Fixes: 3a30964a2eef ("net: hns3: delay ring buffer clearing during reset") Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/hisilicon/hns3/hns3_enet.c | 54 +++++++++---------- 1 file changed, 26 insertions(+), 28 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index e0d3e2f9801d..66b691b7221f 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -27,8 +27,7 @@ #define hns3_set_field(origin, shift, val) ((origin) |= ((val) << (shift))) #define hns3_tx_bd_count(S) DIV_ROUND_UP(S, HNS3_MAX_BD_SIZE)
-static void hns3_clear_all_ring(struct hnae3_handle *h); -static void hns3_force_clear_all_ring(struct hnae3_handle *h); +static void hns3_clear_all_ring(struct hnae3_handle *h, bool force); static void hns3_remove_hw_addr(struct net_device *netdev);
static const char hns3_driver_name[] = "hns3"; @@ -466,6 +465,20 @@ static int hns3_nic_net_open(struct net_device *netdev) return 0; }
+static void hns3_reset_tx_queue(struct hnae3_handle *h) +{ + struct net_device *ndev = h->kinfo.netdev; + struct hns3_nic_priv *priv = netdev_priv(ndev); + struct netdev_queue *dev_queue; + u32 i; + + for (i = 0; i < h->kinfo.num_tqps; i++) { + dev_queue = netdev_get_tx_queue(ndev, + priv->ring_data[i].queue_index); + netdev_tx_reset_queue(dev_queue); + } +} + static void hns3_nic_net_down(struct net_device *netdev) { struct hns3_nic_priv *priv = netdev_priv(netdev); @@ -496,7 +509,9 @@ static void hns3_nic_net_down(struct net_device *netdev) * to disable the ring through firmware when downing the netdev. */ if (!hns3_nic_resetting(netdev)) - hns3_clear_all_ring(priv->ae_handle); + hns3_clear_all_ring(priv->ae_handle, false); + + hns3_reset_tx_queue(priv->ae_handle); }
static int hns3_nic_net_stop(struct net_device *netdev) @@ -3888,7 +3903,7 @@ static void hns3_client_uninit(struct hnae3_handle *handle, bool reset)
hns3_del_all_fd_rules(netdev, true);
- hns3_force_clear_all_ring(handle); + hns3_clear_all_ring(handle, true);
hns3_uninit_phy(netdev);
@@ -4060,43 +4075,26 @@ static void hns3_force_clear_rx_ring(struct hns3_enet_ring *ring) } }
-static void hns3_force_clear_all_ring(struct hnae3_handle *h) -{ - struct net_device *ndev = h->kinfo.netdev; - struct hns3_nic_priv *priv = netdev_priv(ndev); - struct hns3_enet_ring *ring; - u32 i; - - for (i = 0; i < h->kinfo.num_tqps; i++) { - ring = priv->ring_data[i].ring; - hns3_clear_tx_ring(ring); - - ring = priv->ring_data[i + h->kinfo.num_tqps].ring; - hns3_force_clear_rx_ring(ring); - } -} - -static void hns3_clear_all_ring(struct hnae3_handle *h) +static void hns3_clear_all_ring(struct hnae3_handle *h, bool force) { struct net_device *ndev = h->kinfo.netdev; struct hns3_nic_priv *priv = netdev_priv(ndev); u32 i;
for (i = 0; i < h->kinfo.num_tqps; i++) { - struct netdev_queue *dev_queue; struct hns3_enet_ring *ring;
ring = priv->ring_data[i].ring; hns3_clear_tx_ring(ring); - dev_queue = netdev_get_tx_queue(ndev, - priv->ring_data[i].queue_index); - netdev_tx_reset_queue(dev_queue);
ring = priv->ring_data[i + h->kinfo.num_tqps].ring; /* Continue to clear other rings even if clearing some * rings failed. */ - hns3_clear_rx_ring(ring); + if (force) + hns3_force_clear_rx_ring(ring); + else + hns3_clear_rx_ring(ring); } }
@@ -4305,8 +4303,8 @@ static int hns3_reset_notify_uninit_enet(struct hnae3_handle *handle) return 0; }
- hns3_clear_all_ring(handle); - hns3_force_clear_all_ring(handle); + hns3_clear_all_ring(handle, true); + hns3_reset_tx_queue(priv->ae_handle);
hns3_nic_uninit_vector_data(priv);
[ Upstream commit 9fe06a51287b2d41baef7ece94df34b5abf19b90 ]
A recent commit efa14c3985828d ("iavf: allow null RX descriptors") added a null pointer sanity check on rx_buffer, however, rx_buffer is being dereferenced before that check, which implies a null pointer dereference bug can potentially occur. Fix this by only dereferencing rx_buffer until after the null pointer check.
Addresses-Coverity: ("Dereference before null check") Signed-off-by: Colin Ian King colin.king@canonical.com Tested-by: Andrew Bowers andrewx.bowers@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_txrx.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c b/drivers/net/ethernet/intel/iavf/iavf_txrx.c index c97b9ecf026a..26422bc9ca8c 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c +++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c @@ -1296,7 +1296,7 @@ static struct sk_buff *iavf_construct_skb(struct iavf_ring *rx_ring, struct iavf_rx_buffer *rx_buffer, unsigned int size) { - void *va = page_address(rx_buffer->page) + rx_buffer->page_offset; + void *va; #if (PAGE_SIZE < 8192) unsigned int truesize = iavf_rx_pg_size(rx_ring) / 2; #else @@ -1308,6 +1308,7 @@ static struct sk_buff *iavf_construct_skb(struct iavf_ring *rx_ring, if (!rx_buffer) return NULL; /* prefetch first cache line of first page */ + va = page_address(rx_buffer->page) + rx_buffer->page_offset; prefetch(va); #if L1_CACHE_BYTES < 128 prefetch(va + L1_CACHE_BYTES); @@ -1362,7 +1363,7 @@ static struct sk_buff *iavf_build_skb(struct iavf_ring *rx_ring, struct iavf_rx_buffer *rx_buffer, unsigned int size) { - void *va = page_address(rx_buffer->page) + rx_buffer->page_offset; + void *va; #if (PAGE_SIZE < 8192) unsigned int truesize = iavf_rx_pg_size(rx_ring) / 2; #else @@ -1374,6 +1375,7 @@ static struct sk_buff *iavf_build_skb(struct iavf_ring *rx_ring, if (!rx_buffer) return NULL; /* prefetch first cache line of first page */ + va = page_address(rx_buffer->page) + rx_buffer->page_offset; prefetch(va); #if L1_CACHE_BYTES < 128 prefetch(va + L1_CACHE_BYTES);
[ Upstream commit c9b3007feca018d3f7061f5d5a14cb00766ffe9b ]
The iolatency controller is based on rq_qos. It increments on rq_qos_throttle() and decrements on either rq_qos_cleanup() or rq_qos_done_bio(). a3fb01ba5af0 fixes the double accounting issue where blk_mq_make_request() may call both rq_qos_cleanup() and rq_qos_done_bio() on REQ_NO_WAIT. So checking STS_AGAIN prevents the double decrement.
The above works upstream as the only way we can get STS_AGAIN is from blk_mq_get_request() failing. The STS_AGAIN handling isn't a real problem as bio_endio() skipping only happens on reserved tag allocation failures which can only be caused by driver bugs and already triggers WARN.
However, the fix creates a not so great dependency on how STS_AGAIN can be propagated. Internally, we (Facebook) carry a patch that kills read ahead if a cgroup is io congested or a fatal signal is pending. This combined with chained bios progagate their bi_status to the parent is not already set can can cause the parent bio to not clean up properly even though it was successful. This consequently leaks the inflight counter and can hang all IOs under that blkg.
To nip the adverse interaction early, this removes the rq_qos_cleanup() callback in iolatency in favor of cleaning up always on the rq_qos_done_bio() path.
Fixes: a3fb01ba5af0 ("blk-iolatency: only account submitted bios") Debugged-by: Tejun Heo tj@kernel.org Debugged-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Dennis Zhou dennis@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-iolatency.c | 51 ++++++++++++------------------------------- 1 file changed, 14 insertions(+), 37 deletions(-)
diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index c91b84bb9d0a..a1eb5e9ac904 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -600,10 +600,6 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio) if (!blkg || !bio_flagged(bio, BIO_TRACKED)) return;
- /* We didn't actually submit this bio, don't account it. */ - if (bio->bi_status == BLK_STS_AGAIN) - return; - iolat = blkg_to_lat(bio->bi_blkg); if (!iolat) return; @@ -622,40 +618,22 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio)
inflight = atomic_dec_return(&rqw->inflight); WARN_ON_ONCE(inflight < 0); - if (iolat->min_lat_nsec == 0) - goto next; - iolatency_record_time(iolat, &bio->bi_issue, now, - issue_as_root); - window_start = atomic64_read(&iolat->window_start); - if (now > window_start && - (now - window_start) >= iolat->cur_win_nsec) { - if (atomic64_cmpxchg(&iolat->window_start, - window_start, now) == window_start) - iolatency_check_latencies(iolat, now); + /* + * If bi_status is BLK_STS_AGAIN, the bio wasn't actually + * submitted, so do not account for it. + */ + if (iolat->min_lat_nsec && bio->bi_status != BLK_STS_AGAIN) { + iolatency_record_time(iolat, &bio->bi_issue, now, + issue_as_root); + window_start = atomic64_read(&iolat->window_start); + if (now > window_start && + (now - window_start) >= iolat->cur_win_nsec) { + if (atomic64_cmpxchg(&iolat->window_start, + window_start, now) == window_start) + iolatency_check_latencies(iolat, now); + } } -next: - wake_up(&rqw->wait); - blkg = blkg->parent; - } -} - -static void blkcg_iolatency_cleanup(struct rq_qos *rqos, struct bio *bio) -{ - struct blkcg_gq *blkg; - - blkg = bio->bi_blkg; - while (blkg && blkg->parent) { - struct rq_wait *rqw; - struct iolatency_grp *iolat; - - iolat = blkg_to_lat(blkg); - if (!iolat) - goto next; - - rqw = &iolat->rq_wait; - atomic_dec(&rqw->inflight); wake_up(&rqw->wait); -next: blkg = blkg->parent; } } @@ -671,7 +649,6 @@ static void blkcg_iolatency_exit(struct rq_qos *rqos)
static struct rq_qos_ops blkcg_iolatency_ops = { .throttle = blkcg_iolatency_throttle, - .cleanup = blkcg_iolatency_cleanup, .done_bio = blkcg_iolatency_done_bio, .exit = blkcg_iolatency_exit, };
[ Upstream commit 763ff0e7d9c72e7094b31e7fb84a859be9325635 ]
Similar issue was fixed in cdfc7f888c2a ("libbpf: fix GCC8 warning for strncpy") already. This one was missed. Fixing now.
Cc: Magnus Karlsson magnus.karlsson@intel.com Signed-off-by: Andrii Nakryiko andriin@fb.com Acked-by: Magnus Karlsson magnus.karlsson@intel.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/xsk.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c index 8a7a05bc657d..ca272c5b67f4 100644 --- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -562,7 +562,8 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname, err = -errno; goto out_socket; } - strncpy(xsk->ifname, ifname, IFNAMSIZ); + strncpy(xsk->ifname, ifname, IFNAMSIZ - 1); + xsk->ifname[IFNAMSIZ - 1] = '\0';
err = xsk_set_xdp_socket_config(&xsk->config, usr_config); if (err)
[ Upstream commit f3554aeb991214cbfafd17d55e2bfddb50282e32 ]
This fixes a divide by zero error in the setup_format_params function of the floppy driver.
Two consecutive ioctls can trigger the bug: The first one should set the drive geometry with such .sect and .rate values for the F_SECT_PER_TRACK to become zero. Next, the floppy format operation should be called.
A floppy disk is not required to be inserted. An unprivileged user could trigger the bug if the device is accessible.
The patch checks F_SECT_PER_TRACK for a non-zero value in the set_geometry function. The proper check should involve a reasonable upper limit for the .sect and .rate fields, but it could change the UAPI.
The patch also checks F_SECT_PER_TRACK in the setup_format_params, and cancels the formatting operation in case of zero.
The bug was found by syzkaller.
Signed-off-by: Denis Efremov efremov@ispras.ru Tested-by: Willy Tarreau w@1wt.eu Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/floppy.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c index 9fb9b312ab6b..51246bc9709a 100644 --- a/drivers/block/floppy.c +++ b/drivers/block/floppy.c @@ -2120,6 +2120,9 @@ static void setup_format_params(int track) raw_cmd->kernel_data = floppy_track_buffer; raw_cmd->length = 4 * F_SECT_PER_TRACK;
+ if (!F_SECT_PER_TRACK) + return; + /* allow for about 30ms for data transport per track */ head_shift = (F_SECT_PER_TRACK + 5) / 6;
@@ -3232,6 +3235,8 @@ static int set_geometry(unsigned int cmd, struct floppy_struct *g, /* sanity checking for parameters. */ if (g->sect <= 0 || g->head <= 0 || + /* check for zero in F_SECT_PER_TRACK */ + (unsigned char)((g->sect << 2) >> FD_SIZECODE(g)) == 0 || g->track <= 0 || g->track > UDP->tracks >> STRETCH(g) || /* check if reserved bits are set */ (g->stretch & ~(FD_STRETCH | FD_SWAPSIDES | FD_SECTBASEMASK)) != 0)
[ Upstream commit 5635f897ed83fd539df78e98ba69ee91592f9bb8 ]
This fixes a global out-of-bounds read access in the next_valid_format function of the floppy driver.
The values from autodetect field of the struct floppy_drive_params are used as indices for the floppy_type array in the next_valid_format function 'floppy_type[DP->autodetect[probed_format]].sect'.
To trigger the bug, one could use a value out of range and set the drive parameters with the FDSETDRVPRM ioctl. A floppy disk is not required to be inserted.
CAP_SYS_ADMIN is required to call FDSETDRVPRM.
The patch adds the check for values of the autodetect field to be in the '0 <= x < ARRAY_SIZE(floppy_type)' range of the floppy_type array indices.
The bug was found by syzkaller.
Signed-off-by: Denis Efremov efremov@ispras.ru Tested-by: Willy Tarreau w@1wt.eu Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/floppy.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c index 51246bc9709a..b70d6e103a57 100644 --- a/drivers/block/floppy.c +++ b/drivers/block/floppy.c @@ -3380,6 +3380,20 @@ static int fd_getgeo(struct block_device *bdev, struct hd_geometry *geo) return 0; }
+static bool valid_floppy_drive_params(const short autodetect[8]) +{ + size_t floppy_type_size = ARRAY_SIZE(floppy_type); + size_t i = 0; + + for (i = 0; i < 8; ++i) { + if (autodetect[i] < 0 || + autodetect[i] >= floppy_type_size) + return false; + } + + return true; +} + static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int cmd, unsigned long param) { @@ -3506,6 +3520,8 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int SUPBOUND(size, strlen((const char *)outparam) + 1); break; case FDSETDRVPRM: + if (!valid_floppy_drive_params(inparam.dp.autodetect)) + return -EINVAL; *UDP = inparam.dp; break; case FDGETDRVPRM: @@ -3703,6 +3719,8 @@ static int compat_setdrvprm(int drive, return -EPERM; if (copy_from_user(&v, arg, sizeof(struct compat_floppy_drive_params))) return -EFAULT; + if (!valid_floppy_drive_params(v.autodetect)) + return -EINVAL; mutex_lock(&floppy_mutex); UDP->cmos = v.cmos; UDP->max_dtr = v.max_dtr;
[ Upstream commit 9b04609b784027968348796a18f601aed9db3789 ]
This fixes the invalid pointer dereference in the drive_name function of the floppy driver.
The native_format field of the struct floppy_drive_params is used as floppy_type array index in the drive_name function. Thus, the field should be checked the same way as the autodetect field.
To trigger the bug, one could use a value out of range and set the drive parameters with the FDSETDRVPRM ioctl. Next, FDGETDRVTYP ioctl should be used to call the drive_name. A floppy disk is not required to be inserted.
CAP_SYS_ADMIN is required to call FDSETDRVPRM.
The patch adds the check for a value of the native_format field to be in the '0 <= x < ARRAY_SIZE(floppy_type)' range of the floppy_type array indices.
The bug was found by syzkaller.
Signed-off-by: Denis Efremov efremov@ispras.ru Tested-by: Willy Tarreau w@1wt.eu Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/floppy.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c index b70d6e103a57..671a0ae434b4 100644 --- a/drivers/block/floppy.c +++ b/drivers/block/floppy.c @@ -3380,7 +3380,8 @@ static int fd_getgeo(struct block_device *bdev, struct hd_geometry *geo) return 0; }
-static bool valid_floppy_drive_params(const short autodetect[8]) +static bool valid_floppy_drive_params(const short autodetect[8], + int native_format) { size_t floppy_type_size = ARRAY_SIZE(floppy_type); size_t i = 0; @@ -3391,6 +3392,9 @@ static bool valid_floppy_drive_params(const short autodetect[8]) return false; }
+ if (native_format < 0 || native_format >= floppy_type_size) + return false; + return true; }
@@ -3520,7 +3524,8 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int SUPBOUND(size, strlen((const char *)outparam) + 1); break; case FDSETDRVPRM: - if (!valid_floppy_drive_params(inparam.dp.autodetect)) + if (!valid_floppy_drive_params(inparam.dp.autodetect, + inparam.dp.native_format)) return -EINVAL; *UDP = inparam.dp; break; @@ -3719,7 +3724,7 @@ static int compat_setdrvprm(int drive, return -EPERM; if (copy_from_user(&v, arg, sizeof(struct compat_floppy_drive_params))) return -EFAULT; - if (!valid_floppy_drive_params(v.autodetect)) + if (!valid_floppy_drive_params(v.autodetect, v.native_format)) return -EINVAL; mutex_lock(&floppy_mutex); UDP->cmos = v.cmos;
[ Upstream commit da99466ac243f15fbba65bd261bfc75ffa1532b6 ]
This fixes a global out-of-bounds read access in the copy_buffer function of the floppy driver.
The FDDEFPRM ioctl allows one to set the geometry of a disk. The sect and head fields (unsigned int) of the floppy_drive structure are used to compute the max_sector (int) in the make_raw_rw_request function. It is possible to overflow the max_sector. Next, max_sector is passed to the copy_buffer function and used in one of the memcpy calls.
An unprivileged user could trigger the bug if the device is accessible, but requires a floppy disk to be inserted.
The patch adds the check for the .sect * .head multiplication for not overflowing in the set_geometry function.
The bug was found by syzkaller.
Signed-off-by: Denis Efremov efremov@ispras.ru Tested-by: Willy Tarreau w@1wt.eu Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/floppy.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c index 671a0ae434b4..fee57f7f3821 100644 --- a/drivers/block/floppy.c +++ b/drivers/block/floppy.c @@ -3233,8 +3233,10 @@ static int set_geometry(unsigned int cmd, struct floppy_struct *g, int cnt;
/* sanity checking for parameters. */ - if (g->sect <= 0 || - g->head <= 0 || + if ((int)g->sect <= 0 || + (int)g->head <= 0 || + /* check for overflow in max_sector */ + (int)(g->sect * g->head) <= 0 || /* check for zero in F_SECT_PER_TRACK */ (unsigned char)((g->sect << 2) >> FD_SIZECODE(g)) == 0 || g->track <= 0 || g->track > UDP->tracks >> STRETCH(g) ||
From: Juergen Gross jgross@suse.com
commit a1078e821b605813b63bf6bca414a85f804d5c66 upstream.
Instead of trying to allocate pages with GFP_USER in add_ballooned_pages() check the available free memory via si_mem_available(). GFP_USER is far less limiting memory exhaustion than the test via si_mem_available().
This will avoid dom0 running out of memory due to excessive foreign page mappings especially on ARM and on x86 in PVH mode, as those don't have a pre-ballooned area which can be used for foreign mappings.
As the normal ballooning suffers from the same problem don't balloon down more than si_mem_available() pages in one iteration. At the same time limit the default maximum number of retries.
This is part of XSA-300.
Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/xen/balloon.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
--- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -538,8 +538,15 @@ static void balloon_process(struct work_ state = reserve_additional_memory(); }
- if (credit < 0) - state = decrease_reservation(-credit, GFP_BALLOON); + if (credit < 0) { + long n_pages; + + n_pages = min(-credit, si_mem_available()); + state = decrease_reservation(n_pages, GFP_BALLOON); + if (state == BP_DONE && n_pages != -credit && + n_pages < totalreserve_pages) + state = BP_EAGAIN; + }
state = update_schedule(state);
@@ -578,6 +585,9 @@ static int add_ballooned_pages(int nr_pa } }
+ if (si_mem_available() < nr_pages) + return -ENOMEM; + st = decrease_reservation(nr_pages, GFP_USER); if (st != BP_DONE) return -ENOMEM; @@ -710,7 +720,7 @@ static int __init balloon_init(void) balloon_stats.schedule_delay = 1; balloon_stats.max_schedule_delay = 32; balloon_stats.retry_count = 1; - balloon_stats.max_retry_count = RETRY_UNLIMITED; + balloon_stats.max_retry_count = 4;
#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG set_online_page_callback(&xen_online_page);
From: Finn Thain fthain@telegraphics.com.au
commit 57f31326518e98ee4cabf9a04efe00ed57c54147 upstream.
The reselection interrupt gets disabled during selection and must be re-enabled when hostdata->connected becomes NULL. If it isn't re-enabled a disconnected command may time-out or the target may wedge the bus while trying to reselect the host. This can happen after a command is aborted.
Fix this by enabling the reselection interrupt in NCR5380_main() after calls to NCR5380_select() and NCR5380_information_transfer() return.
Cc: Michael Schmitz schmitzmic@gmail.com Cc: stable@vger.kernel.org # v4.9+ Fixes: 8b00c3d5d40d ("ncr5380: Implement new eh_abort_handler") Signed-off-by: Finn Thain fthain@telegraphics.com.au Tested-by: Stan Johnson userm57@yahoo.com Tested-by: Michael Schmitz schmitzmic@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/NCR5380.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-)
--- a/drivers/scsi/NCR5380.c +++ b/drivers/scsi/NCR5380.c @@ -709,6 +709,8 @@ static void NCR5380_main(struct work_str NCR5380_information_transfer(instance); done = 0; } + if (!hostdata->connected) + NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask); spin_unlock_irq(&hostdata->lock); if (!done) cond_resched(); @@ -1110,8 +1112,6 @@ static bool NCR5380_select(struct Scsi_H spin_lock_irq(&hostdata->lock); NCR5380_write(INITIATOR_COMMAND_REG, ICR_BASE); NCR5380_reselect(instance); - if (!hostdata->connected) - NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask); shost_printk(KERN_ERR, instance, "reselection after won arbitration?\n"); goto out; } @@ -1119,7 +1119,6 @@ static bool NCR5380_select(struct Scsi_H if (err < 0) { spin_lock_irq(&hostdata->lock); NCR5380_write(INITIATOR_COMMAND_REG, ICR_BASE); - NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask);
/* Can't touch cmd if it has been reclaimed by the scsi ML */ if (!hostdata->selecting) @@ -1157,7 +1156,6 @@ static bool NCR5380_select(struct Scsi_H if (err < 0) { shost_printk(KERN_ERR, instance, "select: REQ timeout\n"); NCR5380_write(INITIATOR_COMMAND_REG, ICR_BASE); - NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask); goto out; } if (!hostdata->selecting) { @@ -1826,9 +1824,6 @@ static void NCR5380_information_transfer */ NCR5380_write(TARGET_COMMAND_REG, 0);
- /* Enable reselect interrupts */ - NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask); - maybe_release_dma_irq(instance); return; case MESSAGE_REJECT: @@ -1860,8 +1855,6 @@ static void NCR5380_information_transfer */ NCR5380_write(TARGET_COMMAND_REG, 0);
- /* Enable reselect interrupts */ - NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask); #ifdef SUN3_SCSI_VME dregs->csr |= CSR_DMA_ENABLE; #endif @@ -1964,7 +1957,6 @@ static void NCR5380_information_transfer cmd->result = DID_ERROR << 16; complete_cmd(instance, cmd); maybe_release_dma_irq(instance); - NCR5380_write(SELECT_ENABLE_REG, hostdata->id_mask); return; } msgout = NOP;
From: Finn Thain fthain@telegraphics.com.au
commit f9dfed1c785734b95b08d67600e05d2092508ab0 upstream.
A PDMA error is handled in the core driver by setting the device's 'borken' flag and aborting the command. Unfortunately, do_abort() is not dependable. Perform a SCSI bus reset instead, to make sure that the command fails and gets retried.
Cc: Michael Schmitz schmitzmic@gmail.com Cc: stable@vger.kernel.org # v4.20+ Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Finn Thain fthain@telegraphics.com.au Tested-by: Stan Johnson userm57@yahoo.com Tested-by: Michael Schmitz schmitzmic@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/NCR5380.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/scsi/NCR5380.c +++ b/drivers/scsi/NCR5380.c @@ -1761,10 +1761,8 @@ static void NCR5380_information_transfer scmd_printk(KERN_INFO, cmd, "switching to slow handshake\n"); cmd->device->borken = 1; - sink = 1; - do_abort(instance); - cmd->result = DID_ERROR << 16; - /* XXX - need to source or sink data here, as appropriate */ + do_reset(instance); + bus_reset_cleanup(instance); } } else { /* Transfer a small chunk so that the
From: Finn Thain fthain@telegraphics.com.au
commit 25fcf94a2fa89dd3e73e965ebb0b38a2a4f72aa4 upstream.
This reverts commit 4822827a69d7cd3bc5a07b7637484ebd2cf88db6.
The purpose of that commit was to suppress a timeout warning message which appeared to be caused by target latency. But suppressing the warning is undesirable as the warning may indicate a messed up transfer count.
Another problem with that commit is that 15 ms is too long to keep interrupts disabled as interrupt latency can cause system clock drift and other problems.
Cc: Michael Schmitz schmitzmic@gmail.com Cc: stable@vger.kernel.org Fixes: 4822827a69d7 ("scsi: ncr5380: Increase register polling limit") Signed-off-by: Finn Thain fthain@telegraphics.com.au Tested-by: Stan Johnson userm57@yahoo.com Tested-by: Michael Schmitz schmitzmic@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/NCR5380.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/scsi/NCR5380.h +++ b/drivers/scsi/NCR5380.h @@ -235,7 +235,7 @@ struct NCR5380_cmd { #define NCR5380_PIO_CHUNK_SIZE 256
/* Time limit (ms) to poll registers when IRQs are disabled, e.g. during PDMA */ -#define NCR5380_REG_POLL_TIME 15 +#define NCR5380_REG_POLL_TIME 10
static inline struct scsi_cmnd *NCR5380_to_scmd(struct NCR5380_cmd *ncmd_ptr) {
From: Ming Lei ming.lei@redhat.com
commit f9b0530fa02e0c73f31a49ef743e8f44eb8e32cc upstream.
When scsi_init_sense_cache(host) is called concurrently from different hosts, each code path may find that no cache has been created and allocate a new one. The lack of locking can lead to potentially overriding a cache allocated by a different host.
Fix the issue by moving 'mutex_lock(&scsi_sense_cache_mutex)' before scsi_select_sense_cache().
Fixes: 0a6ac4ee7c21 ("scsi: respect unchecked_isa_dma for blk-mq") Cc: Stable stable@vger.kernel.org Cc: Christoph Hellwig hch@lst.de Cc: Hannes Reinecke hare@suse.com Cc: Ewan D. Milne emilne@redhat.com Signed-off-by: Ming Lei ming.lei@redhat.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/scsi_lib.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -72,11 +72,11 @@ int scsi_init_sense_cache(struct Scsi_Ho struct kmem_cache *cache; int ret = 0;
+ mutex_lock(&scsi_sense_cache_mutex); cache = scsi_select_sense_cache(shost->unchecked_isa_dma); if (cache) - return 0; + goto exit;
- mutex_lock(&scsi_sense_cache_mutex); if (shost->unchecked_isa_dma) { scsi_sense_isadma_cache = kmem_cache_create("scsi_sense_cache(DMA)", @@ -92,7 +92,7 @@ int scsi_init_sense_cache(struct Scsi_Ho if (!scsi_sense_cache) ret = -ENOMEM; } - + exit: mutex_unlock(&scsi_sense_cache_mutex); return ret; }
From: Damien Le Moal damien.lemoal@wdc.com
commit 0cdc58580b37a160fac4b884266b8b7cb096f539 upstream.
kbuild test robot gets the following compilation warning using gcc 7.4 cross compilation for c6x (GCC_VERSION=7.4.0 make.cross ARCH=c6x).
In file included from include/asm-generic/bug.h:18:0, from arch/c6x/include/asm/bug.h:12, from include/linux/bug.h:5, from include/linux/thread_info.h:12, from include/asm-generic/current.h:5, from ./arch/c6x/include/generated/asm/current.h:1, from include/linux/sched.h:12, from include/linux/blkdev.h:5, from drivers//scsi/sd_zbc.c:11: drivers//scsi/sd_zbc.c: In function 'sd_zbc_read_zones':
include/linux/kernel.h:62:48: warning: 'zone_blocks' may be used
uninitialized in this function [-Wmaybe-uninitialized] #define __round_mask(x, y) ((__typeof__(x))((y)-1)) ^ drivers//scsi/sd_zbc.c:464:6: note: 'zone_blocks' was declared here u32 zone_blocks; ^~~~~~~~~~~
This is a false-positive report. The variable zone_blocks is always initialized in sd_zbc_check_zones() before use. It is not initialized only and only if sd_zbc_check_zones() fails.
Avoid this warning by initializing the zone_blocks variable to 0.
Fixes: 5f832a395859 ("scsi: sd_zbc: Fix sd_zbc_check_zones() error checks") Cc: Stable stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/sd_zbc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -417,7 +417,7 @@ int sd_zbc_read_zones(struct scsi_disk * { struct gendisk *disk = sdkp->disk; unsigned int nr_zones; - u32 zone_blocks; + u32 zone_blocks = 0; int ret;
if (!sd_is_zoned(sdkp))
From: Benjamin Block bblock@linux.ibm.com
commit b76becde2b84137faa29bbc9a3b20953b5980e48 upstream.
With a recent change to our send path for FSF commands we introduced a possible use-after-free of request-objects, that might further lead to zfcp crafting bad requests, which the FCP channel correctly complains about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action) { struct zfcp_fsf_req *req; ... spin_lock_irq(&qdio->req_q_lock); // ^^^^^^^^^^^^^^^^ // protects QDIO queue during sending ... req = zfcp_fsf_req_create(qdio, FSF_QTCB_OPEN_PORT_WITH_DID, SBAL_SFLAGS0_TYPE_READ, qdio->adapter->pool.erp_req); // ^^^^^^^^^^^^^^^^^^^ // allocation of the request-object ... retval = zfcp_fsf_req_send(req); ... spin_unlock_irq(&qdio->req_q_lock); return retval; }
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req) { struct zfcp_adapter *adapter = req->adapter; struct zfcp_qdio *qdio = adapter->qdio; ... zfcp_reqlist_add(adapter->req_list, req); // ^^^^^^^^^^^^^^^^ // add request to our driver-internal hash-table for tracking // (protected by separate lock req_list->lock) ... if (zfcp_qdio_send(qdio, &req->qdio_req)) { // ^^^^^^^^^^^^^^ // hand-off the request to FCP channel; // the request can complete at any point now ... }
/* Don't increase for unsolicited status */ if (!zfcp_fsf_req_is_status_read_buffer(req)) // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ // possible use-after-free adapter->fsf_req_seq_no++; // ^^^^^^^^^^^^^^^^ // because of the use-after-free we might // miss this accounting, and as follow-up // this results in the FCP channel error // FSF_PROT_SEQ_NUMB_ERROR adapter->req_no++;
return 0; }
static inline bool zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req) { return req->qtcb == NULL; // ^^^^^^^^^ // possible use-after-free }
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx) { ... struct zfcp_fsf_req *fsf_req; ... for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) { ... fsf_req = zfcp_reqlist_find_rm(adapter->req_list, req_id); // ^^^^^^^^^^^^^^^^^^^^ // remove request from our driver-internal // hash-table (lock req_list->lock) ... zfcp_fsf_req_complete(fsf_req); } }
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req) { ... if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP)) zfcp_fsf_req_free(req); // ^^^^^^^^^^^^^^^^^ // free memory for request-object else complete(&req->completion); // ^^^^^^^^ // completion notification for code-paths that wait // synchronous for the completion of the request; in // those the memory is freed separately }
The result of the use-after-free only affects the send path, and can not lead to any data corruption. In case we miss the sequence-number accounting, because the memory was already re-purposed, the next FSF command will fail with said FCP channel error, and we will recover the whole adapter. This causes no additional errors, but it slows down traffic. There is a slight chance of the same thing happen again recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU that gets scheduled away by z/VM, while the return path might still run, and so create the necessary timing. Running with KASAN can also slow down the kernel sufficiently to run into this user-after-free, and then see the report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in front of the hand-off to the FCP channel (this information doesn't change during hand-off), but leave the sequence-number accounting itself where it is.
To make future regressions of the same kind less likely, add comments to all closely related code-paths.
Signed-off-by: Benjamin Block bblock@linux.ibm.com Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header") Cc: stable@vger.kernel.org #5.0+ Reviewed-by: Steffen Maier maier@linux.ibm.com Reviewed-by: Jens Remus jremus@linux.ibm.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/s390/scsi/zfcp_fsf.c | 45 ++++++++++++++++++++++++++++++++++++++----- 1 file changed, 40 insertions(+), 5 deletions(-)
--- a/drivers/s390/scsi/zfcp_fsf.c +++ b/drivers/s390/scsi/zfcp_fsf.c @@ -11,6 +11,7 @@ #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
#include <linux/blktrace_api.h> +#include <linux/types.h> #include <linux/slab.h> #include <scsi/fc/fc_els.h> #include "zfcp_ext.h" @@ -741,6 +742,7 @@ static struct zfcp_fsf_req *zfcp_fsf_req
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req) { + const bool is_srb = zfcp_fsf_req_is_status_read_buffer(req); struct zfcp_adapter *adapter = req->adapter; struct zfcp_qdio *qdio = adapter->qdio; int req_id = req->req_id; @@ -757,8 +759,20 @@ static int zfcp_fsf_req_send(struct zfcp return -EIO; }
+ /* + * NOTE: DO NOT TOUCH ASYNC req PAST THIS POINT. + * ONLY TOUCH SYNC req AGAIN ON req->completion. + * + * The request might complete and be freed concurrently at any point + * now. This is not protected by the QDIO-lock (req_q_lock). So any + * uncontrolled access after this might result in an use-after-free bug. + * Only if the request doesn't have ZFCP_STATUS_FSFREQ_CLEANUP set, and + * when it is completed via req->completion, is it safe to use req + * again. + */ + /* Don't increase for unsolicited status */ - if (!zfcp_fsf_req_is_status_read_buffer(req)) + if (!is_srb) adapter->fsf_req_seq_no++; adapter->req_no++;
@@ -805,6 +819,7 @@ int zfcp_fsf_status_read(struct zfcp_qdi retval = zfcp_fsf_req_send(req); if (retval) goto failed_req_send; + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */
goto out;
@@ -914,8 +929,10 @@ struct zfcp_fsf_req *zfcp_fsf_abort_fcp_ req->qtcb->bottom.support.req_handle = (u64) old_req_id;
zfcp_fsf_start_timer(req, ZFCP_FSF_SCSI_ER_TIMEOUT); - if (!zfcp_fsf_req_send(req)) + if (!zfcp_fsf_req_send(req)) { + /* NOTE: DO NOT TOUCH req, UNTIL IT COMPLETES! */ goto out; + }
out_error_free: zfcp_fsf_req_free(req); @@ -1098,6 +1115,7 @@ int zfcp_fsf_send_ct(struct zfcp_fc_wka_ ret = zfcp_fsf_req_send(req); if (ret) goto failed_send; + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */
goto out;
@@ -1198,6 +1216,7 @@ int zfcp_fsf_send_els(struct zfcp_adapte ret = zfcp_fsf_req_send(req); if (ret) goto failed_send; + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */
goto out;
@@ -1243,6 +1262,7 @@ int zfcp_fsf_exchange_config_data(struct zfcp_fsf_req_free(req); erp_action->fsf_req_id = 0; } + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */ out: spin_unlock_irq(&qdio->req_q_lock); return retval; @@ -1279,8 +1299,10 @@ int zfcp_fsf_exchange_config_data_sync(s zfcp_fsf_start_timer(req, ZFCP_FSF_REQUEST_TIMEOUT); retval = zfcp_fsf_req_send(req); spin_unlock_irq(&qdio->req_q_lock); - if (!retval) + if (!retval) { + /* NOTE: ONLY TOUCH SYNC req AGAIN ON req->completion. */ wait_for_completion(&req->completion); + }
zfcp_fsf_req_free(req); return retval; @@ -1330,6 +1352,7 @@ int zfcp_fsf_exchange_port_data(struct z zfcp_fsf_req_free(req); erp_action->fsf_req_id = 0; } + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */ out: spin_unlock_irq(&qdio->req_q_lock); return retval; @@ -1372,8 +1395,10 @@ int zfcp_fsf_exchange_port_data_sync(str retval = zfcp_fsf_req_send(req); spin_unlock_irq(&qdio->req_q_lock);
- if (!retval) + if (!retval) { + /* NOTE: ONLY TOUCH SYNC req AGAIN ON req->completion. */ wait_for_completion(&req->completion); + }
zfcp_fsf_req_free(req);
@@ -1493,6 +1518,7 @@ int zfcp_fsf_open_port(struct zfcp_erp_a erp_action->fsf_req_id = 0; put_device(&port->dev); } + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */ out: spin_unlock_irq(&qdio->req_q_lock); return retval; @@ -1557,6 +1583,7 @@ int zfcp_fsf_close_port(struct zfcp_erp_ zfcp_fsf_req_free(req); erp_action->fsf_req_id = 0; } + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */ out: spin_unlock_irq(&qdio->req_q_lock); return retval; @@ -1626,6 +1653,7 @@ int zfcp_fsf_open_wka_port(struct zfcp_f retval = zfcp_fsf_req_send(req); if (retval) zfcp_fsf_req_free(req); + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */ out: spin_unlock_irq(&qdio->req_q_lock); if (!retval) @@ -1681,6 +1709,7 @@ int zfcp_fsf_close_wka_port(struct zfcp_ retval = zfcp_fsf_req_send(req); if (retval) zfcp_fsf_req_free(req); + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */ out: spin_unlock_irq(&qdio->req_q_lock); if (!retval) @@ -1776,6 +1805,7 @@ int zfcp_fsf_close_physical_port(struct zfcp_fsf_req_free(req); erp_action->fsf_req_id = 0; } + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */ out: spin_unlock_irq(&qdio->req_q_lock); return retval; @@ -1899,6 +1929,7 @@ int zfcp_fsf_open_lun(struct zfcp_erp_ac zfcp_fsf_req_free(req); erp_action->fsf_req_id = 0; } + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */ out: spin_unlock_irq(&qdio->req_q_lock); return retval; @@ -1987,6 +2018,7 @@ int zfcp_fsf_close_lun(struct zfcp_erp_a zfcp_fsf_req_free(req); erp_action->fsf_req_id = 0; } + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */ out: spin_unlock_irq(&qdio->req_q_lock); return retval; @@ -2299,6 +2331,7 @@ int zfcp_fsf_fcp_cmnd(struct scsi_cmnd * retval = zfcp_fsf_req_send(req); if (unlikely(retval)) goto failed_scsi_cmnd; + /* NOTE: DO NOT TOUCH req PAST THIS POINT! */
goto out;
@@ -2373,8 +2406,10 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_m zfcp_fc_fcp_tm(fcp_cmnd, sdev, tm_flags);
zfcp_fsf_start_timer(req, ZFCP_FSF_SCSI_ER_TIMEOUT); - if (!zfcp_fsf_req_send(req)) + if (!zfcp_fsf_req_send(req)) { + /* NOTE: DO NOT TOUCH req, UNTIL IT COMPLETES! */ goto out; + }
zfcp_fsf_req_free(req); req = NULL;
From: Benjamin Block bblock@linux.ibm.com
commit 106d45f350c7cac876844dc685845cba4ffdb70b upstream.
When tracing instances where we open and close WKA ports, we also pass the request-ID of the respective FSF command.
But after successfully sending the FSF command we must not use the request-object anymore, as this might result in an use-after-free (see "zfcp: fix request object use-after-free in send path causing seqno errors" ).
To fix this add a new variable that caches the request-ID before sending the request. This won't change during the hand-off to the FCP channel, and so it's safe to trace this cached request-ID later, instead of using the request object.
Signed-off-by: Benjamin Block bblock@linux.ibm.com Fixes: d27a7cb91960 ("zfcp: trace on request for open and close of WKA port") Cc: stable@vger.kernel.org #2.6.38+ Reviewed-by: Steffen Maier maier@linux.ibm.com Reviewed-by: Jens Remus jremus@linux.ibm.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/s390/scsi/zfcp_fsf.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/s390/scsi/zfcp_fsf.c +++ b/drivers/s390/scsi/zfcp_fsf.c @@ -1627,6 +1627,7 @@ int zfcp_fsf_open_wka_port(struct zfcp_f { struct zfcp_qdio *qdio = wka_port->adapter->qdio; struct zfcp_fsf_req *req; + unsigned long req_id = 0; int retval = -EIO;
spin_lock_irq(&qdio->req_q_lock); @@ -1649,6 +1650,8 @@ int zfcp_fsf_open_wka_port(struct zfcp_f hton24(req->qtcb->bottom.support.d_id, wka_port->d_id); req->data = wka_port;
+ req_id = req->req_id; + zfcp_fsf_start_timer(req, ZFCP_FSF_REQUEST_TIMEOUT); retval = zfcp_fsf_req_send(req); if (retval) @@ -1657,7 +1660,7 @@ int zfcp_fsf_open_wka_port(struct zfcp_f out: spin_unlock_irq(&qdio->req_q_lock); if (!retval) - zfcp_dbf_rec_run_wka("fsowp_1", wka_port, req->req_id); + zfcp_dbf_rec_run_wka("fsowp_1", wka_port, req_id); return retval; }
@@ -1683,6 +1686,7 @@ int zfcp_fsf_close_wka_port(struct zfcp_ { struct zfcp_qdio *qdio = wka_port->adapter->qdio; struct zfcp_fsf_req *req; + unsigned long req_id = 0; int retval = -EIO;
spin_lock_irq(&qdio->req_q_lock); @@ -1705,6 +1709,8 @@ int zfcp_fsf_close_wka_port(struct zfcp_ req->data = wka_port; req->qtcb->header.port_handle = wka_port->handle;
+ req_id = req->req_id; + zfcp_fsf_start_timer(req, ZFCP_FSF_REQUEST_TIMEOUT); retval = zfcp_fsf_req_send(req); if (retval) @@ -1713,7 +1719,7 @@ int zfcp_fsf_close_wka_port(struct zfcp_ out: spin_unlock_irq(&qdio->req_q_lock); if (!retval) - zfcp_dbf_rec_run_wka("fscwp_1", wka_port, req->req_id); + zfcp_dbf_rec_run_wka("fscwp_1", wka_port, req_id); return retval; }
From: Shivasharan S shivasharan.srikanteshwara@broadcom.com
commit c8f96df5b8e633056b7ebf5d52a9d6fb1b156ce3 upstream.
In megasas_get_target_prop(), driver is incorrectly calculating the target ID for devices with channel 1 and 3. Due to this, firmware will either fail the command (if there is no device with the target id sent from driver) or could return the properties for a target which was not intended. Devices could end up with the wrong queue depth due to this.
Fix target id calculation for channel 1 and 3.
Fixes: 96188a89cc6d ("scsi: megaraid_sas: NVME interface target prop added") Cc: stable@vger.kernel.org Signed-off-by: Shivasharan S shivasharan.srikanteshwara@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/megaraid/megaraid_sas_base.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -6155,7 +6155,8 @@ megasas_get_target_prop(struct megasas_i int ret; struct megasas_cmd *cmd; struct megasas_dcmd_frame *dcmd; - u16 targetId = (sdev->channel % 2) + sdev->id; + u16 targetId = ((sdev->channel % 2) * MEGASAS_MAX_DEV_PER_CHANNEL) + + sdev->id;
cmd = megasas_get_cmd(instance);
From: Finn Thain fthain@telegraphics.com.au
commit 7398cee4c3e6aea1ba07a6449e5533ecd0b92cdd upstream.
Some targets introduce delays when handshaking the response to certain commands. For example, a disk may send a 96-byte response to an INQUIRY command (or a 24-byte response to a MODE SENSE command) too slowly.
Apparently the first 12 or 14 bytes are handshaked okay but then the system bus error timeout is reached while transferring the next word.
Since the scsi bus phase hasn't changed, the driver then sets the target borken flag to prevent further PDMA transfers. The driver also logs the warning, "switching to slow handshake".
Raise the PDMA threshold to 512 bytes so that PIO transfers will be used for these commands. This default is sufficiently low that PDMA will still be used for READ and WRITE commands.
The existing threshold (16 bytes) was chosen more or less at random. However, best performance requires the threshold to be as low as possible. Those systems that don't need the PIO workaround at all may benefit from mac_scsi.setup_use_pdma=1
Cc: Michael Schmitz schmitzmic@gmail.com Cc: stable@vger.kernel.org # v4.14+ Fixes: 3a0f64bfa907 ("mac_scsi: Fix pseudo DMA implementation") Signed-off-by: Finn Thain fthain@telegraphics.com.au Tested-by: Stan Johnson userm57@yahoo.com Tested-by: Michael Schmitz schmitzmic@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/mac_scsi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/scsi/mac_scsi.c +++ b/drivers/scsi/mac_scsi.c @@ -53,7 +53,7 @@ static int setup_cmd_per_lun = -1; module_param(setup_cmd_per_lun, int, 0); static int setup_sg_tablesize = -1; module_param(setup_sg_tablesize, int, 0); -static int setup_use_pdma = -1; +static int setup_use_pdma = 512; module_param(setup_use_pdma, int, 0); static int setup_hostid = -1; module_param(setup_hostid, int, 0); @@ -306,7 +306,7 @@ static int macscsi_dma_xfer_len(struct N struct scsi_cmnd *cmd) { if (hostdata->flags & FLAG_NO_PSEUDO_DMA || - cmd->SCp.this_residual < 16) + cmd->SCp.this_residual < setup_use_pdma) return 0;
return cmd->SCp.this_residual;
From: Finn Thain fthain@telegraphics.com.au
commit 78ff751f8e6a9446e9fb26b2bff0b8d3f8974cbd upstream.
A system bus error during a PDMA transfer can mess up the calculation of the transfer residual (the PDMA handshaking hardware lacks a byte counter). This results in data corruption.
The algorithm in this patch anticipates a bus error by starting each transfer with a MOVE.B instruction. If a bus error is caught the transfer will be retried. If a bus error is caught later in the transfer (for a MOVE.W instruction) the transfer gets failed and subsequent requests for that target will use PIO instead of PDMA.
This avoids the "!REQ and !ACK" error so the severity level of that message is reduced to KERN_DEBUG.
Cc: Michael Schmitz schmitzmic@gmail.com Cc: Geert Uytterhoeven geert@linux-m68k.org Cc: stable@vger.kernel.org # v4.14+ Fixes: 3a0f64bfa907 ("mac_scsi: Fix pseudo DMA implementation") Signed-off-by: Finn Thain fthain@telegraphics.com.au Reported-by: Chris Jones chris@martin-jones.com Tested-by: Stan Johnson userm57@yahoo.com Tested-by: Michael Schmitz schmitzmic@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/mac_scsi.c | 369 ++++++++++++++++++++++++++++-------------------- 1 file changed, 217 insertions(+), 152 deletions(-)
--- a/drivers/scsi/mac_scsi.c +++ b/drivers/scsi/mac_scsi.c @@ -4,6 +4,8 @@ * * Copyright 1998, Michael Schmitz mschmitz@lbl.gov * + * Copyright 2019 Finn Thain + * * derived in part from: */ /* @@ -12,6 +14,7 @@ * Copyright 1995, Russell King */
+#include <linux/delay.h> #include <linux/types.h> #include <linux/module.h> #include <linux/ioport.h> @@ -90,101 +93,217 @@ static int __init mac_scsi_setup(char *s __setup("mac5380=", mac_scsi_setup); #endif /* !MODULE */
-/* Pseudo DMA asm originally by Ove Edlund */ +/* + * According to "Inside Macintosh: Devices", Mac OS requires disk drivers to + * specify the number of bytes between the delays expected from a SCSI target. + * This allows the operating system to "prevent bus errors when a target fails + * to deliver the next byte within the processor bus error timeout period." + * Linux SCSI drivers lack knowledge of the timing behaviour of SCSI targets + * so bus errors are unavoidable. + * + * If a MOVE.B instruction faults, we assume that zero bytes were transferred + * and simply retry. That assumption probably depends on target behaviour but + * seems to hold up okay. The NOP provides synchronization: without it the + * fault can sometimes occur after the program counter has moved past the + * offending instruction. Post-increment addressing can't be used. + */ + +#define MOVE_BYTE(operands) \ + asm volatile ( \ + "1: moveb " operands " \n" \ + "11: nop \n" \ + " addq #1,%0 \n" \ + " subq #1,%1 \n" \ + "40: \n" \ + " \n" \ + ".section .fixup,"ax" \n" \ + ".even \n" \ + "90: movel #1, %2 \n" \ + " jra 40b \n" \ + ".previous \n" \ + " \n" \ + ".section __ex_table,"a" \n" \ + ".align 4 \n" \ + ".long 1b,90b \n" \ + ".long 11b,90b \n" \ + ".previous \n" \ + : "+a" (addr), "+r" (n), "+r" (result) : "a" (io))
-#define CP_IO_TO_MEM(s,d,n) \ -__asm__ __volatile__ \ - (" cmp.w #4,%2\n" \ - " bls 8f\n" \ - " move.w %1,%%d0\n" \ - " neg.b %%d0\n" \ - " and.w #3,%%d0\n" \ - " sub.w %%d0,%2\n" \ - " bra 2f\n" \ - " 1: move.b (%0),(%1)+\n" \ - " 2: dbf %%d0,1b\n" \ - " move.w %2,%%d0\n" \ - " lsr.w #5,%%d0\n" \ - " bra 4f\n" \ - " 3: move.l (%0),(%1)+\n" \ - "31: move.l (%0),(%1)+\n" \ - "32: move.l (%0),(%1)+\n" \ - "33: move.l (%0),(%1)+\n" \ - "34: move.l (%0),(%1)+\n" \ - "35: move.l (%0),(%1)+\n" \ - "36: move.l (%0),(%1)+\n" \ - "37: move.l (%0),(%1)+\n" \ - " 4: dbf %%d0,3b\n" \ - " move.w %2,%%d0\n" \ - " lsr.w #2,%%d0\n" \ - " and.w #7,%%d0\n" \ - " bra 6f\n" \ - " 5: move.l (%0),(%1)+\n" \ - " 6: dbf %%d0,5b\n" \ - " and.w #3,%2\n" \ - " bra 8f\n" \ - " 7: move.b (%0),(%1)+\n" \ - " 8: dbf %2,7b\n" \ - " moveq.l #0, %2\n" \ - " 9: \n" \ - ".section .fixup,"ax"\n" \ - " .even\n" \ - "91: moveq.l #1, %2\n" \ - " jra 9b\n" \ - "94: moveq.l #4, %2\n" \ - " jra 9b\n" \ - ".previous\n" \ - ".section __ex_table,"a"\n" \ - " .align 4\n" \ - " .long 1b,91b\n" \ - " .long 3b,94b\n" \ - " .long 31b,94b\n" \ - " .long 32b,94b\n" \ - " .long 33b,94b\n" \ - " .long 34b,94b\n" \ - " .long 35b,94b\n" \ - " .long 36b,94b\n" \ - " .long 37b,94b\n" \ - " .long 5b,94b\n" \ - " .long 7b,91b\n" \ - ".previous" \ - : "=a"(s), "=a"(d), "=d"(n) \ - : "0"(s), "1"(d), "2"(n) \ - : "d0") +/* + * If a MOVE.W (or MOVE.L) instruction faults, it cannot be retried because + * the residual byte count would be uncertain. In that situation the MOVE_WORD + * macro clears n in the fixup section to abort the transfer. + */ + +#define MOVE_WORD(operands) \ + asm volatile ( \ + "1: movew " operands " \n" \ + "11: nop \n" \ + " subq #2,%1 \n" \ + "40: \n" \ + " \n" \ + ".section .fixup,"ax" \n" \ + ".even \n" \ + "90: movel #0, %1 \n" \ + " movel #2, %2 \n" \ + " jra 40b \n" \ + ".previous \n" \ + " \n" \ + ".section __ex_table,"a" \n" \ + ".align 4 \n" \ + ".long 1b,90b \n" \ + ".long 11b,90b \n" \ + ".previous \n" \ + : "+a" (addr), "+r" (n), "+r" (result) : "a" (io)) + +#define MOVE_16_WORDS(operands) \ + asm volatile ( \ + "1: movew " operands " \n" \ + "2: movew " operands " \n" \ + "3: movew " operands " \n" \ + "4: movew " operands " \n" \ + "5: movew " operands " \n" \ + "6: movew " operands " \n" \ + "7: movew " operands " \n" \ + "8: movew " operands " \n" \ + "9: movew " operands " \n" \ + "10: movew " operands " \n" \ + "11: movew " operands " \n" \ + "12: movew " operands " \n" \ + "13: movew " operands " \n" \ + "14: movew " operands " \n" \ + "15: movew " operands " \n" \ + "16: movew " operands " \n" \ + "17: nop \n" \ + " subl #32,%1 \n" \ + "40: \n" \ + " \n" \ + ".section .fixup,"ax" \n" \ + ".even \n" \ + "90: movel #0, %1 \n" \ + " movel #2, %2 \n" \ + " jra 40b \n" \ + ".previous \n" \ + " \n" \ + ".section __ex_table,"a" \n" \ + ".align 4 \n" \ + ".long 1b,90b \n" \ + ".long 2b,90b \n" \ + ".long 3b,90b \n" \ + ".long 4b,90b \n" \ + ".long 5b,90b \n" \ + ".long 6b,90b \n" \ + ".long 7b,90b \n" \ + ".long 8b,90b \n" \ + ".long 9b,90b \n" \ + ".long 10b,90b \n" \ + ".long 11b,90b \n" \ + ".long 12b,90b \n" \ + ".long 13b,90b \n" \ + ".long 14b,90b \n" \ + ".long 15b,90b \n" \ + ".long 16b,90b \n" \ + ".long 17b,90b \n" \ + ".previous \n" \ + : "+a" (addr), "+r" (n), "+r" (result) : "a" (io)) + +#define MAC_PDMA_DELAY 32 + +static inline int mac_pdma_recv(void __iomem *io, unsigned char *start, int n) +{ + unsigned char *addr = start; + int result = 0; + + if (n >= 1) { + MOVE_BYTE("%3@,%0@"); + if (result) + goto out; + } + if (n >= 1 && ((unsigned long)addr & 1)) { + MOVE_BYTE("%3@,%0@"); + if (result) + goto out; + } + while (n >= 32) + MOVE_16_WORDS("%3@,%0@+"); + while (n >= 2) + MOVE_WORD("%3@,%0@+"); + if (result) + return start - addr; /* Negated to indicate uncertain length */ + if (n == 1) + MOVE_BYTE("%3@,%0@"); +out: + return addr - start; +} + +static inline int mac_pdma_send(unsigned char *start, void __iomem *io, int n) +{ + unsigned char *addr = start; + int result = 0; + + if (n >= 1) { + MOVE_BYTE("%0@,%3@"); + if (result) + goto out; + } + if (n >= 1 && ((unsigned long)addr & 1)) { + MOVE_BYTE("%0@,%3@"); + if (result) + goto out; + } + while (n >= 32) + MOVE_16_WORDS("%0@+,%3@"); + while (n >= 2) + MOVE_WORD("%0@+,%3@"); + if (result) + return start - addr; /* Negated to indicate uncertain length */ + if (n == 1) + MOVE_BYTE("%0@,%3@"); +out: + return addr - start; +}
static inline int macscsi_pread(struct NCR5380_hostdata *hostdata, unsigned char *dst, int len) { u8 __iomem *s = hostdata->pdma_io + (INPUT_DATA_REG << 4); unsigned char *d = dst; - int n = len; - int transferred; + + hostdata->pdma_residual = len;
while (!NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG, BASR_DRQ | BASR_PHASE_MATCH, BASR_DRQ | BASR_PHASE_MATCH, HZ / 64)) { - CP_IO_TO_MEM(s, d, n); + int bytes; + + bytes = mac_pdma_recv(s, d, min(hostdata->pdma_residual, 512));
- transferred = d - dst - n; - hostdata->pdma_residual = len - transferred; + if (bytes > 0) { + d += bytes; + hostdata->pdma_residual -= bytes; + }
- /* No bus error. */ - if (n == 0) + if (hostdata->pdma_residual == 0) return 0;
- /* Target changed phase early? */ if (NCR5380_poll_politely2(hostdata, STATUS_REG, SR_REQ, SR_REQ, - BUS_AND_STATUS_REG, BASR_ACK, BASR_ACK, HZ / 64) < 0) - scmd_printk(KERN_ERR, hostdata->connected, + BUS_AND_STATUS_REG, BASR_ACK, + BASR_ACK, HZ / 64) < 0) + scmd_printk(KERN_DEBUG, hostdata->connected, "%s: !REQ and !ACK\n", __func__); if (!(NCR5380_read(BUS_AND_STATUS_REG) & BASR_PHASE_MATCH)) return 0;
+ if (bytes == 0) + udelay(MAC_PDMA_DELAY); + + if (bytes >= 0) + continue; + dsprintk(NDEBUG_PSEUDO_DMA, hostdata->host, - "%s: bus error (%d/%d)\n", __func__, transferred, len); + "%s: bus error (%d/%d)\n", __func__, d - dst, len); NCR5380_dprint(NDEBUG_PSEUDO_DMA, hostdata->host); - d = dst + transferred; - n = len - transferred; + return -1; }
scmd_printk(KERN_ERR, hostdata->connected, @@ -193,93 +312,27 @@ static inline int macscsi_pread(struct N return -1; }
- -#define CP_MEM_TO_IO(s,d,n) \ -__asm__ __volatile__ \ - (" cmp.w #4,%2\n" \ - " bls 8f\n" \ - " move.w %0,%%d0\n" \ - " neg.b %%d0\n" \ - " and.w #3,%%d0\n" \ - " sub.w %%d0,%2\n" \ - " bra 2f\n" \ - " 1: move.b (%0)+,(%1)\n" \ - " 2: dbf %%d0,1b\n" \ - " move.w %2,%%d0\n" \ - " lsr.w #5,%%d0\n" \ - " bra 4f\n" \ - " 3: move.l (%0)+,(%1)\n" \ - "31: move.l (%0)+,(%1)\n" \ - "32: move.l (%0)+,(%1)\n" \ - "33: move.l (%0)+,(%1)\n" \ - "34: move.l (%0)+,(%1)\n" \ - "35: move.l (%0)+,(%1)\n" \ - "36: move.l (%0)+,(%1)\n" \ - "37: move.l (%0)+,(%1)\n" \ - " 4: dbf %%d0,3b\n" \ - " move.w %2,%%d0\n" \ - " lsr.w #2,%%d0\n" \ - " and.w #7,%%d0\n" \ - " bra 6f\n" \ - " 5: move.l (%0)+,(%1)\n" \ - " 6: dbf %%d0,5b\n" \ - " and.w #3,%2\n" \ - " bra 8f\n" \ - " 7: move.b (%0)+,(%1)\n" \ - " 8: dbf %2,7b\n" \ - " moveq.l #0, %2\n" \ - " 9: \n" \ - ".section .fixup,"ax"\n" \ - " .even\n" \ - "91: moveq.l #1, %2\n" \ - " jra 9b\n" \ - "94: moveq.l #4, %2\n" \ - " jra 9b\n" \ - ".previous\n" \ - ".section __ex_table,"a"\n" \ - " .align 4\n" \ - " .long 1b,91b\n" \ - " .long 3b,94b\n" \ - " .long 31b,94b\n" \ - " .long 32b,94b\n" \ - " .long 33b,94b\n" \ - " .long 34b,94b\n" \ - " .long 35b,94b\n" \ - " .long 36b,94b\n" \ - " .long 37b,94b\n" \ - " .long 5b,94b\n" \ - " .long 7b,91b\n" \ - ".previous" \ - : "=a"(s), "=a"(d), "=d"(n) \ - : "0"(s), "1"(d), "2"(n) \ - : "d0") - static inline int macscsi_pwrite(struct NCR5380_hostdata *hostdata, unsigned char *src, int len) { unsigned char *s = src; u8 __iomem *d = hostdata->pdma_io + (OUTPUT_DATA_REG << 4); - int n = len; - int transferred; + + hostdata->pdma_residual = len;
while (!NCR5380_poll_politely(hostdata, BUS_AND_STATUS_REG, BASR_DRQ | BASR_PHASE_MATCH, BASR_DRQ | BASR_PHASE_MATCH, HZ / 64)) { - CP_MEM_TO_IO(s, d, n); + int bytes;
- transferred = s - src - n; - hostdata->pdma_residual = len - transferred; + bytes = mac_pdma_send(s, d, min(hostdata->pdma_residual, 512));
- /* Target changed phase early? */ - if (NCR5380_poll_politely2(hostdata, STATUS_REG, SR_REQ, SR_REQ, - BUS_AND_STATUS_REG, BASR_ACK, BASR_ACK, HZ / 64) < 0) - scmd_printk(KERN_ERR, hostdata->connected, - "%s: !REQ and !ACK\n", __func__); - if (!(NCR5380_read(BUS_AND_STATUS_REG) & BASR_PHASE_MATCH)) - return 0; + if (bytes > 0) { + s += bytes; + hostdata->pdma_residual -= bytes; + }
- /* No bus error. */ - if (n == 0) { + if (hostdata->pdma_residual == 0) { if (NCR5380_poll_politely(hostdata, TARGET_COMMAND_REG, TCR_LAST_BYTE_SENT, TCR_LAST_BYTE_SENT, HZ / 64) < 0) @@ -288,17 +341,29 @@ static inline int macscsi_pwrite(struct return 0; }
+ if (NCR5380_poll_politely2(hostdata, STATUS_REG, SR_REQ, SR_REQ, + BUS_AND_STATUS_REG, BASR_ACK, + BASR_ACK, HZ / 64) < 0) + scmd_printk(KERN_DEBUG, hostdata->connected, + "%s: !REQ and !ACK\n", __func__); + if (!(NCR5380_read(BUS_AND_STATUS_REG) & BASR_PHASE_MATCH)) + return 0; + + if (bytes == 0) + udelay(MAC_PDMA_DELAY); + + if (bytes >= 0) + continue; + dsprintk(NDEBUG_PSEUDO_DMA, hostdata->host, - "%s: bus error (%d/%d)\n", __func__, transferred, len); + "%s: bus error (%d/%d)\n", __func__, s - src, len); NCR5380_dprint(NDEBUG_PSEUDO_DMA, hostdata->host); - s = src + transferred; - n = len - transferred; + return -1; }
scmd_printk(KERN_ERR, hostdata->connected, "%s: phase mismatch or !DRQ\n", __func__); NCR5380_dprint(NDEBUG_PSEUDO_DMA, hostdata->host); - return -1; }
From: Eric Biggers ebiggers@google.com
commit 5c6bc4dfa515738149998bb0db2481a4fdead979 upstream.
Changing ghash_mod_init() to be subsys_initcall made it start running before the alignment fault handler has been installed on ARM. In kernel builds where the keys in the ghash test vectors happened to be misaligned in the kernel image, this exposed the longstanding bug that ghash_setkey() is incorrectly casting the key buffer (which can have any alignment) to be128 for passing to gf128mul_init_4k_lle().
Fix this by memcpy()ing the key to a temporary buffer.
Don't fix it by setting an alignmask on the algorithm instead because that would unnecessarily force alignment of the data too.
Fixes: 2cdc6899a88e ("crypto: ghash - Add GHASH digest algorithm for GCM") Reported-by: Peter Robinson pbrobinson@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Tested-by: Peter Robinson pbrobinson@gmail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/ghash-generic.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/crypto/ghash-generic.c +++ b/crypto/ghash-generic.c @@ -31,6 +31,7 @@ static int ghash_setkey(struct crypto_sh const u8 *key, unsigned int keylen) { struct ghash_ctx *ctx = crypto_shash_ctx(tfm); + be128 k;
if (keylen != GHASH_BLOCK_SIZE) { crypto_shash_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN); @@ -39,7 +40,12 @@ static int ghash_setkey(struct crypto_sh
if (ctx->gf128) gf128mul_free_4k(ctx->gf128); - ctx->gf128 = gf128mul_init_4k_lle((be128 *)key); + + BUILD_BUG_ON(sizeof(k) != GHASH_BLOCK_SIZE); + memcpy(&k, key, GHASH_BLOCK_SIZE); /* avoid violating alignment rules */ + ctx->gf128 = gf128mul_init_4k_lle(&k); + memzero_explicit(&k, GHASH_BLOCK_SIZE); + if (!ctx->gf128) return -ENOMEM;
From: Ard Biesheuvel ard.biesheuvel@linaro.org
commit ed527b13d800dd515a9e6c582f0a73eca65b2e1b upstream.
The CAAM driver currently violates an undocumented and slightly controversial requirement imposed by the crypto stack that a buffer referred to by the request structure via its virtual address may not be modified while any scatterlists passed via the same request structure are mapped for inbound DMA.
This may result in errors like
alg: aead: decryption failed on test 1 for gcm_base(ctr-aes-caam,ghash-generic): ret=74 alg: aead: Failed to load transform for gcm(aes): -2
on non-cache coherent systems, due to the fact that the GCM driver passes an IV buffer by virtual address which shares a cacheline with the auth_tag buffer passed via a scatterlist, resulting in corruption of the auth_tag when the IV is updated while the DMA mapping is live.
Since the IV that is returned to the caller is only valid for CBC mode, and given that the in-kernel users of CBC (such as CTS) don't trigger the same issue as the GCM driver, let's just disable the output IV generation for all modes except CBC for the time being.
Fixes: 854b06f76879 ("crypto: caam - properly set IV after {en,de}crypt") Cc: Horia Geanta horia.geanta@nxp.com Cc: Iuliana Prodan iuliana.prodan@nxp.com Reported-by: Sascha Hauer s.hauer@pengutronix.de Cc: stable@vger.kernel.org Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Reviewed-by: Horia Geanta horia.geanta@nxp.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/caam/caamalg.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/drivers/crypto/caam/caamalg.c +++ b/drivers/crypto/caam/caamalg.c @@ -999,6 +999,7 @@ static void skcipher_encrypt_done(struct struct skcipher_request *req = context; struct skcipher_edesc *edesc; struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(req); + struct caam_ctx *ctx = crypto_skcipher_ctx(skcipher); int ivsize = crypto_skcipher_ivsize(skcipher);
#ifdef DEBUG @@ -1023,9 +1024,9 @@ static void skcipher_encrypt_done(struct
/* * The crypto API expects us to set the IV (req->iv) to the last - * ciphertext block. This is used e.g. by the CTS mode. + * ciphertext block when running in CBC mode. */ - if (ivsize) + if ((ctx->cdata.algtype & OP_ALG_AAI_MASK) == OP_ALG_AAI_CBC) scatterwalk_map_and_copy(req->iv, req->dst, req->cryptlen - ivsize, ivsize, 0);
@@ -1843,9 +1844,9 @@ static int skcipher_decrypt(struct skcip
/* * The crypto API expects us to set the IV (req->iv) to the last - * ciphertext block. + * ciphertext block when running in CBC mode. */ - if (ivsize) + if ((ctx->cdata.algtype & OP_ALG_AAI_MASK) == OP_ALG_AAI_CBC) scatterwalk_map_and_copy(req->iv, req->src, req->cryptlen - ivsize, ivsize, 0);
From: Hook, Gary Gary.Hook@amd.com
commit 52393d617af7b554f03531e6756facf2ea687d2e upstream.
The error code read from the queue status register is only 6 bits wide, but we need to verify its value is within range before indexing the error messages.
Fixes: 81422badb3907 ("crypto: ccp - Make syslog errors human-readable") Cc: stable@vger.kernel.org Reported-by: Cfir Cohen cfir@google.com Signed-off-by: Gary R Hook gary.hook@amd.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/ccp/ccp-dev.c | 96 ++++++++++++++++++++++--------------------- drivers/crypto/ccp/ccp-dev.h | 2 2 files changed, 52 insertions(+), 46 deletions(-)
--- a/drivers/crypto/ccp/ccp-dev.c +++ b/drivers/crypto/ccp/ccp-dev.c @@ -32,56 +32,62 @@ struct ccp_tasklet_data { };
/* Human-readable error strings */ +#define CCP_MAX_ERROR_CODE 64 static char *ccp_error_codes[] = { "", - "ERR 01: ILLEGAL_ENGINE", - "ERR 02: ILLEGAL_KEY_ID", - "ERR 03: ILLEGAL_FUNCTION_TYPE", - "ERR 04: ILLEGAL_FUNCTION_MODE", - "ERR 05: ILLEGAL_FUNCTION_ENCRYPT", - "ERR 06: ILLEGAL_FUNCTION_SIZE", - "ERR 07: Zlib_MISSING_INIT_EOM", - "ERR 08: ILLEGAL_FUNCTION_RSVD", - "ERR 09: ILLEGAL_BUFFER_LENGTH", - "ERR 10: VLSB_FAULT", - "ERR 11: ILLEGAL_MEM_ADDR", - "ERR 12: ILLEGAL_MEM_SEL", - "ERR 13: ILLEGAL_CONTEXT_ID", - "ERR 14: ILLEGAL_KEY_ADDR", - "ERR 15: 0xF Reserved", - "ERR 16: Zlib_ILLEGAL_MULTI_QUEUE", - "ERR 17: Zlib_ILLEGAL_JOBID_CHANGE", - "ERR 18: CMD_TIMEOUT", - "ERR 19: IDMA0_AXI_SLVERR", - "ERR 20: IDMA0_AXI_DECERR", - "ERR 21: 0x15 Reserved", - "ERR 22: IDMA1_AXI_SLAVE_FAULT", - "ERR 23: IDMA1_AIXI_DECERR", - "ERR 24: 0x18 Reserved", - "ERR 25: ZLIBVHB_AXI_SLVERR", - "ERR 26: ZLIBVHB_AXI_DECERR", - "ERR 27: 0x1B Reserved", - "ERR 27: ZLIB_UNEXPECTED_EOM", - "ERR 27: ZLIB_EXTRA_DATA", - "ERR 30: ZLIB_BTYPE", - "ERR 31: ZLIB_UNDEFINED_SYMBOL", - "ERR 32: ZLIB_UNDEFINED_DISTANCE_S", - "ERR 33: ZLIB_CODE_LENGTH_SYMBOL", - "ERR 34: ZLIB _VHB_ILLEGAL_FETCH", - "ERR 35: ZLIB_UNCOMPRESSED_LEN", - "ERR 36: ZLIB_LIMIT_REACHED", - "ERR 37: ZLIB_CHECKSUM_MISMATCH0", - "ERR 38: ODMA0_AXI_SLVERR", - "ERR 39: ODMA0_AXI_DECERR", - "ERR 40: 0x28 Reserved", - "ERR 41: ODMA1_AXI_SLVERR", - "ERR 42: ODMA1_AXI_DECERR", - "ERR 43: LSB_PARITY_ERR", + "ILLEGAL_ENGINE", + "ILLEGAL_KEY_ID", + "ILLEGAL_FUNCTION_TYPE", + "ILLEGAL_FUNCTION_MODE", + "ILLEGAL_FUNCTION_ENCRYPT", + "ILLEGAL_FUNCTION_SIZE", + "Zlib_MISSING_INIT_EOM", + "ILLEGAL_FUNCTION_RSVD", + "ILLEGAL_BUFFER_LENGTH", + "VLSB_FAULT", + "ILLEGAL_MEM_ADDR", + "ILLEGAL_MEM_SEL", + "ILLEGAL_CONTEXT_ID", + "ILLEGAL_KEY_ADDR", + "0xF Reserved", + "Zlib_ILLEGAL_MULTI_QUEUE", + "Zlib_ILLEGAL_JOBID_CHANGE", + "CMD_TIMEOUT", + "IDMA0_AXI_SLVERR", + "IDMA0_AXI_DECERR", + "0x15 Reserved", + "IDMA1_AXI_SLAVE_FAULT", + "IDMA1_AIXI_DECERR", + "0x18 Reserved", + "ZLIBVHB_AXI_SLVERR", + "ZLIBVHB_AXI_DECERR", + "0x1B Reserved", + "ZLIB_UNEXPECTED_EOM", + "ZLIB_EXTRA_DATA", + "ZLIB_BTYPE", + "ZLIB_UNDEFINED_SYMBOL", + "ZLIB_UNDEFINED_DISTANCE_S", + "ZLIB_CODE_LENGTH_SYMBOL", + "ZLIB _VHB_ILLEGAL_FETCH", + "ZLIB_UNCOMPRESSED_LEN", + "ZLIB_LIMIT_REACHED", + "ZLIB_CHECKSUM_MISMATCH0", + "ODMA0_AXI_SLVERR", + "ODMA0_AXI_DECERR", + "0x28 Reserved", + "ODMA1_AXI_SLVERR", + "ODMA1_AXI_DECERR", };
-void ccp_log_error(struct ccp_device *d, int e) +void ccp_log_error(struct ccp_device *d, unsigned int e) { - dev_err(d->dev, "CCP error: %s (0x%x)\n", ccp_error_codes[e], e); + if (WARN_ON(e >= CCP_MAX_ERROR_CODE)) + return; + + if (e < ARRAY_SIZE(ccp_error_codes)) + dev_err(d->dev, "CCP error %d: %s\n", e, ccp_error_codes[e]); + else + dev_err(d->dev, "CCP error %d: Unknown Error\n", e); }
/* List of CCPs, CCP count, read-write access lock, and access functions --- a/drivers/crypto/ccp/ccp-dev.h +++ b/drivers/crypto/ccp/ccp-dev.h @@ -629,7 +629,7 @@ struct ccp5_desc { void ccp_add_device(struct ccp_device *ccp); void ccp_del_device(struct ccp_device *ccp);
-extern void ccp_log_error(struct ccp_device *, int); +extern void ccp_log_error(struct ccp_device *, unsigned int);
struct ccp_device *ccp_alloc_struct(struct sp_device *sp); bool ccp_queues_suspended(struct ccp_device *ccp);
From: Elena Petrova lenaptr@google.com
commit 1d4aaf16defa86d2665ae7db0259d6cb07e2091f upstream.
The sha1-ce finup implementation for ARM64 produces wrong digest for empty input (len=0). Expected: da39a3ee..., result: 67452301... (initial value of SHA internal state). The error is in sha1_ce_finup: for empty data `finalize` will be 1, so the code is relying on sha1_ce_transform to make the final round. However, in sha1_base_do_update, the block function will not be called when len == 0.
Fix it by setting finalize to 0 if data is empty.
Fixes: 07eb54d306f4 ("crypto: arm64/sha1-ce - move SHA-1 ARMv8 implementation to base layer") Cc: stable@vger.kernel.org Signed-off-by: Elena Petrova lenaptr@google.com Reviewed-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/crypto/sha1-ce-glue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/crypto/sha1-ce-glue.c +++ b/arch/arm64/crypto/sha1-ce-glue.c @@ -52,7 +52,7 @@ static int sha1_ce_finup(struct shash_de unsigned int len, u8 *out) { struct sha1_ce_state *sctx = shash_desc_ctx(desc); - bool finalize = !sctx->sst.count && !(len % SHA1_BLOCK_SIZE); + bool finalize = !sctx->sst.count && !(len % SHA1_BLOCK_SIZE) && len;
if (!crypto_simd_usable()) return crypto_sha1_finup(desc, data, len, out);
From: Elena Petrova lenaptr@google.com
commit 6bd934de1e393466b319d29c4427598fda096c57 upstream.
The sha256-ce finup implementation for ARM64 produces wrong digest for empty input (len=0). Expected: the actual digest, result: initial value of SHA internal state. The error is in sha256_ce_finup: for empty data `finalize` will be 1, so the code is relying on sha2_ce_transform to make the final round. However, in sha256_base_do_update, the block function will not be called when len == 0.
Fix it by setting finalize to 0 if data is empty.
Fixes: 03802f6a80b3a ("crypto: arm64/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer") Cc: stable@vger.kernel.org Signed-off-by: Elena Petrova lenaptr@google.com Reviewed-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/crypto/sha2-ce-glue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/crypto/sha2-ce-glue.c +++ b/arch/arm64/crypto/sha2-ce-glue.c @@ -57,7 +57,7 @@ static int sha256_ce_finup(struct shash_ unsigned int len, u8 *out) { struct sha256_ce_state *sctx = shash_desc_ctx(desc); - bool finalize = !sctx->sst.count && !(len % SHA256_BLOCK_SIZE); + bool finalize = !sctx->sst.count && !(len % SHA256_BLOCK_SIZE) && len;
if (!crypto_simd_usable()) { if (len)
From: Eric Biggers ebiggers@google.com
commit 7545b6c2087f4ef0287c8c9b7eba6a728c67ff8e upstream.
Clear the CRYPTO_TFM_REQ_MAY_SLEEP flag when the chacha20poly1305 operation is being continued from an async completion callback, since sleeping may not be allowed in that context.
This is basically the same bug that was recently fixed in the xts and lrw templates. But, it's always been broken in chacha20poly1305 too. This was found using syzkaller in combination with the updated crypto self-tests which actually test the MAY_SLEEP flag now.
Reproducer:
python -c 'import socket; socket.socket(socket.AF_ALG, 5, 0).bind( ("aead", "rfc7539(cryptd(chacha20-generic),poly1305-generic)"))'
Kernel output:
BUG: sleeping function called from invalid context at include/crypto/algapi.h:426 in_atomic(): 1, irqs_disabled(): 0, pid: 1001, name: kworker/2:2 [...] CPU: 2 PID: 1001 Comm: kworker/2:2 Not tainted 5.2.0-rc2 #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014 Workqueue: crypto cryptd_queue_worker Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x4d/0x6a lib/dump_stack.c:113 ___might_sleep kernel/sched/core.c:6138 [inline] ___might_sleep.cold.19+0x8e/0x9f kernel/sched/core.c:6095 crypto_yield include/crypto/algapi.h:426 [inline] crypto_hash_walk_done+0xd6/0x100 crypto/ahash.c:113 shash_ahash_update+0x41/0x60 crypto/shash.c:251 shash_async_update+0xd/0x10 crypto/shash.c:260 crypto_ahash_update include/crypto/hash.h:539 [inline] poly_setkey+0xf6/0x130 crypto/chacha20poly1305.c:337 poly_init+0x51/0x60 crypto/chacha20poly1305.c:364 async_done_continue crypto/chacha20poly1305.c:78 [inline] poly_genkey_done+0x15/0x30 crypto/chacha20poly1305.c:369 cryptd_skcipher_complete+0x29/0x70 crypto/cryptd.c:279 cryptd_skcipher_decrypt+0xcd/0x110 crypto/cryptd.c:339 cryptd_queue_worker+0x70/0xa0 crypto/cryptd.c:184 process_one_work+0x1ed/0x420 kernel/workqueue.c:2269 worker_thread+0x3e/0x3a0 kernel/workqueue.c:2415 kthread+0x11f/0x140 kernel/kthread.c:255 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
Fixes: 71ebc4d1b27d ("crypto: chacha20poly1305 - Add a ChaCha20-Poly1305 AEAD construction, RFC7539") Cc: stable@vger.kernel.org # v4.2+ Cc: Martin Willi martin@strongswan.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/chacha20poly1305.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-)
--- a/crypto/chacha20poly1305.c +++ b/crypto/chacha20poly1305.c @@ -61,6 +61,8 @@ struct chachapoly_req_ctx { unsigned int cryptlen; /* Actual AD, excluding IV */ unsigned int assoclen; + /* request flags, with MAY_SLEEP cleared if needed */ + u32 flags; union { struct poly_req poly; struct chacha_req chacha; @@ -70,8 +72,12 @@ struct chachapoly_req_ctx { static inline void async_done_continue(struct aead_request *req, int err, int (*cont)(struct aead_request *)) { - if (!err) + if (!err) { + struct chachapoly_req_ctx *rctx = aead_request_ctx(req); + + rctx->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP; err = cont(req); + }
if (err != -EINPROGRESS && err != -EBUSY) aead_request_complete(req, err); @@ -138,7 +144,7 @@ static int chacha_decrypt(struct aead_re dst = scatterwalk_ffwd(rctx->dst, req->dst, req->assoclen); }
- skcipher_request_set_callback(&creq->req, aead_request_flags(req), + skcipher_request_set_callback(&creq->req, rctx->flags, chacha_decrypt_done, req); skcipher_request_set_tfm(&creq->req, ctx->chacha); skcipher_request_set_crypt(&creq->req, src, dst, @@ -182,7 +188,7 @@ static int poly_tail(struct aead_request memcpy(&preq->tail.cryptlen, &len, sizeof(len)); sg_set_buf(preq->src, &preq->tail, sizeof(preq->tail));
- ahash_request_set_callback(&preq->req, aead_request_flags(req), + ahash_request_set_callback(&preq->req, rctx->flags, poly_tail_done, req); ahash_request_set_tfm(&preq->req, ctx->poly); ahash_request_set_crypt(&preq->req, preq->src, @@ -213,7 +219,7 @@ static int poly_cipherpad(struct aead_re sg_init_table(preq->src, 1); sg_set_buf(preq->src, &preq->pad, padlen);
- ahash_request_set_callback(&preq->req, aead_request_flags(req), + ahash_request_set_callback(&preq->req, rctx->flags, poly_cipherpad_done, req); ahash_request_set_tfm(&preq->req, ctx->poly); ahash_request_set_crypt(&preq->req, preq->src, NULL, padlen); @@ -244,7 +250,7 @@ static int poly_cipher(struct aead_reque sg_init_table(rctx->src, 2); crypt = scatterwalk_ffwd(rctx->src, crypt, req->assoclen);
- ahash_request_set_callback(&preq->req, aead_request_flags(req), + ahash_request_set_callback(&preq->req, rctx->flags, poly_cipher_done, req); ahash_request_set_tfm(&preq->req, ctx->poly); ahash_request_set_crypt(&preq->req, crypt, NULL, rctx->cryptlen); @@ -274,7 +280,7 @@ static int poly_adpad(struct aead_reques sg_init_table(preq->src, 1); sg_set_buf(preq->src, preq->pad, padlen);
- ahash_request_set_callback(&preq->req, aead_request_flags(req), + ahash_request_set_callback(&preq->req, rctx->flags, poly_adpad_done, req); ahash_request_set_tfm(&preq->req, ctx->poly); ahash_request_set_crypt(&preq->req, preq->src, NULL, padlen); @@ -298,7 +304,7 @@ static int poly_ad(struct aead_request * struct poly_req *preq = &rctx->u.poly; int err;
- ahash_request_set_callback(&preq->req, aead_request_flags(req), + ahash_request_set_callback(&preq->req, rctx->flags, poly_ad_done, req); ahash_request_set_tfm(&preq->req, ctx->poly); ahash_request_set_crypt(&preq->req, req->src, NULL, rctx->assoclen); @@ -325,7 +331,7 @@ static int poly_setkey(struct aead_reque sg_init_table(preq->src, 1); sg_set_buf(preq->src, rctx->key, sizeof(rctx->key));
- ahash_request_set_callback(&preq->req, aead_request_flags(req), + ahash_request_set_callback(&preq->req, rctx->flags, poly_setkey_done, req); ahash_request_set_tfm(&preq->req, ctx->poly); ahash_request_set_crypt(&preq->req, preq->src, NULL, sizeof(rctx->key)); @@ -349,7 +355,7 @@ static int poly_init(struct aead_request struct poly_req *preq = &rctx->u.poly; int err;
- ahash_request_set_callback(&preq->req, aead_request_flags(req), + ahash_request_set_callback(&preq->req, rctx->flags, poly_init_done, req); ahash_request_set_tfm(&preq->req, ctx->poly);
@@ -387,7 +393,7 @@ static int poly_genkey(struct aead_reque
chacha_iv(creq->iv, req, 0);
- skcipher_request_set_callback(&creq->req, aead_request_flags(req), + skcipher_request_set_callback(&creq->req, rctx->flags, poly_genkey_done, req); skcipher_request_set_tfm(&creq->req, ctx->chacha); skcipher_request_set_crypt(&creq->req, creq->src, creq->src, @@ -427,7 +433,7 @@ static int chacha_encrypt(struct aead_re dst = scatterwalk_ffwd(rctx->dst, req->dst, req->assoclen); }
- skcipher_request_set_callback(&creq->req, aead_request_flags(req), + skcipher_request_set_callback(&creq->req, rctx->flags, chacha_encrypt_done, req); skcipher_request_set_tfm(&creq->req, ctx->chacha); skcipher_request_set_crypt(&creq->req, src, dst, @@ -445,6 +451,7 @@ static int chachapoly_encrypt(struct aea struct chachapoly_req_ctx *rctx = aead_request_ctx(req);
rctx->cryptlen = req->cryptlen; + rctx->flags = aead_request_flags(req);
/* encrypt call chain: * - chacha_encrypt/done() @@ -466,6 +473,7 @@ static int chachapoly_decrypt(struct aea struct chachapoly_req_ctx *rctx = aead_request_ctx(req);
rctx->cryptlen = req->cryptlen - POLY1305_DIGEST_SIZE; + rctx->flags = aead_request_flags(req);
/* decrypt call chain: * - poly_genkey/done()
From: Christian Lamparter chunkeey@gmail.com
commit bfa2ba7d9e6b20aca82b99e6842fe18842ae3a0f upstream.
This patch fixes a issue with crypto4xx's ctr(aes) that was discovered by libcapi's kcapi-enc-test.sh test.
The some of the ctr(aes) encryptions test were failing on the non-power-of-two test:
kcapi-enc - Error: encryption failed with error 0 kcapi-enc - Error: decryption failed with error 0 [FAILED: 32-bit - 5.1.0-rc1+] 15 bytes: STDIN / STDOUT enc test (128 bits): original file (1d100e..cc96184c) and generated file (e3b0c442..1b7852b855) [FAILED: 32-bit - 5.1.0-rc1+] 15 bytes: STDIN / STDOUT enc test (128 bits) (openssl generated CT): original file (e3b0..5) and generated file (3..8e) [PASSED: 32-bit - 5.1.0-rc1+] 15 bytes: STDIN / STDOUT enc test (128 bits) (openssl generated PT) [FAILED: 32-bit - 5.1.0-rc1+] 15 bytes: STDIN / STDOUT enc test (password): original file (1d1..84c) and generated file (e3b..852b855)
But the 16, 32, 512, 65536 tests always worked.
Thankfully, this isn't a hidden hardware problem like previously, instead this turned out to be a copy and paste issue.
With this patch, all the tests are passing with and kcapi-enc-test.sh gives crypto4xx's a clean bill of health: "Number of failures: 0" :).
Cc: stable@vger.kernel.org Fixes: 98e87e3d933b ("crypto: crypto4xx - add aes-ctr support") Fixes: f2a13e7cba9e ("crypto: crypto4xx - enable AES RFC3686, ECB, CFB and OFB offloads") Signed-off-by: Christian Lamparter chunkeey@gmail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/amcc/crypto4xx_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/crypto/amcc/crypto4xx_core.c +++ b/drivers/crypto/amcc/crypto4xx_core.c @@ -1243,7 +1243,7 @@ static struct crypto4xx_alg_common crypt .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC | CRYPTO_ALG_KERN_DRIVER_ONLY, - .cra_blocksize = AES_BLOCK_SIZE, + .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto4xx_ctx), .cra_module = THIS_MODULE, }, @@ -1263,7 +1263,7 @@ static struct crypto4xx_alg_common crypt .cra_priority = CRYPTO4XX_CRYPTO_PRIORITY, .cra_flags = CRYPTO_ALG_ASYNC | CRYPTO_ALG_KERN_DRIVER_ONLY, - .cra_blocksize = AES_BLOCK_SIZE, + .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto4xx_ctx), .cra_module = THIS_MODULE, },
From: Christian Lamparter chunkeey@gmail.com
commit 70c4997f34b6c6888b3ac157adec49e01d0df2d5 upstream.
While the hardware consider them to be blockciphers, the reference implementation defines them as streamciphers.
Do the right thing and set the blocksize to 1. This was found by CONFIG_CRYPTO_MANAGER_EXTRA_TESTS.
This fixes the following issues: skcipher: blocksize for ofb-aes-ppc4xx (16) doesn't match generic impl (1) skcipher: blocksize for cfb-aes-ppc4xx (16) doesn't match generic impl (1)
Cc: Eric Biggers ebiggers@kernel.org Cc: stable@vger.kernel.org Fixes: f2a13e7cba9e ("crypto: crypto4xx - enable AES RFC3686, ECB, CFB and OFB offloads") Signed-off-by: Christian Lamparter chunkeey@gmail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/amcc/crypto4xx_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/crypto/amcc/crypto4xx_core.c +++ b/drivers/crypto/amcc/crypto4xx_core.c @@ -1222,7 +1222,7 @@ static struct crypto4xx_alg_common crypt .cra_priority = CRYPTO4XX_CRYPTO_PRIORITY, .cra_flags = CRYPTO_ALG_ASYNC | CRYPTO_ALG_KERN_DRIVER_ONLY, - .cra_blocksize = AES_BLOCK_SIZE, + .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto4xx_ctx), .cra_module = THIS_MODULE, }, @@ -1302,7 +1302,7 @@ static struct crypto4xx_alg_common crypt .cra_priority = CRYPTO4XX_CRYPTO_PRIORITY, .cra_flags = CRYPTO_ALG_ASYNC | CRYPTO_ALG_KERN_DRIVER_ONLY, - .cra_blocksize = AES_BLOCK_SIZE, + .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto4xx_ctx), .cra_module = THIS_MODULE, },
From: Christian Lamparter chunkeey@gmail.com
commit 0f7a81374060828280fcfdfbaa162cb559017f9f upstream.
The hardware automatically zero pads incomplete block ciphers blocks without raising any errors. This is a screw-up. This was noticed by CONFIG_CRYPTO_MANAGER_EXTRA_TESTS tests that sent a incomplete blocks and expect them to fail.
This fixes: cbc-aes-ppc4xx encryption unexpectedly succeeded on test vector "random: len=2409 klen=32"; expected_error=-22, cfg="random: may_sleep use_digest src_divs=[96.90%@+2295, 2.34%@+4066, 0.32%@alignmask+12, 0.34%@+4087, 0.9%@alignmask+1787, 0.1%@+3767] iv_offset=6"
ecb-aes-ppc4xx encryption unexpectedly succeeded on test vector "random: len=1011 klen=32"; expected_error=-22, cfg="random: may_sleep use_digest src_divs=[100.0%@alignmask+20] dst_divs=[3.12%@+3001, 96.88%@+4070]"
Cc: Eric Biggers ebiggers@kernel.org Cc: stable@vger.kernel.org [4.19, 5.0 and 5.1] Signed-off-by: Christian Lamparter chunkeey@gmail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/amcc/crypto4xx_alg.c | 36 ++++++++++++++++++++++++----------- drivers/crypto/amcc/crypto4xx_core.c | 16 +++++++-------- drivers/crypto/amcc/crypto4xx_core.h | 10 +++++---- 3 files changed, 39 insertions(+), 23 deletions(-)
--- a/drivers/crypto/amcc/crypto4xx_alg.c +++ b/drivers/crypto/amcc/crypto4xx_alg.c @@ -67,12 +67,16 @@ static void set_dynamic_sa_command_1(str }
static inline int crypto4xx_crypt(struct skcipher_request *req, - const unsigned int ivlen, bool decrypt) + const unsigned int ivlen, bool decrypt, + bool check_blocksize) { struct crypto_skcipher *cipher = crypto_skcipher_reqtfm(req); struct crypto4xx_ctx *ctx = crypto_skcipher_ctx(cipher); __le32 iv[AES_IV_SIZE];
+ if (check_blocksize && !IS_ALIGNED(req->cryptlen, AES_BLOCK_SIZE)) + return -EINVAL; + if (ivlen) crypto4xx_memcpy_to_le32(iv, req->iv, ivlen);
@@ -81,24 +85,34 @@ static inline int crypto4xx_crypt(struct ctx->sa_len, 0, NULL); }
-int crypto4xx_encrypt_noiv(struct skcipher_request *req) +int crypto4xx_encrypt_noiv_block(struct skcipher_request *req) +{ + return crypto4xx_crypt(req, 0, false, true); +} + +int crypto4xx_encrypt_iv_stream(struct skcipher_request *req) +{ + return crypto4xx_crypt(req, AES_IV_SIZE, false, false); +} + +int crypto4xx_decrypt_noiv_block(struct skcipher_request *req) { - return crypto4xx_crypt(req, 0, false); + return crypto4xx_crypt(req, 0, true, true); }
-int crypto4xx_encrypt_iv(struct skcipher_request *req) +int crypto4xx_decrypt_iv_stream(struct skcipher_request *req) { - return crypto4xx_crypt(req, AES_IV_SIZE, false); + return crypto4xx_crypt(req, AES_IV_SIZE, true, false); }
-int crypto4xx_decrypt_noiv(struct skcipher_request *req) +int crypto4xx_encrypt_iv_block(struct skcipher_request *req) { - return crypto4xx_crypt(req, 0, true); + return crypto4xx_crypt(req, AES_IV_SIZE, false, true); }
-int crypto4xx_decrypt_iv(struct skcipher_request *req) +int crypto4xx_decrypt_iv_block(struct skcipher_request *req) { - return crypto4xx_crypt(req, AES_IV_SIZE, true); + return crypto4xx_crypt(req, AES_IV_SIZE, true, true); }
/** @@ -269,8 +283,8 @@ crypto4xx_ctr_crypt(struct skcipher_requ return ret; }
- return encrypt ? crypto4xx_encrypt_iv(req) - : crypto4xx_decrypt_iv(req); + return encrypt ? crypto4xx_encrypt_iv_stream(req) + : crypto4xx_decrypt_iv_stream(req); }
static int crypto4xx_sk_setup_fallback(struct crypto4xx_ctx *ctx, --- a/drivers/crypto/amcc/crypto4xx_core.c +++ b/drivers/crypto/amcc/crypto4xx_core.c @@ -1210,8 +1210,8 @@ static struct crypto4xx_alg_common crypt .max_keysize = AES_MAX_KEY_SIZE, .ivsize = AES_IV_SIZE, .setkey = crypto4xx_setkey_aes_cbc, - .encrypt = crypto4xx_encrypt_iv, - .decrypt = crypto4xx_decrypt_iv, + .encrypt = crypto4xx_encrypt_iv_block, + .decrypt = crypto4xx_decrypt_iv_block, .init = crypto4xx_sk_init, .exit = crypto4xx_sk_exit, } }, @@ -1230,8 +1230,8 @@ static struct crypto4xx_alg_common crypt .max_keysize = AES_MAX_KEY_SIZE, .ivsize = AES_IV_SIZE, .setkey = crypto4xx_setkey_aes_cfb, - .encrypt = crypto4xx_encrypt_iv, - .decrypt = crypto4xx_decrypt_iv, + .encrypt = crypto4xx_encrypt_iv_stream, + .decrypt = crypto4xx_decrypt_iv_stream, .init = crypto4xx_sk_init, .exit = crypto4xx_sk_exit, } }, @@ -1290,8 +1290,8 @@ static struct crypto4xx_alg_common crypt .min_keysize = AES_MIN_KEY_SIZE, .max_keysize = AES_MAX_KEY_SIZE, .setkey = crypto4xx_setkey_aes_ecb, - .encrypt = crypto4xx_encrypt_noiv, - .decrypt = crypto4xx_decrypt_noiv, + .encrypt = crypto4xx_encrypt_noiv_block, + .decrypt = crypto4xx_decrypt_noiv_block, .init = crypto4xx_sk_init, .exit = crypto4xx_sk_exit, } }, @@ -1310,8 +1310,8 @@ static struct crypto4xx_alg_common crypt .max_keysize = AES_MAX_KEY_SIZE, .ivsize = AES_IV_SIZE, .setkey = crypto4xx_setkey_aes_ofb, - .encrypt = crypto4xx_encrypt_iv, - .decrypt = crypto4xx_decrypt_iv, + .encrypt = crypto4xx_encrypt_iv_stream, + .decrypt = crypto4xx_decrypt_iv_stream, .init = crypto4xx_sk_init, .exit = crypto4xx_sk_exit, } }, --- a/drivers/crypto/amcc/crypto4xx_core.h +++ b/drivers/crypto/amcc/crypto4xx_core.h @@ -173,10 +173,12 @@ int crypto4xx_setkey_rfc3686(struct cryp const u8 *key, unsigned int keylen); int crypto4xx_encrypt_ctr(struct skcipher_request *req); int crypto4xx_decrypt_ctr(struct skcipher_request *req); -int crypto4xx_encrypt_iv(struct skcipher_request *req); -int crypto4xx_decrypt_iv(struct skcipher_request *req); -int crypto4xx_encrypt_noiv(struct skcipher_request *req); -int crypto4xx_decrypt_noiv(struct skcipher_request *req); +int crypto4xx_encrypt_iv_stream(struct skcipher_request *req); +int crypto4xx_decrypt_iv_stream(struct skcipher_request *req); +int crypto4xx_encrypt_iv_block(struct skcipher_request *req); +int crypto4xx_decrypt_iv_block(struct skcipher_request *req); +int crypto4xx_encrypt_noiv_block(struct skcipher_request *req); +int crypto4xx_decrypt_noiv_block(struct skcipher_request *req); int crypto4xx_rfc3686_encrypt(struct skcipher_request *req); int crypto4xx_rfc3686_decrypt(struct skcipher_request *req); int crypto4xx_sha1_alg_init(struct crypto_tfm *tfm);
From: Hook, Gary Gary.Hook@amd.com
commit 20e833dc36355ed642d00067641a679c618303fa upstream.
The AES GCM function reuses an 'op' data structure, which members contain values that must be cleared for each (re)use.
This fix resolves a crypto self-test failure: alg: aead: gcm-aes-ccp encryption test failed (wrong result) on test vector 2, cfg="two even aligned splits"
Fixes: 36cf515b9bbe ("crypto: ccp - Enable support for AES GCM on v5 CCPs") Cc: stable@vger.kernel.org Signed-off-by: Gary R Hook gary.hook@amd.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/ccp/ccp-ops.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
--- a/drivers/crypto/ccp/ccp-ops.c +++ b/drivers/crypto/ccp/ccp-ops.c @@ -622,6 +622,7 @@ static int ccp_run_aes_gcm_cmd(struct cc
unsigned long long *final; unsigned int dm_offset; + unsigned int jobid; unsigned int ilen; bool in_place = true; /* Default value */ int ret; @@ -660,9 +661,11 @@ static int ccp_run_aes_gcm_cmd(struct cc p_tag = scatterwalk_ffwd(sg_tag, p_inp, ilen); }
+ jobid = CCP_NEW_JOBID(cmd_q->ccp); + memset(&op, 0, sizeof(op)); op.cmd_q = cmd_q; - op.jobid = CCP_NEW_JOBID(cmd_q->ccp); + op.jobid = jobid; op.sb_key = cmd_q->sb_key; /* Pre-allocated */ op.sb_ctx = cmd_q->sb_ctx; /* Pre-allocated */ op.init = 1; @@ -813,6 +816,13 @@ static int ccp_run_aes_gcm_cmd(struct cc final[0] = cpu_to_be64(aes->aad_len * 8); final[1] = cpu_to_be64(ilen * 8);
+ memset(&op, 0, sizeof(op)); + op.cmd_q = cmd_q; + op.jobid = jobid; + op.sb_key = cmd_q->sb_key; /* Pre-allocated */ + op.sb_ctx = cmd_q->sb_ctx; /* Pre-allocated */ + op.init = 1; + op.u.aes.type = aes->type; op.u.aes.mode = CCP_AES_MODE_GHASH; op.u.aes.action = CCP_AES_GHASHFINAL; op.src.type = CCP_MEMTYPE_SYSTEM;
From: Cfir Cohen cfir@google.com
commit 538a5a072e6ef04377b180ee9b3ce5bae0a85da4 upstream.
Avoid leaking GCM tag through timing side channel.
Fixes: 36cf515b9bbe ("crypto: ccp - Enable support for AES GCM on v5 CCPs") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Cfir Cohen cfir@google.com Acked-by: Gary R Hook ghook@amd.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/ccp/ccp-ops.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/crypto/ccp/ccp-ops.c +++ b/drivers/crypto/ccp/ccp-ops.c @@ -850,7 +850,8 @@ static int ccp_run_aes_gcm_cmd(struct cc if (ret) goto e_tag;
- ret = memcmp(tag.address, final_wa.address, AES_BLOCK_SIZE); + ret = crypto_memneq(tag.address, final_wa.address, + AES_BLOCK_SIZE) ? -EBADMSG : 0; ccp_dm_free(&tag); }
From: Wen Yang wen.yang99@zte.com.cn
commit 95566aa75cd6b3b404502c06f66956b5481194b3 upstream.
There is a possible double free issue in ppc4xx_trng_probe():
85: dev->trng_base = of_iomap(trng, 0); 86: of_node_put(trng); ---> released here 87: if (!dev->trng_base) 88: goto err_out; ... 110: ierr_out: 111: of_node_put(trng); ---> double released here ...
This issue was detected by using the Coccinelle software. We fix it by removing the unnecessary of_node_put().
Fixes: 5343e674f32f ("crypto4xx: integrate ppc4xx-rng into crypto4xx") Signed-off-by: Wen Yang wen.yang99@zte.com.cn Cc: stable@vger.kernel.org Cc: "David S. Miller" davem@davemloft.net Cc: Thomas Gleixner tglx@linutronix.de Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Allison Randal allison@lohutok.net Cc: Armijn Hemel armijn@tjaldur.nl Cc: Julia Lawall Julia.Lawall@lip6.fr Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Acked-by: Julia Lawall julia.lawall@lip6.fr Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/amcc/crypto4xx_trng.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/crypto/amcc/crypto4xx_trng.c +++ b/drivers/crypto/amcc/crypto4xx_trng.c @@ -108,7 +108,6 @@ void ppc4xx_trng_probe(struct crypto4xx_ return;
err_out: - of_node_put(trng); iounmap(dev->trng_base); kfree(rng); dev->trng_base = NULL;
From: Ronnie Sahlberg lsahlber@redhat.com
commit 3e2725796cbdfe4efc7eb7b27cacaeac2ddad1a5 upstream.
not just if CONFIG_CIFS_DEBUG2 is enabled.
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com CC: Stable stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/connect.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -1223,11 +1223,11 @@ next_pdu: atomic_read(&midCount)); cifs_dump_mem("Received Data is: ", bufs[i], HEADER_SIZE(server)); + smb2_add_credits_from_hdr(bufs[i], server); #ifdef CONFIG_CIFS_DEBUG2 if (server->ops->dump_detail) server->ops->dump_detail(bufs[i], server); - smb2_add_credits_from_hdr(bufs[i], server); cifs_dump_mids(server); #endif /* CIFS_DEBUG2 */ }
From: Ronnie Sahlberg lsahlber@redhat.com
commit 88a92c913cef09e70b1744a8877d177aa6cb2189 upstream.
RHBZ: 1722704
In low memory situations the various SMB2_*_init() functions can fail to allocate a request PDU and thus leave the request iovector as NULL.
If we don't check the return code for failure we end up calling smb2_set_next_command() with a NULL iovector causing a crash when it tries to dereference it.
CC: Stable stable@vger.kernel.org Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/smb2inode.c | 12 ++++++++++++ fs/cifs/smb2ops.c | 11 ++++++++++- 2 files changed, 22 insertions(+), 1 deletion(-)
--- a/fs/cifs/smb2inode.c +++ b/fs/cifs/smb2inode.c @@ -120,6 +120,8 @@ smb2_compound_op(const unsigned int xid, SMB2_O_INFO_FILE, 0, sizeof(struct smb2_file_all_info) + PATH_MAX * 2, 0, NULL); + if (rc) + goto finished; smb2_set_next_command(tcon, &rqst[num_rqst]); smb2_set_related(&rqst[num_rqst++]); trace_smb3_query_info_compound_enter(xid, ses->Suid, tcon->tid, @@ -147,6 +149,8 @@ smb2_compound_op(const unsigned int xid, COMPOUND_FID, current->tgid, FILE_DISPOSITION_INFORMATION, SMB2_O_INFO_FILE, 0, data, size); + if (rc) + goto finished; smb2_set_next_command(tcon, &rqst[num_rqst]); smb2_set_related(&rqst[num_rqst++]); trace_smb3_rmdir_enter(xid, ses->Suid, tcon->tid, full_path); @@ -163,6 +167,8 @@ smb2_compound_op(const unsigned int xid, COMPOUND_FID, current->tgid, FILE_END_OF_FILE_INFORMATION, SMB2_O_INFO_FILE, 0, data, size); + if (rc) + goto finished; smb2_set_next_command(tcon, &rqst[num_rqst]); smb2_set_related(&rqst[num_rqst++]); trace_smb3_set_eof_enter(xid, ses->Suid, tcon->tid, full_path); @@ -180,6 +186,8 @@ smb2_compound_op(const unsigned int xid, COMPOUND_FID, current->tgid, FILE_BASIC_INFORMATION, SMB2_O_INFO_FILE, 0, data, size); + if (rc) + goto finished; smb2_set_next_command(tcon, &rqst[num_rqst]); smb2_set_related(&rqst[num_rqst++]); trace_smb3_set_info_compound_enter(xid, ses->Suid, tcon->tid, @@ -206,6 +214,8 @@ smb2_compound_op(const unsigned int xid, COMPOUND_FID, current->tgid, FILE_RENAME_INFORMATION, SMB2_O_INFO_FILE, 0, data, size); + if (rc) + goto finished; smb2_set_next_command(tcon, &rqst[num_rqst]); smb2_set_related(&rqst[num_rqst++]); trace_smb3_rename_enter(xid, ses->Suid, tcon->tid, full_path); @@ -231,6 +241,8 @@ smb2_compound_op(const unsigned int xid, COMPOUND_FID, current->tgid, FILE_LINK_INFORMATION, SMB2_O_INFO_FILE, 0, data, size); + if (rc) + goto finished; smb2_set_next_command(tcon, &rqst[num_rqst]); smb2_set_related(&rqst[num_rqst++]); trace_smb3_hardlink_enter(xid, ses->Suid, tcon->tid, full_path); --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -2027,6 +2027,10 @@ smb2_set_related(struct smb_rqst *rqst) struct smb2_sync_hdr *shdr;
shdr = (struct smb2_sync_hdr *)(rqst->rq_iov[0].iov_base); + if (shdr == NULL) { + cifs_dbg(FYI, "shdr NULL in smb2_set_related\n"); + return; + } shdr->Flags |= SMB2_FLAGS_RELATED_OPERATIONS; }
@@ -2041,6 +2045,12 @@ smb2_set_next_command(struct cifs_tcon * unsigned long len = smb_rqst_len(server, rqst); int i, num_padding;
+ shdr = (struct smb2_sync_hdr *)(rqst->rq_iov[0].iov_base); + if (shdr == NULL) { + cifs_dbg(FYI, "shdr NULL in smb2_set_next_command\n"); + return; + } + /* SMB headers in a compound are 8 byte aligned. */
/* No padding needed */ @@ -2080,7 +2090,6 @@ smb2_set_next_command(struct cifs_tcon * }
finished: - shdr = (struct smb2_sync_hdr *)(rqst->rq_iov[0].iov_base); shdr->NextCommand = cpu_to_le32(len); }
From: Paulo Alcantara (SUSE) paulo@paulo.ac
commit 29fbeb7a908a60a5ae8c50fbe171cb8fdcef1980 upstream.
Fix mount options comparison when serverino option is turned off later in cifs_autodisable_serverino() and thus avoiding mismatch of new cifs mounts.
Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) paulo@paulo.ac Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Pavel Shilovsky pshilove@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/cifs_fs_sb.h | 5 +++++ fs/cifs/connect.c | 8 ++++++-- fs/cifs/misc.c | 1 + 3 files changed, 12 insertions(+), 2 deletions(-)
--- a/fs/cifs/cifs_fs_sb.h +++ b/fs/cifs/cifs_fs_sb.h @@ -83,5 +83,10 @@ struct cifs_sb_info { * failover properly. */ char *origin_fullpath; /* \HOST\SHARE[OPTIONAL PATH] */ + /* + * Indicate whether serverino option was turned off later + * (cifs_autodisable_serverino) in order to match new mounts. + */ + bool mnt_cifs_serverino_autodisabled; }; #endif /* _CIFS_FS_SB_H */ --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -3460,12 +3460,16 @@ compare_mount_options(struct super_block { struct cifs_sb_info *old = CIFS_SB(sb); struct cifs_sb_info *new = mnt_data->cifs_sb; + unsigned int oldflags = old->mnt_cifs_flags & CIFS_MOUNT_MASK; + unsigned int newflags = new->mnt_cifs_flags & CIFS_MOUNT_MASK;
if ((sb->s_flags & CIFS_MS_MASK) != (mnt_data->flags & CIFS_MS_MASK)) return 0;
- if ((old->mnt_cifs_flags & CIFS_MOUNT_MASK) != - (new->mnt_cifs_flags & CIFS_MOUNT_MASK)) + if (old->mnt_cifs_serverino_autodisabled) + newflags &= ~CIFS_MOUNT_SERVER_INUM; + + if (oldflags != newflags) return 0;
/* --- a/fs/cifs/misc.c +++ b/fs/cifs/misc.c @@ -539,6 +539,7 @@ cifs_autodisable_serverino(struct cifs_s tcon = cifs_sb_master_tcon(cifs_sb);
cifs_sb->mnt_cifs_flags &= ~CIFS_MOUNT_SERVER_INUM; + cifs_sb->mnt_cifs_serverino_autodisabled = true; cifs_dbg(VFS, "Autodisabling the use of server inode numbers on %s.\n", tcon ? tcon->treeName : "new server"); cifs_dbg(VFS, "The server doesn't seem to support them properly or the files might be on different servers (DFS).\n");
From: Ronnie Sahlberg lsahlber@redhat.com
commit aa081859b10c5d8b19f5c525c78883a59d73c2b8 upstream.
Servers can defer destaging any data and updating the mtime until close(). This means that if we do a setinfo to modify the mtime while other handles are open for write the server may overwrite our setinfo timestamps when if flushes the file on close() of the writeable handle.
To solve this we add an explicit flush when the mtime is about to be updated.
This fixes "cp -p" to preserve mtime when copying a file onto an SMB2 share.
CC: Stable stable@vger.kernel.org Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/inode.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
--- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -2408,6 +2408,8 @@ cifs_setattr_nounix(struct dentry *diren struct inode *inode = d_inode(direntry); struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb); struct cifsInodeInfo *cifsInode = CIFS_I(inode); + struct cifsFileInfo *wfile; + struct cifs_tcon *tcon; char *full_path = NULL; int rc = -EACCES; __u32 dosattr = 0; @@ -2454,6 +2456,20 @@ cifs_setattr_nounix(struct dentry *diren mapping_set_error(inode->i_mapping, rc); rc = 0;
+ if (attrs->ia_valid & ATTR_MTIME) { + rc = cifs_get_writable_file(cifsInode, false, &wfile); + if (!rc) { + tcon = tlink_tcon(wfile->tlink); + rc = tcon->ses->server->ops->flush(xid, tcon, &wfile->fid); + cifsFileInfo_put(wfile); + if (rc) + return rc; + } else if (rc != -EBADF) + return rc; + else + rc = 0; + } + if (attrs->ia_valid & ATTR_SIZE) { rc = cifs_set_file_size(inode, attrs, xid, full_path); if (rc != 0)
From: Aurelien Aptel aaptel@suse.com
commit 7e5a70ad88b1e6f6d9b934b2efb41afff496820f upstream.
Prevent deadlock between open_shroot() and cifs_mark_open_files_invalid() by releasing the lock before entering SMB2_open, taking it again after and checking if we still need to use the result.
Link: https://lore.kernel.org/linux-cifs/684ed01c-cbca-2716-bc28-b0a59a0f8521@prod... Fixes: 3d4ef9a15343 ("smb3: fix redundant opens on root") Signed-off-by: Aurelien Aptel aaptel@suse.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com CC: Stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/smb2ops.c | 46 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 45 insertions(+), 1 deletion(-)
--- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -694,8 +694,51 @@ int open_shroot(unsigned int xid, struct
smb2_set_related(&rqst[1]);
+ /* + * We do not hold the lock for the open because in case + * SMB2_open needs to reconnect, it will end up calling + * cifs_mark_open_files_invalid() which takes the lock again + * thus causing a deadlock + */ + + mutex_unlock(&tcon->crfid.fid_mutex); rc = compound_send_recv(xid, ses, flags, 2, rqst, resp_buftype, rsp_iov); + mutex_lock(&tcon->crfid.fid_mutex); + + /* + * Now we need to check again as the cached root might have + * been successfully re-opened from a concurrent process + */ + + if (tcon->crfid.is_valid) { + /* work was already done */ + + /* stash fids for close() later */ + struct cifs_fid fid = { + .persistent_fid = pfid->persistent_fid, + .volatile_fid = pfid->volatile_fid, + }; + + /* + * caller expects this func to set pfid to a valid + * cached root, so we copy the existing one and get a + * reference. + */ + memcpy(pfid, tcon->crfid.fid, sizeof(*pfid)); + kref_get(&tcon->crfid.refcount); + + mutex_unlock(&tcon->crfid.fid_mutex); + + if (rc == 0) { + /* close extra handle outside of crit sec */ + SMB2_close(xid, tcon, fid.persistent_fid, fid.volatile_fid); + } + goto oshr_free; + } + + /* Cached root is still invalid, continue normaly */ + if (rc) goto oshr_exit;
@@ -729,8 +772,9 @@ int open_shroot(unsigned int xid, struct (char *)&tcon->crfid.file_all_info)) tcon->crfid.file_all_info_is_valid = 1;
- oshr_exit: +oshr_exit: mutex_unlock(&tcon->crfid.fid_mutex); +oshr_free: SMB2_open_free(&rqst[0]); SMB2_query_info_free(&rqst[1]); free_rsp_buf(resp_buftype[0], rsp_iov[0].iov_base);
From: Coly Li colyli@suse.de
commit 695277f16b3a102fcc22c97fdf2de77c7b19f0b3 upstream.
This reverts commit 6147305c73e4511ca1a975b766b97a779d442567.
Although this patch helps the failed bcache device to stop faster when too many I/O errors detected on corresponding cached device, setting CACHE_SET_IO_DISABLE bit to cache set c->flags was not a good idea. This operation will disable all I/Os on cache set, which means other attached bcache devices won't work neither.
Without this patch, the failed bcache device can also be stopped eventually if internal I/O accomplished (e.g. writeback). Therefore here I revert it.
Fixes: 6147305c73e4 ("bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()") Reported-by: Yong Li mr.liyong@qq.com Signed-off-by: Coly Li colyli@suse.de Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/bcache/super.c | 17 ----------------- 1 file changed, 17 deletions(-)
--- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1437,8 +1437,6 @@ int bch_flash_dev_create(struct cache_se
bool bch_cached_dev_error(struct cached_dev *dc) { - struct cache_set *c; - if (!dc || test_bit(BCACHE_DEV_CLOSING, &dc->disk.flags)) return false;
@@ -1449,21 +1447,6 @@ bool bch_cached_dev_error(struct cached_ pr_err("stop %s: too many IO errors on backing device %s\n", dc->disk.disk->disk_name, dc->backing_dev_name);
- /* - * If the cached device is still attached to a cache set, - * even dc->io_disable is true and no more I/O requests - * accepted, cache device internal I/O (writeback scan or - * garbage collection) may still prevent bcache device from - * being stopped. So here CACHE_SET_IO_DISABLE should be - * set to c->flags too, to make the internal I/O to cache - * device rejected and stopped immediately. - * If c is NULL, that means the bcache device is not attached - * to any cache set, then no CACHE_SET_IO_DISABLE bit to set. - */ - c = dc->disk.c; - if (c && test_and_set_bit(CACHE_SET_IO_DISABLE, &c->flags)) - pr_info("CACHE_SET_IO_DISABLE already set"); - bcache_device_stop(&dc->disk); return true; }
From: Coly Li colyli@suse.de
commit 249a5f6da57c28a903c75d81505d58ec8c10030d upstream.
This reverts commit c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0.
This patch enlarges a race between normal btree flush code path and flush_btree_write(), which causes deadlock when journal space is exhausted. Reverts this patch makes the race window from 128 btree nodes to only 1 btree nodes.
Fixes: c4dc2497d50d ("bcache: fix high CPU occupancy during journal") Signed-off-by: Coly Li colyli@suse.de Cc: stable@vger.kernel.org Cc: Tang Junhui tang.junhui.linux@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/bcache/bcache.h | 2 - drivers/md/bcache/journal.c | 47 ++++++++++++++------------------------------ drivers/md/bcache/util.h | 2 - 3 files changed, 15 insertions(+), 36 deletions(-)
--- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -726,8 +726,6 @@ struct cache_set {
#define BUCKET_HASH_BITS 12 struct hlist_head bucket_hash[1 << BUCKET_HASH_BITS]; - - DECLARE_HEAP(struct btree *, flush_btree); };
struct bbio { --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -391,12 +391,6 @@ err: }
/* Journalling */ -#define journal_max_cmp(l, r) \ - (fifo_idx(&c->journal.pin, btree_current_write(l)->journal) < \ - fifo_idx(&(c)->journal.pin, btree_current_write(r)->journal)) -#define journal_min_cmp(l, r) \ - (fifo_idx(&c->journal.pin, btree_current_write(l)->journal) > \ - fifo_idx(&(c)->journal.pin, btree_current_write(r)->journal))
static void btree_flush_write(struct cache_set *c) { @@ -404,35 +398,25 @@ static void btree_flush_write(struct cac * Try to find the btree node with that references the oldest journal * entry, best is our current candidate and is locked if non NULL: */ - struct btree *b; - int i; + struct btree *b, *best; + unsigned int i;
atomic_long_inc(&c->flush_write); - retry: - spin_lock(&c->journal.lock); - if (heap_empty(&c->flush_btree)) { - for_each_cached_btree(b, c, i) - if (btree_current_write(b)->journal) { - if (!heap_full(&c->flush_btree)) - heap_add(&c->flush_btree, b, - journal_max_cmp); - else if (journal_max_cmp(b, - heap_peek(&c->flush_btree))) { - c->flush_btree.data[0] = b; - heap_sift(&c->flush_btree, 0, - journal_max_cmp); - } - } - - for (i = c->flush_btree.used / 2 - 1; i >= 0; --i) - heap_sift(&c->flush_btree, i, journal_min_cmp); - } + best = NULL;
- b = NULL; - heap_pop(&c->flush_btree, b, journal_min_cmp); - spin_unlock(&c->journal.lock); + for_each_cached_btree(b, c, i) + if (btree_current_write(b)->journal) { + if (!best) + best = b; + else if (journal_pin_cmp(c, + btree_current_write(best)->journal, + btree_current_write(b)->journal)) { + best = b; + } + }
+ b = best; if (b) { mutex_lock(&b->write_lock); if (!btree_current_write(b)->journal) { @@ -874,8 +858,7 @@ int bch_journal_alloc(struct cache_set * j->w[0].c = c; j->w[1].c = c;
- if (!(init_heap(&c->flush_btree, 128, GFP_KERNEL)) || - !(init_fifo(&j->pin, JOURNAL_PIN, GFP_KERNEL)) || + if (!(init_fifo(&j->pin, JOURNAL_PIN, GFP_KERNEL)) || !(j->w[0].data = (void *) __get_free_pages(GFP_KERNEL, JSET_BITS)) || !(j->w[1].data = (void *) __get_free_pages(GFP_KERNEL, JSET_BITS))) return -ENOMEM; --- a/drivers/md/bcache/util.h +++ b/drivers/md/bcache/util.h @@ -113,8 +113,6 @@ do { \
#define heap_full(h) ((h)->used == (h)->size)
-#define heap_empty(h) ((h)->used == 0) - #define DECLARE_FIFO(type, name) \ struct { \ size_t front, back, size, mask; \
From: Coly Li colyli@suse.de
commit ba82c1ac1667d6efb91a268edb13fc9cdaecec9b upstream.
This reverts commit 6268dc2c4703aabfb0b35681be709acf4c2826c6.
This patch depends on commit c4dc2497d50d ("bcache: fix high CPU occupancy during journal") which is reverted in previous patch. So revert this one too.
Fixes: 6268dc2c4703 ("bcache: free heap cache_set->flush_btree in bch_journal_free") Signed-off-by: Coly Li colyli@suse.de Cc: stable@vger.kernel.org Cc: Shenghui Wang shhuiw@foxmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/bcache/journal.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -843,7 +843,6 @@ void bch_journal_free(struct cache_set * free_pages((unsigned long) c->journal.w[1].data, JSET_BITS); free_pages((unsigned long) c->journal.w[0].data, JSET_BITS); free_fifo(&c->journal.pin); - free_heap(&c->flush_btree); }
int bch_journal_alloc(struct cache_set *c)
From: Coly Li colyli@suse.de
commit 578df99b1b0531d19af956530fe4da63d01a1604 upstream.
When md raid device (e.g. raid456) is used as backing device, read-ahead requests on a degrading and recovering md raid device might be failured immediately by md raid code, but indeed this md raid array can still be read or write for normal I/O requests. Therefore such failed read-ahead request are not real hardware failure. Further more, after degrading and recovering accomplished, read-ahead requests will be handled by md raid array again.
For such condition, I/O failures of read-ahead requests don't indicate real health status (because normal I/O still be served), they should not be counted into I/O error counter dc->io_errors.
Since there is no simple way to detect whether the backing divice is a md raid device, this patch simply ignores I/O failures for read-ahead bios on backing device, to avoid bogus backing device failure on a degrading md raid array.
Suggested-and-tested-by: Thorsten Knabe linux@thorsten-knabe.de Signed-off-by: Coly Li colyli@suse.de Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/bcache/io.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/md/bcache/io.c +++ b/drivers/md/bcache/io.c @@ -58,6 +58,18 @@ void bch_count_backing_io_errors(struct
WARN_ONCE(!dc, "NULL pointer of struct cached_dev");
+ /* + * Read-ahead requests on a degrading and recovering md raid + * (e.g. raid6) device might be failured immediately by md + * raid code, which is not a real hardware media failure. So + * we shouldn't count failed REQ_RAHEAD bio to dc->io_errors. + */ + if (bio->bi_opf & REQ_RAHEAD) { + pr_warn_ratelimited("%s: Read-ahead I/O failed on backing device, ignore", + dc->backing_dev_name); + return; + } + errors = atomic_add_return(1, &dc->io_errors); if (errors < dc->error_limit) pr_err("%s: IO error on backing device, unrecoverable",
From: Coly Li colyli@suse.de
commit 5461999848e0462c14f306a62923d22de820a59c upstream.
In bch_cached_dev_files[] from driver/md/bcache/sysfs.c, sysfs_errors is incorrectly inserted in. The correct entry should be sysfs_io_errors.
This patch fixes the problem and now I/O errors of cached device can be read from /sys/block/bcache<N>/bcache/io_errors.
Fixes: c7b7bd07404c5 ("bcache: add io_disable to struct cached_dev") Signed-off-by: Coly Li colyli@suse.de Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/bcache/sysfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -182,7 +182,7 @@ SHOW(__bch_cached_dev) var_print(writeback_percent); sysfs_hprint(writeback_rate, wb ? atomic_long_read(&dc->writeback_rate.rate) << 9 : 0); - sysfs_hprint(io_errors, atomic_read(&dc->io_errors)); + sysfs_printf(io_errors, "%i", atomic_read(&dc->io_errors)); sysfs_printf(io_error_limit, "%i", dc->error_limit); sysfs_printf(io_disable, "%i", dc->io_disable); var_print(writeback_rate_update_seconds); @@ -474,7 +474,7 @@ static struct attribute *bch_cached_dev_ &sysfs_writeback_rate_p_term_inverse, &sysfs_writeback_rate_minimum, &sysfs_writeback_rate_debug, - &sysfs_errors, + &sysfs_io_errors, &sysfs_io_error_limit, &sysfs_io_disable, &sysfs_dirty_data,
From: Coly Li colyli@suse.de
commit f54d801dda14942dbefa00541d10603015b7859c upstream.
Commit 9baf30972b55 ("bcache: fix for gc and write-back race") added a new work queue dc->writeback_write_wq, but forgot to destroy it in the error condition when creating dc->writeback_thread failed.
This patch destroys dc->writeback_write_wq if kthread_create() returns error pointer to dc->writeback_thread, then a memory leak is avoided.
Fixes: 9baf30972b55 ("bcache: fix for gc and write-back race") Signed-off-by: Coly Li colyli@suse.de Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/bcache/writeback.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -834,6 +834,7 @@ int bch_cached_dev_writeback_start(struc "bcache_writeback"); if (IS_ERR(dc->writeback_thread)) { cached_dev_put(dc); + destroy_workqueue(dc->writeback_write_wq); return PTR_ERR(dc->writeback_thread); } dc->writeback_running = true;
From: Grant Hernandez granthernandez@google.com
commit 2a017fd82c5402b3c8df5e3d6e5165d9e6147dc1 upstream.
The GTCO tablet input driver configures itself from an HID report sent via USB during the initial enumeration process. Some debugging messages are generated during the parsing. A debugging message indentation counter is not bounds checked, leading to the ability for a specially crafted HID report to cause '-' and null bytes be written past the end of the indentation array. As long as the kernel has CONFIG_DYNAMIC_DEBUG enabled, this code will not be optimized out. This was discovered during code review after a previous syzkaller bug was found in this driver.
Signed-off-by: Grant Hernandez granthernandez@google.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/tablet/gtco.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)
--- a/drivers/input/tablet/gtco.c +++ b/drivers/input/tablet/gtco.c @@ -78,6 +78,7 @@ Scott Hill shill@gtcocalcomp.com
/* Max size of a single report */ #define REPORT_MAX_SIZE 10 +#define MAX_COLLECTION_LEVELS 10
/* Bitmask whether pen is in range */ @@ -223,8 +224,7 @@ static void parse_hid_report_descriptor( char maintype = 'x'; char globtype[12]; int indent = 0; - char indentstr[10] = ""; - + char indentstr[MAX_COLLECTION_LEVELS + 1] = { 0 };
dev_dbg(ddev, "======>>>>>>PARSE<<<<<<======\n");
@@ -350,6 +350,13 @@ static void parse_hid_report_descriptor( case TAG_MAIN_COL_START: maintype = 'S';
+ if (indent == MAX_COLLECTION_LEVELS) { + dev_err(ddev, "Collection level %d would exceed limit of %d\n", + indent + 1, + MAX_COLLECTION_LEVELS); + break; + } + if (data == 0) { dev_dbg(ddev, "======>>>>>> Physical\n"); strcpy(globtype, "Physical"); @@ -369,8 +376,15 @@ static void parse_hid_report_descriptor( break;
case TAG_MAIN_COL_END: - dev_dbg(ddev, "<<<<<<======\n"); maintype = 'E'; + + if (indent == 0) { + dev_err(ddev, "Collection level already at zero\n"); + break; + } + + dev_dbg(ddev, "<<<<<<======\n"); + indent--; for (x = 0; x < indent; x++) indentstr[x] = '-';
From: Hui Wang hui.wang@canonical.com
commit 7e4935ccc3236751e5fe4bd6846f86e46bb2e427 upstream.
On a latest Lenovo laptop, the trackpoint and 3 buttons below it don't work at all, when we move the trackpoint or press those 3 buttons, the kernel will print out: "Rejected trackstick packet from non DualPoint device"
This device is identified as an alps touchpad but the packet has trackpoint format, so the alps.c drops the packet and prints out the message above.
According to XiaoXiao's explanation, this device is named cs19 and is trackpoint-only device, its firmware is only for trackpoint, it is independent of touchpad and is a device completely different from DualPoint ones.
To drive this device with mininal changes to the existing driver, we just let the alps driver not handle this device, then the trackpoint.c will be the driver of this device if the trackpoint driver is enabled. (if not, this device will fallback to a bare PS/2 device)
With the trackpoint.c, this trackpoint and 3 buttons all work well, they have all features that the trackpoint should have, like scrolling-screen, drag-and-drop and frame-selection.
Signed-off-by: XiaoXiao Liu sliuuxiaonxiao@gmail.com Signed-off-by: Hui Wang hui.wang@canonical.com Reviewed-by: Pali Rohár pali.rohar@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/mouse/alps.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)
--- a/drivers/input/mouse/alps.c +++ b/drivers/input/mouse/alps.c @@ -21,6 +21,7 @@
#include "psmouse.h" #include "alps.h" +#include "trackpoint.h"
/* * Definitions for ALPS version 3 and 4 command mode protocol @@ -2861,6 +2862,23 @@ static const struct alps_protocol_info * return NULL; }
+static bool alps_is_cs19_trackpoint(struct psmouse *psmouse) +{ + u8 param[2] = { 0 }; + + if (ps2_command(&psmouse->ps2dev, + param, MAKE_PS2_CMD(0, 2, TP_READ_ID))) + return false; + + /* + * param[0] contains the trackpoint device variant_id while + * param[1] contains the firmware_id. So far all alps + * trackpoint-only devices have their variant_ids equal + * TP_VARIANT_ALPS and their firmware_ids are in 0x20~0x2f range. + */ + return param[0] == TP_VARIANT_ALPS && (param[1] & 0x20); +} + static int alps_identify(struct psmouse *psmouse, struct alps_data *priv) { const struct alps_protocol_info *protocol; @@ -3162,6 +3180,20 @@ int alps_detect(struct psmouse *psmouse, return error;
/* + * ALPS cs19 is a trackpoint-only device, and uses different + * protocol than DualPoint ones, so we return -EINVAL here and let + * trackpoint.c drive this device. If the trackpoint driver is not + * enabled, the device will fall back to a bare PS/2 mouse. + * If ps2_command() fails here, we depend on the immediately + * followed psmouse_reset() to reset the device to normal state. + */ + if (alps_is_cs19_trackpoint(psmouse)) { + psmouse_dbg(psmouse, + "ALPS CS19 trackpoint-only device detected, ignoring\n"); + return -EINVAL; + } + + /* * Reset the device to make sure it is fully operational: * on some laptops, like certain Dell Latitudes, we may * fail to properly detect presence of trackstick if device
From: Nick Black dankamongmen@gmail.com
commit 1976d7d200c5a32e72293a2ada36b7b7c9d6dd6e upstream.
Adds the Lenovo T580 to the SMBus intertouch list for Synaptics touchpads. I've tested with this for a week now, and it seems a great improvement. It's also nice to have the complaint gone from dmesg.
Signed-off-by: Nick Black dankamongmen@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/mouse/synaptics.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/input/mouse/synaptics.c +++ b/drivers/input/mouse/synaptics.c @@ -176,6 +176,7 @@ static const char * const smbus_pnp_ids[ "LEN0093", /* T480 */ "LEN0096", /* X280 */ "LEN0097", /* X280 -> ALPS trackpoint */ + "LEN009b", /* T580 */ "LEN200f", /* T450s */ "LEN2054", /* E480 */ "LEN2055", /* E580 */
From: Hui Wang hui.wang@canonical.com
commit 771a081e44a9baa1991ef011cc453ef425591740 upstream.
In the function alps_is_cs19_trackpoint(), we check if the param[1] is in the 0x20~0x2f range, but the code we wrote for this checking is not correct: (param[1] & 0x20) does not mean param[1] is in the range of 0x20~0x2f, it also means the param[1] is in the range of 0x30~0x3f, 0x60~0x6f...
Now fix it with a new condition checking ((param[1] & 0xf0) == 0x20).
Fixes: 7e4935ccc323 ("Input: alps - don't handle ALPS cs19 trackpoint-only device") Cc: stable@vger.kernel.org Signed-off-by: Hui Wang hui.wang@canonical.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/input/mouse/alps.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/input/mouse/alps.c +++ b/drivers/input/mouse/alps.c @@ -2876,7 +2876,7 @@ static bool alps_is_cs19_trackpoint(stru * trackpoint-only devices have their variant_ids equal * TP_VARIANT_ALPS and their firmware_ids are in 0x20~0x2f range. */ - return param[0] == TP_VARIANT_ALPS && (param[1] & 0x20); + return param[0] == TP_VARIANT_ALPS && ((param[1] & 0xf0) == 0x20); }
static int alps_identify(struct psmouse *psmouse, struct alps_data *priv)
From: Krzysztof Kozlowski krzk@kernel.org
commit 70ca117b02f3b1c8830fe95e4e3dea2937038e11 upstream.
If devm_gpiod_get_from_of_node() call returns ERR_PTR, it is assigned into an array of GPIO descriptors and used later because such error is not treated as critical thus it is not propagated back to the probe function.
All code later expects that such GPIO descriptor is either a NULL or proper value. This later might lead to dereference of ERR_PTR.
Only devices with S2MPS14 flavor are affected (other do not control regulators with GPIOs).
Fixes: 1c984942f0a4 ("regulator: s2mps11: Pass descriptor instead of GPIO number") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/regulator/s2mps11.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/regulator/s2mps11.c +++ b/drivers/regulator/s2mps11.c @@ -826,6 +826,7 @@ static void s2mps14_pmic_dt_parse_ext_co else if (IS_ERR(gpio[reg])) { dev_err(&pdev->dev, "Failed to get control GPIO for %d/%s\n", reg, rdata[reg].name); + gpio[reg] = NULL; continue; } if (gpio[reg])
From: Krzysztof Kozlowski krzk@kernel.org
commit 16da0eb5ab6ef2dd1d33431199126e63db9997cc upstream.
On S2MPS11 device, the buck7 and buck8 regulator voltages start at 750 mV, not 600 mV. Using wrong minimal value caused shifting of these regulator values by 150 mV (e.g. buck7 usually configured to v1.35 V was reported as 1.2 V).
On most of the boards these regulators are left in default state so this was only affecting reported voltage. However if any driver wanted to change them, then effectively it would set voltage 150 mV higher than intended.
Cc: stable@vger.kernel.org Fixes: cb74685ecb39 ("regulator: s2mps11: Add samsung s2mps11 regulator driver") Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/regulator/s2mps11.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/regulator/s2mps11.c +++ b/drivers/regulator/s2mps11.c @@ -372,8 +372,8 @@ static const struct regulator_desc s2mps regulator_desc_s2mps11_buck1_4(4), regulator_desc_s2mps11_buck5, regulator_desc_s2mps11_buck67810(6, MIN_600_MV, STEP_6_25_MV), - regulator_desc_s2mps11_buck67810(7, MIN_600_MV, STEP_12_5_MV), - regulator_desc_s2mps11_buck67810(8, MIN_600_MV, STEP_12_5_MV), + regulator_desc_s2mps11_buck67810(7, MIN_750_MV, STEP_12_5_MV), + regulator_desc_s2mps11_buck67810(8, MIN_750_MV, STEP_12_5_MV), regulator_desc_s2mps11_buck9, regulator_desc_s2mps11_buck67810(10, MIN_750_MV, STEP_12_5_MV), };
From: Jon Hunter jonathanh@nvidia.com
commit ece6031ece2dd64d63708cfe1088016cee5b10c0 upstream.
The GPU regulator enable ramp delay for Jetson TX1 is set to 1ms which not sufficient because the enable ramp delay has been measured to be greater than 1ms. Furthermore, the downstream kernels released by NVIDIA for Jetson TX1 are using a enable ramp delay 2ms and a settling delay of 160us. Update the GPU regulator enable ramp delay for Jetson TX1 to be 2ms and add a settling delay of 160us.
Cc: stable@vger.kernel.org Signed-off-by: Jon Hunter jonathanh@nvidia.com Fixes: 5e6b9a89afce ("arm64: tegra: Add VDD_GPU regulator to Jetson TX1") Signed-off-by: Thierry Reding treding@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/boot/dts/nvidia/tegra210-p2180.dtsi | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/arm64/boot/dts/nvidia/tegra210-p2180.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra210-p2180.dtsi @@ -328,7 +328,8 @@ regulator-max-microvolt = <1320000>; enable-gpios = <&pmic 6 GPIO_ACTIVE_HIGH>; regulator-ramp-delay = <80>; - regulator-enable-ramp-delay = <1000>; + regulator-enable-ramp-delay = <2000>; + regulator-settling-time-us = <160>; }; }; };
From: Jon Hunter jonathanh@nvidia.com
commit 434e8aedeaec595933811c2af191db9f11d3ce3b upstream.
There are a few issues with the GPU regulator defined for Jetson Nano which are:
1. The GPU regulator is a PWM based regulator and not a fixed voltage regulator. 2. The output voltages for the GPU regulator are not correct. 3. The regulator enable ramp delay is too short for the regulator and needs to be increased. 2ms should be sufficient. 4. This is the same regulator used on Jetson TX1 and so make the ramp delay and settling time the same as Jetson TX1.
Cc: stable@vger.kernel.org Signed-off-by: Jon Hunter jonathanh@nvidia.com Fixes: 6772cd0eacc8 ("arm64: tegra: Add NVIDIA Jetson Nano Developer Kit support") Signed-off-by: Thierry Reding treding@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-)
--- a/arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts +++ b/arch/arm64/boot/dts/nvidia/tegra210-p3450-0000.dts @@ -633,17 +633,16 @@ };
vdd_gpu: regulator@6 { - compatible = "regulator-fixed"; + compatible = "pwm-regulator"; reg = <6>; - + pwms = <&pwm 1 4880>; regulator-name = "VDD_GPU"; - regulator-min-microvolt = <5000000>; - regulator-max-microvolt = <5000000>; - regulator-enable-ramp-delay = <250>; - - gpio = <&pmic 6 GPIO_ACTIVE_HIGH>; - enable-active-high; - + regulator-min-microvolt = <710000>; + regulator-max-microvolt = <1320000>; + regulator-ramp-delay = <80>; + regulator-enable-ramp-delay = <2000>; + regulator-settling-time-us = <160>; + enable-gpios = <&pmic 6 GPIO_ACTIVE_HIGH>; vin-supply = <&vdd_5v0_sys>; }; };
From: Oren Givon oren.givon@intel.com
commit 498d3eb5bfbb2e05e40005152976a7b9eadfb59c upstream.
The 22000 series FW that was meant to be used with hr is also the FW that is used for hr1 and has a different RF ID. Add support to load the hr FW when hr1 RF ID is detected.
Cc: stable@vger.kernel.org # 5.1+ Signed-off-by: Oren Givon oren.givon@intel.com Signed-off-by: Luciano Coelho luciano.coelho@intel.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/intel/iwlwifi/iwl-csr.h | 1 + drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 8 +++++--- 2 files changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/net/wireless/intel/iwlwifi/iwl-csr.h +++ b/drivers/net/wireless/intel/iwlwifi/iwl-csr.h @@ -336,6 +336,7 @@ enum { /* RF_ID value */ #define CSR_HW_RF_ID_TYPE_JF (0x00105100) #define CSR_HW_RF_ID_TYPE_HR (0x0010A000) +#define CSR_HW_RF_ID_TYPE_HR1 (0x0010c100) #define CSR_HW_RF_ID_TYPE_HRCDB (0x00109F00) #define CSR_HW_RF_ID_TYPE_GF (0x0010D000) #define CSR_HW_RF_ID_TYPE_GF4 (0x0010E000) --- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c @@ -3575,9 +3575,11 @@ struct iwl_trans *iwl_trans_pcie_alloc(s trans->cfg = &iwlax210_2ax_cfg_so_gf4_a0; } } else if (cfg == &iwl_ax101_cfg_qu_hr) { - if (CSR_HW_RF_ID_TYPE_CHIP_ID(trans->hw_rf_id) == - CSR_HW_RF_ID_TYPE_CHIP_ID(CSR_HW_RF_ID_TYPE_HR) && - trans->hw_rev == CSR_HW_REV_TYPE_QNJ_B0) { + if ((CSR_HW_RF_ID_TYPE_CHIP_ID(trans->hw_rf_id) == + CSR_HW_RF_ID_TYPE_CHIP_ID(CSR_HW_RF_ID_TYPE_HR) && + trans->hw_rev == CSR_HW_REV_TYPE_QNJ_B0) || + (CSR_HW_RF_ID_TYPE_CHIP_ID(trans->hw_rf_id) == + CSR_HW_RF_ID_TYPE_CHIP_ID(CSR_HW_RF_ID_TYPE_HR1))) { trans->cfg = &iwl22000_2ax_cfg_qnj_hr_b0; } else if (CSR_HW_RF_ID_TYPE_CHIP_ID(trans->hw_rf_id) == CSR_HW_RF_ID_TYPE_CHIP_ID(CSR_HW_RF_ID_TYPE_HR)) {
From: Emmanuel Grumbach emmanuel.grumbach@intel.com
commit 3b57a10ca14c619707398dc58fe5ece18c95b20b upstream.
Sometimes the register status can include interrupts that were masked. We can, for example, get the RF-Kill bit set in the interrupt status register although this interrupt was masked. Then if we get the ALIVE interrupt (for example) that was not masked, we need to *not* service the RF-Kill interrupt. Fix this in the MSI-X interrupt handler.
Cc: stable@vger.kernel.org Signed-off-by: Emmanuel Grumbach emmanuel.grumbach@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/intel/iwlwifi/pcie/rx.c | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-)
--- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c @@ -2108,10 +2108,18 @@ irqreturn_t iwl_pcie_irq_msix_handler(in return IRQ_NONE; }
- if (iwl_have_debug_level(IWL_DL_ISR)) - IWL_DEBUG_ISR(trans, "ISR inta_fh 0x%08x, enabled 0x%08x\n", - inta_fh, + if (iwl_have_debug_level(IWL_DL_ISR)) { + IWL_DEBUG_ISR(trans, + "ISR inta_fh 0x%08x, enabled (sw) 0x%08x (hw) 0x%08x\n", + inta_fh, trans_pcie->fh_mask, iwl_read32(trans, CSR_MSIX_FH_INT_MASK_AD)); + if (inta_fh & ~trans_pcie->fh_mask) + IWL_DEBUG_ISR(trans, + "We got a masked interrupt (0x%08x)\n", + inta_fh & ~trans_pcie->fh_mask); + } + + inta_fh &= trans_pcie->fh_mask;
if ((trans_pcie->shared_vec_mask & IWL_SHARED_IRQ_NON_RX) && inta_fh & MSIX_FH_INT_CAUSES_Q0) { @@ -2151,11 +2159,18 @@ irqreturn_t iwl_pcie_irq_msix_handler(in }
/* After checking FH register check HW register */ - if (iwl_have_debug_level(IWL_DL_ISR)) + if (iwl_have_debug_level(IWL_DL_ISR)) { IWL_DEBUG_ISR(trans, - "ISR inta_hw 0x%08x, enabled 0x%08x\n", - inta_hw, + "ISR inta_hw 0x%08x, enabled (sw) 0x%08x (hw) 0x%08x\n", + inta_hw, trans_pcie->hw_mask, iwl_read32(trans, CSR_MSIX_HW_INT_MASK_AD)); + if (inta_hw & ~trans_pcie->hw_mask) + IWL_DEBUG_ISR(trans, + "We got a masked interrupt 0x%08x\n", + inta_hw & ~trans_pcie->hw_mask); + } + + inta_hw &= trans_pcie->hw_mask;
/* Alive notification via Rx interrupt will do the real work */ if (inta_hw & MSIX_HW_INT_CAUSES_REG_ALIVE) {
From: Emmanuel Grumbach emmanuel.grumbach@intel.com
commit ec46ae30245ecb41d73f8254613db07c653fb498 upstream.
We added code to restock the buffer upon ALIVE interrupt when MSI-X is disabled. This was added as part of the context info code. This code was added only if the ISR debug level is set which is very unlikely to be related. Move this code to run even when the ISR debug level is not set.
Note that gen2 devices work with MSI-X in most cases so that this path is seldom used.
Cc: stable@vger.kernel.org Signed-off-by: Emmanuel Grumbach emmanuel.grumbach@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/intel/iwlwifi/pcie/rx.c | 34 ++++++++++++--------------- 1 file changed, 16 insertions(+), 18 deletions(-)
--- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c @@ -1827,25 +1827,23 @@ irqreturn_t iwl_pcie_irq_handler(int irq goto out; }
- if (iwl_have_debug_level(IWL_DL_ISR)) { - /* NIC fires this, but we don't use it, redundant with WAKEUP */ - if (inta & CSR_INT_BIT_SCD) { - IWL_DEBUG_ISR(trans, - "Scheduler finished to transmit the frame/frames.\n"); - isr_stats->sch++; - } + /* NIC fires this, but we don't use it, redundant with WAKEUP */ + if (inta & CSR_INT_BIT_SCD) { + IWL_DEBUG_ISR(trans, + "Scheduler finished to transmit the frame/frames.\n"); + isr_stats->sch++; + }
- /* Alive notification via Rx interrupt will do the real work */ - if (inta & CSR_INT_BIT_ALIVE) { - IWL_DEBUG_ISR(trans, "Alive interrupt\n"); - isr_stats->alive++; - if (trans->cfg->gen2) { - /* - * We can restock, since firmware configured - * the RFH - */ - iwl_pcie_rxmq_restock(trans, trans_pcie->rxq); - } + /* Alive notification via Rx interrupt will do the real work */ + if (inta & CSR_INT_BIT_ALIVE) { + IWL_DEBUG_ISR(trans, "Alive interrupt\n"); + isr_stats->alive++; + if (trans->cfg->gen2) { + /* + * We can restock, since firmware configured + * the RFH + */ + iwl_pcie_rxmq_restock(trans, trans_pcie->rxq); } }
From: Emmanuel Grumbach emmanuel.grumbach@intel.com
commit 0d53cfd0cca3c729a089c39eef0e7d8ae7662974 upstream.
iwl_mvm_send_cmd returns 0 when the command won't be sent because RF-Kill is asserted. Do the same when we call iwl_get_shared_mem_conf since it is not sent through iwl_mvm_send_cmd but directly calls the transport layer.
Cc: stable@vger.kernel.org Signed-off-by: Emmanuel Grumbach emmanuel.grumbach@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/intel/iwlwifi/fw/smem.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
--- a/drivers/net/wireless/intel/iwlwifi/fw/smem.c +++ b/drivers/net/wireless/intel/iwlwifi/fw/smem.c @@ -8,7 +8,7 @@ * Copyright(c) 2012 - 2014 Intel Corporation. All rights reserved. * Copyright(c) 2013 - 2015 Intel Mobile Communications GmbH * Copyright(c) 2016 - 2017 Intel Deutschland GmbH - * Copyright(c) 2018 Intel Corporation + * Copyright(c) 2018 - 2019 Intel Corporation * * This program is free software; you can redistribute it and/or modify * it under the terms of version 2 of the GNU General Public License as @@ -31,7 +31,7 @@ * Copyright(c) 2012 - 2014 Intel Corporation. All rights reserved. * Copyright(c) 2013 - 2015 Intel Mobile Communications GmbH * Copyright(c) 2016 - 2017 Intel Deutschland GmbH - * Copyright(c) 2018 Intel Corporation + * Copyright(c) 2018 - 2019 Intel Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -134,6 +134,7 @@ void iwl_get_shared_mem_conf(struct iwl_ .len = { 0, }, }; struct iwl_rx_packet *pkt; + int ret;
if (fw_has_capa(&fwrt->fw->ucode_capa, IWL_UCODE_TLV_CAPA_EXTEND_SHARED_MEM_CFG)) @@ -141,8 +142,13 @@ void iwl_get_shared_mem_conf(struct iwl_ else cmd.id = SHARED_MEM_CFG;
- if (WARN_ON(iwl_trans_send_cmd(fwrt->trans, &cmd))) + ret = iwl_trans_send_cmd(fwrt->trans, &cmd); + + if (ret) { + WARN(ret != -ERFKILL, + "Could not send the SMEM command: %d\n", ret); return; + }
pkt = cmd.resp_pkt; if (fwrt->trans->cfg->device_family >= IWL_DEVICE_FAMILY_22000)
From: Emmanuel Grumbach emmanuel.grumbach@intel.com
commit ed3e4c6d3cd8f093a3636cb05492429fe2af228d upstream.
Newest devices have a new firmware load mechanism. This mechanism is called the context info. It means that the driver doesn't need to load the sections of the firmware. The driver rather prepares a place in DRAM, with pointers to the relevant sections of the firmware, and the firmware loads itself. At the end of the process, the firmware sends the ALIVE interrupt. This is different from the previous scheme in which the driver expected the FH_TX interrupt after each section being transferred over the DMA.
In order to support this new flow, we enabled all the interrupts. This broke the assumption that we have in the code that the RF-Kill interrupt can't interrupt the firmware load flow.
Change the context info flow to enable only the ALIVE interrupt, and re-enable all the other interrupts only after the firmware is alive. Then, we won't see the RF-Kill interrupt until then. Getting the RF-Kill interrupt while loading the firmware made us kill the firmware while it is loading and we ended up dumping garbage instead of the firmware state.
Re-enable the ALIVE | RX interrupts from the ISR when we get the ALIVE interrupt to be able to get the RX interrupt that comes immediately afterwards for the ALIVE notification. This is needed for non MSI-X only.
Cc: stable@vger.kernel.org Signed-off-by: Emmanuel Grumbach emmanuel.grumbach@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c | 2 - drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info.c | 2 - drivers/net/wireless/intel/iwlwifi/pcie/internal.h | 27 +++++++++++++++ drivers/net/wireless/intel/iwlwifi/pcie/rx.c | 5 ++ drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c | 9 +++++ 5 files changed, 43 insertions(+), 2 deletions(-)
--- a/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c @@ -169,7 +169,7 @@ int iwl_pcie_ctxt_info_gen3_init(struct
memcpy(iml_img, trans->iml, trans->iml_len);
- iwl_enable_interrupts(trans); + iwl_enable_fw_load_int_ctx_info(trans);
/* kick FW self load */ iwl_write64(trans, CSR_CTXT_INFO_ADDR, --- a/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info.c @@ -222,7 +222,7 @@ int iwl_pcie_ctxt_info_init(struct iwl_t
trans_pcie->ctxt_info = ctxt_info;
- iwl_enable_interrupts(trans); + iwl_enable_fw_load_int_ctx_info(trans);
/* Configure debug, if exists */ if (iwl_pcie_dbg_on(trans)) --- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h +++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h @@ -874,6 +874,33 @@ static inline void iwl_enable_fw_load_in } }
+static inline void iwl_enable_fw_load_int_ctx_info(struct iwl_trans *trans) +{ + struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); + + IWL_DEBUG_ISR(trans, "Enabling ALIVE interrupt only\n"); + + if (!trans_pcie->msix_enabled) { + /* + * When we'll receive the ALIVE interrupt, the ISR will call + * iwl_enable_fw_load_int_ctx_info again to set the ALIVE + * interrupt (which is not really needed anymore) but also the + * RX interrupt which will allow us to receive the ALIVE + * notification (which is Rx) and continue the flow. + */ + trans_pcie->inta_mask = CSR_INT_BIT_ALIVE | CSR_INT_BIT_FH_RX; + iwl_write32(trans, CSR_INT_MASK, trans_pcie->inta_mask); + } else { + iwl_enable_hw_int_msk_msix(trans, + MSIX_HW_INT_CAUSES_REG_ALIVE); + /* + * Leave all the FH causes enabled to get the ALIVE + * notification. + */ + iwl_enable_fh_int_msk_msix(trans, trans_pcie->fh_init_mask); + } +} + static inline u16 iwl_pcie_get_cmd_index(const struct iwl_txq *q, u32 index) { return index & (q->n_window - 1); --- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c @@ -1845,6 +1845,8 @@ irqreturn_t iwl_pcie_irq_handler(int irq */ iwl_pcie_rxmq_restock(trans, trans_pcie->rxq); } + + handled |= CSR_INT_BIT_ALIVE; }
/* Safely ignore these bits for debug checks below */ @@ -1963,6 +1965,9 @@ irqreturn_t iwl_pcie_irq_handler(int irq /* Re-enable RF_KILL if it occurred */ else if (handled & CSR_INT_BIT_RF_KILL) iwl_enable_rfkill_int(trans); + /* Re-enable the ALIVE / Rx interrupt if it occurred */ + else if (handled & (CSR_INT_BIT_ALIVE | CSR_INT_BIT_FH_RX)) + iwl_enable_fw_load_int_ctx_info(trans); spin_unlock(&trans_pcie->irq_lock);
out: --- a/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c @@ -273,6 +273,15 @@ void iwl_trans_pcie_gen2_fw_alive(struct * paging memory cannot be freed included since FW will still use it */ iwl_pcie_ctxt_info_free(trans); + + /* + * Re-enable all the interrupts, including the RF-Kill one, now that + * the firmware is alive. + */ + iwl_enable_interrupts(trans); + mutex_lock(&trans_pcie->mutex); + iwl_pcie_check_hw_rf_kill(trans); + mutex_unlock(&trans_pcie->mutex); }
int iwl_trans_pcie_gen2_start_fw(struct iwl_trans *trans,
From: Johannes Berg johannes.berg@intel.com
commit c56e00a3feaee2b46b7d33875fb7f52efd30241f upstream.
In AP (and IBSS) mode, we can only set GTKs to firmware after we have sent down the multicast station, but this we can only do after we've enabled beaconing, etc.
However, during rfkill exit, hostapd will configure the keys before starting the AP, and cfg80211/mac80211 accept it happily.
On earlier devices, this didn't bother us as GTK TX wasn't really handled in firmware, we just put the key material into the TX cmd and thus it only mattered when we actually transmitted a frame.
On newer devices, however, the firmware needs to track all of this and that doesn't work if we add the key before the (multicast) sta it belongs to.
To fix this, keep a list of keys to add during AP enable, and call the function there.
Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 53 +++++++++++++++++++++- drivers/net/wireless/intel/iwlwifi/mvm/mvm.h | 3 + 2 files changed, 54 insertions(+), 2 deletions(-)
--- a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c @@ -207,6 +207,12 @@ static const struct cfg80211_pmsr_capabi }, };
+static int iwl_mvm_mac_set_key(struct ieee80211_hw *hw, + enum set_key_cmd cmd, + struct ieee80211_vif *vif, + struct ieee80211_sta *sta, + struct ieee80211_key_conf *key); + void iwl_mvm_ref(struct iwl_mvm *mvm, enum iwl_mvm_ref_type ref_type) { if (!iwl_mvm_is_d0i3_supported(mvm)) @@ -2636,7 +2642,7 @@ static int iwl_mvm_start_ap_ibss(struct { struct iwl_mvm *mvm = IWL_MAC80211_GET_MVM(hw); struct iwl_mvm_vif *mvmvif = iwl_mvm_vif_from_mac80211(vif); - int ret; + int ret, i;
/* * iwl_mvm_mac_ctxt_add() might read directly from the device @@ -2710,6 +2716,20 @@ static int iwl_mvm_start_ap_ibss(struct /* must be set before quota calculations */ mvmvif->ap_ibss_active = true;
+ /* send all the early keys to the device now */ + for (i = 0; i < ARRAY_SIZE(mvmvif->ap_early_keys); i++) { + struct ieee80211_key_conf *key = mvmvif->ap_early_keys[i]; + + if (!key) + continue; + + mvmvif->ap_early_keys[i] = NULL; + + ret = iwl_mvm_mac_set_key(hw, SET_KEY, vif, NULL, key); + if (ret) + goto out_quota_failed; + } + if (vif->type == NL80211_IFTYPE_AP && !vif->p2p) { iwl_mvm_vif_set_low_latency(mvmvif, true, LOW_LATENCY_VIF_TYPE); @@ -3479,11 +3499,12 @@ static int iwl_mvm_mac_set_key(struct ie struct ieee80211_sta *sta, struct ieee80211_key_conf *key) { + struct iwl_mvm_vif *mvmvif = iwl_mvm_vif_from_mac80211(vif); struct iwl_mvm *mvm = IWL_MAC80211_GET_MVM(hw); struct iwl_mvm_sta *mvmsta; struct iwl_mvm_key_pn *ptk_pn; int keyidx = key->keyidx; - int ret; + int ret, i; u8 key_offset;
if (iwlwifi_mod_params.swcrypto) { @@ -3556,6 +3577,22 @@ static int iwl_mvm_mac_set_key(struct ie key->hw_key_idx = STA_KEY_IDX_INVALID; break; } + + if (!mvmvif->ap_ibss_active) { + for (i = 0; + i < ARRAY_SIZE(mvmvif->ap_early_keys); + i++) { + if (!mvmvif->ap_early_keys[i]) { + mvmvif->ap_early_keys[i] = key; + break; + } + } + + if (i >= ARRAY_SIZE(mvmvif->ap_early_keys)) + ret = -ENOSPC; + + break; + } }
/* During FW restart, in order to restore the state as it was, @@ -3624,6 +3661,18 @@ static int iwl_mvm_mac_set_key(struct ie
break; case DISABLE_KEY: + ret = -ENOENT; + for (i = 0; i < ARRAY_SIZE(mvmvif->ap_early_keys); i++) { + if (mvmvif->ap_early_keys[i] == key) { + mvmvif->ap_early_keys[i] = NULL; + ret = 0; + } + } + + /* found in pending list - don't do anything else */ + if (ret == 0) + break; + if (key->hw_key_idx == STA_KEY_IDX_INVALID) { ret = 0; break; --- a/drivers/net/wireless/intel/iwlwifi/mvm/mvm.h +++ b/drivers/net/wireless/intel/iwlwifi/mvm/mvm.h @@ -501,6 +501,9 @@ struct iwl_mvm_vif { netdev_features_t features;
struct iwl_probe_resp_data __rcu *probe_resp_data; + + /* we can only have 2 GTK + 2 IGTK active at a time */ + struct ieee80211_key_conf *ap_early_keys[4]; };
static inline struct iwl_mvm_vif *
From: Emmanuel Grumbach emmanuel.grumbach@intel.com
commit 940225628652b340b2bfe99f42f3d2db9fd9ce6c upstream.
Otherwise it'll stay set forever which is clearly buggy.
Cc: stable@vger.kernel.org Signed-off-by: Emmanuel Grumbach emmanuel.grumbach@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c @@ -419,6 +419,8 @@ static int iwl_run_unified_mvm_ucode(str
lockdep_assert_held(&mvm->mutex);
+ mvm->rfkill_safe_init_done = false; + iwl_init_notification_wait(&mvm->notif_wait, &init_wait, init_complete, @@ -537,8 +539,7 @@ int iwl_run_init_mvm_ucode(struct iwl_mv
lockdep_assert_held(&mvm->mutex);
- if (WARN_ON_ONCE(mvm->rfkill_safe_init_done)) - return 0; + mvm->rfkill_safe_init_done = false;
iwl_init_notification_wait(&mvm->notif_wait, &calib_wait, @@ -1108,10 +1109,13 @@ static int iwl_mvm_load_rt_fw(struct iwl
iwl_fw_dbg_apply_point(&mvm->fwrt, IWL_FW_INI_APPLY_EARLY);
+ mvm->rfkill_safe_init_done = false; ret = iwl_mvm_load_ucode_wait_alive(mvm, IWL_UCODE_REGULAR); if (ret) return ret;
+ mvm->rfkill_safe_init_done = true; + iwl_fw_dbg_apply_point(&mvm->fwrt, IWL_FW_INI_APPLY_AFTER_ALIVE);
return iwl_init_paging(&mvm->fwrt, mvm->fwrt.cur_fw_img);
From: Dmitry Osipenko digetx@gmail.com
commit 560d1bcad715c215e7ffe5d7cffe045974b623d0 upstream.
_set_opp_custom() receives a set of OPP supplies as its arguments and the caller of it passes NULL when the supplies are not valid. But _set_opp_custom(), by mistake, checks for error by performing IS_ERR(old_supply) on it which will always evaluate to false.
The problem was spotted during of testing of upcoming update for the NVIDIA Tegra CPUFreq driver.
Cc: stable stable@vger.kernel.org Fixes: 7e535993fa4f ("OPP: Separate out custom OPP handler specific code") Reported-by: Marc Dietrich marvin24@gmx.de Signed-off-by: Dmitry Osipenko digetx@gmail.com [ Viresh: Massaged changelog ] Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/opp/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -682,7 +682,7 @@ static int _set_opp_custom(const struct
data->old_opp.rate = old_freq; size = sizeof(*old_supply) * opp_table->regulator_count; - if (IS_ERR(old_supply)) + if (!old_supply) memset(data->old_opp.supplies, 0, size); else memcpy(data->old_opp.supplies, old_supply, size);
From: Julien Thierry julien.thierry@arm.com
commit 17ce302f3117e9518395847a3120c8a108b587b8 upstream.
In the presence of any form of instrumentation, nmi_enter() should be done before calling any traceable code and any instrumentation code.
Currently, nmi_enter() is done in handle_domain_nmi(), which is much too late as instrumentation code might get called before. Move the nmi_enter/exit() calls to the arch IRQ vector handler.
On arm64, it is not possible to know if the IRQ vector handler was called because of an NMI before acknowledging the interrupt. However, It is possible to know whether normal interrupts could be taken in the interrupted context (i.e. if taking an NMI in that context could introduce a potential race condition).
When interrupting a context with IRQs disabled, call nmi_enter() as soon as possible. In contexts with IRQs enabled, defer this to the interrupt controller, which is in a better position to know if an interrupt taken is an NMI.
Fixes: bc3c03ccb464 ("arm64: Enable the support of pseudo-NMIs") Cc: stable@vger.kernel.org # 5.1.x- Cc: Will Deacon will.deacon@arm.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Mark Rutland mark.rutland@arm.com Reviewed-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Julien Thierry julien.thierry@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/kernel/entry.S | 44 ++++++++++++++++++++++++++++++++----------- arch/arm64/kernel/irq.c | 17 ++++++++++++++++ drivers/irqchip/irq-gic-v3.c | 7 ++++++ kernel/irq/irqdesc.c | 8 +++++-- 4 files changed, 63 insertions(+), 13 deletions(-)
--- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -424,6 +424,20 @@ tsk .req x28 // current thread_info irq_stack_exit .endm
+#ifdef CONFIG_ARM64_PSEUDO_NMI + /* + * Set res to 0 if irqs were unmasked in interrupted context. + * Otherwise set res to non-0 value. + */ + .macro test_irqs_unmasked res:req, pmr:req +alternative_if ARM64_HAS_IRQ_PRIO_MASKING + sub \res, \pmr, #GIC_PRIO_IRQON +alternative_else + mov \res, xzr +alternative_endif + .endm +#endif + .text
/* @@ -620,19 +634,19 @@ ENDPROC(el1_sync) el1_irq: kernel_entry 1 enable_da_f -#ifdef CONFIG_TRACE_IRQFLAGS + #ifdef CONFIG_ARM64_PSEUDO_NMI alternative_if ARM64_HAS_IRQ_PRIO_MASKING ldr x20, [sp, #S_PMR_SAVE] -alternative_else - mov x20, #GIC_PRIO_IRQON -alternative_endif - cmp x20, #GIC_PRIO_IRQOFF - /* Irqs were disabled, don't trace */ - b.ls 1f +alternative_else_nop_endif + test_irqs_unmasked res=x0, pmr=x20 + cbz x0, 1f + bl asm_nmi_enter +1: #endif + +#ifdef CONFIG_TRACE_IRQFLAGS bl trace_hardirqs_off -1: #endif
irq_handler @@ -651,14 +665,22 @@ alternative_else_nop_endif bl preempt_schedule_irq // irq en/disable is done inside 1: #endif -#ifdef CONFIG_TRACE_IRQFLAGS + #ifdef CONFIG_ARM64_PSEUDO_NMI /* * if IRQs were disabled when we received the interrupt, we have an NMI * and we are not re-enabling interrupt upon eret. Skip tracing. */ - cmp x20, #GIC_PRIO_IRQOFF - b.ls 1f + test_irqs_unmasked res=x0, pmr=x20 + cbz x0, 1f + bl asm_nmi_exit +1: +#endif + +#ifdef CONFIG_TRACE_IRQFLAGS +#ifdef CONFIG_ARM64_PSEUDO_NMI + test_irqs_unmasked res=x0, pmr=x20 + cbnz x0, 1f #endif bl trace_hardirqs_on 1: --- a/arch/arm64/kernel/irq.c +++ b/arch/arm64/kernel/irq.c @@ -16,8 +16,10 @@ #include <linux/smp.h> #include <linux/init.h> #include <linux/irqchip.h> +#include <linux/kprobes.h> #include <linux/seq_file.h> #include <linux/vmalloc.h> +#include <asm/daifflags.h> #include <asm/vmap_stack.h>
unsigned long irq_err_count; @@ -65,3 +67,18 @@ void __init init_IRQ(void) if (!handle_arch_irq) panic("No interrupt controller found."); } + +/* + * Stubs to make nmi_enter/exit() code callable from ASM + */ +asmlinkage void notrace asm_nmi_enter(void) +{ + nmi_enter(); +} +NOKPROBE_SYMBOL(asm_nmi_enter); + +asmlinkage void notrace asm_nmi_exit(void) +{ + nmi_exit(); +} +NOKPROBE_SYMBOL(asm_nmi_exit); --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -461,8 +461,12 @@ static void gic_deactivate_unhandled(u32
static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) { + bool irqs_enabled = interrupts_enabled(regs); int err;
+ if (irqs_enabled) + nmi_enter(); + if (static_branch_likely(&supports_deactivate_key)) gic_write_eoir(irqnr); /* @@ -474,6 +478,9 @@ static inline void gic_handle_nmi(u32 ir err = handle_domain_nmi(gic_data.domain, irqnr, regs); if (err) gic_deactivate_unhandled(irqnr); + + if (irqs_enabled) + nmi_exit(); }
static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs) --- a/kernel/irq/irqdesc.c +++ b/kernel/irq/irqdesc.c @@ -680,6 +680,8 @@ int __handle_domain_irq(struct irq_domai * @hwirq: The HW irq number to convert to a logical one * @regs: Register file coming from the low-level handling code * + * This function must be called from an NMI context. + * * Returns: 0 on success, or -EINVAL if conversion has failed */ int handle_domain_nmi(struct irq_domain *domain, unsigned int hwirq, @@ -689,7 +691,10 @@ int handle_domain_nmi(struct irq_domain unsigned int irq; int ret = 0;
- nmi_enter(); + /* + * NMI context needs to be setup earlier in order to deal with tracing. + */ + WARN_ON(!in_nmi());
irq = irq_find_mapping(domain, hwirq);
@@ -702,7 +707,6 @@ int handle_domain_nmi(struct irq_domain else ret = -EINVAL;
- nmi_exit(); set_irq_regs(old_regs); return ret; }
From: Eiichi Tsukata devel@etsukata.com
commit 6d54ceb539aacc3df65c89500e8b045924f3ef81 upstream.
Commit c5c27a0a5838 ("x86/stacktrace: Remove the pointless ULONG_MAX marker") removes ULONG_MAX marker from user stack trace entries but trace_user_stack_print() still uses the marker and it outputs unnecessary "??".
For example:
less-1911 [001] d..2 34.758944: <user stack trace> => <00007f16f2295910> => ?? => ?? => ?? => ?? => ?? => ?? => ??
The user stack trace code zeroes the storage before saving the stack, so if the trace is shorter than the maximum number of entries it can terminate the print loop if a zero entry is detected.
Link: http://lkml.kernel.org/r/20190630085438.25545-1-devel@etsukata.com
Cc: stable@vger.kernel.org Fixes: 4285f2fcef80 ("tracing: Remove the ULONG_MAX stack trace hackery") Signed-off-by: Eiichi Tsukata devel@etsukata.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/trace/trace_output.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-)
--- a/kernel/trace/trace_output.c +++ b/kernel/trace/trace_output.c @@ -1109,17 +1109,10 @@ static enum print_line_t trace_user_stac for (i = 0; i < FTRACE_STACK_ENTRIES; i++) { unsigned long ip = field->caller[i];
- if (ip == ULONG_MAX || trace_seq_has_overflowed(s)) + if (!ip || trace_seq_has_overflowed(s)) break;
trace_seq_puts(s, " => "); - - if (!ip) { - trace_seq_puts(s, "??"); - trace_seq_putc(s, '\n'); - continue; - } - seq_print_user_ip(s, mm, ip, flags); trace_seq_putc(s, '\n'); }
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 44942b4e457beda00981f616402a1a791e8c616e upstream.
According to the open() manpage, Linux reserves the access mode 3 to mean "check for read and write permission on the file and return a file descriptor that can't be used for reading or writing."
Currently, the NFSv4 code will ask the server to open the file, and will use an incorrect share access mode of 0. Since it has an incorrect share access mode, the client later forgets to send a corresponding close, meaning it can leak stateids on the server.
Fixes: ce4ef7c0a8a05 ("NFS: Split out NFS v4 file operations") Cc: stable@vger.kernel.org # 3.6+ Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/inode.c | 1 + fs/nfs/nfs4file.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)
--- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -1100,6 +1100,7 @@ int nfs_open(struct inode *inode, struct nfs_fscache_open_file(inode, filp); return 0; } +EXPORT_SYMBOL_GPL(nfs_open);
/* * This function is called whenever some part of NFS notices that --- a/fs/nfs/nfs4file.c +++ b/fs/nfs/nfs4file.c @@ -49,7 +49,7 @@ nfs4_file_open(struct inode *inode, stru return err;
if ((openflags & O_ACCMODE) == 3) - openflags--; + return nfs_open(inode, filp);
/* We can't create new files here */ openflags &= ~(O_CREAT|O_EXCL);
From: Max Kellermann mk@cm4all.com
commit db531db951f950b86d274cc8ed7b21b9e2240036 upstream.
This reverts commit be4c2d4723a4a637f0d1b4f7c66447141a4b3564.
That commit caused a severe memory leak in nfs_readdir_make_qstr().
When listing a directory with more than 100 files (this is how many struct nfs_cache_array_entry elements fit in one 4kB page), all allocated file name strings past those 100 leak.
The root of the leakage is that those string pointers are managed in pages which are never linked into the page cache.
fs/nfs/dir.c puts pages into the page cache by calling read_cache_page(); the callback function nfs_readdir_filler() will then fill the given page struct which was passed to it, which is already linked in the page cache (by do_read_cache_page() calling add_to_page_cache_lru()).
Commit be4c2d4723a4 added another (local) array of allocated pages, to be filled with more data, instead of discarding excess items received from the NFS server. Those additional pages can be used by the next nfs_readdir_filler() call (from within the same nfs_readdir() call).
The leak happens when some of those additional pages are never used (copied to the page cache using copy_highpage()). The pages will be freed by nfs_readdir_free_pages(), but their contents will not. The commit did not invoke nfs_readdir_clear_array() (and doing so would have been dangerous, because it did not track which of those pages were already copied to the page cache, risking double free bugs).
How to reproduce the leak:
- Use a kernel with CONFIG_SLUB_DEBUG_ON.
- Create a directory on a NFS mount with more than 100 files with names long enough to use the "kmalloc-32" slab (so we can easily look up the allocation counts):
for i in `seq 110`; do touch ${i}_0123456789abcdef; done
- Drop all caches:
echo 3 >/proc/sys/vm/drop_caches
- Check the allocation counter:
grep nfs_readdir /sys/kernel/slab/kmalloc-32/alloc_calls 30564391 nfs_readdir_add_to_array+0x73/0xd0 age=534558/4791307/6540952 pid=370-1048386 cpus=0-47 nodes=0-1
- Request a directory listing and check the allocation counters again:
ls [...] grep nfs_readdir /sys/kernel/slab/kmalloc-32/alloc_calls 30564511 nfs_readdir_add_to_array+0x73/0xd0 age=207/4792999/6542663 pid=370-1048386 cpus=0-47 nodes=0-1
There are now 120 new allocations.
- Drop all caches and check the counters again:
echo 3 >/proc/sys/vm/drop_caches grep nfs_readdir /sys/kernel/slab/kmalloc-32/alloc_calls 30564401 nfs_readdir_add_to_array+0x73/0xd0 age=735/4793524/6543176 pid=370-1048386 cpus=0-47 nodes=0-1
110 allocations are gone, but 10 have leaked and will never be freed.
Unhelpfully, those allocations are explicitly excluded from KMEMLEAK, that's why my initial attempts with KMEMLEAK were not successful:
/* * Avoid a kmemleak false positive. The pointer to the name is stored * in a page cache page which kmemleak does not scan. */ kmemleak_not_leak(string->name);
It would be possible to solve this bug without reverting the whole commit:
- keep track of which pages were not used, and call nfs_readdir_clear_array() on them, or - manually link those pages into the page cache
But for now I have decided to just revert the commit, because the real fix would require complex considerations, risking more dangerous (crash) bugs, which may seem unsuitable for the stable branches.
Signed-off-by: Max Kellermann mk@cm4all.com Cc: stable@vger.kernel.org # v5.1+ Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/dir.c | 90 +++--------------------------------------------------- fs/nfs/internal.h | 3 - 2 files changed, 7 insertions(+), 86 deletions(-)
--- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -140,19 +140,12 @@ struct nfs_cache_array { struct nfs_cache_array_entry array[0]; };
-struct readdirvec { - unsigned long nr; - unsigned long index; - struct page *pages[NFS_MAX_READDIR_RAPAGES]; -}; - typedef int (*decode_dirent_t)(struct xdr_stream *, struct nfs_entry *, bool); typedef struct { struct file *file; struct page *page; struct dir_context *ctx; unsigned long page_index; - struct readdirvec pvec; u64 *dir_cookie; u64 last_cookie; loff_t current_index; @@ -532,10 +525,6 @@ int nfs_readdir_page_filler(nfs_readdir_ struct nfs_cache_array *array; unsigned int count = 0; int status; - int max_rapages = NFS_MAX_READDIR_RAPAGES; - - desc->pvec.index = desc->page_index; - desc->pvec.nr = 0;
scratch = alloc_page(GFP_KERNEL); if (scratch == NULL) @@ -560,40 +549,20 @@ int nfs_readdir_page_filler(nfs_readdir_ if (desc->plus) nfs_prime_dcache(file_dentry(desc->file), entry);
- status = nfs_readdir_add_to_array(entry, desc->pvec.pages[desc->pvec.nr]); - if (status == -ENOSPC) { - desc->pvec.nr++; - if (desc->pvec.nr == max_rapages) - break; - status = nfs_readdir_add_to_array(entry, desc->pvec.pages[desc->pvec.nr]); - } + status = nfs_readdir_add_to_array(entry, page); if (status != 0) break; } while (!entry->eof);
- /* - * page and desc->pvec.pages[0] are valid, don't need to check - * whether or not to be NULL. - */ - copy_highpage(page, desc->pvec.pages[0]); - out_nopages: if (count == 0 || (status == -EBADCOOKIE && entry->eof != 0)) { - array = kmap_atomic(desc->pvec.pages[desc->pvec.nr]); + array = kmap(page); array->eof_index = array->size; status = 0; - kunmap_atomic(array); + kunmap(page); }
put_page(scratch); - - /* - * desc->pvec.nr > 0 means at least one page was completely filled, - * we should return -ENOSPC. Otherwise function - * nfs_readdir_xdr_to_array will enter infinite loop. - */ - if (desc->pvec.nr > 0) - return -ENOSPC; return status; }
@@ -627,24 +596,6 @@ out_freepages: return -ENOMEM; }
-/* - * nfs_readdir_rapages_init initialize rapages by nfs_cache_array structure. - */ -static -void nfs_readdir_rapages_init(nfs_readdir_descriptor_t *desc) -{ - struct nfs_cache_array *array; - int max_rapages = NFS_MAX_READDIR_RAPAGES; - int index; - - for (index = 0; index < max_rapages; index++) { - array = kmap_atomic(desc->pvec.pages[index]); - memset(array, 0, sizeof(struct nfs_cache_array)); - array->eof_index = -1; - kunmap_atomic(array); - } -} - static int nfs_readdir_xdr_to_array(nfs_readdir_descriptor_t *desc, struct page *page, struct inode *inode) { @@ -655,12 +606,6 @@ int nfs_readdir_xdr_to_array(nfs_readdir int status = -ENOMEM; unsigned int array_size = ARRAY_SIZE(pages);
- /* - * This means we hit readdir rdpages miss, the preallocated rdpages - * are useless, the preallocate rdpages should be reinitialized. - */ - nfs_readdir_rapages_init(desc); - entry.prev_cookie = 0; entry.cookie = desc->last_cookie; entry.eof = 0; @@ -721,24 +666,9 @@ int nfs_readdir_filler(void *data, struc struct inode *inode = file_inode(desc->file); int ret;
- /* - * If desc->page_index in range desc->pvec.index and - * desc->pvec.index + desc->pvec.nr, we get readdir cache hit. - */ - if (desc->page_index >= desc->pvec.index && - desc->page_index < (desc->pvec.index + desc->pvec.nr)) { - /* - * page and desc->pvec.pages[x] are valid, don't need to check - * whether or not to be NULL. - */ - copy_highpage(page, desc->pvec.pages[desc->page_index - desc->pvec.index]); - ret = 0; - } else { - ret = nfs_readdir_xdr_to_array(desc, page, inode); - if (ret < 0) - goto error; - } - + ret = nfs_readdir_xdr_to_array(desc, page, inode); + if (ret < 0) + goto error; SetPageUptodate(page);
if (invalidate_inode_pages2_range(inode->i_mapping, page->index + 1, -1) < 0) { @@ -903,7 +833,6 @@ static int nfs_readdir(struct file *file *desc = &my_desc; struct nfs_open_dir_context *dir_ctx = file->private_data; int res = 0; - int max_rapages = NFS_MAX_READDIR_RAPAGES;
dfprintk(FILE, "NFS: readdir(%pD2) starting at cookie %llu\n", file, (long long)ctx->pos); @@ -923,12 +852,6 @@ static int nfs_readdir(struct file *file desc->decode = NFS_PROTO(inode)->decode_dirent; desc->plus = nfs_use_readdirplus(inode, ctx);
- res = nfs_readdir_alloc_pages(desc->pvec.pages, max_rapages); - if (res < 0) - return -ENOMEM; - - nfs_readdir_rapages_init(desc); - if (ctx->pos == 0 || nfs_attribute_cache_expired(inode)) res = nfs_revalidate_mapping(inode, file->f_mapping); if (res < 0) @@ -964,7 +887,6 @@ static int nfs_readdir(struct file *file break; } while (!desc->eof); out: - nfs_readdir_free_pages(desc->pvec.pages, max_rapages); if (res > 0) res = 0; dfprintk(FILE, "NFS: readdir(%pD2) returns %d\n", file, res); --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -69,8 +69,7 @@ struct nfs_clone_mount { * Maximum number of pages that readdir can use for creating * a vmapped array of pages. */ -#define NFS_MAX_READDIR_PAGES 64 -#define NFS_MAX_READDIR_RAPAGES 8 +#define NFS_MAX_READDIR_PAGES 8
struct nfs_client_initdata { unsigned long init_flags;
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 8e04fdfadda75a849c649f7e50fe7d97772e1fcb upstream.
mirror->mirror_ds can be NULL if uninitialised, but can contain a PTR_ERR() if call to GETDEVICEINFO failed.
Fixes: 65990d1afbd2 ("pNFS/flexfiles: Fix a deadlock on LAYOUTGET") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Cc: stable@vger.kernel.org # 4.10+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/flexfilelayout/flexfilelayoutdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c +++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c @@ -257,7 +257,7 @@ int ff_layout_track_ds_error(struct nfs4 if (status == 0) return 0;
- if (mirror->mirror_ds == NULL) + if (IS_ERR_OR_NULL(mirror->mirror_ds)) return -EINVAL;
dserr = kmalloc(sizeof(*dserr), gfp_flags);
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 58bbeab425c6c5e318f5b6ae31d351331ddfb34b upstream.
If the client has to stop in pnfs_update_layout() to wait for another layoutget to complete, it currently exits and defaults to I/O through the MDS if the layoutget was successful.
Fixes: d03360aaf5cc ("pNFS: Ensure we return the error if someone kills...") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/pnfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -1890,7 +1890,7 @@ lookup_again: spin_unlock(&ino->i_lock); lseg = ERR_PTR(wait_var_event_killable(&lo->plh_outstanding, !atomic_read(&lo->plh_outstanding))); - if (IS_ERR(lseg) || !list_empty(&lo->plh_segs)) + if (IS_ERR(lseg)) goto out_put_layout_hdr; pnfs_put_layout_hdr(lo); goto lookup_again;
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 75369089820473eac45e9ddd970081901a373c08 upstream.
The bvec tracks the list of pages, so if the number of pages changes due to a re-encode, we need to reset the bvec as well.
Fixes: 277e4ab7d530 ("SUNRPC: Simplify TCP receive code by switching...") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sunrpc/clnt.c | 3 +-- net/sunrpc/xprt.c | 2 ++ net/sunrpc/xprtsock.c | 1 + 3 files changed, 4 insertions(+), 2 deletions(-)
--- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1788,6 +1788,7 @@ rpc_xdr_encode(struct rpc_task *task) req->rq_snd_buf.head[0].iov_len = 0; xdr_init_encode(&xdr, &req->rq_snd_buf, req->rq_snd_buf.head[0].iov_base, req); + xdr_free_bvec(&req->rq_snd_buf); if (rpc_encode_header(task, &xdr)) return;
@@ -1827,8 +1828,6 @@ call_encode(struct rpc_task *task) rpc_call_rpcerror(task, task->tk_status); } return; - } else { - xprt_request_prepare(task->tk_rqstp); }
/* Add task to reply queue before transmission to avoid races */ --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -1013,6 +1013,8 @@ xprt_request_enqueue_receive(struct rpc_
if (!xprt_request_need_enqueue_receive(task, req)) return; + + xprt_request_prepare(task->tk_rqstp); spin_lock(&xprt->queue_lock);
/* Update the softirq receive buffer */ --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -909,6 +909,7 @@ static int xs_nospace(struct rpc_rqst *r static void xs_stream_prepare_request(struct rpc_rqst *req) { + xdr_free_bvec(&req->rq_rcv_buf); req->rq_task->tk_status = xdr_alloc_bvec(&req->rq_rcv_buf, GFP_KERNEL); }
From: Christophe Leroy christophe.leroy@c-s.fr
commit aeb87246537a83c2aff482f3f34a2e0991e02cbc upstream.
All mapping iterator logic is based on the assumption that sg->offset is always lower than PAGE_SIZE.
But there are situations where sg->offset is such that the SG item is on the second page. In that case sg_copy_to_buffer() fails properly copying the data into the buffer. One of the reason is that the data will be outside the kmapped area used to access that data.
This patch fixes the issue by adjusting the mapping iterator offset and pgoffset fields such that offset is always lower than PAGE_SIZE.
Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Fixes: 4225fc8555a9 ("lib/scatterlist: use page iterator in the mapping iterator") Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- lib/scatterlist.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -676,17 +676,18 @@ static bool sg_miter_get_next_page(struc { if (!miter->__remaining) { struct scatterlist *sg; - unsigned long pgoffset;
if (!__sg_page_iter_next(&miter->piter)) return false;
sg = miter->piter.sg; - pgoffset = miter->piter.sg_pgoffset;
- miter->__offset = pgoffset ? 0 : sg->offset; + miter->__offset = miter->piter.sg_pgoffset ? 0 : sg->offset; + miter->piter.sg_pgoffset += miter->__offset >> PAGE_SHIFT; + miter->__offset &= PAGE_SIZE - 1; miter->__remaining = sg->offset + sg->length - - (pgoffset << PAGE_SHIFT) - miter->__offset; + (miter->piter.sg_pgoffset << PAGE_SHIFT) - + miter->__offset; miter->__remaining = min_t(unsigned long, miter->__remaining, PAGE_SIZE - miter->__offset); }
From: Mark Brown broonie@kernel.org
commit ceaea851b9ea75f9ea2bbefb53ff0d4b27cd5a6e upstream.
Back in ff9fb72bc07705c (debugfs: return error values, not NULL) the debugfs APIs were changed to return error pointers rather than NULL pointers on error, breaking the error checking in ASoC. Update the code to use IS_ERR() and log the codes that are returned as part of the error messages.
Fixes: ff9fb72bc07705c (debugfs: return error values, not NULL) Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/soc/soc-dapm.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-)
--- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -2155,23 +2155,25 @@ void snd_soc_dapm_debugfs_init(struct sn { struct dentry *d;
- if (!parent) + if (!parent || IS_ERR(parent)) return;
dapm->debugfs_dapm = debugfs_create_dir("dapm", parent);
- if (!dapm->debugfs_dapm) { + if (IS_ERR(dapm->debugfs_dapm)) { dev_warn(dapm->dev, - "ASoC: Failed to create DAPM debugfs directory\n"); + "ASoC: Failed to create DAPM debugfs directory %ld\n", + PTR_ERR(dapm->debugfs_dapm)); return; }
d = debugfs_create_file("bias_level", 0444, dapm->debugfs_dapm, dapm, &dapm_bias_fops); - if (!d) + if (IS_ERR(d)) dev_warn(dapm->dev, - "ASoC: Failed to create bias level debugfs file\n"); + "ASoC: Failed to create bias level debugfs file: %ld\n", + PTR_ERR(d)); }
static void dapm_debugfs_add_widget(struct snd_soc_dapm_widget *w) @@ -2185,10 +2187,10 @@ static void dapm_debugfs_add_widget(stru d = debugfs_create_file(w->name, 0444, dapm->debugfs_dapm, w, &dapm_widget_power_fops); - if (!d) + if (IS_ERR(d)) dev_warn(w->dapm->dev, - "ASoC: Failed to create %s debugfs file\n", - w->name); + "ASoC: Failed to create %s debugfs file: %ld\n", + w->name, PTR_ERR(d)); }
static void dapm_debugfs_cleanup(struct snd_soc_dapm_context *dapm)
From: Mark Brown broonie@kernel.org
commit c2c928c93173f220955030e8440517b87ec7df92 upstream.
Back in ff9fb72bc07705c (debugfs: return error values, not NULL) the debugfs APIs were changed to return error pointers rather than NULL pointers on error, breaking the error checking in ASoC. Update the code to use IS_ERR() and log the codes that are returned as part of the error messages.
Fixes: ff9fb72bc07705c (debugfs: return error values, not NULL) Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/soc/soc-core.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
--- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -158,9 +158,10 @@ static void soc_init_component_debugfs(s component->card->debugfs_card_root); }
- if (!component->debugfs_root) { + if (IS_ERR(component->debugfs_root)) { dev_warn(component->dev, - "ASoC: Failed to create component debugfs directory\n"); + "ASoC: Failed to create component debugfs directory: %ld\n", + PTR_ERR(component->debugfs_root)); return; }
@@ -212,18 +213,21 @@ static void soc_init_card_debugfs(struct
card->debugfs_card_root = debugfs_create_dir(card->name, snd_soc_debugfs_root); - if (!card->debugfs_card_root) { + if (IS_ERR(card->debugfs_card_root)) { dev_warn(card->dev, - "ASoC: Failed to create card debugfs directory\n"); + "ASoC: Failed to create card debugfs directory: %ld\n", + PTR_ERR(card->debugfs_card_root)); + card->debugfs_card_root = NULL; return; }
card->debugfs_pop_time = debugfs_create_u32("dapm_pop_time", 0644, card->debugfs_card_root, &card->pop_time); - if (!card->debugfs_pop_time) + if (IS_ERR(card->debugfs_pop_time)) dev_warn(card->dev, - "ASoC: Failed to create pop time debugfs file\n"); + "ASoC: Failed to create pop time debugfs file: %ld\n", + PTR_ERR(card->debugfs_pop_time)); }
static void soc_cleanup_card_debugfs(struct snd_soc_card *card)
From: Xiao Ni xni@redhat.com
commit d9771f5ec46c282d518b453c793635dbdc3a2a94 upstream.
commit d5d885fd514f ("md: introduce new personality funciton start()") splits the init job to two parts. The first part run() does the jobs that do not require the md threads. The second part start() does the jobs that require the md threads.
Now it just does run() in adding new journal device. It needs to do the second part start() too.
Fixes: d5d885fd514f ("md: introduce new personality funciton start()") Cc: stable@vger.kernel.org #v4.9+ Reported-by: Michal Soltys soltys@ziu.info Signed-off-by: Xiao Ni xni@redhat.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/raid5.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -7672,7 +7672,7 @@ abort: static int raid5_add_disk(struct mddev *mddev, struct md_rdev *rdev) { struct r5conf *conf = mddev->private; - int err = -EEXIST; + int ret, err = -EEXIST; int disk; struct disk_info *p; int first = 0; @@ -7687,7 +7687,14 @@ static int raid5_add_disk(struct mddev * * The array is in readonly mode if journal is missing, so no * write requests running. We should be safe */ - log_init(conf, rdev, false); + ret = log_init(conf, rdev, false); + if (ret) + return ret; + + ret = r5l_start(conf->log); + if (ret) + return ret; + return 0; } if (mddev->recovery_disabled == conf->recovery_disabled)
From: Masahiro Yamada yamada.masahiro@socionext.com
commit 8e2442a5f86e1f77b86401fce274a7f622740bc4 upstream.
Since commit 00c864f8903d ("kconfig: allow all config targets to write auto.conf if missing"), Kconfig creates include/config/auto.conf in the defconfig stage when it is missing.
Joonas Kylmälä reported incorrect auto.conf generation under some circumstances.
To reproduce it, apply the following diff:
| --- a/arch/arm/configs/imx_v6_v7_defconfig | +++ b/arch/arm/configs/imx_v6_v7_defconfig | @@ -345,14 +345,7 @@ CONFIG_USB_CONFIGFS_F_MIDI=y | CONFIG_USB_CONFIGFS_F_HID=y | CONFIG_USB_CONFIGFS_F_UVC=y | CONFIG_USB_CONFIGFS_F_PRINTER=y | -CONFIG_USB_ZERO=m | -CONFIG_USB_AUDIO=m | -CONFIG_USB_ETH=m | -CONFIG_USB_G_NCM=m | -CONFIG_USB_GADGETFS=m | -CONFIG_USB_FUNCTIONFS=m | -CONFIG_USB_MASS_STORAGE=m | -CONFIG_USB_G_SERIAL=m | +CONFIG_USB_FUNCTIONFS=y | CONFIG_MMC=y | CONFIG_MMC_SDHCI=y | CONFIG_MMC_SDHCI_PLTFM=y
And then, run:
$ make ARCH=arm mrproper imx_v6_v7_defconfig
You will see CONFIG_USB_FUNCTIONFS=y is correctly contained in the .config, but not in the auto.conf.
Please note drivers/usb/gadget/legacy/Kconfig is included from a choice block in drivers/usb/gadget/Kconfig. So USB_FUNCTIONFS is a choice value.
This is probably a similar situation described in commit beaaddb62540 ("kconfig: tests: test defconfig when two choices interact").
When sym_calc_choice() is called, the choice symbol forgets the SYMBOL_DEF_USER unless all of its choice values are explicitly set by the user.
The choice symbol is given just one chance to recall it because set_all_choice_values() is called if SYMBOL_NEED_SET_CHOICE_VALUES is set.
When sym_calc_choice() is called again, the choice symbol forgets it forever, since SYMBOL_NEED_SET_CHOICE_VALUES is a one-time aid. Hence, we cannot call sym_clear_all_valid() again and again.
It is crazy to repeat set and unset of internal flags. However, we cannot simply get rid of "sym->flags &= flags | ~SYMBOL_DEF_USER;" Doing so would re-introduce the problem solved by commit 5d09598d488f ("kconfig: fix new choices being skipped upon config update").
To work around the issue, conf_write_autoconf() stopped calling sym_clear_all_valid().
conf_write() must be changed accordingly. Currently, it clears SYMBOL_WRITE after the symbol is written into the .config file. This is needed to prevent it from writing the same symbol multiple times in case the symbol is declared in two or more locations. I added the new flag SYMBOL_WRITTEN, to track the symbols that have been written.
Anyway, this is a cheesy workaround in order to suppress the issue as far as defconfig is concerned.
Handling of choices is totally broken. sym_clear_all_valid() is called every time a user touches a symbol from the GUI interface. To reproduce it, just add a new symbol drivers/usb/gadget/legacy/Kconfig, then touch around unrelated symbols from menuconfig. USB_FUNCTIONFS will disappear from the .config file.
I added the Fixes tag since it is more fatal than before. But, this has been broken since long long time before, and still it is. We should take a closer look to fix this correctly somehow.
Fixes: 00c864f8903d ("kconfig: allow all config targets to write auto.conf if missing") Cc: linux-stable stable@vger.kernel.org # 4.19+ Reported-by: Joonas Kylmälä joonas.kylmala@iki.fi Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Tested-by: Joonas Kylmälä joonas.kylmala@iki.fi Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- scripts/kconfig/confdata.c | 7 +++---- scripts/kconfig/expr.h | 1 + 2 files changed, 4 insertions(+), 4 deletions(-)
--- a/scripts/kconfig/confdata.c +++ b/scripts/kconfig/confdata.c @@ -914,7 +914,8 @@ int conf_write(const char *name) "# %s\n" "#\n", str); need_newline = false; - } else if (!(sym->flags & SYMBOL_CHOICE)) { + } else if (!(sym->flags & SYMBOL_CHOICE) && + !(sym->flags & SYMBOL_WRITTEN)) { sym_calc_value(sym); if (!(sym->flags & SYMBOL_WRITE)) goto next; @@ -922,7 +923,7 @@ int conf_write(const char *name) fprintf(out, "\n"); need_newline = false; } - sym->flags &= ~SYMBOL_WRITE; + sym->flags |= SYMBOL_WRITTEN; conf_write_symbol(out, sym, &kconfig_printer_cb, NULL); }
@@ -1082,8 +1083,6 @@ int conf_write_autoconf(int overwrite) if (!overwrite && is_present(autoconf_name)) return 0;
- sym_clear_all_valid(); - conf_write_dep("include/config/auto.conf.cmd");
if (conf_touch_deps()) --- a/scripts/kconfig/expr.h +++ b/scripts/kconfig/expr.h @@ -141,6 +141,7 @@ struct symbol { #define SYMBOL_OPTIONAL 0x0100 /* choice is optional - values can be 'n' */ #define SYMBOL_WRITE 0x0200 /* write symbol to file (KCONFIG_CONFIG) */ #define SYMBOL_CHANGED 0x0400 /* ? */ +#define SYMBOL_WRITTEN 0x0800 /* track info to avoid double-write to .config */ #define SYMBOL_NO_WRITE 0x1000 /* Symbol for internal use only; it will not be written */ #define SYMBOL_CHECKED 0x2000 /* used during dependency checking */ #define SYMBOL_WARNED 0x8000 /* warning has been issued */
From: Takashi Iwai tiwai@suse.de
commit ede34f397ddb063b145b9e7d79c6026f819ded13 upstream.
The fix for the racy writes and ioctls to sequencer widened the application of client->ioctl_mutex to the whole write loop. Although it does unlock/relock for the lengthy operation like the event dup, the loop keeps the ioctl_mutex for the whole time in other situations. This may take quite long time if the user-space would give a huge buffer, and this is a likely cause of some weird behavior spotted by syzcaller fuzzer.
This patch puts a simple workaround, just adding a mutex break in the loop when a large number of events have been processed. This shouldn't hit any performance drop because the threshold is set high enough for usual operations.
Fixes: 7bd800915677 ("ALSA: seq: More protection for concurrent write and ioctl races") Reported-by: syzbot+97aae04ce27e39cbfca9@syzkaller.appspotmail.com Reported-by: syzbot+4c595632b98bb8ffcc66@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/core/seq/seq_clientmgr.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
--- a/sound/core/seq/seq_clientmgr.c +++ b/sound/core/seq/seq_clientmgr.c @@ -1021,7 +1021,7 @@ static ssize_t snd_seq_write(struct file { struct snd_seq_client *client = file->private_data; int written = 0, len; - int err; + int err, handled; struct snd_seq_event event;
if (!(snd_seq_file_flags(file) & SNDRV_SEQ_LFLG_OUTPUT)) @@ -1034,6 +1034,8 @@ static ssize_t snd_seq_write(struct file if (!client->accept_output || client->pool == NULL) return -ENXIO;
+ repeat: + handled = 0; /* allocate the pool now if the pool is not allocated yet */ mutex_lock(&client->ioctl_mutex); if (client->pool->size > 0 && !snd_seq_write_pool_allocated(client)) { @@ -1093,12 +1095,19 @@ static ssize_t snd_seq_write(struct file 0, 0, &client->ioctl_mutex); if (err < 0) break; + handled++;
__skip_event: /* Update pointers and counts */ count -= len; buf += len; written += len; + + /* let's have a coffee break if too many events are queued */ + if (++handled >= 200) { + mutex_unlock(&client->ioctl_mutex); + goto repeat; + } }
out:
From: Takashi Iwai tiwai@suse.de
commit 4914da2fb0c89205790503f20dfdde854f3afdd8 upstream.
We apply the codec resume forcibly at system resume callback for updating and syncing the jack detection state that may have changed during sleeping. This is, however, superfluous for the codec like Intel HDMI/DP, where the jack detection is managed via the audio component notification; i.e. the jack state change shall be reported sooner or later from the graphics side at mode change.
This patch changes the codec resume callback to avoid the forcible resume conditionally with a new flag, codec->relaxed_resume, for reducing the resume time. The flag is set in the codec probe.
Although this doesn't fix the entire bug mentioned in the bugzilla entry below, it's still a good optimization and some improvements are seen.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201901 Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/sound/hda_codec.h | 2 ++ sound/pci/hda/hda_codec.c | 8 ++++++-- sound/pci/hda/patch_hdmi.c | 6 +++++- 3 files changed, 13 insertions(+), 3 deletions(-)
--- a/include/sound/hda_codec.h +++ b/include/sound/hda_codec.h @@ -249,6 +249,8 @@ struct hda_codec { unsigned int auto_runtime_pm:1; /* enable automatic codec runtime pm */ unsigned int force_pin_prefix:1; /* Add location prefix */ unsigned int link_down_at_suspend:1; /* link down at runtime suspend */ + unsigned int relaxed_resume:1; /* don't resume forcibly for jack */ + #ifdef CONFIG_PM unsigned long power_on_acct; unsigned long power_off_acct; --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -2941,15 +2941,19 @@ static int hda_codec_runtime_resume(stru #ifdef CONFIG_PM_SLEEP static int hda_codec_force_resume(struct device *dev) { + struct hda_codec *codec = dev_to_hda_codec(dev); + bool forced_resume = !codec->relaxed_resume; int ret;
/* The get/put pair below enforces the runtime resume even if the * device hasn't been used at suspend time. This trick is needed to * update the jack state change during the sleep. */ - pm_runtime_get_noresume(dev); + if (forced_resume) + pm_runtime_get_noresume(dev); ret = pm_runtime_force_resume(dev); - pm_runtime_put(dev); + if (forced_resume) + pm_runtime_put(dev); return ret; }
--- a/sound/pci/hda/patch_hdmi.c +++ b/sound/pci/hda/patch_hdmi.c @@ -2291,8 +2291,10 @@ static void generic_hdmi_free(struct hda struct hdmi_spec *spec = codec->spec; int pin_idx, pcm_idx;
- if (codec_has_acomp(codec)) + if (codec_has_acomp(codec)) { snd_hdac_acomp_register_notifier(&codec->bus->core, NULL); + codec->relaxed_resume = 0; + }
for (pin_idx = 0; pin_idx < spec->num_pins; pin_idx++) { struct hdmi_spec_per_pin *per_pin = get_pin(spec, pin_idx); @@ -2565,6 +2567,8 @@ static void register_i915_notifier(struc spec->drm_audio_ops.pin_eld_notify = intel_pin_eld_notify; snd_hdac_acomp_register_notifier(&codec->bus->core, &spec->drm_audio_ops); + /* no need for forcible resume for jack check thanks to notifier */ + codec->relaxed_resume = 1; }
/* setup_stream ops override for HSW+ */
From: Kailang Yang kailang@realtek.com
commit fbc571290d9f7bfe089c50f4ac4028dd98ebfe98 upstream.
It assigned to wrong model. So, The headphone Mic can't work.
Fixes: 3f640970a414 ("ALSA: hda - Fix headset mic detection problem for several Dell laptops") Signed-off-by: Kailang Yang kailang@realtek.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -7657,9 +7657,12 @@ static const struct snd_hda_pin_quirk al {0x12, 0x90a60130}, {0x17, 0x90170110}, {0x21, 0x03211020}), - SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, + SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE, {0x14, 0x90170110}, {0x21, 0x04211020}), + SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE, + {0x14, 0x90170110}, + {0x21, 0x04211030}), SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, ALC295_STANDARD_PINS, {0x17, 0x21014020},
From: Hui Wang hui.wang@canonical.com
commit 4b4e0e32e4b09274dbc9d173016c1a026f44608c upstream.
Without this patch, the headset-mic and headphone-mic don't work.
Cc: stable@vger.kernel.org Signed-off-by: Hui Wang hui.wang@canonical.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -8803,6 +8803,11 @@ static const struct snd_hda_pin_quirk al {0x18, 0x01a19030}, {0x1a, 0x01813040}, {0x21, 0x01014020}), + SND_HDA_PIN_QUIRK(0x10ec0867, 0x1028, "Dell", ALC891_FIXUP_DELL_MIC_NO_PRESENCE, + {0x16, 0x01813030}, + {0x17, 0x02211010}, + {0x18, 0x01a19040}, + {0x21, 0x01014020}), SND_HDA_PIN_QUIRK(0x10ec0662, 0x1028, "Dell", ALC662_FIXUP_DELL_MIC_NO_PRESENCE, {0x14, 0x01014010}, {0x18, 0x01a19020},
From: Takashi Iwai tiwai@suse.de
commit eb4177116bf568a413c544eca3f4446cb4064be9 upstream.
INTEL_GET_VENDOR_VERB is defined twice identically. Let's remove a superfluous line.
Fixes: b0d8bc50b9f2 ("ALSA: hda: hdmi - add Icelake support") Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_hdmi.c | 1 - 1 file changed, 1 deletion(-)
--- a/sound/pci/hda/patch_hdmi.c +++ b/sound/pci/hda/patch_hdmi.c @@ -2418,7 +2418,6 @@ static void intel_haswell_fixup_connect_ }
#define INTEL_GET_VENDOR_VERB 0xf81 -#define INTEL_GET_VENDOR_VERB 0xf81 #define INTEL_SET_VENDOR_VERB 0x781 #define INTEL_EN_DP12 0x02 /* enable DP 1.2 features */ #define INTEL_EN_ALL_PIN_CVTS 0x01 /* enable 2nd & 3rd pins and convertors */
From: Takashi Iwai tiwai@suse.de
commit 3140aafb22edeab0cc41f15f53b12a118c0ac215 upstream.
The recent fix for Icelake HDMI codec introduced the mapping from pin NID to the i915 gfx port number. However, it forgot the reverse mapping from the port number to the pin NID that is used in the ELD notifier callback. As a result, it's processed to a wrong widget and gives a warning like snd_hda_codec_hdmi hdaudioC0D2: HDMI: pin nid 5 not registered
This patch corrects it with a proper reverse mapping function.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204133 Fixes: b0d8bc50b9f2 ("ALSA: hda: hdmi - add Icelake support") Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_hdmi.c | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-)
--- a/sound/pci/hda/patch_hdmi.c +++ b/sound/pci/hda/patch_hdmi.c @@ -2525,18 +2525,32 @@ static int intel_pin2port(void *audio_pt return -1; }
+static int intel_port2pin(struct hda_codec *codec, int port) +{ + struct hdmi_spec *spec = codec->spec; + + if (!spec->port_num) { + /* we assume only from port-B to port-D */ + if (port < 1 || port > 3) + return 0; + /* intel port is 1-based */ + return port + intel_base_nid(codec) - 1; + } + + if (port < 1 || port > spec->port_num) + return 0; + return spec->port_map[port - 1]; +} + static void intel_pin_eld_notify(void *audio_ptr, int port, int pipe) { struct hda_codec *codec = audio_ptr; int pin_nid; int dev_id = pipe;
- /* we assume only from port-B to port-D */ - if (port < 1 || port > 3) + pin_nid = intel_port2pin(codec, port); + if (!pin_nid) return; - - pin_nid = port + intel_base_nid(codec) - 1; /* intel port is 1-based */ - /* skip notification during system suspend (but not in runtime PM); * the state will be updated at resume */
From: Luis Henriques lhenriques@suse.com
commit d31d07b97a5e76f41e00eb81dcca740e84aa7782 upstream.
Commit e450f4d1a5d6 ("ceph: pass inclusive lend parameter to filemap_write_and_wait_range()") fixed the end offset parameter used to call filemap_write_and_wait_range and invalidate_inode_pages2_range. Unfortunately it missed truncate_inode_pages_range, introducing a regression that is easily detected by xfstest generic/130.
The problem is that when doing direct IO it is possible that an extra page is truncated from the page cache when the end offset is page aligned. This can cause data loss if that page hasn't been sync'ed to the OSDs.
While there, change code to use PAGE_ALIGN macro instead.
Cc: stable@vger.kernel.org Fixes: e450f4d1a5d6 ("ceph: pass inclusive lend parameter to filemap_write_and_wait_range()") Signed-off-by: Luis Henriques lhenriques@suse.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ceph/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1007,7 +1007,7 @@ ceph_direct_read_write(struct kiocb *ioc * may block. */ truncate_inode_pages_range(inode->i_mapping, pos, - (pos+len) | (PAGE_SIZE - 1)); + PAGE_ALIGN(pos + len) - 1);
req->r_mtime = mtime; }
From: Yan, Zheng zyan@redhat.com
commit 87bc5b895d94a0f40fe170d4cf5771c8e8f85d15 upstream.
remove_session_caps() relies on __wait_on_freeing_inode(), to wait for freeing inode to remove its caps. But VFS wakes freeing inode waiters before calling destroy_inode().
Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/40102 Signed-off-by: "Yan, Zheng" zyan@redhat.com Reviewed-by: Jeff Layton jlayton@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ceph/inode.c | 7 +++++-- fs/ceph/super.c | 2 +- fs/ceph/super.h | 2 +- 3 files changed, 7 insertions(+), 4 deletions(-)
--- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -523,13 +523,16 @@ void ceph_free_inode(struct inode *inode kmem_cache_free(ceph_inode_cachep, ci); }
-void ceph_destroy_inode(struct inode *inode) +void ceph_evict_inode(struct inode *inode) { struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_inode_frag *frag; struct rb_node *n;
- dout("destroy_inode %p ino %llx.%llx\n", inode, ceph_vinop(inode)); + dout("evict_inode %p ino %llx.%llx\n", inode, ceph_vinop(inode)); + + truncate_inode_pages_final(&inode->i_data); + clear_inode(inode);
ceph_fscache_unregister_inode_cookie(ci);
--- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -840,10 +840,10 @@ static int ceph_remount(struct super_blo
static const struct super_operations ceph_super_ops = { .alloc_inode = ceph_alloc_inode, - .destroy_inode = ceph_destroy_inode, .free_inode = ceph_free_inode, .write_inode = ceph_write_inode, .drop_inode = ceph_drop_inode, + .evict_inode = ceph_evict_inode, .sync_fs = ceph_sync_fs, .put_super = ceph_put_super, .remount_fs = ceph_remount, --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -876,7 +876,7 @@ static inline bool __ceph_have_pending_c extern const struct inode_operations ceph_file_iops;
extern struct inode *ceph_alloc_inode(struct super_block *sb); -extern void ceph_destroy_inode(struct inode *inode); +extern void ceph_evict_inode(struct inode *inode); extern void ceph_free_inode(struct inode *inode); extern int ceph_drop_inode(struct inode *inode);
From: Boris Brezillon boris.brezillon@collabora.com
commit 07d89227a983df957a6a7c56f7c040cde9ac571f upstream.
cfg->type can be overridden by v4l2_ctrl_fill() and the new value is stored in the local type var. Fix the tests to use this local var.
Fixes: 0996517cf8ea ("V4L/DVB: v4l2: Add new control handling framework") Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon boris.brezillon@collabora.com [hverkuil-cisco@xs4all.nl: change to !qmenu and !qmenu_int (checkpatch)] Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/v4l2-core/v4l2-ctrls.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
--- a/drivers/media/v4l2-core/v4l2-ctrls.c +++ b/drivers/media/v4l2-core/v4l2-ctrls.c @@ -2369,16 +2369,15 @@ struct v4l2_ctrl *v4l2_ctrl_new_custom(s v4l2_ctrl_fill(cfg->id, &name, &type, &min, &max, &step, &def, &flags);
- is_menu = (cfg->type == V4L2_CTRL_TYPE_MENU || - cfg->type == V4L2_CTRL_TYPE_INTEGER_MENU); + is_menu = (type == V4L2_CTRL_TYPE_MENU || + type == V4L2_CTRL_TYPE_INTEGER_MENU); if (is_menu) WARN_ON(step); else WARN_ON(cfg->menu_skip_mask); - if (cfg->type == V4L2_CTRL_TYPE_MENU && qmenu == NULL) + if (type == V4L2_CTRL_TYPE_MENU && !qmenu) { qmenu = v4l2_ctrl_get_menu(cfg->id); - else if (cfg->type == V4L2_CTRL_TYPE_INTEGER_MENU && - qmenu_int == NULL) { + } else if (type == V4L2_CTRL_TYPE_INTEGER_MENU && !qmenu_int) { handler_set_err(hdl, -EINVAL); return NULL; }
From: Ezequiel Garcia ezequiel@collabora.com
commit 766b9b168f6c75c350dd87c3e0bc6a9b322f0013 upstream.
The mutex unlock in the threaded interrupt handler is not paired with any mutex lock. Remove it.
This bug has been here for a really long time, so it applies to any stable repo.
Reviewed-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Ezequiel Garcia ezequiel@collabora.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Cc: stable@vger.kernel.org Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/platform/coda/coda-bit.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/media/platform/coda/coda-bit.c +++ b/drivers/media/platform/coda/coda-bit.c @@ -2310,7 +2310,6 @@ irqreturn_t coda_irq_handler(int irq, vo if (ctx == NULL) { v4l2_err(&dev->v4l2_dev, "Instance released before the end of transaction\n"); - mutex_unlock(&dev->coda_mutex); return IRQ_HANDLED; }
From: Sakari Ailus sakari.ailus@linux.intel.com
commit defcdc5d89ced780fb45196d539d6570ec5b1ba5 upstream.
PAGE_ALIGN() may wrap the buffer size around to 0. Prevent this by checking that the aligned value is not smaller than the unaligned one.
Note on backporting to stable: the file used to be under drivers/media/v4l2-core, it was moved to the current location after 4.14.
Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Cc: stable@vger.kernel.org Reviewed-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/common/videobuf2/videobuf2-core.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/media/common/videobuf2/videobuf2-core.c +++ b/drivers/media/common/videobuf2/videobuf2-core.c @@ -207,6 +207,10 @@ static int __vb2_buf_mem_alloc(struct vb for (plane = 0; plane < vb->num_planes; ++plane) { unsigned long size = PAGE_ALIGN(vb->planes[plane].length);
+ /* Did it wrap around? */ + if (size < vb->planes[plane].length) + goto free; + mem_priv = call_ptr_memop(vb, alloc, q->alloc_devs[plane] ? : q->dev, q->dma_attrs, size, q->dma_dir, q->gfp_flags);
From: Sakari Ailus sakari.ailus@linux.intel.com
commit 14f28f5cea9e3998442de87846d1907a531b6748 upstream.
buf->size is an unsigned long; casting that to int will lead to an overflow if buf->size exceeds INT_MAX.
Fix this by changing the type to unsigned long instead. This is possible as the buf->size is always aligned to PAGE_SIZE, and therefore the size will never have values lesser than 0.
Note on backporting to stable: the file used to be under drivers/media/v4l2-core, it was moved to the current location after 4.14.
Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Cc: stable@vger.kernel.org Reviewed-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/common/videobuf2/videobuf2-dma-sg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c @@ -59,7 +59,7 @@ static int vb2_dma_sg_alloc_compacted(st gfp_t gfp_flags) { unsigned int last_page = 0; - int size = buf->size; + unsigned long size = buf->size;
while (size > 0) { struct page *pages;
From: Sean Christopherson sean.j.christopherson@intel.com
commit 73cb85568433feadb79e963bf2efba9b3e9ae3df upstream.
... as a malicious userspace can run a toy guest to generate invalid virtual-APIC page addresses in L1, i.e. flood the kernel log with error messages.
Fixes: 690908104e39d ("KVM: nVMX: allow tests to use bad virtual-APIC page address") Cc: stable@vger.kernel.org Cc: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx/nested.c | 3 --- 1 file changed, 3 deletions(-)
--- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -2878,9 +2878,6 @@ static void nested_get_vmcs12_pages(stru */ vmcs_clear_bits(CPU_BASED_VM_EXEC_CONTROL, CPU_BASED_TPR_SHADOW); - } else { - printk("bad virtual-APIC page address\n"); - dump_vmcs(); } }
From: Sean Christopherson sean.j.christopherson@intel.com
commit d28f4290b53a157191ed9991ad05dffe9e8c0c89 upstream.
The behavior of WRMSR is in no way dependent on whether or not KVM consumes the value.
Fixes: 4566654bb9be9 ("KVM: vmx: Inject #GP on invalid PAT CR") Cc: stable@vger.kernel.org Cc: Nadav Amit nadav.amit@gmail.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx/vmx.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1896,9 +1896,10 @@ static int vmx_set_msr(struct kvm_vcpu * MSR_TYPE_W); break; case MSR_IA32_CR_PAT: + if (!kvm_pat_valid(data)) + return 1; + if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) { - if (!kvm_pat_valid(data)) - return 1; vmcs_write64(GUEST_IA32_PAT, data); vcpu->arch.pat = data; break;
From: Sean Christopherson sean.j.christopherson@intel.com
commit 3b013a2972d5bc344d6eaa8f24fdfe268211e45f upstream.
If L1 does not set VM_ENTRY_LOAD_BNDCFGS, then L1's BNDCFGS value must be propagated to vmcs02 since KVM always runs with VM_ENTRY_LOAD_BNDCFGS when MPX is supported. Because the value effectively comes from vmcs01, vmcs02 must be updated even if vmcs12 is clean.
Fixes: 62cf9bd8118c4 ("KVM: nVMX: Fix emulation of VM_ENTRY_LOAD_BNDCFGS") Cc: stable@vger.kernel.org Cc: Liran Alon liran.alon@oracle.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx/nested.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
--- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -2234,13 +2234,9 @@ static void prepare_vmcs02_full(struct v
set_cr4_guest_host_mask(vmx);
- if (kvm_mpx_supported()) { - if (vmx->nested.nested_run_pending && - (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)) - vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs); - else - vmcs_write64(GUEST_BNDCFGS, vmx->nested.vmcs01_guest_bndcfgs); - } + if (kvm_mpx_supported() && vmx->nested.nested_run_pending && + (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)) + vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs); }
/* @@ -2283,6 +2279,9 @@ static int prepare_vmcs02(struct kvm_vcp kvm_set_dr(vcpu, 7, vcpu->arch.dr7); vmcs_write64(GUEST_IA32_DEBUGCTL, vmx->nested.vmcs01_debugctl); } + if (kvm_mpx_supported() && (!vmx->nested.nested_run_pending || + !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))) + vmcs_write64(GUEST_BNDCFGS, vmx->nested.vmcs01_guest_bndcfgs); vmx_set_rflags(vcpu, vmcs12->guest_rflags);
/* EXCEPTION_BITMAP and CR0_GUEST_HOST_MASK should basically be the
From: Sean Christopherson sean.j.christopherson@intel.com
commit beb8d93b3e423043e079ef3dda19dad7b28467a8 upstream.
A previous fix to prevent KVM from consuming stale VMCS state after a failed VM-Entry inadvertantly blocked KVM's handling of machine checks that occur during VM-Entry.
Per Intel's SDM, a #MC during VM-Entry is handled in one of three ways, depending on when the #MC is recognoized. As it pertains to this bug fix, the third case explicitly states EXIT_REASON_MCE_DURING_VMENTRY is handled like any other VM-Exit during VM-Entry, i.e. sets bit 31 to indicate the VM-Entry failed.
If a machine-check event occurs during a VM entry, one of the following occurs: - The machine-check event is handled as if it occurred before the VM entry: ... - The machine-check event is handled after VM entry completes: ... - A VM-entry failure occurs as described in Section 26.7. The basic exit reason is 41, for "VM-entry failure due to machine-check event".
Explicitly handle EXIT_REASON_MCE_DURING_VMENTRY as a one-off case in vmx_vcpu_run() instead of binning it into vmx_complete_atomic_exit(). Doing so allows vmx_vcpu_run() to handle VMX_EXIT_REASONS_FAILED_VMENTRY in a sane fashion and also simplifies vmx_complete_atomic_exit() since VMCS.VM_EXIT_INTR_INFO is guaranteed to be fresh.
Fixes: b060ca3b2e9e7 ("kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Reviewed-by: Jim Mattson jmattson@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx/vmx.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-)
--- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6110,28 +6110,21 @@ static void vmx_apicv_post_state_restore
static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx) { - u32 exit_intr_info = 0; - u16 basic_exit_reason = (u16)vmx->exit_reason; - - if (!(basic_exit_reason == EXIT_REASON_MCE_DURING_VMENTRY - || basic_exit_reason == EXIT_REASON_EXCEPTION_NMI)) + if (vmx->exit_reason != EXIT_REASON_EXCEPTION_NMI) return;
- if (!(vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)) - exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO); - vmx->exit_intr_info = exit_intr_info; + vmx->exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
/* if exit due to PF check for async PF */ - if (is_page_fault(exit_intr_info)) + if (is_page_fault(vmx->exit_intr_info)) vmx->vcpu.arch.apf.host_apf_reason = kvm_read_and_reset_pf_reason();
/* Handle machine checks before interrupts are enabled */ - if (basic_exit_reason == EXIT_REASON_MCE_DURING_VMENTRY || - is_machine_check(exit_intr_info)) + if (is_machine_check(vmx->exit_intr_info)) kvm_machine_check();
/* We need to handle NMIs before interrupts are enabled */ - if (is_nmi(exit_intr_info)) { + if (is_nmi(vmx->exit_intr_info)) { kvm_before_interrupt(&vmx->vcpu); asm("int $2"); kvm_after_interrupt(&vmx->vcpu); @@ -6534,6 +6527,9 @@ static void vmx_vcpu_run(struct kvm_vcpu vmx->idt_vectoring_info = 0;
vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON); + if ((u16)vmx->exit_reason == EXIT_REASON_MCE_DURING_VMENTRY) + kvm_machine_check(); + if (vmx->fail || (vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)) return;
From: Wanpeng Li wanpengli@tencent.com
commit 4d763b168e9c5c366b05812c7bba7662e5ea3669 upstream.
Raise #GP when guest read/write IA32_XSS, but the CPUID bits say that it shouldn't exist.
Fixes: 203000993de5 (kvm: vmx: add MSR logic for XSAVES) Reported-by: Xiaoyao Li xiaoyao.li@linux.intel.com Reported-by: Tao Xu tao3.xu@intel.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Radim Krčmář rkrcmar@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li wanpengli@tencent.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx/vmx.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1718,7 +1718,10 @@ static int vmx_get_msr(struct kvm_vcpu * return vmx_get_vmx_msr(&vmx->nested.msrs, msr_info->index, &msr_info->data); case MSR_IA32_XSS: - if (!vmx_xsaves_supported()) + if (!vmx_xsaves_supported() || + (!msr_info->host_initiated && + !(guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) && + guest_cpuid_has(vcpu, X86_FEATURE_XSAVES)))) return 1; msr_info->data = vcpu->arch.ia32_xss; break; @@ -1933,7 +1936,10 @@ static int vmx_set_msr(struct kvm_vcpu * return 1; return vmx_set_vmx_msr(vcpu, msr_index, data); case MSR_IA32_XSS: - if (!vmx_xsaves_supported()) + if (!vmx_xsaves_supported() || + (!msr_info->host_initiated && + !(guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) && + guest_cpuid_has(vcpu, X86_FEATURE_XSAVES)))) return 1; /* * The only supported bit as of Skylake is bit 8, but
From: KarimAllah Ahmed karahmed@amazon.de
commit b614c6027896ff9ad6757122e84760d938cab15e upstream.
The field "page" is initialized to KVM_UNMAPPED_PAGE when it is not used (i.e. when the memory lives outside kernel control). So this check will always end up using kunmap even for memremap regions.
Fixes: e45adf665a53 ("KVM: Introduce a new guest mapping API") Cc: stable@vger.kernel.org Signed-off-by: KarimAllah Ahmed karahmed@amazon.de Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- virt/kvm/kvm_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1790,7 +1790,7 @@ void kvm_vcpu_unmap(struct kvm_vcpu *vcp if (!map->hva) return;
- if (map->page) + if (map->page != KVM_UNMAPPED_PAGE) kunmap(map->page); #ifdef CONFIG_HAS_IOMEM else
From: Suraj Jitindar Singh sjitindarsingh@gmail.com
commit 869537709ebf1dc865e75c3fc97b23f8acf37c16 upstream.
On POWER9 the decrementer can operate in large decrementer mode where the decrementer is 56 bits and signed extended to 64 bits. When not operating in this mode the decrementer behaves as a 32 bit decrementer which is NOT signed extended (as on POWER8).
Currently when reading a guest decrementer value we don't take into account whether the large decrementer is enabled or not, and this means the value will be incorrect when the guest is not using the large decrementer. Fix this by sign extending the value read when the guest isn't using the large decrementer.
Fixes: 95a6432ce903 ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Suraj Jitindar Singh sjitindarsingh@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kvm/book3s_hv.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -3603,6 +3603,8 @@ int kvmhv_p9_guest_entry(struct kvm_vcpu
vcpu->arch.slb_max = 0; dec = mfspr(SPRN_DEC); + if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */ + dec = (s32) dec; tb = mftb(); vcpu->arch.dec_expires = dec + tb; vcpu->cpu = -1;
From: Suraj Jitindar Singh sjitindarsingh@gmail.com
commit 3c25ab35fbc8526ac0c9b298e8a78e7ad7a55479 upstream.
If we enter an L1 guest with a pending decrementer exception then this is cleared on guest exit if the guest has writtien a positive value into the decrementer (indicating that it handled the decrementer exception) since there is no other way to detect that the guest has handled the pending exception and that it should be dequeued. In the event that the L1 guest tries to run a nested (L2) guest immediately after this and the L2 guest decrementer is negative (which is loaded by L1 before making the H_ENTER_NESTED hcall), then the pending decrementer exception isn't cleared and the L2 entry is blocked since L1 has a pending exception, even though L1 may have already handled the exception and written a positive value for it's decrementer. This results in a loop of L1 trying to enter the L2 guest and L0 blocking the entry since L1 has an interrupt pending with the outcome being that L2 never gets to run and hangs.
Fix this by clearing any pending decrementer exceptions when L1 makes the H_ENTER_NESTED hcall since it won't do this if it's decrementer has gone negative, and anyway it's decrementer has been communicated to L0 in the hdec_expires field and L0 will return control to L1 when this goes negative by delivering an H_DECREMENTER exception.
Fixes: 95a6432ce903 ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Suraj Jitindar Singh sjitindarsingh@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kvm/book3s_hv.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -4124,8 +4124,15 @@ int kvmhv_run_single_vcpu(struct kvm_run
preempt_enable();
- /* cancel pending decrementer exception if DEC is now positive */ - if (get_tb() < vcpu->arch.dec_expires && kvmppc_core_pending_dec(vcpu)) + /* + * cancel pending decrementer exception if DEC is now positive, or if + * entering a nested guest in which case the decrementer is now owned + * by L2 and the L1 decrementer is provided in hdec_expires + */ + if (kvmppc_core_pending_dec(vcpu) && + ((get_tb() < vcpu->arch.dec_expires) || + (trap == BOOK3S_INTERRUPT_SYSCALL && + kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED))) kvmppc_core_dequeue_dec(vcpu);
trace_kvm_guest_exit(vcpu);
From: Michael Neuling mikey@neuling.org
commit 3fefd1cd95df04da67c83c1cb93b663f04b3324f upstream.
When emulating tsr, treclaim and trechkpt, we incorrectly set CR0. The code currently sets: CR0 <- 00 || MSR[TS] but according to the ISA it should be: CR0 <- 0 || MSR[TS] || 0
This fixes the bit shift to put the bits in the correct location.
This is a data integrity issue as CR0 is corrupted.
Fixes: 4bb3c7a0208f ("KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9") Cc: stable@vger.kernel.org # v4.17+ Tested-by: Suraj Jitindar Singh sjitindarsingh@gmail.com Signed-off-by: Michael Neuling mikey@neuling.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kvm/book3s_hv_tm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/powerpc/kvm/book3s_hv_tm.c +++ b/arch/powerpc/kvm/book3s_hv_tm.c @@ -128,7 +128,7 @@ int kvmhv_p9_tm_emulation(struct kvm_vcp } /* Set CR0 to indicate previous transactional state */ vcpu->arch.regs.ccr = (vcpu->arch.regs.ccr & 0x0fffffff) | - (((msr & MSR_TS_MASK) >> MSR_TS_S_LG) << 28); + (((msr & MSR_TS_MASK) >> MSR_TS_S_LG) << 29); /* L=1 => tresume, L=0 => tsuspend */ if (instr & (1 << 21)) { if (MSR_TM_SUSPENDED(msr)) @@ -172,7 +172,7 @@ int kvmhv_p9_tm_emulation(struct kvm_vcp
/* Set CR0 to indicate previous transactional state */ vcpu->arch.regs.ccr = (vcpu->arch.regs.ccr & 0x0fffffff) | - (((msr & MSR_TS_MASK) >> MSR_TS_S_LG) << 28); + (((msr & MSR_TS_MASK) >> MSR_TS_S_LG) << 29); vcpu->arch.shregs.msr &= ~MSR_TS_MASK; return RESUME_GUEST;
@@ -202,7 +202,7 @@ int kvmhv_p9_tm_emulation(struct kvm_vcp
/* Set CR0 to indicate previous transactional state */ vcpu->arch.regs.ccr = (vcpu->arch.regs.ccr & 0x0fffffff) | - (((msr & MSR_TS_MASK) >> MSR_TS_S_LG) << 28); + (((msr & MSR_TS_MASK) >> MSR_TS_S_LG) << 29); vcpu->arch.shregs.msr = msr | MSR_TS_S; return RESUME_GUEST; }
From: Like Xu like.xu@linux.intel.com
commit 6fc3977ccc5d3c22e851f2dce2d3ce2a0a843842 upstream.
If a perf_event creation fails due to any reason of the host perf subsystem, it has no chance to log the corresponding event for guest which may cause abnormal sampling data in guest result. In debug mode, this message helps to understand the state of vPMC and we may not limit the number of occurrences but not in a spamming style.
Suggested-by: Joe Perches joe@perches.com Signed-off-by: Like Xu like.xu@linux.intel.com Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/pmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -128,8 +128,8 @@ static void pmc_reprogram_counter(struct intr ? kvm_perf_overflow_intr : kvm_perf_overflow, pmc); if (IS_ERR(event)) { - printk_once("kvm_pmu: event creation failed %ld\n", - PTR_ERR(event)); + pr_debug_ratelimited("kvm_pmu: event creation failed %ld for pmc->idx = %d\n", + PTR_ERR(event), pmc->idx); return; }
From: Jon Hunter jonathanh@nvidia.com
commit ba24eee6686f6ed3738602b54d959253316a9541 upstream.
The Tegra AGIC interrupt controller is an ARM GIC400 interrupt controller. Per the ARM GIC device-tree binding, the first address region is for the GIC distributor registers and the second address region is for the GIC CPU interface registers. The address space for the distributor registers is 4kB, but currently this is incorrectly defined as 8kB for the Tegra AGIC and overlaps with the CPU interface registers. Correct the address space for the distributor to be 4kB.
Cc: stable@vger.kernel.org Signed-off-by: Jon Hunter jonathanh@nvidia.com Fixes: bcdbde433542 ("arm64: tegra: Add AGIC node for Tegra210") Signed-off-by: Thierry Reding treding@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/boot/dts/nvidia/tegra210.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi @@ -1258,7 +1258,7 @@ compatible = "nvidia,tegra210-agic"; #interrupt-cells = <3>; interrupt-controller; - reg = <0x702f9000 0x2000>, + reg = <0x702f9000 0x1000>, <0x702fa000 0x2000>; interrupts = <GIC_SPI 102 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>; clocks = <&tegra_car TEGRA210_CLK_APE>;
From: Julien Thierry julien.thierry@arm.com
commit f57065782f245ca96f1472209a485073bbc11247 upstream.
Some of the inline assembly instruction use the condition flags and need to include "cc" in the clobber list.
Fixes: 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") Cc: stable@vger.kernel.org # 5.1.x- Suggested-by: Marc Zyngier marc.zyngier@arm.com Cc: Will Deacon will.deacon@arm.com Reviewed-by: Marc Zyngier marc.zyngier@arm.com Acked-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Julien Thierry julien.thierry@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/include/asm/irqflags.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm64/include/asm/irqflags.h +++ b/arch/arm64/include/asm/irqflags.h @@ -81,7 +81,7 @@ static inline unsigned long arch_local_s ARM64_HAS_IRQ_PRIO_MASKING) : "=&r" (flags), "+r" (daif_bits) : "r" ((unsigned long) GIC_PRIO_IRQOFF) - : "memory"); + : "cc", "memory");
return flags; } @@ -125,7 +125,7 @@ static inline int arch_irqs_disabled_fla ARM64_HAS_IRQ_PRIO_MASKING) : "=&r" (res) : "r" ((int) flags) - : "memory"); + : "cc", "memory");
return res; }
From: Julien Thierry julien.thierry@arm.com
commit bd82d4bd21880b7c4d5f5756be435095d6ae07b5 upstream.
When using IRQ priority masking to disable interrupts, in order to deal with the PSR.I state, local_irq_save() would convert the I bit into a PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore() potentially modifying the value of PMR in undesired location due to the state of PSR.I upon flag saving [1].
In an attempt to solve this issue in a less hackish manner, introduce a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent whether PSR.I is being used to disable interrupts, in which case it takes precedence of the status of interrupt masking via PMR.
GIC_PRIO_PSR_I_SET is chosen such that (<pmr_value> | GIC_PRIO_PSR_I_SET) does not mask more interrupts than <pmr_value> as some sections (e.g. arch_cpu_idle(), interrupt acknowledge path) requires PMR not to mask interrupts that could be signaled to the CPU when using only PSR.I.
[1] https://www.spinics.net/lists/arm-kernel/msg716956.html
Fixes: 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") Cc: stable@vger.kernel.org # 5.1.x- Reported-by: Zenghui Yu yuzenghui@huawei.com Cc: Steven Rostedt rostedt@goodmis.org Cc: Wei Li liwei391@huawei.com Cc: Will Deacon will.deacon@arm.com Cc: Christoffer Dall christoffer.dall@arm.com Cc: James Morse james.morse@arm.com Cc: Suzuki K Pouloze suzuki.poulose@arm.com Cc: Oleg Nesterov oleg@redhat.com Reviewed-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Julien Thierry julien.thierry@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/include/asm/arch_gicv3.h | 4 +- arch/arm64/include/asm/daifflags.h | 68 +++++++++++++++++++++--------------- arch/arm64/include/asm/irqflags.h | 67 ++++++++++++++--------------------- arch/arm64/include/asm/kvm_host.h | 7 ++- arch/arm64/include/asm/ptrace.h | 10 ++++- arch/arm64/kernel/entry.S | 38 +++++++++++++++++--- arch/arm64/kernel/process.c | 2 - arch/arm64/kernel/smp.c | 8 ++-- arch/arm64/kvm/hyp/switch.c | 2 - 9 files changed, 123 insertions(+), 83 deletions(-)
--- a/arch/arm64/include/asm/arch_gicv3.h +++ b/arch/arm64/include/asm/arch_gicv3.h @@ -152,7 +152,9 @@ static inline bool gic_prio_masking_enab
static inline void gic_pmr_mask_irqs(void) { - BUILD_BUG_ON(GICD_INT_DEF_PRI <= GIC_PRIO_IRQOFF); + BUILD_BUG_ON(GICD_INT_DEF_PRI < (GIC_PRIO_IRQOFF | + GIC_PRIO_PSR_I_SET)); + BUILD_BUG_ON(GICD_INT_DEF_PRI >= GIC_PRIO_IRQON); gic_write_pmr(GIC_PRIO_IRQOFF); }
--- a/arch/arm64/include/asm/daifflags.h +++ b/arch/arm64/include/asm/daifflags.h @@ -7,6 +7,7 @@
#include <linux/irqflags.h>
+#include <asm/arch_gicv3.h> #include <asm/cpufeature.h>
#define DAIF_PROCCTX 0 @@ -21,6 +22,11 @@ static inline void local_daif_mask(void) : : : "memory"); + + /* Don't really care for a dsb here, we don't intend to enable IRQs */ + if (system_uses_irq_prio_masking()) + gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET); + trace_hardirqs_off(); }
@@ -32,7 +38,7 @@ static inline unsigned long local_daif_s
if (system_uses_irq_prio_masking()) { /* If IRQs are masked with PMR, reflect it in the flags */ - if (read_sysreg_s(SYS_ICC_PMR_EL1) <= GIC_PRIO_IRQOFF) + if (read_sysreg_s(SYS_ICC_PMR_EL1) != GIC_PRIO_IRQON) flags |= PSR_I_BIT; }
@@ -48,36 +54,44 @@ static inline void local_daif_restore(un if (!irq_disabled) { trace_hardirqs_on();
- if (system_uses_irq_prio_masking()) - arch_local_irq_enable(); - } else if (!(flags & PSR_A_BIT)) { - /* - * If interrupts are disabled but we can take - * asynchronous errors, we can take NMIs - */ if (system_uses_irq_prio_masking()) { - flags &= ~PSR_I_BIT; + gic_write_pmr(GIC_PRIO_IRQON); + dsb(sy); + } + } else if (system_uses_irq_prio_masking()) { + u64 pmr; + + if (!(flags & PSR_A_BIT)) { /* - * There has been concern that the write to daif - * might be reordered before this write to PMR. - * From the ARM ARM DDI 0487D.a, section D1.7.1 - * "Accessing PSTATE fields": - * Writes to the PSTATE fields have side-effects on - * various aspects of the PE operation. All of these - * side-effects are guaranteed: - * - Not to be visible to earlier instructions in - * the execution stream. - * - To be visible to later instructions in the - * execution stream - * - * Also, writes to PMR are self-synchronizing, so no - * interrupts with a lower priority than PMR is signaled - * to the PE after the write. - * - * So we don't need additional synchronization here. + * If interrupts are disabled but we can take + * asynchronous errors, we can take NMIs */ - arch_local_irq_disable(); + flags &= ~PSR_I_BIT; + pmr = GIC_PRIO_IRQOFF; + } else { + pmr = GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET; } + + /* + * There has been concern that the write to daif + * might be reordered before this write to PMR. + * From the ARM ARM DDI 0487D.a, section D1.7.1 + * "Accessing PSTATE fields": + * Writes to the PSTATE fields have side-effects on + * various aspects of the PE operation. All of these + * side-effects are guaranteed: + * - Not to be visible to earlier instructions in + * the execution stream. + * - To be visible to later instructions in the + * execution stream + * + * Also, writes to PMR are self-synchronizing, so no + * interrupts with a lower priority than PMR is signaled + * to the PE after the write. + * + * So we don't need additional synchronization here. + */ + gic_write_pmr(pmr); }
write_sysreg(flags, daif); --- a/arch/arm64/include/asm/irqflags.h +++ b/arch/arm64/include/asm/irqflags.h @@ -56,43 +56,46 @@ static inline void arch_local_irq_disabl */ static inline unsigned long arch_local_save_flags(void) { - unsigned long daif_bits; unsigned long flags;
- daif_bits = read_sysreg(daif); - - /* - * The asm is logically equivalent to: - * - * if (system_uses_irq_prio_masking()) - * flags = (daif_bits & PSR_I_BIT) ? - * GIC_PRIO_IRQOFF : - * read_sysreg_s(SYS_ICC_PMR_EL1); - * else - * flags = daif_bits; - */ asm volatile(ALTERNATIVE( - "mov %0, %1\n" - "nop\n" - "nop", - __mrs_s("%0", SYS_ICC_PMR_EL1) - "ands %1, %1, " __stringify(PSR_I_BIT) "\n" - "csel %0, %0, %2, eq", - ARM64_HAS_IRQ_PRIO_MASKING) - : "=&r" (flags), "+r" (daif_bits) - : "r" ((unsigned long) GIC_PRIO_IRQOFF) - : "cc", "memory"); + "mrs %0, daif", + __mrs_s("%0", SYS_ICC_PMR_EL1), + ARM64_HAS_IRQ_PRIO_MASKING) + : "=&r" (flags) + : + : "memory");
return flags; }
+static inline int arch_irqs_disabled_flags(unsigned long flags) +{ + int res; + + asm volatile(ALTERNATIVE( + "and %w0, %w1, #" __stringify(PSR_I_BIT), + "eor %w0, %w1, #" __stringify(GIC_PRIO_IRQON), + ARM64_HAS_IRQ_PRIO_MASKING) + : "=&r" (res) + : "r" ((int) flags) + : "memory"); + + return res; +} + static inline unsigned long arch_local_irq_save(void) { unsigned long flags;
flags = arch_local_save_flags();
- arch_local_irq_disable(); + /* + * There are too many states with IRQs disabled, just keep the current + * state if interrupts are already disabled/masked. + */ + if (!arch_irqs_disabled_flags(flags)) + arch_local_irq_disable();
return flags; } @@ -113,21 +116,5 @@ static inline void arch_local_irq_restor : "memory"); }
-static inline int arch_irqs_disabled_flags(unsigned long flags) -{ - int res; - - asm volatile(ALTERNATIVE( - "and %w0, %w1, #" __stringify(PSR_I_BIT) "\n" - "nop", - "cmp %w1, #" __stringify(GIC_PRIO_IRQOFF) "\n" - "cset %w0, ls", - ARM64_HAS_IRQ_PRIO_MASKING) - : "=&r" (res) - : "r" ((int) flags) - : "cc", "memory"); - - return res; -} #endif #endif --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -597,11 +597,12 @@ static inline void kvm_arm_vhe_guest_ent * will not signal the CPU of interrupts of lower priority, and the * only way to get out will be via guest exceptions. * Naturally, we want to avoid this. + * + * local_daif_mask() already sets GIC_PRIO_PSR_I_SET, we just need a + * dsb to ensure the redistributor is forwards EL2 IRQs to the CPU. */ - if (system_uses_irq_prio_masking()) { - gic_write_pmr(GIC_PRIO_IRQON); + if (system_uses_irq_prio_masking()) dsb(sy); - } }
static inline void kvm_arm_vhe_guest_exit(void) --- a/arch/arm64/include/asm/ptrace.h +++ b/arch/arm64/include/asm/ptrace.h @@ -24,9 +24,15 @@ * means masking more IRQs (or at least that the same IRQs remain masked). * * To mask interrupts, we clear the most significant bit of PMR. + * + * Some code sections either automatically switch back to PSR.I or explicitly + * require to not use priority masking. If bit GIC_PRIO_PSR_I_SET is included + * in the the priority mask, it indicates that PSR.I should be set and + * interrupt disabling temporarily does not rely on IRQ priorities. */ -#define GIC_PRIO_IRQON 0xf0 -#define GIC_PRIO_IRQOFF (GIC_PRIO_IRQON & ~0x80) +#define GIC_PRIO_IRQON 0xc0 +#define GIC_PRIO_IRQOFF (GIC_PRIO_IRQON & ~0x80) +#define GIC_PRIO_PSR_I_SET (1 << 4)
/* Additional SPSR bits not exposed in the UABI */ #define PSR_IL_BIT (1 << 20) --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -247,6 +247,7 @@ alternative_else_nop_endif /* * Registers that may be useful after this macro is invoked: * + * x20 - ICC_PMR_EL1 * x21 - aborted SP * x22 - aborted PC * x23 - aborted PSTATE @@ -438,6 +439,24 @@ alternative_endif .endm #endif
+ .macro gic_prio_kentry_setup, tmp:req +#ifdef CONFIG_ARM64_PSEUDO_NMI + alternative_if ARM64_HAS_IRQ_PRIO_MASKING + mov \tmp, #(GIC_PRIO_PSR_I_SET | GIC_PRIO_IRQON) + msr_s SYS_ICC_PMR_EL1, \tmp + alternative_else_nop_endif +#endif + .endm + + .macro gic_prio_irq_setup, pmr:req, tmp:req +#ifdef CONFIG_ARM64_PSEUDO_NMI + alternative_if ARM64_HAS_IRQ_PRIO_MASKING + orr \tmp, \pmr, #GIC_PRIO_PSR_I_SET + msr_s SYS_ICC_PMR_EL1, \tmp + alternative_else_nop_endif +#endif + .endm + .text
/* @@ -616,6 +635,7 @@ el1_dbg: cmp x24, #ESR_ELx_EC_BRK64 // if BRK64 cinc x24, x24, eq // set bit '0' tbz x24, #0, el1_inv // EL1 only + gic_prio_kentry_setup tmp=x3 mrs x0, far_el1 mov x2, sp // struct pt_regs bl do_debug_exception @@ -633,12 +653,10 @@ ENDPROC(el1_sync) .align 6 el1_irq: kernel_entry 1 + gic_prio_irq_setup pmr=x20, tmp=x1 enable_da_f
#ifdef CONFIG_ARM64_PSEUDO_NMI -alternative_if ARM64_HAS_IRQ_PRIO_MASKING - ldr x20, [sp, #S_PMR_SAVE] -alternative_else_nop_endif test_irqs_unmasked res=x0, pmr=x20 cbz x0, 1f bl asm_nmi_enter @@ -668,8 +686,9 @@ alternative_else_nop_endif
#ifdef CONFIG_ARM64_PSEUDO_NMI /* - * if IRQs were disabled when we received the interrupt, we have an NMI - * and we are not re-enabling interrupt upon eret. Skip tracing. + * When using IRQ priority masking, we can get spurious interrupts while + * PMR is set to GIC_PRIO_IRQOFF. An NMI might also have occurred in a + * section with interrupts disabled. Skip tracing in those cases. */ test_irqs_unmasked res=x0, pmr=x20 cbz x0, 1f @@ -798,6 +817,7 @@ el0_ia: * Instruction abort handling */ mrs x26, far_el1 + gic_prio_kentry_setup tmp=x0 enable_da_f #ifdef CONFIG_TRACE_IRQFLAGS bl trace_hardirqs_off @@ -843,6 +863,7 @@ el0_sp_pc: * Stack or PC alignment exception handling */ mrs x26, far_el1 + gic_prio_kentry_setup tmp=x0 enable_da_f #ifdef CONFIG_TRACE_IRQFLAGS bl trace_hardirqs_off @@ -877,6 +898,7 @@ el0_dbg: * Debug exception handling */ tbnz x24, #0, el0_inv // EL0 only + gic_prio_kentry_setup tmp=x3 mrs x0, far_el1 mov x1, x25 mov x2, sp @@ -898,7 +920,9 @@ ENDPROC(el0_sync) el0_irq: kernel_entry 0 el0_irq_naked: + gic_prio_irq_setup pmr=x20, tmp=x0 enable_da_f + #ifdef CONFIG_TRACE_IRQFLAGS bl trace_hardirqs_off #endif @@ -920,6 +944,7 @@ ENDPROC(el0_irq) el1_error: kernel_entry 1 mrs x1, esr_el1 + gic_prio_kentry_setup tmp=x2 enable_dbg mov x0, sp bl do_serror @@ -930,6 +955,7 @@ el0_error: kernel_entry 0 el0_error_naked: mrs x1, esr_el1 + gic_prio_kentry_setup tmp=x2 enable_dbg mov x0, sp bl do_serror @@ -954,6 +980,7 @@ work_pending: */ ret_to_user: disable_daif + gic_prio_kentry_setup tmp=x3 ldr x1, [tsk, #TSK_TI_FLAGS] and x2, x1, #_TIF_WORK_MASK cbnz x2, work_pending @@ -970,6 +997,7 @@ ENDPROC(ret_to_user) */ .align 6 el0_svc: + gic_prio_kentry_setup tmp=x1 mov x0, sp bl el0_svc_handler b ret_to_user --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -83,7 +83,7 @@ static void __cpu_do_idle_irqprio(void) * be raised. */ pmr = gic_read_pmr(); - gic_write_pmr(GIC_PRIO_IRQON); + gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);
__cpu_do_idle();
--- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -181,11 +181,13 @@ static void init_gic_priority_masking(vo
WARN_ON(!(cpuflags & PSR_I_BIT));
- gic_write_pmr(GIC_PRIO_IRQOFF); - /* We can only unmask PSR.I if we can take aborts */ - if (!(cpuflags & PSR_A_BIT)) + if (!(cpuflags & PSR_A_BIT)) { + gic_write_pmr(GIC_PRIO_IRQOFF); write_sysreg(cpuflags & ~PSR_I_BIT, daif); + } else { + gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET); + } }
/* --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -604,7 +604,7 @@ int __hyp_text __kvm_vcpu_run_nvhe(struc * Naturally, we want to avoid this. */ if (system_uses_irq_prio_masking()) { - gic_write_pmr(GIC_PRIO_IRQON); + gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET); dsb(sy); }
From: Shaokun Zhang zhangshaokun@hisilicon.com
commit b96fb368b08f1637cbf780a6b83e36c2c5ed4ff5 upstream.
Commit ba39bd8306057 ("intel_th: msu: Switch over to scatterlist") introduced the following warnings on non-x86 architectures, as a result of reordering the multi mode buffer allocation sequence:
drivers/hwtracing/intel_th/msu.c: In function ‘msc_buffer_win_alloc’: drivers/hwtracing/intel_th/msu.c:783:21: warning: unused variable ‘i’ [-Wunused-variable] int ret = -ENOMEM, i; ^ drivers/hwtracing/intel_th/msu.c: In function ‘msc_buffer_win_free’: drivers/hwtracing/intel_th/msu.c:863:6: warning: unused variable ‘i’ [-Wunused-variable] int i; ^
Fix this compiler warning by factoring out set_memory sequences and making them x86-only.
Suggested-by: Alexander Shishkin alexander.shishkin@linux.intel.com Signed-off-by: Shaokun Zhang zhangshaokun@hisilicon.com Fixes: ba39bd8306057 ("intel_th: msu: Switch over to scatterlist") Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20190621161930.60785-2-alexander.shishkin@linux.in... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwtracing/intel_th/msu.c | 40 ++++++++++++++++++++++++++------------- 1 file changed, 27 insertions(+), 13 deletions(-)
--- a/drivers/hwtracing/intel_th/msu.c +++ b/drivers/hwtracing/intel_th/msu.c @@ -767,6 +767,30 @@ err_nomem: return -ENOMEM; }
+#ifdef CONFIG_X86 +static void msc_buffer_set_uc(struct msc_window *win, unsigned int nr_blocks) +{ + int i; + + for (i = 0; i < nr_blocks; i++) + /* Set the page as uncached */ + set_memory_uc((unsigned long)msc_win_block(win, i), 1); +} + +static void msc_buffer_set_wb(struct msc_window *win) +{ + int i; + + for (i = 0; i < win->nr_blocks; i++) + /* Reset the page to write-back */ + set_memory_wb((unsigned long)msc_win_block(win, i), 1); +} +#else /* !X86 */ +static inline void +msc_buffer_set_uc(struct msc_window *win, unsigned int nr_blocks) {} +static inline void msc_buffer_set_wb(struct msc_window *win) {} +#endif /* CONFIG_X86 */ + /** * msc_buffer_win_alloc() - alloc a window for a multiblock mode * @msc: MSC device @@ -780,7 +804,7 @@ err_nomem: static int msc_buffer_win_alloc(struct msc *msc, unsigned int nr_blocks) { struct msc_window *win; - int ret = -ENOMEM, i; + int ret = -ENOMEM;
if (!nr_blocks) return 0; @@ -811,11 +835,7 @@ static int msc_buffer_win_alloc(struct m if (ret < 0) goto err_nomem;
-#ifdef CONFIG_X86 - for (i = 0; i < ret; i++) - /* Set the page as uncached */ - set_memory_uc((unsigned long)msc_win_block(win, i), 1); -#endif + msc_buffer_set_uc(win, ret);
win->nr_blocks = ret;
@@ -860,8 +880,6 @@ static void __msc_buffer_win_free(struct */ static void msc_buffer_win_free(struct msc *msc, struct msc_window *win) { - int i; - msc->nr_pages -= win->nr_blocks;
list_del(&win->entry); @@ -870,11 +888,7 @@ static void msc_buffer_win_free(struct m msc->base_addr = 0; }
-#ifdef CONFIG_X86 - for (i = 0; i < win->nr_blocks; i++) - /* Reset the page to write-back */ - set_memory_wb((unsigned long)msc_win_block(win, i), 1); -#endif + msc_buffer_set_wb(win);
__msc_buffer_win_free(msc, win);
From: Eric W. Biederman ebiederm@xmission.com
commit 70f1b0d34bdf03065fe869e93cc17cad1ea20c4a upstream.
The usb support for asyncio encoded one of it's values in the wrong field. It should have used si_value but instead used si_addr which is not present in the _rt union member of struct siginfo.
The practical result of this is that on a 64bit big endian kernel when delivering a signal to a 32bit process the si_addr field is set to NULL, instead of the expected pointer value.
This issue can not be fixed in copy_siginfo_to_user32 as the usb usage of the the _sigfault (aka si_addr) member of the siginfo union when SI_ASYNCIO is set is incompatible with the POSIX and glibc usage of the _rt member of the siginfo union.
Therefore replace kill_pid_info_as_cred with kill_pid_usb_asyncio a dedicated function for this one specific case. There are no other users of kill_pid_info_as_cred so this specialization should have no impact on the amount of code in the kernel. Have kill_pid_usb_asyncio take instead of a siginfo_t which is difficult and error prone, 3 arguments, a signal number, an errno value, and an address enconded as a sigval_t. The encoding of the address as a sigval_t allows the code that reads the userspace request for a signal to handle this compat issue along with all of the other compat issues.
Add BUILD_BUG_ONs in kernel/signal.c to ensure that we can now place the pointer value at the in si_pid (instead of si_addr). That is the code now verifies that si_pid and si_addr always occur at the same location. Further the code veries that for native structures a value placed in si_pid and spilling into si_uid will appear in userspace in si_addr (on a byte by byte copy of siginfo or a field by field copy of siginfo). The code also verifies that for a 64bit kernel and a 32bit userspace the 32bit pointer will fit in si_pid.
I have used the usbsig.c program below written by Alan Stern and slightly tweaked by me to run on a big endian machine to verify the issue exists (on sparc64) and to confirm the patch below fixes the issue.
/* usbsig.c -- test USB async signal delivery */
#define _GNU_SOURCE #include <stdio.h> #include <fcntl.h> #include <signal.h> #include <string.h> #include <sys/ioctl.h> #include <unistd.h> #include <endian.h> #include <linux/usb/ch9.h> #include <linux/usbdevice_fs.h>
static struct usbdevfs_urb urb; static struct usbdevfs_disconnectsignal ds; static volatile sig_atomic_t done = 0;
void urb_handler(int sig, siginfo_t *info , void *ucontext) { printf("Got signal %d, signo %d errno %d code %d addr: %p urb: %p\n", sig, info->si_signo, info->si_errno, info->si_code, info->si_addr, &urb);
printf("%s\n", (info->si_addr == &urb) ? "Good" : "Bad"); }
void ds_handler(int sig, siginfo_t *info , void *ucontext) { printf("Got signal %d, signo %d errno %d code %d addr: %p ds: %p\n", sig, info->si_signo, info->si_errno, info->si_code, info->si_addr, &ds);
printf("%s\n", (info->si_addr == &ds) ? "Good" : "Bad"); done = 1; }
int main(int argc, char **argv) { char *devfilename; int fd; int rc; struct sigaction act; struct usb_ctrlrequest *req; void *ptr; char buf[80];
if (argc != 2) { fprintf(stderr, "Usage: usbsig device-file-name\n"); return 1; }
devfilename = argv[1]; fd = open(devfilename, O_RDWR); if (fd == -1) { perror("Error opening device file"); return 1; }
act.sa_sigaction = urb_handler; sigemptyset(&act.sa_mask); act.sa_flags = SA_SIGINFO;
rc = sigaction(SIGUSR1, &act, NULL); if (rc == -1) { perror("Error in sigaction"); return 1; }
act.sa_sigaction = ds_handler; sigemptyset(&act.sa_mask); act.sa_flags = SA_SIGINFO;
rc = sigaction(SIGUSR2, &act, NULL); if (rc == -1) { perror("Error in sigaction"); return 1; }
memset(&urb, 0, sizeof(urb)); urb.type = USBDEVFS_URB_TYPE_CONTROL; urb.endpoint = USB_DIR_IN | 0; urb.buffer = buf; urb.buffer_length = sizeof(buf); urb.signr = SIGUSR1;
req = (struct usb_ctrlrequest *) buf; req->bRequestType = USB_DIR_IN | USB_TYPE_STANDARD | USB_RECIP_DEVICE; req->bRequest = USB_REQ_GET_DESCRIPTOR; req->wValue = htole16(USB_DT_DEVICE << 8); req->wIndex = htole16(0); req->wLength = htole16(sizeof(buf) - sizeof(*req));
rc = ioctl(fd, USBDEVFS_SUBMITURB, &urb); if (rc == -1) { perror("Error in SUBMITURB ioctl"); return 1; }
rc = ioctl(fd, USBDEVFS_REAPURB, &ptr); if (rc == -1) { perror("Error in REAPURB ioctl"); return 1; }
memset(&ds, 0, sizeof(ds)); ds.signr = SIGUSR2; ds.context = &ds; rc = ioctl(fd, USBDEVFS_DISCSIGNAL, &ds); if (rc == -1) { perror("Error in DISCSIGNAL ioctl"); return 1; }
printf("Waiting for usb disconnect\n"); while (!done) { sleep(1); }
close(fd); return 0; }
Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: linux-usb@vger.kernel.org Cc: Alan Stern stern@rowland.harvard.edu Cc: Oliver Neukum oneukum@suse.com Fixes: v2.3.39 Cc: stable@vger.kernel.org Acked-by: Alan Stern stern@rowland.harvard.edu Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/core/devio.c | 48 ++++++++++++++--------------- include/linux/sched/signal.h | 2 - kernel/signal.c | 69 ++++++++++++++++++++++++++++++++++++++----- 3 files changed, 86 insertions(+), 33 deletions(-)
--- a/drivers/usb/core/devio.c +++ b/drivers/usb/core/devio.c @@ -63,7 +63,7 @@ struct usb_dev_state { unsigned int discsignr; struct pid *disc_pid; const struct cred *cred; - void __user *disccontext; + sigval_t disccontext; unsigned long ifclaimed; u32 disabled_bulk_eps; bool privileges_dropped; @@ -90,6 +90,7 @@ struct async { unsigned int ifnum; void __user *userbuffer; void __user *userurb; + sigval_t userurb_sigval; struct urb *urb; struct usb_memory *usbm; unsigned int mem_usage; @@ -582,22 +583,19 @@ static void async_completed(struct urb * { struct async *as = urb->context; struct usb_dev_state *ps = as->ps; - struct kernel_siginfo sinfo; struct pid *pid = NULL; const struct cred *cred = NULL; unsigned long flags; - int signr; + sigval_t addr; + int signr, errno;
spin_lock_irqsave(&ps->lock, flags); list_move_tail(&as->asynclist, &ps->async_completed); as->status = urb->status; signr = as->signr; if (signr) { - clear_siginfo(&sinfo); - sinfo.si_signo = as->signr; - sinfo.si_errno = as->status; - sinfo.si_code = SI_ASYNCIO; - sinfo.si_addr = as->userurb; + errno = as->status; + addr = as->userurb_sigval; pid = get_pid(as->pid); cred = get_cred(as->cred); } @@ -615,7 +613,7 @@ static void async_completed(struct urb * spin_unlock_irqrestore(&ps->lock, flags);
if (signr) { - kill_pid_info_as_cred(sinfo.si_signo, &sinfo, pid, cred); + kill_pid_usb_asyncio(signr, errno, addr, pid, cred); put_pid(pid); put_cred(cred); } @@ -1427,7 +1425,7 @@ find_memory_area(struct usb_dev_state *p
static int proc_do_submiturb(struct usb_dev_state *ps, struct usbdevfs_urb *uurb, struct usbdevfs_iso_packet_desc __user *iso_frame_desc, - void __user *arg) + void __user *arg, sigval_t userurb_sigval) { struct usbdevfs_iso_packet_desc *isopkt = NULL; struct usb_host_endpoint *ep; @@ -1727,6 +1725,7 @@ static int proc_do_submiturb(struct usb_ isopkt = NULL; as->ps = ps; as->userurb = arg; + as->userurb_sigval = userurb_sigval; if (as->usbm) { unsigned long uurb_start = (unsigned long)uurb->buffer;
@@ -1801,13 +1800,17 @@ static int proc_do_submiturb(struct usb_ static int proc_submiturb(struct usb_dev_state *ps, void __user *arg) { struct usbdevfs_urb uurb; + sigval_t userurb_sigval;
if (copy_from_user(&uurb, arg, sizeof(uurb))) return -EFAULT;
+ memset(&userurb_sigval, 0, sizeof(userurb_sigval)); + userurb_sigval.sival_ptr = arg; + return proc_do_submiturb(ps, &uurb, (((struct usbdevfs_urb __user *)arg)->iso_frame_desc), - arg); + arg, userurb_sigval); }
static int proc_unlinkurb(struct usb_dev_state *ps, void __user *arg) @@ -1977,7 +1980,7 @@ static int proc_disconnectsignal_compat( if (copy_from_user(&ds, arg, sizeof(ds))) return -EFAULT; ps->discsignr = ds.signr; - ps->disccontext = compat_ptr(ds.context); + ps->disccontext.sival_int = ds.context; return 0; }
@@ -2005,13 +2008,17 @@ static int get_urb32(struct usbdevfs_urb static int proc_submiturb_compat(struct usb_dev_state *ps, void __user *arg) { struct usbdevfs_urb uurb; + sigval_t userurb_sigval;
if (get_urb32(&uurb, (struct usbdevfs_urb32 __user *)arg)) return -EFAULT;
+ memset(&userurb_sigval, 0, sizeof(userurb_sigval)); + userurb_sigval.sival_int = ptr_to_compat(arg); + return proc_do_submiturb(ps, &uurb, ((struct usbdevfs_urb32 __user *)arg)->iso_frame_desc, - arg); + arg, userurb_sigval); }
static int processcompl_compat(struct async *as, void __user * __user *arg) @@ -2092,7 +2099,7 @@ static int proc_disconnectsignal(struct if (copy_from_user(&ds, arg, sizeof(ds))) return -EFAULT; ps->discsignr = ds.signr; - ps->disccontext = ds.context; + ps->disccontext.sival_ptr = ds.context; return 0; }
@@ -2614,22 +2621,15 @@ const struct file_operations usbdev_file static void usbdev_remove(struct usb_device *udev) { struct usb_dev_state *ps; - struct kernel_siginfo sinfo;
while (!list_empty(&udev->filelist)) { ps = list_entry(udev->filelist.next, struct usb_dev_state, list); destroy_all_async(ps); wake_up_all(&ps->wait); list_del_init(&ps->list); - if (ps->discsignr) { - clear_siginfo(&sinfo); - sinfo.si_signo = ps->discsignr; - sinfo.si_errno = EPIPE; - sinfo.si_code = SI_ASYNCIO; - sinfo.si_addr = ps->disccontext; - kill_pid_info_as_cred(ps->discsignr, &sinfo, - ps->disc_pid, ps->cred); - } + if (ps->discsignr) + kill_pid_usb_asyncio(ps->discsignr, EPIPE, ps->disccontext, + ps->disc_pid, ps->cred); } }
--- a/include/linux/sched/signal.h +++ b/include/linux/sched/signal.h @@ -329,7 +329,7 @@ extern void force_sigsegv(int sig, struc extern int force_sig_info(int, struct kernel_siginfo *, struct task_struct *); extern int __kill_pgrp_info(int sig, struct kernel_siginfo *info, struct pid *pgrp); extern int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid); -extern int kill_pid_info_as_cred(int, struct kernel_siginfo *, struct pid *, +extern int kill_pid_usb_asyncio(int sig, int errno, sigval_t addr, struct pid *, const struct cred *); extern int kill_pgrp(struct pid *pid, int sig, int priv); extern int kill_pid(struct pid *pid, int sig, int priv); --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1440,13 +1440,44 @@ static inline bool kill_as_cred_perm(con uid_eq(cred->uid, pcred->uid); }
-/* like kill_pid_info(), but doesn't use uid/euid of "current" */ -int kill_pid_info_as_cred(int sig, struct kernel_siginfo *info, struct pid *pid, - const struct cred *cred) +/* + * The usb asyncio usage of siginfo is wrong. The glibc support + * for asyncio which uses SI_ASYNCIO assumes the layout is SIL_RT. + * AKA after the generic fields: + * kernel_pid_t si_pid; + * kernel_uid32_t si_uid; + * sigval_t si_value; + * + * Unfortunately when usb generates SI_ASYNCIO it assumes the layout + * after the generic fields is: + * void __user *si_addr; + * + * This is a practical problem when there is a 64bit big endian kernel + * and a 32bit userspace. As the 32bit address will encoded in the low + * 32bits of the pointer. Those low 32bits will be stored at higher + * address than appear in a 32 bit pointer. So userspace will not + * see the address it was expecting for it's completions. + * + * There is nothing in the encoding that can allow + * copy_siginfo_to_user32 to detect this confusion of formats, so + * handle this by requiring the caller of kill_pid_usb_asyncio to + * notice when this situration takes place and to store the 32bit + * pointer in sival_int, instead of sival_addr of the sigval_t addr + * parameter. + */ +int kill_pid_usb_asyncio(int sig, int errno, sigval_t addr, + struct pid *pid, const struct cred *cred) { - int ret = -EINVAL; + struct kernel_siginfo info; struct task_struct *p; unsigned long flags; + int ret = -EINVAL; + + clear_siginfo(&info); + info.si_signo = sig; + info.si_errno = errno; + info.si_code = SI_ASYNCIO; + *((sigval_t *)&info.si_pid) = addr;
if (!valid_signal(sig)) return ret; @@ -1457,17 +1488,17 @@ int kill_pid_info_as_cred(int sig, struc ret = -ESRCH; goto out_unlock; } - if (si_fromuser(info) && !kill_as_cred_perm(cred, p)) { + if (!kill_as_cred_perm(cred, p)) { ret = -EPERM; goto out_unlock; } - ret = security_task_kill(p, info, sig, cred); + ret = security_task_kill(p, &info, sig, cred); if (ret) goto out_unlock;
if (sig) { if (lock_task_sighand(p, &flags)) { - ret = __send_signal(sig, info, p, PIDTYPE_TGID, 0); + ret = __send_signal(sig, &info, p, PIDTYPE_TGID, 0); unlock_task_sighand(p, &flags); } else ret = -ESRCH; @@ -1476,7 +1507,7 @@ out_unlock: rcu_read_unlock(); return ret; } -EXPORT_SYMBOL_GPL(kill_pid_info_as_cred); +EXPORT_SYMBOL_GPL(kill_pid_usb_asyncio);
/* * kill_something_info() interprets pid in interesting ways just like kill(2). @@ -4477,6 +4508,28 @@ static inline void siginfo_buildtime_che CHECK_OFFSET(si_syscall); CHECK_OFFSET(si_arch); #undef CHECK_OFFSET + + /* usb asyncio */ + BUILD_BUG_ON(offsetof(struct siginfo, si_pid) != + offsetof(struct siginfo, si_addr)); + if (sizeof(int) == sizeof(void __user *)) { + BUILD_BUG_ON(sizeof_field(struct siginfo, si_pid) != + sizeof(void __user *)); + } else { + BUILD_BUG_ON((sizeof_field(struct siginfo, si_pid) + + sizeof_field(struct siginfo, si_uid)) != + sizeof(void __user *)); + BUILD_BUG_ON(offsetofend(struct siginfo, si_pid) != + offsetof(struct siginfo, si_uid)); + } +#ifdef CONFIG_COMPAT + BUILD_BUG_ON(offsetof(struct compat_siginfo, si_pid) != + offsetof(struct compat_siginfo, si_addr)); + BUILD_BUG_ON(sizeof_field(struct compat_siginfo, si_pid) != + sizeof(compat_uptr_t)); + BUILD_BUG_ON(sizeof_field(struct compat_siginfo, si_pid) != + sizeof_field(struct siginfo, si_pid)); +#endif }
void __init signals_init(void)
From: Eric W. Biederman ebiederm@xmission.com
commit 7a0cf094944e2540758b7f957eb6846d5126f535 upstream.
The function send_signal was split from __send_signal so that it would be possible to bypass the namespace logic based upon current[1]. As it turns out the si_pid and the si_uid fixup are both inappropriate in the case of kill_pid_usb_asyncio so move that logic into send_signal.
It is difficult to arrange but possible for a signal with an si_code of SI_TIMER or SI_SIGIO to be sent across namespace boundaries. In which case tests for when it is ok to change si_pid and si_uid based on SI_FROMUSER are incorrect. Replace the use of SI_FROMUSER with a new test has_si_pid_and_used based on siginfo_layout.
Now that the uid fixup is no longer present after expanding SEND_SIG_NOINFO properly calculate the si_uid that the target task needs to read.
[1] 7978b567d315 ("signals: add from_ancestor_ns parameter to send_signal()") Cc: stable@vger.kernel.org Fixes: 6588c1e3ff01 ("signals: SI_USER: Masquerade si_pid when crossing pid ns boundary") Fixes: 6b550f949594 ("user namespace: make signal.c respect user namespaces") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/signal.c | 67 +++++++++++++++++++++++++++++++++----------------------- 1 file changed, 40 insertions(+), 27 deletions(-)
--- a/kernel/signal.c +++ b/kernel/signal.c @@ -1057,27 +1057,6 @@ static inline bool legacy_queue(struct s return (sig < SIGRTMIN) && sigismember(&signals->signal, sig); }
-#ifdef CONFIG_USER_NS -static inline void userns_fixup_signal_uid(struct kernel_siginfo *info, struct task_struct *t) -{ - if (current_user_ns() == task_cred_xxx(t, user_ns)) - return; - - if (SI_FROMKERNEL(info)) - return; - - rcu_read_lock(); - info->si_uid = from_kuid_munged(task_cred_xxx(t, user_ns), - make_kuid(current_user_ns(), info->si_uid)); - rcu_read_unlock(); -} -#else -static inline void userns_fixup_signal_uid(struct kernel_siginfo *info, struct task_struct *t) -{ - return; -} -#endif - static int __send_signal(int sig, struct kernel_siginfo *info, struct task_struct *t, enum pid_type type, int from_ancestor_ns) { @@ -1135,7 +1114,11 @@ static int __send_signal(int sig, struct q->info.si_code = SI_USER; q->info.si_pid = task_tgid_nr_ns(current, task_active_pid_ns(t)); - q->info.si_uid = from_kuid_munged(current_user_ns(), current_uid()); + rcu_read_lock(); + q->info.si_uid = + from_kuid_munged(task_cred_xxx(t, user_ns), + current_uid()); + rcu_read_unlock(); break; case (unsigned long) SEND_SIG_PRIV: clear_siginfo(&q->info); @@ -1147,13 +1130,8 @@ static int __send_signal(int sig, struct break; default: copy_siginfo(&q->info, info); - if (from_ancestor_ns) - q->info.si_pid = 0; break; } - - userns_fixup_signal_uid(&q->info, t); - } else if (!is_si_special(info)) { if (sig >= SIGRTMIN && info->si_code != SI_USER) { /* @@ -1197,6 +1175,28 @@ ret: return ret; }
+static inline bool has_si_pid_and_uid(struct kernel_siginfo *info) +{ + bool ret = false; + switch (siginfo_layout(info->si_signo, info->si_code)) { + case SIL_KILL: + case SIL_CHLD: + case SIL_RT: + ret = true; + break; + case SIL_TIMER: + case SIL_POLL: + case SIL_FAULT: + case SIL_FAULT_MCEERR: + case SIL_FAULT_BNDERR: + case SIL_FAULT_PKUERR: + case SIL_SYS: + ret = false; + break; + } + return ret; +} + static int send_signal(int sig, struct kernel_siginfo *info, struct task_struct *t, enum pid_type type) { @@ -1206,7 +1206,20 @@ static int send_signal(int sig, struct k from_ancestor_ns = si_fromuser(info) && !task_pid_nr_ns(current, task_active_pid_ns(t)); #endif + if (!is_si_special(info) && has_si_pid_and_uid(info)) { + struct user_namespace *t_user_ns; + + rcu_read_lock(); + t_user_ns = task_cred_xxx(t, user_ns); + if (current_user_ns() != t_user_ns) { + kuid_t uid = make_kuid(current_user_ns(), info->si_uid); + info->si_uid = from_kuid_munged(t_user_ns, uid); + } + rcu_read_unlock();
+ if (!task_pid_nr_ns(current, task_active_pid_ns(t))) + info->si_pid = 0; + } return __send_signal(sig, info, t, type, from_ancestor_ns); }
From: Radoslaw Burny rburny@google.com
commit 5ec27ec735ba0477d48c80561cc5e856f0c5dfaf upstream.
Normally, the inode's i_uid/i_gid are translated relative to s_user_ns, but this is not a correct behavior for proc. Since sysctl permission check in test_perm is done against GLOBAL_ROOT_[UG]ID, it makes more sense to use these values in u_[ug]id of proc inodes. In other words: although uid/gid in the inode is not read during test_perm, the inode logically belongs to the root of the namespace. I have confirmed this with Eric Biederman at LPC and in this thread: https://lore.kernel.org/lkml/87k1kzjdff.fsf@xmission.com
Consequences ============
Since the i_[ug]id values of proc nodes are not used for permissions checks, this change usually makes no functional difference. However, it causes an issue in a setup where:
* a namespace container is created without root user in container - hence the i_[ug]id of proc nodes are set to INVALID_[UG]ID
* container creator tries to configure it by writing /proc/sys files, e.g. writing /proc/sys/kernel/shmmax to configure shared memory limit
Kernel does not allow to open an inode for writing if its i_[ug]id are invalid, making it impossible to write shmmax and thus - configure the container.
Using a container with no root mapping is apparently rare, but we do use this configuration at Google. Also, we use a generic tool to configure the container limits, and the inability to write any of them causes a failure.
History =======
The invalid uids/gids in inodes first appeared due to 81754357770e (fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns). However, AFAIK, this did not immediately cause any issues. The inability to write to these "invalid" inodes was only caused by a later commit 0bd23d09b874 (vfs: Don't modify inodes with a uid or gid unknown to the vfs).
Tested: Used a repro program that creates a user namespace without any mapping and stat'ed /proc/$PID/root/proc/sys/kernel/shmmax from outside. Before the change, it shows the overflow uid, with the change it's 0. The overflow uid indicates that the uid in the inode is not correct and thus it is not possible to open the file for writing.
Link: http://lkml.kernel.org/r/20190708115130.250149-1-rburny@google.com Fixes: 0bd23d09b874 ("vfs: Don't modify inodes with a uid or gid unknown to the vfs") Signed-off-by: Radoslaw Burny rburny@google.com Acked-by: Luis Chamberlain mcgrof@kernel.org Cc: Kees Cook keescook@chromium.org Cc: "Eric W . Biederman" ebiederm@xmission.com Cc: Seth Forshee seth.forshee@canonical.com Cc: John Sperbeck jsperbeck@google.com Cc: Alexey Dobriyan adobriyan@gmail.com Cc: stable@vger.kernel.org [4.8+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/proc/proc_sysctl.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -499,6 +499,10 @@ static struct inode *proc_sys_make_inode
if (root->set_ownership) root->set_ownership(head, table, &inode->i_uid, &inode->i_gid); + else { + inode->i_uid = GLOBAL_ROOT_UID; + inode->i_gid = GLOBAL_ROOT_GID; + }
return inode; }
From: Vitor Soares Vitor.Soares@synopsys.com
commit ecc8fb54bd443bf69996d9d5ddb8d90a50f14936 upstream.
Currently the I3C framework limits SCL frequency to FM speed when dealing with a mixed slow bus, even if all I2C devices are FM+ capable.
The core was also not accounting for I3C speed limitations when operating in mixed slow mode and was erroneously using FM+ speed as the max I2C speed when operating in mixed fast mode.
Fixes: 3a379bbcea0a ("i3c: Add core I3C infrastructure") Signed-off-by: Vitor Soares vitor.soares@synopsys.com Cc: Boris Brezillon bbrezillon@kernel.org Cc: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Boris Brezillon boris.brezillon@collabora.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/i3c/master.c | 51 ++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 38 insertions(+), 13 deletions(-)
--- a/drivers/i3c/master.c +++ b/drivers/i3c/master.c @@ -91,6 +91,12 @@ void i3c_bus_normaluse_unlock(struct i3c up_read(&bus->lock); }
+static struct i3c_master_controller * +i3c_bus_to_i3c_master(struct i3c_bus *i3cbus) +{ + return container_of(i3cbus, struct i3c_master_controller, bus); +} + static struct i3c_master_controller *dev_to_i3cmaster(struct device *dev) { return container_of(dev, struct i3c_master_controller, dev); @@ -565,20 +571,38 @@ static const struct device_type i3c_mast .groups = i3c_masterdev_groups, };
-int i3c_bus_set_mode(struct i3c_bus *i3cbus, enum i3c_bus_mode mode) +int i3c_bus_set_mode(struct i3c_bus *i3cbus, enum i3c_bus_mode mode, + unsigned long max_i2c_scl_rate) { - i3cbus->mode = mode; + struct i3c_master_controller *master = i3c_bus_to_i3c_master(i3cbus);
- if (!i3cbus->scl_rate.i3c) - i3cbus->scl_rate.i3c = I3C_BUS_TYP_I3C_SCL_RATE; + i3cbus->mode = mode;
- if (!i3cbus->scl_rate.i2c) { - if (i3cbus->mode == I3C_BUS_MODE_MIXED_SLOW) - i3cbus->scl_rate.i2c = I3C_BUS_I2C_FM_SCL_RATE; - else - i3cbus->scl_rate.i2c = I3C_BUS_I2C_FM_PLUS_SCL_RATE; + switch (i3cbus->mode) { + case I3C_BUS_MODE_PURE: + if (!i3cbus->scl_rate.i3c) + i3cbus->scl_rate.i3c = I3C_BUS_TYP_I3C_SCL_RATE; + break; + case I3C_BUS_MODE_MIXED_FAST: + if (!i3cbus->scl_rate.i3c) + i3cbus->scl_rate.i3c = I3C_BUS_TYP_I3C_SCL_RATE; + if (!i3cbus->scl_rate.i2c) + i3cbus->scl_rate.i2c = max_i2c_scl_rate; + break; + case I3C_BUS_MODE_MIXED_SLOW: + if (!i3cbus->scl_rate.i2c) + i3cbus->scl_rate.i2c = max_i2c_scl_rate; + if (!i3cbus->scl_rate.i3c || + i3cbus->scl_rate.i3c > i3cbus->scl_rate.i2c) + i3cbus->scl_rate.i3c = i3cbus->scl_rate.i2c; + break; + default: + return -EINVAL; }
+ dev_dbg(&master->dev, "i2c-scl = %ld Hz i3c-scl = %ld Hz\n", + i3cbus->scl_rate.i2c, i3cbus->scl_rate.i3c); + /* * I3C/I2C frequency may have been overridden, check that user-provided * values are not exceeding max possible frequency. @@ -1966,9 +1990,6 @@ of_i3c_master_add_i2c_boardinfo(struct i /* LVR is encoded in reg[2]. */ boardinfo->lvr = reg[2];
- if (boardinfo->lvr & I3C_LVR_I2C_FM_MODE) - master->bus.scl_rate.i2c = I3C_BUS_I2C_FM_SCL_RATE; - list_add_tail(&boardinfo->node, &master->boardinfo.i2c); of_node_get(node);
@@ -2417,6 +2438,7 @@ int i3c_master_register(struct i3c_maste const struct i3c_master_controller_ops *ops, bool secondary) { + unsigned long i2c_scl_rate = I3C_BUS_I2C_FM_PLUS_SCL_RATE; struct i3c_bus *i3cbus = i3c_master_get_bus(master); enum i3c_bus_mode mode = I3C_BUS_MODE_PURE; struct i2c_dev_boardinfo *i2cbi; @@ -2466,9 +2488,12 @@ int i3c_master_register(struct i3c_maste ret = -EINVAL; goto err_put_dev; } + + if (i2cbi->lvr & I3C_LVR_I2C_FM_MODE) + i2c_scl_rate = I3C_BUS_I2C_FM_SCL_RATE; }
- ret = i3c_bus_set_mode(i3cbus, mode); + ret = i3c_bus_set_mode(i3cbus, mode, i2c_scl_rate); if (ret) goto err_put_dev;
From: Linus Walleij linus.walleij@linaro.org
commit f90b8fda3a9d72a9422ea80ae95843697f94ea4a upstream.
The SPI to the display on the DIR-685 is active low, we were just saved by the SPI library enforcing active low on everything before, so set it as active low to avoid ambiguity.
Link: https://lore.kernel.org/r/20190715202101.16060-1-linus.walleij@linaro.org Cc: stable@vger.kernel.org Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Olof Johansson olof@lixom.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/boot/dts/gemini-dlink-dir-685.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/boot/dts/gemini-dlink-dir-685.dts +++ b/arch/arm/boot/dts/gemini-dlink-dir-685.dts @@ -64,7 +64,7 @@ gpio-sck = <&gpio1 5 GPIO_ACTIVE_HIGH>; gpio-miso = <&gpio1 8 GPIO_ACTIVE_HIGH>; gpio-mosi = <&gpio1 7 GPIO_ACTIVE_HIGH>; - cs-gpios = <&gpio0 20 GPIO_ACTIVE_HIGH>; + cs-gpios = <&gpio0 20 GPIO_ACTIVE_LOW>; num-chipselects = <1>;
panel: display@0 {
From: Lyude Paul lyude@redhat.com
commit 7cb95eeea6706c790571042a06782e378b2561ea upstream.
It turns out that while disabling i2c bus access from software when the GPU is suspended was a step in the right direction with:
commit 342406e4fbba ("drm/nouveau/i2c: Disable i2c bus access after ->fini()")
We also ended up accidentally breaking the vbios init scripts on some older Tesla GPUs, as apparently said scripts can actually use the i2c bus. Since these scripts are executed before initializing any subdevices, we end up failing to acquire access to the i2c bus which has left a number of cards with their fan controllers uninitialized. Luckily this doesn't break hardware - it just means the fan gets stuck at 100%.
This also means that we've always been using our i2c busses before initializing them during the init scripts for older GPUs, we just didn't notice it until we started preventing them from being used until init. It's pretty impressive this never caused us any issues before!
So, fix this by initializing our i2c pad and busses during subdev pre-init. We skip initializing aux busses during pre-init, as those are guaranteed to only ever be used by nouveau for DP aux transactions.
Signed-off-by: Lyude Paul lyude@redhat.com Tested-by: Marc Meledandri m.meledandri@gmail.com Fixes: 342406e4fbba ("drm/nouveau/i2c: Disable i2c bus access after ->fini()") Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs bskeggs@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c @@ -185,6 +185,25 @@ nvkm_i2c_fini(struct nvkm_subdev *subdev }
static int +nvkm_i2c_preinit(struct nvkm_subdev *subdev) +{ + struct nvkm_i2c *i2c = nvkm_i2c(subdev); + struct nvkm_i2c_bus *bus; + struct nvkm_i2c_pad *pad; + + /* + * We init our i2c busses as early as possible, since they may be + * needed by the vbios init scripts on some cards + */ + list_for_each_entry(pad, &i2c->pad, head) + nvkm_i2c_pad_init(pad); + list_for_each_entry(bus, &i2c->bus, head) + nvkm_i2c_bus_init(bus); + + return 0; +} + +static int nvkm_i2c_init(struct nvkm_subdev *subdev) { struct nvkm_i2c *i2c = nvkm_i2c(subdev); @@ -238,6 +257,7 @@ nvkm_i2c_dtor(struct nvkm_subdev *subdev static const struct nvkm_subdev_func nvkm_i2c = { .dtor = nvkm_i2c_dtor, + .preinit = nvkm_i2c_preinit, .init = nvkm_i2c_init, .fini = nvkm_i2c_fini, .intr = nvkm_i2c_intr,
From: Daniel Jordan daniel.m.jordan@oracle.com
commit cf144f81a99d1a3928f90b0936accfd3f45c9a0a upstream.
Testing padata with the tcrypt module on a 5.2 kernel...
# modprobe tcrypt alg="pcrypt(rfc4106(gcm(aes)))" type=3 # modprobe tcrypt mode=211 sec=1
...produces this splat:
INFO: task modprobe:10075 blocked for more than 120 seconds. Not tainted 5.2.0-base+ #16 modprobe D 0 10075 10064 0x80004080 Call Trace: ? __schedule+0x4dd/0x610 ? ring_buffer_unlock_commit+0x23/0x100 schedule+0x6c/0x90 schedule_timeout+0x3b/0x320 ? trace_buffer_unlock_commit_regs+0x4f/0x1f0 wait_for_common+0x160/0x1a0 ? wake_up_q+0x80/0x80 { crypto_wait_req } # entries in braces added by hand { do_one_aead_op } { test_aead_jiffies } test_aead_speed.constprop.17+0x681/0xf30 [tcrypt] do_test+0x4053/0x6a2b [tcrypt] ? 0xffffffffa00f4000 tcrypt_mod_init+0x50/0x1000 [tcrypt] ...
The second modprobe command never finishes because in padata_reorder, CPU0's load of reorder_objects is executed before the unlocking store in spin_unlock_bh(pd->lock), causing CPU0 to miss CPU1's increment:
CPU0 CPU1
padata_reorder padata_do_serial LOAD reorder_objects // 0 INC reorder_objects // 1 padata_reorder TRYLOCK pd->lock // failed UNLOCK pd->lock
CPU0 deletes the timer before returning from padata_reorder and since no other job is submitted to padata, modprobe waits indefinitely.
Add a pair of full barriers to guarantee proper ordering:
CPU0 CPU1
padata_reorder padata_do_serial UNLOCK pd->lock smp_mb() LOAD reorder_objects INC reorder_objects smp_mb__after_atomic() padata_reorder TRYLOCK pd->lock
smp_mb__after_atomic is needed so the read part of the trylock operation comes after the INC, as Andrea points out. Thanks also to Andrea for help with writing a litmus test.
Fixes: 16295bec6398 ("padata: Generic parallelization/serialization interface") Signed-off-by: Daniel Jordan daniel.m.jordan@oracle.com Cc: stable@vger.kernel.org Cc: Andrea Parri andrea.parri@amarulasolutions.com Cc: Boqun Feng boqun.feng@gmail.com Cc: Herbert Xu herbert@gondor.apana.org.au Cc: Paul E. McKenney paulmck@linux.ibm.com Cc: Peter Zijlstra peterz@infradead.org Cc: Steffen Klassert steffen.klassert@secunet.com Cc: linux-arch@vger.kernel.org Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/padata.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/kernel/padata.c +++ b/kernel/padata.c @@ -267,7 +267,12 @@ static void padata_reorder(struct parall * The next object that needs serialization might have arrived to * the reorder queues in the meantime, we will be called again * from the timer function if no one else cares for it. + * + * Ensure reorder_objects is read after pd->lock is dropped so we see + * an increment from another task in padata_do_serial. Pairs with + * smp_mb__after_atomic in padata_do_serial. */ + smp_mb(); if (atomic_read(&pd->reorder_objects) && !(pinst->flags & PADATA_RESET)) mod_timer(&pd->timer, jiffies + HZ); @@ -387,6 +392,13 @@ void padata_do_serial(struct padata_priv list_add_tail(&padata->list, &pqueue->reorder.list); spin_unlock(&pqueue->reorder.lock);
+ /* + * Ensure the atomic_inc of reorder_objects above is ordered correctly + * with the trylock of pd->lock in padata_reorder. Pairs with smp_mb + * in padata_reorder. + */ + smp_mb__after_atomic(); + put_cpu();
/* If we're running on the wrong CPU, call padata_reorder() via a
From: Damien Le Moal damien.lemoal@wdc.com
commit 3b8cafdd5436f9298b3bf6eb831df5eef5ee82b6 upstream.
dm-zoned uses the zone flag DMZ_ACTIVE to indicate that a zone of the backend device is being actively read or written and so cannot be reclaimed. This flag is set as long as the zone atomic reference counter is not 0. When this atomic is decremented and reaches 0 (e.g. on BIO completion), the active flag is cleared and set again whenever the zone is reused and BIO issued with the atomic counter incremented. These 2 operations (atomic inc/dec and flag set/clear) are however not always executed atomically under the target metadata mutex lock and this causes the warning:
WARN_ON(!test_bit(DMZ_ACTIVE, &zone->flags));
in dmz_deactivate_zone() to be displayed. This problem is regularly triggered with xfstests generic/209, generic/300, generic/451 and xfs/077 with XFS being used as the file system on the dm-zoned target device. Similarly, xfstests ext4/303, ext4/304, generic/209 and generic/300 trigger the warning with ext4 use.
This problem can be easily fixed by simply removing the DMZ_ACTIVE flag and managing the "ACTIVE" state by directly looking at the reference counter value. To do so, the functions dmz_activate_zone() and dmz_deactivate_zone() are changed to inline functions respectively calling atomic_inc() and atomic_dec(), while the dmz_is_active() macro is changed to an inline function calling atomic_read().
Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Reported-by: Masato Suzuki masato.suzuki@wdc.com Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-zoned-metadata.c | 24 ------------------------ drivers/md/dm-zoned.h | 28 ++++++++++++++++++++++++---- 2 files changed, 24 insertions(+), 28 deletions(-)
--- a/drivers/md/dm-zoned-metadata.c +++ b/drivers/md/dm-zoned-metadata.c @@ -1594,30 +1594,6 @@ struct dm_zone *dmz_get_zone_for_reclaim }
/* - * Activate a zone (increment its reference count). - */ -void dmz_activate_zone(struct dm_zone *zone) -{ - set_bit(DMZ_ACTIVE, &zone->flags); - atomic_inc(&zone->refcount); -} - -/* - * Deactivate a zone. This decrement the zone reference counter - * and clears the active state of the zone once the count reaches 0, - * indicating that all BIOs to the zone have completed. Returns - * true if the zone was deactivated. - */ -void dmz_deactivate_zone(struct dm_zone *zone) -{ - if (atomic_dec_and_test(&zone->refcount)) { - WARN_ON(!test_bit(DMZ_ACTIVE, &zone->flags)); - clear_bit_unlock(DMZ_ACTIVE, &zone->flags); - smp_mb__after_atomic(); - } -} - -/* * Get the zone mapping a chunk, if the chunk is mapped already. * If no mapping exist and the operation is WRITE, a zone is * allocated and used to map the chunk. --- a/drivers/md/dm-zoned.h +++ b/drivers/md/dm-zoned.h @@ -115,7 +115,6 @@ enum { DMZ_BUF,
/* Zone internal state */ - DMZ_ACTIVE, DMZ_RECLAIM, DMZ_SEQ_WRITE_ERR, }; @@ -128,7 +127,6 @@ enum { #define dmz_is_empty(z) ((z)->wp_block == 0) #define dmz_is_offline(z) test_bit(DMZ_OFFLINE, &(z)->flags) #define dmz_is_readonly(z) test_bit(DMZ_READ_ONLY, &(z)->flags) -#define dmz_is_active(z) test_bit(DMZ_ACTIVE, &(z)->flags) #define dmz_in_reclaim(z) test_bit(DMZ_RECLAIM, &(z)->flags) #define dmz_seq_write_err(z) test_bit(DMZ_SEQ_WRITE_ERR, &(z)->flags)
@@ -188,8 +186,30 @@ void dmz_unmap_zone(struct dmz_metadata unsigned int dmz_nr_rnd_zones(struct dmz_metadata *zmd); unsigned int dmz_nr_unmap_rnd_zones(struct dmz_metadata *zmd);
-void dmz_activate_zone(struct dm_zone *zone); -void dmz_deactivate_zone(struct dm_zone *zone); +/* + * Activate a zone (increment its reference count). + */ +static inline void dmz_activate_zone(struct dm_zone *zone) +{ + atomic_inc(&zone->refcount); +} + +/* + * Deactivate a zone. This decrement the zone reference counter + * indicating that all BIOs to the zone have completed when the count is 0. + */ +static inline void dmz_deactivate_zone(struct dm_zone *zone) +{ + atomic_dec(&zone->refcount); +} + +/* + * Test if a zone is active, that is, has a refcount > 0. + */ +static inline bool dmz_is_active(struct dm_zone *zone) +{ + return atomic_read(&zone->refcount); +}
int dmz_lock_zone_reclaim(struct dm_zone *zone); void dmz_unlock_zone_reclaim(struct dm_zone *zone);
From: Juergen Gross jgross@suse.com
commit bce5963bcb4f9934faa52be323994511d59fd13c upstream.
When binding an interdomain event channel to a vcpu via IOCTL_EVTCHN_BIND_INTERDOMAIN not only the event channel needs to be bound, but the affinity of the associated IRQi must be changed, too. Otherwise the IRQ and the event channel won't be moved to another vcpu in case the original vcpu they were bound to is going offline.
Cc: stable@vger.kernel.org # 4.13 Fixes: c48f64ab472389df ("xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU") Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/xen/events/events_base.c | 12 ++++++++++-- drivers/xen/evtchn.c | 2 +- include/xen/events.h | 3 ++- 3 files changed, 13 insertions(+), 4 deletions(-)
--- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -1294,7 +1294,7 @@ void rebind_evtchn_irq(int evtchn, int i }
/* Rebind an evtchn so that it gets delivered to a specific cpu */ -int xen_rebind_evtchn_to_cpu(int evtchn, unsigned tcpu) +static int xen_rebind_evtchn_to_cpu(int evtchn, unsigned int tcpu) { struct evtchn_bind_vcpu bind_vcpu; int masked; @@ -1328,7 +1328,6 @@ int xen_rebind_evtchn_to_cpu(int evtchn,
return 0; } -EXPORT_SYMBOL_GPL(xen_rebind_evtchn_to_cpu);
static int set_affinity_irq(struct irq_data *data, const struct cpumask *dest, bool force) @@ -1342,6 +1341,15 @@ static int set_affinity_irq(struct irq_d return ret; }
+/* To be called with desc->lock held. */ +int xen_set_affinity_evtchn(struct irq_desc *desc, unsigned int tcpu) +{ + struct irq_data *d = irq_desc_get_irq_data(desc); + + return set_affinity_irq(d, cpumask_of(tcpu), false); +} +EXPORT_SYMBOL_GPL(xen_set_affinity_evtchn); + static void enable_dynirq(struct irq_data *data) { int evtchn = evtchn_from_irq(data->irq); --- a/drivers/xen/evtchn.c +++ b/drivers/xen/evtchn.c @@ -447,7 +447,7 @@ static void evtchn_bind_interdom_next_vc this_cpu_write(bind_last_selected_cpu, selected_cpu);
/* unmask expects irqs to be disabled */ - xen_rebind_evtchn_to_cpu(evtchn, selected_cpu); + xen_set_affinity_evtchn(desc, selected_cpu); raw_spin_unlock_irqrestore(&desc->lock, flags); }
--- a/include/xen/events.h +++ b/include/xen/events.h @@ -3,6 +3,7 @@ #define _XEN_EVENTS_H
#include <linux/interrupt.h> +#include <linux/irq.h> #ifdef CONFIG_PCI_MSI #include <linux/msi.h> #endif @@ -59,7 +60,7 @@ void evtchn_put(unsigned int evtchn);
void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector); void rebind_evtchn_irq(int evtchn, int irq); -int xen_rebind_evtchn_to_cpu(int evtchn, unsigned tcpu); +int xen_set_affinity_evtchn(struct irq_desc *desc, unsigned int tcpu);
static inline void notify_remote_via_evtchn(int port) {
From: YueHaibing yuehaibing@huawei.com
commit 80a316ff16276b36d0392a8f8b2f63259857ae98 upstream.
If xenbus_register_frontend() fails in p9_trans_xen_init, we should call v9fs_unregister_trans() to do cleanup.
Link: http://lkml.kernel.org/r/20190430143933.19368-1-yuehaibing@huawei.com Cc: stable@vger.kernel.org Fixes: 868eb122739a ("xen/9pfs: introduce Xen 9pfs transport driver") Signed-off-by: YueHaibing yuehaibing@huawei.com Signed-off-by: Dominique Martinet dominique.martinet@cea.fr Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/9p/trans_xen.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/net/9p/trans_xen.c +++ b/net/9p/trans_xen.c @@ -530,13 +530,19 @@ static struct xenbus_driver xen_9pfs_fro
static int p9_trans_xen_init(void) { + int rc; + if (!xen_domain()) return -ENODEV;
pr_info("Initialising Xen transport for 9pfs\n");
v9fs_register_trans(&p9_xen_trans); - return xenbus_register_frontend(&xen_9pfs_front_driver); + rc = xenbus_register_frontend(&xen_9pfs_front_driver); + if (rc) + v9fs_unregister_trans(&p9_xen_trans); + + return rc; } module_init(p9_trans_xen_init);
From: YueHaibing yuehaibing@huawei.com
commit d4548543fc4ece56c6f04b8586f435fb4fd84c20 upstream.
KASAN report this:
BUG: unable to handle kernel paging request at ffffffffa0097000 PGD 3870067 P4D 3870067 PUD 3871063 PMD 2326e2067 PTE 0 Oops: 0000 [#1 CPU: 0 PID: 5340 Comm: modprobe Not tainted 5.1.0-rc7+ #25 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:__list_add_valid+0x10/0x70 Code: c3 48 8b 06 55 48 89 e5 5d 48 39 07 0f 94 c0 0f b6 c0 c3 90 90 90 90 90 90 90 55 48 89 d0 48 8b 52 08 48 89 e5 48 39 f2 75 19 <48> 8b 32 48 39 f0 75 3a
RSP: 0018:ffffc90000e23c68 EFLAGS: 00010246 RAX: ffffffffa00ad000 RBX: ffffffffa009d000 RCX: 0000000000000000 RDX: ffffffffa0097000 RSI: ffffffffa0097000 RDI: ffffffffa009d000 RBP: ffffc90000e23c68 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0097000 R13: ffff888231797180 R14: 0000000000000000 R15: ffffc90000e23e78 FS: 00007fb215285540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa0097000 CR3: 000000022f144000 CR4: 00000000000006f0 Call Trace: v9fs_register_trans+0x2f/0x60 [9pnet ? 0xffffffffa0087000 p9_virtio_init+0x25/0x1000 [9pnet_virtio do_one_initcall+0x6c/0x3cc ? kmem_cache_alloc_trace+0x248/0x3b0 do_init_module+0x5b/0x1f1 load_module+0x1db1/0x2690 ? m_show+0x1d0/0x1d0 __do_sys_finit_module+0xc5/0xd0 __x64_sys_finit_module+0x15/0x20 do_syscall_64+0x6b/0x1d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fb214d8e839 Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01
RSP: 002b:00007ffc96554278 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 000055e67eed2aa0 RCX: 00007fb214d8e839 RDX: 0000000000000000 RSI: 000055e67ce95c2e RDI: 0000000000000003 RBP: 000055e67ce95c2e R08: 0000000000000000 R09: 000055e67eed2aa0 R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000 R13: 000055e67eeda500 R14: 0000000000040000 R15: 000055e67eed2aa0 Modules linked in: 9pnet_virtio(+) 9pnet gre rfkill vmw_vsock_virtio_transport_common vsock [last unloaded: 9pnet_virtio CR2: ffffffffa0097000 ---[ end trace 4a52bb13ff07b761
If register_virtio_driver() fails in p9_virtio_init, we should call v9fs_unregister_trans() to do cleanup.
Link: http://lkml.kernel.org/r/20190430115942.41840-1-yuehaibing@huawei.com Cc: stable@vger.kernel.org Reported-by: Hulk Robot hulkci@huawei.com Fixes: b530cc794024 ("9p: add virtio transport") Signed-off-by: YueHaibing yuehaibing@huawei.com Signed-off-by: Dominique Martinet dominique.martinet@cea.fr Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/9p/trans_virtio.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/net/9p/trans_virtio.c +++ b/net/9p/trans_virtio.c @@ -767,10 +767,16 @@ static struct p9_trans_module p9_virtio_ /* The standard init function */ static int __init p9_virtio_init(void) { + int rc; + INIT_LIST_HEAD(&virtio_chan_list);
v9fs_register_trans(&p9_virtio_trans); - return register_virtio_driver(&p9_virtio_drv); + rc = register_virtio_driver(&p9_virtio_drv); + if (rc) + v9fs_unregister_trans(&p9_virtio_trans); + + return rc; }
static void __exit p9_virtio_cleanup(void)
From: Soeren Moch smoch@web.de
commit 41a531ffa4c5aeb062f892227c00fabb3b4a9c91 upstream.
Since commit ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler") the handler rt2x00usb_interrupt_rxdone() is not running with interrupts disabled anymore. So this completion handler is not guaranteed to run completely before workqueue processing starts for the same queue entry. Be sure to set all other flags in the entry correctly before marking this entry ready for workqueue processing. This way we cannot miss error conditions that need to be signalled from the completion handler to the worker thread. Note that rt2x00usb_work_rxdone() processes all available entries, not only such for which queue_work() was called.
This patch is similar to what commit df71c9cfceea ("rt2x00: fix order of entry flags modification") did for TX processing.
This fixes a regression on a RT5370 based wifi stick in AP mode, which suddenly stopped data transmission after some period of heavy load. Also stopping the hanging hostapd resulted in the error message "ieee80211 phy0: rt2x00queue_flush_queue: Warning - Queue 14 failed to flush". Other operation modes are probably affected as well, this just was the used testcase.
Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler") Cc: stable@vger.kernel.org # 4.20+ Signed-off-by: Soeren Moch smoch@web.de Acked-by: Stanislaw Gruszka sgruszka@redhat.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/ralink/rt2x00/rt2x00usb.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c +++ b/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c @@ -372,15 +372,10 @@ static void rt2x00usb_interrupt_rxdone(s struct queue_entry *entry = (struct queue_entry *)urb->context; struct rt2x00_dev *rt2x00dev = entry->queue->rt2x00dev;
- if (!test_and_clear_bit(ENTRY_OWNER_DEVICE_DATA, &entry->flags)) + if (!test_bit(ENTRY_OWNER_DEVICE_DATA, &entry->flags)) return;
/* - * Report the frame as DMA done - */ - rt2x00lib_dmadone(entry); - - /* * Check if the received data is simply too small * to be actually valid, or if the urb is signaling * a problem. @@ -389,6 +384,11 @@ static void rt2x00usb_interrupt_rxdone(s set_bit(ENTRY_DATA_IO_FAILED, &entry->flags);
/* + * Report the frame as DMA done + */ + rt2x00lib_dmadone(entry); + + /* * Schedule the delayed work for reading the RX status * from the device. */
From: Dexuan Cui decui@microsoft.com
commit e320ab3cec7dd8b1606964d81ae1e14391ff8e96 upstream.
The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section 5.2.1 "GPA Overlay Pages" for the details) and here is an excerpt:
"The hypervisor defines several special pages that "overlay" the guest's Guest Physical Addresses (GPA) space. Overlays are addressed GPA but are not included in the normal GPA map maintained internally by the hypervisor. Conceptually, they exist in a separate map that overlays the GPA map.
If a page within the GPA space is overlaid, any SPA page mapped to the GPA page is effectively "obscured" and generally unreachable by the virtual processor through processor memory accesses.
If an overlay page is disabled, the underlying GPA page is "uncovered", and an existing mapping becomes accessible to the guest."
SPA = System Physical Address = the final real physical address.
When a CPU (e.g. CPU1) is onlined, hv_cpu_init() allocates the VP ASSIST PAGE and enables the EOI optimization for this CPU by writing the MSR HV_X64_MSR_VP_ASSIST_PAGE. From now on, hvp->apic_assist belongs to the special SPA page, and this CPU *always* uses hvp->apic_assist (which is shared with the hypervisor) to decide if it needs to write the EOI MSR.
When a CPU is offlined then on the outgoing CPU: 1. hv_cpu_die() disables the EOI optimizaton for this CPU, and from now on hvp->apic_assist belongs to the original "normal" SPA page; 2. the remaining work of stopping this CPU is done 3. this CPU is completely stopped.
Between 1 and 3, this CPU can still receive interrupts (e.g. reschedule IPIs from CPU0, and Local APIC timer interrupts), and this CPU *must* write the EOI MSR for every interrupt received, otherwise the hypervisor may not deliver further interrupts, which may be needed to completely stop the CPU.
So, after the EOI optimization is disabled in hv_cpu_die(), it's required that the hvp->apic_assist's bit0 is zero, which is not guaranteed by the current allocation mode because it lacks __GFP_ZERO. As a consequence the bit might be set and interrupt handling would not write the EOI MSR causing interrupt delivery to become stuck.
Add the missing __GFP_ZERO to the allocation.
Note 1: after the "normal" SPA page is allocted and zeroed out, neither the hypervisor nor the guest writes into the page, so the page remains with zeros.
Note 2: see Section 10.3.5 "EOI Assist" for the details of the EOI optimization. When the optimization is enabled, the guest can still write the EOI MSR register irrespective of the "No EOI required" value, but that's slower than the optimized assist based variant.
Fixes: ba696429d290 ("x86/hyper-v: Implement EOI assist") Signed-off-by: Dexuan Cui decui@microsoft.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/ <PU1P153MB0169B716A637FABF07433C04BFCB0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/hyperv/hv_init.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
--- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -111,8 +111,17 @@ static int hv_cpu_init(unsigned int cpu) if (!hv_vp_assist_page) return 0;
- if (!*hvp) - *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL, PAGE_KERNEL); + /* + * The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section + * 5.2.1 "GPA Overlay Pages"). Here it must be zeroed out to make sure + * we always write the EOI MSR in hv_apic_eoi_write() *after* the + * EOI optimization is disabled in hv_cpu_die(), otherwise a CPU may + * not be stopped in the case of CPU offlining and the VM will hang. + */ + if (!*hvp) { + *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO, + PAGE_KERNEL); + }
if (*hvp) { u64 val;
From: David Rientjes rientjes@google.com
commit e74bd96989dd42a51a73eddb4a5510a6f5e42ac3 upstream.
When default_get_smp_config() is called with early == 1 and mpf->feature1 is non-zero, mpf is leaked because the return path does not do early_memunmap().
Fix this and share a common exit routine.
Fixes: 5997efb96756 ("x86/boot: Use memremap() to map the MPF and MPC data") Reported-by: Cfir Cohen cfir@google.com Signed-off-by: David Rientjes rientjes@google.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907091942570.28240@chino.kir.corp... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/mpparse.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
--- a/arch/x86/kernel/mpparse.c +++ b/arch/x86/kernel/mpparse.c @@ -546,17 +546,15 @@ void __init default_get_smp_config(unsig * local APIC has default address */ mp_lapic_addr = APIC_DEFAULT_PHYS_BASE; - return; + goto out; }
pr_info("Default MP configuration #%d\n", mpf->feature1); construct_default_ISA_mptable(mpf->feature1);
} else if (mpf->physptr) { - if (check_physptr(mpf, early)) { - early_memunmap(mpf, sizeof(*mpf)); - return; - } + if (check_physptr(mpf, early)) + goto out; } else BUG();
@@ -565,7 +563,7 @@ void __init default_get_smp_config(unsig /* * Only use the first configuration found. */ - +out: early_memunmap(mpf, sizeof(*mpf)); }
From: Kan Liang kan.liang@linux.intel.com
commit e4557c1a46b0d32746bd309e1941914b5a6912b4 upstream.
If a user first sample a PEBS event on a fixed counter, then sample a non-PEBS event on the same fixed counter on Icelake, it will trigger spurious NMI. For example:
perf record -e 'cycles:p' -a perf record -e 'cycles' -a
The error message for spurious NMI:
[June 21 15:38] Uhhuh. NMI received for unknown reason 30 on CPU 2. [ +0.000000] Do you have a strange power saving mode enabled? [ +0.000000] Dazed and confused, but trying to continue
The bug was introduced by the following commit:
commit 6f55967ad9d9 ("perf/x86/intel: Fix race in intel_pmu_disable_event()")
The commit moves the intel_pmu_pebs_disable() after intel_pmu_disable_fixed(), which returns immediately. The related bit of PEBS_ENABLE MSR will never be cleared for the fixed counter. Then a non-PEBS event runs on the fixed counter, but the bit on PEBS_ENABLE is still set, which triggers spurious NMIs.
Check and disable PEBS for fixed counters after intel_pmu_disable_fixed().
Reported-by: Yi, Ammy ammy.yi@intel.com Signed-off-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Jiri Olsa jolsa@kernel.org Cc: stable@vger.kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Fixes: 6f55967ad9d9 ("perf/x86/intel: Fix race in intel_pmu_disable_event()") Link: https://lkml.kernel.org/r/20190625142135.22112-1-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/events/intel/core.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
--- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2161,12 +2161,10 @@ static void intel_pmu_disable_event(stru cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx); cpuc->intel_cp_status &= ~(1ull << hwc->idx);
- if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) { + if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) intel_pmu_disable_fixed(hwc); - return; - } - - x86_pmu_disable_event(event); + else + x86_pmu_disable_event(event);
/* * Needs to be called after x86_pmu_disable_event,
From: Kim Phillips kim.phillips@amd.com
commit 16f4641166b10e199f0d7b68c2c5f004fef0bda3 upstream.
The following commit:
d7cbbe49a930 ("perf/x86/amd/uncore: Set ThreadMask and SliceMask for L3 Cache perf events")
enables L3 PMC events for all threads and slices by writing 1's in 'ChL3PmcCfg' (L3 PMC PERF_CTL) register fields.
Those bitfields overlap with high order event select bits in the Data Fabric PMC control register, however.
So when a user requests raw Data Fabric events (-e amd_df/event=0xYYY/), the two highest order bits get inadvertently set, changing the counter select to events that don't exist, and for which no counts are read.
This patch changes the logic to write the L3 masks only when dealing with L3 PMC counters.
AMD Family 16h and below Northbridge (NB) counters were not affected.
Signed-off-by: Kim Phillips kim.phillips@amd.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: Gary Hook Gary.Hook@amd.com Cc: H. Peter Anvin hpa@zytor.com Cc: Janakarajan Natarajan Janakarajan.Natarajan@amd.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Martin Liska mliska@suse.cz Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Pu Wen puwen@hygon.cn Cc: Stephane Eranian eranian@google.com Cc: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Fixes: d7cbbe49a930 ("perf/x86/amd/uncore: Set ThreadMask and SliceMask for L3 Cache perf events") Link: https://lkml.kernel.org/r/20190628215906.4276-1-kim.phillips@amd.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/events/amd/uncore.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/events/amd/uncore.c +++ b/arch/x86/events/amd/uncore.c @@ -206,7 +206,7 @@ static int amd_uncore_event_init(struct * SliceMask and ThreadMask need to be set for certain L3 events in * Family 17h. For other events, the two fields do not affect the count. */ - if (l3_mask) + if (l3_mask && is_llc_event(event)) hwc->config |= (AMD64_L3_SLICE_MASK | AMD64_L3_THREAD_MASK);
if (event->cpu < 0)
From: Kim Phillips kim.phillips@amd.com
commit 2f217d58a8a086d3399fecce39fb358848e799c4 upstream.
Fill in the L3 performance event select register ThreadMask bitfield, to enable per hardware thread accounting.
Signed-off-by: Kim Phillips kim.phillips@amd.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: Gary Hook Gary.Hook@amd.com Cc: H. Peter Anvin hpa@zytor.com Cc: Janakarajan Natarajan Janakarajan.Natarajan@amd.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Martin Liska mliska@suse.cz Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Pu Wen puwen@hygon.cn Cc: Stephane Eranian eranian@google.com Cc: Suravee Suthikulpanit Suravee.Suthikulpanit@amd.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Link: https://lkml.kernel.org/r/20190628215906.4276-2-kim.phillips@amd.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/events/amd/uncore.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
--- a/arch/x86/events/amd/uncore.c +++ b/arch/x86/events/amd/uncore.c @@ -202,15 +202,22 @@ static int amd_uncore_event_init(struct hwc->config = event->attr.config & AMD64_RAW_EVENT_MASK_NB; hwc->idx = -1;
+ if (event->cpu < 0) + return -EINVAL; + /* * SliceMask and ThreadMask need to be set for certain L3 events in * Family 17h. For other events, the two fields do not affect the count. */ - if (l3_mask && is_llc_event(event)) - hwc->config |= (AMD64_L3_SLICE_MASK | AMD64_L3_THREAD_MASK); + if (l3_mask && is_llc_event(event)) { + int thread = 2 * (cpu_data(event->cpu).cpu_core_id % 4);
- if (event->cpu < 0) - return -EINVAL; + if (smp_num_siblings > 1) + thread += cpu_data(event->cpu).apicid & 1; + + hwc->config |= (1ULL << (AMD64_L3_THREAD_SHIFT + thread) & + AMD64_L3_THREAD_MASK) | AMD64_L3_SLICE_MASK; + }
uncore = event_to_amd_uncore(event); if (!uncore)
From: Eiichi Tsukata devel@etsukata.com
commit cbf5b73d162b22e044fe0b7d51dcaa33be065253 upstream.
arch_stack_walk_user() checks `if (fp == frame.next_fp)` to prevent a infinite loop by self reference but it's not enogh for circular reference.
Once a lack of return address is found, there is no point to continue the loop, so break out.
Fixes: 02b67518e2b1 ("tracing: add support for userspace stacktraces in tracing/iter_ctrl") Signed-off-by: Eiichi Tsukata devel@etsukata.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Linus Torvalds torvalds@linux-foundation.org Link: https://lkml.kernel.org/r/20190711023501.963-1-devel@etsukata.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/stacktrace.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
--- a/arch/x86/kernel/stacktrace.c +++ b/arch/x86/kernel/stacktrace.c @@ -129,11 +129,9 @@ void arch_stack_walk_user(stack_trace_co break; if ((unsigned long)fp < regs->sp) break; - if (frame.ret_addr) { - if (!consume_entry(cookie, frame.ret_addr, false)) - return; - } - if (fp == frame.next_fp) + if (!frame.ret_addr) + break; + if (!consume_entry(cookie, frame.ret_addr, false)) break; fp = frame.next_fp; }
From: Andres Rodriguez andresx7@gmail.com
commit e28ad544f462231d3fd081a7316339359efbb481 upstream.
DisplayID blocks allow embedding of CEA blocks. The payloads are identical to traditional top level CEA extension blocks, but the header is slightly different.
This change allows the CEA parser to find a CEA block inside a DisplayID block. Additionally, it adds support for parsing the embedded CTA header. No further changes are necessary due to payload parity.
This change fixes audio support for the Valve Index HMD.
Signed-off-by: Andres Rodriguez andresx7@gmail.com Reviewed-by: Dave Airlie airlied@redhat.com Cc: Jani Nikula jani.nikula@linux.intel.com Cc: stable@vger.kernel.org # v4.15 Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20190619180901.17901-1-andresx... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/drm_edid.c | 81 ++++++++++++++++++++++++++++++++++++++------ include/drm/drm_displayid.h | 10 +++++ 2 files changed, 80 insertions(+), 11 deletions(-)
--- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -1339,6 +1339,7 @@ MODULE_PARM_DESC(edid_fixup,
static void drm_get_displayid(struct drm_connector *connector, struct edid *edid); +static int validate_displayid(u8 *displayid, int length, int idx);
static int drm_edid_block_checksum(const u8 *raw_edid) { @@ -2922,16 +2923,46 @@ static u8 *drm_find_edid_extension(const return edid_ext; }
-static u8 *drm_find_cea_extension(const struct edid *edid) -{ - return drm_find_edid_extension(edid, CEA_EXT); -}
static u8 *drm_find_displayid_extension(const struct edid *edid) { return drm_find_edid_extension(edid, DISPLAYID_EXT); }
+static u8 *drm_find_cea_extension(const struct edid *edid) +{ + int ret; + int idx = 1; + int length = EDID_LENGTH; + struct displayid_block *block; + u8 *cea; + u8 *displayid; + + /* Look for a top level CEA extension block */ + cea = drm_find_edid_extension(edid, CEA_EXT); + if (cea) + return cea; + + /* CEA blocks can also be found embedded in a DisplayID block */ + displayid = drm_find_displayid_extension(edid); + if (!displayid) + return NULL; + + ret = validate_displayid(displayid, length, idx); + if (ret) + return NULL; + + idx += sizeof(struct displayid_hdr); + for_each_displayid_db(displayid, block, idx, length) { + if (block->tag == DATA_BLOCK_CTA) { + cea = (u8 *)block; + break; + } + } + + return cea; +} + /* * Calculate the alternate clock for the CEA mode * (60Hz vs. 59.94Hz etc.) @@ -3655,13 +3686,38 @@ cea_revision(const u8 *cea) static int cea_db_offsets(const u8 *cea, int *start, int *end) { - /* Data block offset in CEA extension block */ - *start = 4; - *end = cea[2]; - if (*end == 0) - *end = 127; - if (*end < 4 || *end > 127) - return -ERANGE; + /* DisplayID CTA extension blocks and top-level CEA EDID + * block header definitions differ in the following bytes: + * 1) Byte 2 of the header specifies length differently, + * 2) Byte 3 is only present in the CEA top level block. + * + * The different definitions for byte 2 follow. + * + * DisplayID CTA extension block defines byte 2 as: + * Number of payload bytes + * + * CEA EDID block defines byte 2 as: + * Byte number (decimal) within this block where the 18-byte + * DTDs begin. If no non-DTD data is present in this extension + * block, the value should be set to 04h (the byte after next). + * If set to 00h, there are no DTDs present in this block and + * no non-DTD data. + */ + if (cea[0] == DATA_BLOCK_CTA) { + *start = 3; + *end = *start + cea[2]; + } else if (cea[0] == CEA_EXT) { + /* Data block offset in CEA extension block */ + *start = 4; + *end = cea[2]; + if (*end == 0) + *end = 127; + if (*end < 4 || *end > 127) + return -ERANGE; + } else { + return -ENOTSUPP; + } + return 0; }
@@ -5279,6 +5335,9 @@ static int drm_parse_display_id(struct d case DATA_BLOCK_TYPE_1_DETAILED_TIMING: /* handled in mode gathering code. */ break; + case DATA_BLOCK_CTA: + /* handled in the cea parser code. */ + break; default: DRM_DEBUG_KMS("found DisplayID tag 0x%x, unhandled\n", block->tag); break; --- a/include/drm/drm_displayid.h +++ b/include/drm/drm_displayid.h @@ -40,6 +40,7 @@ #define DATA_BLOCK_DISPLAY_INTERFACE 0x0f #define DATA_BLOCK_STEREO_DISPLAY_INTERFACE 0x10 #define DATA_BLOCK_TILED_DISPLAY 0x12 +#define DATA_BLOCK_CTA 0x81
#define DATA_BLOCK_VENDOR_SPECIFIC 0x7f
@@ -90,4 +91,13 @@ struct displayid_detailed_timing_block { struct displayid_block base; struct displayid_detailed_timings_1 timings[0]; }; + +#define for_each_displayid_db(displayid, block, idx, length) \ + for ((block) = (struct displayid_block *)&(displayid)[idx]; \ + (idx) + sizeof(struct displayid_block) <= (length) && \ + (idx) + sizeof(struct displayid_block) + (block)->num_bytes <= (length) && \ + (block)->num_bytes > 0; \ + (idx) += (block)->num_bytes + sizeof(struct displayid_block), \ + (block) = (struct displayid_block *)&(displayid)[idx]) + #endif
From: Damien Le Moal damien.lemoal@wdc.com
commit b4c5875d36178e8df409bdce232f270cac89fafe upstream.
To allow the SCSI subsystem scsi_execute_req() function to issue requests using large buffers that are better allocated with vmalloc() rather than kmalloc(), modify bio_map_kern() to allow passing a buffer allocated with vmalloc().
To do so, detect vmalloc-ed buffers using is_vmalloc_addr(). For vmalloc-ed buffers, flush the buffer using flush_kernel_vmap_range(), use vmalloc_to_page() instead of virt_to_page() to obtain the pages of the buffer, and invalidate the buffer addresses with invalidate_kernel_vmap_range() on completion of read BIOs. This last point is executed using the function bio_invalidate_vmalloc_pages() which is defined only if the architecture defines ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE, that is, if the architecture actually needs the invalidation done.
Fixes: 515ce6061312 ("scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocation") Fixes: e76239a3748c ("block: add a report_zones method") Cc: stable@vger.kernel.org Reviewed-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com Reviewed-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- block/bio.c | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-)
--- a/block/bio.c +++ b/block/bio.c @@ -16,6 +16,7 @@ #include <linux/workqueue.h> #include <linux/cgroup.h> #include <linux/blk-cgroup.h> +#include <linux/highmem.h>
#include <trace/events/block.h> #include "blk.h" @@ -1479,8 +1480,22 @@ void bio_unmap_user(struct bio *bio) bio_put(bio); }
+static void bio_invalidate_vmalloc_pages(struct bio *bio) +{ +#ifdef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE + if (bio->bi_private && !op_is_write(bio_op(bio))) { + unsigned long i, len = 0; + + for (i = 0; i < bio->bi_vcnt; i++) + len += bio->bi_io_vec[i].bv_len; + invalidate_kernel_vmap_range(bio->bi_private, len); + } +#endif +} + static void bio_map_kern_endio(struct bio *bio) { + bio_invalidate_vmalloc_pages(bio); bio_put(bio); }
@@ -1501,6 +1516,8 @@ struct bio *bio_map_kern(struct request_ unsigned long end = (kaddr + len + PAGE_SIZE - 1) >> PAGE_SHIFT; unsigned long start = kaddr >> PAGE_SHIFT; const int nr_pages = end - start; + bool is_vmalloc = is_vmalloc_addr(data); + struct page *page; int offset, i; struct bio *bio;
@@ -1508,6 +1525,11 @@ struct bio *bio_map_kern(struct request_ if (!bio) return ERR_PTR(-ENOMEM);
+ if (is_vmalloc) { + flush_kernel_vmap_range(data, len); + bio->bi_private = data; + } + offset = offset_in_page(kaddr); for (i = 0; i < nr_pages; i++) { unsigned int bytes = PAGE_SIZE - offset; @@ -1518,7 +1540,11 @@ struct bio *bio_map_kern(struct request_ if (bytes > len) bytes = len;
- if (bio_add_pc_page(q, bio, virt_to_page(data), bytes, + if (!is_vmalloc) + page = virt_to_page(data); + else + page = vmalloc_to_page(data); + if (bio_add_pc_page(q, bio, page, bytes, offset) < bytes) { /* we don't support partial mappings */ bio_put(bio);
From: Damien Le Moal damien.lemoal@wdc.com
commit 113ab72ed4794c193509a97d7c6d32a6886e1682 upstream.
For large values of the number of zones reported and/or large zone sizes, the sector increment calculated with
blk_queue_zone_sectors(q) * n
in blk_report_zones() loop can overflow the unsigned int type used for the calculation as both "n" and blk_queue_zone_sectors() value are unsigned int. E.g. for a device with 256 MB zones (524288 sectors), overflow happens with 8192 or more zones reported.
Changing the return type of blk_queue_zone_sectors() to sector_t, fixes this problem and avoids overflow problem for all other callers of this helper too. The same change is also applied to the bdev_zone_sectors() helper.
Fixes: e76239a3748c ("block: add a report_zones method") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- block/blk-zoned.c | 2 +- include/linux/blkdev.h | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
--- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -70,7 +70,7 @@ EXPORT_SYMBOL_GPL(__blk_req_zone_write_u static inline unsigned int __blkdev_nr_zones(struct request_queue *q, sector_t nr_sectors) { - unsigned long zone_sectors = blk_queue_zone_sectors(q); + sector_t zone_sectors = blk_queue_zone_sectors(q);
return (nr_sectors + zone_sectors - 1) >> ilog2(zone_sectors); } --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -681,7 +681,7 @@ static inline bool blk_queue_is_zoned(st } }
-static inline unsigned int blk_queue_zone_sectors(struct request_queue *q) +static inline sector_t blk_queue_zone_sectors(struct request_queue *q) { return blk_queue_is_zoned(q) ? q->limits.chunk_sectors : 0; } @@ -1429,7 +1429,7 @@ static inline bool bdev_is_zoned(struct return false; }
-static inline unsigned int bdev_zone_sectors(struct block_device *bdev) +static inline sector_t bdev_zone_sectors(struct block_device *bdev) { struct request_queue *q = bdev_get_queue(bdev);
From: Bart Van Assche bvanassche@acm.org
commit bcef5b7215681250c4bf8961dfe15e9e4fef97d0 upstream.
The function srp_parse_in() is used both for parsing source address specifications and for target address specifications. Target addresses must have a port number. Having to specify a port number for source addresses is inconvenient. Make sure that srp_parse_in() supports again parsing addresses with no port number.
Cc: stable@vger.kernel.org Fixes: c62adb7def71 ("IB/srp: Fix IPv6 address parsing") Signed-off-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/ulp/srp/ib_srp.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-)
--- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -3483,13 +3483,14 @@ static const match_table_t srp_opt_token * @net: [in] Network namespace. * @sa: [out] Address family, IP address and port number. * @addr_port_str: [in] IP address and port number. + * @has_port: [out] Whether or not @addr_port_str includes a port number. * * Parse the following address formats: * - IPv4: <ip_address>:<port>, e.g. 1.2.3.4:5. * - IPv6: [<ipv6_address>]:<port>, e.g. [1::2:3%4]:5. */ static int srp_parse_in(struct net *net, struct sockaddr_storage *sa, - const char *addr_port_str) + const char *addr_port_str, bool *has_port) { char *addr_end, *addr = kstrdup(addr_port_str, GFP_KERNEL); char *port_str; @@ -3498,9 +3499,12 @@ static int srp_parse_in(struct net *net, if (!addr) return -ENOMEM; port_str = strrchr(addr, ':'); - if (!port_str) - return -EINVAL; - *port_str++ = '\0'; + if (port_str && strchr(port_str, ']')) + port_str = NULL; + if (port_str) + *port_str++ = '\0'; + if (has_port) + *has_port = port_str != NULL; ret = inet_pton_with_scope(net, AF_INET, addr, port_str, sa); if (ret && addr[0]) { addr_end = addr + strlen(addr) - 1; @@ -3522,6 +3526,7 @@ static int srp_parse_options(struct net char *p; substring_t args[MAX_OPT_ARGS]; unsigned long long ull; + bool has_port; int opt_mask = 0; int token; int ret = -EINVAL; @@ -3620,7 +3625,8 @@ static int srp_parse_options(struct net ret = -ENOMEM; goto out; } - ret = srp_parse_in(net, &target->rdma_cm.src.ss, p); + ret = srp_parse_in(net, &target->rdma_cm.src.ss, p, + NULL); if (ret < 0) { pr_warn("bad source parameter '%s'\n", p); kfree(p); @@ -3636,7 +3642,10 @@ static int srp_parse_options(struct net ret = -ENOMEM; goto out; } - ret = srp_parse_in(net, &target->rdma_cm.dst.ss, p); + ret = srp_parse_in(net, &target->rdma_cm.dst.ss, p, + &has_port); + if (!has_port) + ret = -EINVAL; if (ret < 0) { pr_warn("bad dest parameter '%s'\n", p); kfree(p);
From: Jason Gunthorpe jgg@mellanox.com
commit 7608bf40cf2480057ec0da31456cc428791c32ef upstream.
If invalidate_start returns with EAGAIN then the umem_rwsem needs to be unlocked as no invalidate_end will be called.
Cc: stable@vger.kernel.org Fixes: ca748c39ea3f ("RDMA/umem: Get rid of per_mm->notifier_count") Signed-off-by: Jason Gunthorpe jgg@mellanox.com Reviewed-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Doug Ledford dledford@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/core/umem_odp.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
--- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -151,6 +151,7 @@ static int ib_umem_notifier_invalidate_r { struct ib_ucontext_per_mm *per_mm = container_of(mn, struct ib_ucontext_per_mm, mn); + int rc;
if (mmu_notifier_range_blockable(range)) down_read(&per_mm->umem_rwsem); @@ -167,11 +168,14 @@ static int ib_umem_notifier_invalidate_r return 0; }
- return rbt_ib_umem_for_each_in_range(&per_mm->umem_tree, range->start, - range->end, - invalidate_range_start_trampoline, - mmu_notifier_range_blockable(range), - NULL); + rc = rbt_ib_umem_for_each_in_range(&per_mm->umem_tree, range->start, + range->end, + invalidate_range_start_trampoline, + mmu_notifier_range_blockable(range), + NULL); + if (rc) + up_read(&per_mm->umem_rwsem); + return rc; }
static int invalidate_range_end_trampoline(struct ib_umem_odp *item, u64 start,
From: Alexander Shishkin alexander.shishkin@linux.intel.com
commit 4aa5aed2b6f267592705a526f57518a5d715b769 upstream.
This adds Ice Lake NNPI support to the Intel(R) Trace Hub.
Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20190621161930.60785-5-alexander.shishkin@linux.in... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwtracing/intel_th/pci.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/hwtracing/intel_th/pci.c +++ b/drivers/hwtracing/intel_th/pci.c @@ -194,6 +194,11 @@ static const struct pci_device_id intel_ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x02a6), .driver_data = (kernel_ulong_t)&intel_th_2x, }, + { + /* Ice Lake NNPI */ + PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x45c5), + .driver_data = (kernel_ulong_t)&intel_th_2x, + }, { 0 }, };
From: Dexuan Cui decui@microsoft.com
commit 4df591b20b80cb77920953812d894db259d85bd7 upstream.
Fix a use-after-free in hv_eject_device_work().
Fixes: 05f151a73ec2 ("PCI: hv: Fix a memory leak in hv_eject_device_work()") Signed-off-by: Dexuan Cui decui@microsoft.com Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Michael Kelley mikelley@microsoft.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/controller/pci-hyperv.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
--- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -1875,6 +1875,7 @@ static void hv_pci_devices_present(struc static void hv_eject_device_work(struct work_struct *work) { struct pci_eject_response *ejct_pkt; + struct hv_pcibus_device *hbus; struct hv_pci_dev *hpdev; struct pci_dev *pdev; unsigned long flags; @@ -1885,6 +1886,7 @@ static void hv_eject_device_work(struct } ctxt;
hpdev = container_of(work, struct hv_pci_dev, wrk); + hbus = hpdev->hbus;
WARN_ON(hpdev->state != hv_pcichild_ejecting);
@@ -1895,8 +1897,7 @@ static void hv_eject_device_work(struct * because hbus->pci_bus may not exist yet. */ wslot = wslot_to_devfn(hpdev->desc.win_slot.slot); - pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, 0, - wslot); + pdev = pci_get_domain_bus_and_slot(hbus->sysdata.domain, 0, wslot); if (pdev) { pci_lock_rescan_remove(); pci_stop_and_remove_bus_device(pdev); @@ -1904,9 +1905,9 @@ static void hv_eject_device_work(struct pci_unlock_rescan_remove(); }
- spin_lock_irqsave(&hpdev->hbus->device_list_lock, flags); + spin_lock_irqsave(&hbus->device_list_lock, flags); list_del(&hpdev->list_entry); - spin_unlock_irqrestore(&hpdev->hbus->device_list_lock, flags); + spin_unlock_irqrestore(&hbus->device_list_lock, flags);
if (hpdev->pci_slot) pci_destroy_slot(hpdev->pci_slot); @@ -1915,7 +1916,7 @@ static void hv_eject_device_work(struct ejct_pkt = (struct pci_eject_response *)&ctxt.pkt.message; ejct_pkt->message_type.type = PCI_EJECTION_COMPLETE; ejct_pkt->wslot.slot = hpdev->desc.win_slot.slot; - vmbus_sendpacket(hpdev->hbus->hdev->channel, ejct_pkt, + vmbus_sendpacket(hbus->hdev->channel, ejct_pkt, sizeof(*ejct_pkt), (unsigned long)&ctxt.pkt, VM_PKT_DATA_INBAND, 0);
@@ -1924,7 +1925,9 @@ static void hv_eject_device_work(struct /* For the two refs got in new_pcichild_device() */ put_pcichild(hpdev); put_pcichild(hpdev); - put_hvpcibus(hpdev->hbus); + /* hpdev has been freed. Do not use it any more. */ + + put_hvpcibus(hbus); }
/**
From: Mika Westerberg mika.westerberg@linux.intel.com
commit 000dd5316e1c756a1c028f22e01d06a38249dd4d upstream.
PME polling does not take into account that a device that is directly connected to the host bridge may go into D3cold as well. This leads to a situation where the PME poll thread reads from a config space of a device that is in D3cold and gets incorrect information because the config space is not accessible.
Here is an example from Intel Ice Lake system where two PCIe root ports are in D3cold (I've instrumented the kernel to log the PMCSR register contents):
[ 62.971442] pcieport 0000:00:07.1: Check PME status, PMCSR=0xffff [ 62.971504] pcieport 0000:00:07.0: Check PME status, PMCSR=0xffff
Since 0xffff is interpreted so that PME is pending, the root ports will be runtime resumed. This repeats over and over again essentially blocking all runtime power management.
Prevent this from happening by checking whether the device is in D3cold before its PME status is read.
Fixes: 71a83bd727cc ("PCI/PM: add runtime PM support to PCIe port") Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Reviewed-by: Lukas Wunner lukas@wunner.de Cc: 3.6+ stable@vger.kernel.org # v3.6+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/pci.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2060,6 +2060,13 @@ static void pci_pme_list_scan(struct wor */ if (bridge && bridge->current_state != PCI_D0) continue; + /* + * If the device is in D3cold it should not be + * polled either. + */ + if (pme_dev->dev->current_state == PCI_D3cold) + continue; + pci_pme_wakeup(pme_dev->dev, NULL); } else { list_del(&pme_dev->list);
From: Niklas Cassel niklas.cassel@linaro.org
commit 64adde31c8e996a6db6f7a1a4131180e363aa9f2 upstream.
Currently, there is only a 1 ms sleep after asserting PERST.
Reading the datasheets for different endpoints, some require PERST to be asserted for 10 ms in order for the endpoint to perform a reset, others require it to be asserted for 50 ms.
Several SoCs using this driver uses PCIe Mini Card, where we don't know what endpoint will be plugged in.
The PCI Express Card Electromechanical Specification r2.0, section 2.2, "PERST# Signal" specifies:
"On power up, the deassertion of PERST# is delayed 100 ms (TPVPERL) from the power rails achieving specified operating limits."
Add a sleep of 100 ms before deasserting PERST, in order to ensure that we are compliant with the spec.
Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver") Signed-off-by: Niklas Cassel niklas.cassel@linaro.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Acked-by: Stanimir Varbanov svarbanov@mm-sol.com Cc: stable@vger.kernel.org # 4.5+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/controller/dwc/pcie-qcom.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -178,6 +178,8 @@ static void qcom_ep_reset_assert(struct
static void qcom_ep_reset_deassert(struct qcom_pcie *pcie) { + /* Ensure that PERST has been asserted for at least 100 ms */ + msleep(100); gpiod_set_value_cansleep(pcie->reset, 0); usleep_range(PERST_DELAY_US, PERST_DELAY_US + 500); }
From: Johannes Thumshirn jthumshirn@suse.de
commit aa53e3bfac7205fb3a8815ac1c937fd6ed01b41e upstream.
Nikolay reported the following KASAN splat when running btrfs/048:
[ 1843.470920] ================================================================== [ 1843.471971] BUG: KASAN: slab-out-of-bounds in strncmp+0x66/0xb0 [ 1843.472775] Read of size 1 at addr ffff888111e369e2 by task btrfs/3979
[ 1843.473904] CPU: 3 PID: 3979 Comm: btrfs Not tainted 5.2.0-rc3-default #536 [ 1843.475009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 1843.476322] Call Trace: [ 1843.476674] dump_stack+0x7c/0xbb [ 1843.477132] ? strncmp+0x66/0xb0 [ 1843.477587] print_address_description+0x114/0x320 [ 1843.478256] ? strncmp+0x66/0xb0 [ 1843.478740] ? strncmp+0x66/0xb0 [ 1843.479185] __kasan_report+0x14e/0x192 [ 1843.479759] ? strncmp+0x66/0xb0 [ 1843.480209] kasan_report+0xe/0x20 [ 1843.480679] strncmp+0x66/0xb0 [ 1843.481105] prop_compression_validate+0x24/0x70 [ 1843.481798] btrfs_xattr_handler_set_prop+0x65/0x160 [ 1843.482509] __vfs_setxattr+0x71/0x90 [ 1843.483012] __vfs_setxattr_noperm+0x84/0x130 [ 1843.483606] vfs_setxattr+0xac/0xb0 [ 1843.484085] setxattr+0x18c/0x230 [ 1843.484546] ? vfs_setxattr+0xb0/0xb0 [ 1843.485048] ? __mod_node_page_state+0x1f/0xa0 [ 1843.485672] ? _raw_spin_unlock+0x24/0x40 [ 1843.486233] ? __handle_mm_fault+0x988/0x1290 [ 1843.486823] ? lock_acquire+0xb4/0x1e0 [ 1843.487330] ? lock_acquire+0xb4/0x1e0 [ 1843.487842] ? mnt_want_write_file+0x3c/0x80 [ 1843.488442] ? debug_lockdep_rcu_enabled+0x22/0x40 [ 1843.489089] ? rcu_sync_lockdep_assert+0xe/0x70 [ 1843.489707] ? __sb_start_write+0x158/0x200 [ 1843.490278] ? mnt_want_write_file+0x3c/0x80 [ 1843.490855] ? __mnt_want_write+0x98/0xe0 [ 1843.491397] __x64_sys_fsetxattr+0xba/0xe0 [ 1843.492201] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 1843.493201] do_syscall_64+0x6c/0x230 [ 1843.493988] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1843.495041] RIP: 0033:0x7fa7a8a7707a [ 1843.495819] Code: 48 8b 0d 21 de 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 be 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ee dd 2b 00 f7 d8 64 89 01 48 [ 1843.499203] RSP: 002b:00007ffcb73bca38 EFLAGS: 00000202 ORIG_RAX: 00000000000000be [ 1843.500210] RAX: ffffffffffffffda RBX: 00007ffcb73bda9d RCX: 00007fa7a8a7707a [ 1843.501170] RDX: 00007ffcb73bda9d RSI: 00000000006dc050 RDI: 0000000000000003 [ 1843.502152] RBP: 00000000006dc050 R08: 0000000000000000 R09: 0000000000000000 [ 1843.503109] R10: 0000000000000002 R11: 0000000000000202 R12: 00007ffcb73bda91 [ 1843.504055] R13: 0000000000000003 R14: 00007ffcb73bda82 R15: ffffffffffffffff
[ 1843.505268] Allocated by task 3979: [ 1843.505771] save_stack+0x19/0x80 [ 1843.506211] __kasan_kmalloc.constprop.5+0xa0/0xd0 [ 1843.506836] setxattr+0xeb/0x230 [ 1843.507264] __x64_sys_fsetxattr+0xba/0xe0 [ 1843.507886] do_syscall_64+0x6c/0x230 [ 1843.508429] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1843.509558] Freed by task 0: [ 1843.510188] (stack is not available)
[ 1843.511309] The buggy address belongs to the object at ffff888111e369e0 which belongs to the cache kmalloc-8 of size 8 [ 1843.514095] The buggy address is located 2 bytes inside of 8-byte region [ffff888111e369e0, ffff888111e369e8) [ 1843.516524] The buggy address belongs to the page: [ 1843.517561] page:ffff88813f478d80 refcount:1 mapcount:0 mapping:ffff88811940c300 index:0xffff888111e373b8 compound_mapcount: 0 [ 1843.519993] flags: 0x4404000010200(slab|head) [ 1843.520951] raw: 0004404000010200 ffff88813f48b008 ffff888119403d50 ffff88811940c300 [ 1843.522616] raw: ffff888111e373b8 000000000016000f 00000001ffffffff 0000000000000000 [ 1843.524281] page dumped because: kasan: bad access detected
[ 1843.525936] Memory state around the buggy address: [ 1843.526975] ffff888111e36880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1843.528479] ffff888111e36900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1843.530138] >ffff888111e36980: fc fc fc fc fc fc fc fc fc fc fc fc 02 fc fc fc [ 1843.531877] ^ [ 1843.533287] ffff888111e36a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1843.534874] ffff888111e36a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1843.536468] ==================================================================
This is caused by supplying a too short compression value ('lz') in the test-case and comparing it to 'lzo' with strncmp() and a length of 3. strncmp() read past the 'lz' when looking for the 'o' and thus caused an out-of-bounds read.
Introduce a new check 'btrfs_compress_is_valid_type()' which not only checks the user-supplied value against known compression types, but also employs checks for too short values.
Reported-by: Nikolay Borisov nborisov@suse.com Fixes: 272e5326c783 ("btrfs: prop: fix vanished compression property after failed set") CC: stable@vger.kernel.org # 5.1+ Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Johannes Thumshirn jthumshirn@suse.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/compression.c | 16 ++++++++++++++++ fs/btrfs/compression.h | 1 + fs/btrfs/props.c | 6 +----- 3 files changed, 18 insertions(+), 5 deletions(-)
--- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -42,6 +42,22 @@ const char* btrfs_compress_type2str(enum return NULL; }
+bool btrfs_compress_is_valid_type(const char *str, size_t len) +{ + int i; + + for (i = 1; i < ARRAY_SIZE(btrfs_compress_types); i++) { + size_t comp_len = strlen(btrfs_compress_types[i]); + + if (len < comp_len) + continue; + + if (!strncmp(btrfs_compress_types[i], str, comp_len)) + return true; + } + return false; +} + static int btrfs_decompress_bio(struct compressed_bio *cb);
static inline int compressed_bio_size(struct btrfs_fs_info *fs_info, --- a/fs/btrfs/compression.h +++ b/fs/btrfs/compression.h @@ -173,6 +173,7 @@ extern const struct btrfs_compress_op bt extern const struct btrfs_compress_op btrfs_zstd_compress;
const char* btrfs_compress_type2str(enum btrfs_compression_type type); +bool btrfs_compress_is_valid_type(const char *str, size_t len);
int btrfs_compress_heuristic(struct inode *inode, u64 start, u64 end);
--- a/fs/btrfs/props.c +++ b/fs/btrfs/props.c @@ -257,11 +257,7 @@ static int prop_compression_validate(con if (!value) return 0;
- if (!strncmp("lzo", value, 3)) - return 0; - else if (!strncmp("zlib", value, 4)) - return 0; - else if (!strncmp("zstd", value, 4)) + if (btrfs_compress_is_valid_type(value, len)) return 0;
return -EINVAL;
From: Filipe Manana fdmanana@suse.com
commit d1d832a0b51dd9570429bb4b81b2a6c1759e681a upstream.
When we log an inode, regardless of logging it completely or only that it exists, we always update it as logged (logged_trans and last_log_commit fields of the inode are updated). This is generally fine and avoids future attempts to log it from having to do repeated work that brings no value.
However, if we write data to a file, then evict its inode after all the dealloc was flushed (and ordered extents completed), rename the file and fsync it, we end up not logging the new extents, since the rename may result in logging that the inode exists in case the parent directory was logged before. The following reproducer shows and explains how this can happen:
$ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt
$ mkdir /mnt/dir $ touch /mnt/dir/foo $ touch /mnt/dir/bar
# Do a direct IO write instead of a buffered write because with a # buffered write we would need to make sure dealloc gets flushed and # complete before we do the inode eviction later, and we can not do that # from user space with call to things such as sync(2) since that results # in a transaction commit as well. $ xfs_io -d -c "pwrite -S 0xd3 0 4K" /mnt/dir/bar
# Keep the directory dir in use while we evict inodes. We want our file # bar's inode to be evicted but we don't want our directory's inode to # be evicted (if it were evicted too, we would not be able to reproduce # the issue since the first fsync below, of file foo, would result in a # transaction commit. $ ( cd /mnt/dir; while true; do :; done ) & $ pid=$!
# Wait a bit to give time for the background process to chdir. $ sleep 0.1
# Evict all inodes, except the inode for the directory dir because it is # currently in use by our background process. $ echo 2 > /proc/sys/vm/drop_caches
# fsync file foo, which ends up persisting information about the parent # directory because it is a new inode. $ xfs_io -c fsync /mnt/dir/foo
# Rename bar, this results in logging that this inode exists (inode item, # names, xattrs) because the parent directory is in the log. $ mv /mnt/dir/bar /mnt/dir/baz
# Now fsync baz, which ends up doing absolutely nothing because of the # rename operation which logged that the inode exists only. $ xfs_io -c fsync /mnt/dir/baz
<power failure>
$ mount /dev/sdb /mnt $ od -t x1 -A d /mnt/dir/baz 0000000
--> Empty file, data we wrote is missing.
Fix this by not updating last_sub_trans of an inode when we are logging only that it exists and the inode was not yet logged since it was loaded from disk (full_sync bit set), this is enough to make btrfs_inode_in_log() return false for this scenario and make us log the inode. The logged_trans of the inode is still always setsince that alone is used to track if names need to be deleted as part of unlink operations.
Fixes: 257c62e1bce03e ("Btrfs: avoid tree log commit when there are no changes") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/tree-log.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
--- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -5420,9 +5420,19 @@ log_extents: } }
+ /* + * Don't update last_log_commit if we logged that an inode exists after + * it was loaded to memory (full_sync bit set). + * This is to prevent data loss when we do a write to the inode, then + * the inode gets evicted after all delalloc was flushed, then we log + * it exists (due to a rename for example) and then fsync it. This last + * fsync would do nothing (not logging the extents previously written). + */ spin_lock(&inode->lock); inode->logged_trans = trans->transid; - inode->last_log_commit = inode->last_sub_trans; + if (inode_only != LOG_INODE_EXISTS || + !test_bit(BTRFS_INODE_NEEDS_FULL_SYNC, &inode->runtime_flags)) + inode->last_log_commit = inode->last_sub_trans; spin_unlock(&inode->lock); out_unlock: mutex_unlock(&inode->log_mutex);
From: Filipe Manana fdmanana@suse.com
commit 803f0f64d17769071d7287d9e3e3b79a3e1ae937 upstream.
In order to avoid searches on a log tree when unlinking an inode, we check if the inode being unlinked was logged in the current transaction, as well as the inode of its parent directory. When any of the inodes are logged, we proceed to delete directory items and inode reference items from the log, to ensure that if a subsequent fsync of only the inode being unlinked or only of the parent directory when the other is not fsync'ed as well, does not result in the entry still existing after a power failure.
That check however is not reliable when one of the inodes involved (the one being unlinked or its parent directory's inode) is evicted, since the logged_trans field is transient, that is, it is not stored on disk, so it is lost when the inode is evicted and loaded into memory again (which is set to zero on load). As a consequence the checks currently being done by btrfs_del_dir_entries_in_log() and btrfs_del_inode_ref_in_log() always return true if the inode was evicted before, regardless of the inode having been logged or not before (and in the current transaction), this results in the dentry being unlinked still existing after a log replay if after the unlink operation only one of the inodes involved is fsync'ed.
Example:
$ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt
$ mkdir /mnt/dir $ touch /mnt/dir/foo $ xfs_io -c fsync /mnt/dir/foo
# Keep an open file descriptor on our directory while we evict inodes. # We just want to evict the file's inode, the directory's inode must not # be evicted. $ ( cd /mnt/dir; while true; do :; done ) & $ pid=$!
# Wait a bit to give time to background process to chdir to our test # directory. $ sleep 0.5
# Trigger eviction of the file's inode. $ echo 2 > /proc/sys/vm/drop_caches
# Unlink our file and fsync the parent directory. After a power failure # we don't expect to see the file anymore, since we fsync'ed the parent # directory. $ rm -f $SCRATCH_MNT/dir/foo $ xfs_io -c fsync /mnt/dir
<power failure>
$ mount /dev/sdb /mnt $ ls /mnt/dir foo $ --> file still there, unlink not persisted despite explicit fsync on dir
Fix this by checking if the inode has the full_sync bit set in its runtime flags as well, since that bit is set everytime an inode is loaded from disk, or for other less common cases such as after a shrinking truncate or failure to allocate extent maps for holes, and gets cleared after the first fsync. Also consider the inode as possibly logged only if it was last modified in the current transaction (besides having the full_fsync flag set).
Fixes: 3a5f1d458ad161 ("Btrfs: Optimize btree walking while logging inodes") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/tree-log.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-)
--- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -3323,6 +3323,30 @@ int btrfs_free_log_root_tree(struct btrf }
/* + * Check if an inode was logged in the current transaction. We can't always rely + * on an inode's logged_trans value, because it's an in-memory only field and + * therefore not persisted. This means that its value is lost if the inode gets + * evicted and loaded again from disk (in which case it has a value of 0, and + * certainly it is smaller then any possible transaction ID), when that happens + * the full_sync flag is set in the inode's runtime flags, so on that case we + * assume eviction happened and ignore the logged_trans value, assuming the + * worst case, that the inode was logged before in the current transaction. + */ +static bool inode_logged(struct btrfs_trans_handle *trans, + struct btrfs_inode *inode) +{ + if (inode->logged_trans == trans->transid) + return true; + + if (inode->last_trans == trans->transid && + test_bit(BTRFS_INODE_NEEDS_FULL_SYNC, &inode->runtime_flags) && + !test_bit(BTRFS_FS_LOG_RECOVERING, &trans->fs_info->flags)) + return true; + + return false; +} + +/* * If both a file and directory are logged, and unlinks or renames are * mixed in, we have a few interesting corners: * @@ -3356,7 +3380,7 @@ int btrfs_del_dir_entries_in_log(struct int bytes_del = 0; u64 dir_ino = btrfs_ino(dir);
- if (dir->logged_trans < trans->transid) + if (!inode_logged(trans, dir)) return 0;
ret = join_running_log_trans(root); @@ -3460,7 +3484,7 @@ int btrfs_del_inode_ref_in_log(struct bt u64 index; int ret;
- if (inode->logged_trans < trans->transid) + if (!inode_logged(trans, inode)) return 0;
ret = join_running_log_trans(root);
From: Filipe Manana fdmanana@suse.com
commit 179006688a7e888cbff39577189f2e034786d06a upstream.
If the range for which we are punching a hole covers only part of a page, we end up updating the inode item but we skip the update of the inode's iversion, mtime and ctime. Fix that by ensuring we update those properties of the inode.
A patch for fstests test case generic/059 that tests this as been sent along with this fix.
Fixes: 2aaa66558172b0 ("Btrfs: add hole punching") Fixes: e8c1c76e804b18 ("Btrfs: add missing inode update when punching hole") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/file.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2721,6 +2721,11 @@ out_only_mutex: * for detecting, at fsync time, if the inode isn't yet in the * log tree or it's there but not up to date. */ + struct timespec64 now = current_time(inode); + + inode_inc_iversion(inode); + inode->i_mtime = now; + inode->i_ctime = now; trans = btrfs_start_transaction(root, 1); if (IS_ERR(trans)) { err = PTR_ERR(trans);
From: Danit Goldberg danitg@mellanox.com
commit 89705e92700170888236555fe91b45e4c1bb0985 upstream.
Userspace expects the IB_TM_CAP_RC bit to indicate that the device supports RC transport tag matching with rendezvous offload. However the firmware splits this into two capabilities for eager and rendezvous tag matching.
Only if the FW supports both modes should userspace be told the tag matching capability is available.
Cc: stable@vger.kernel.org # 4.13 Fixes: eb761894351d ("IB/mlx5: Fill XRQ capabilities") Signed-off-by: Danit Goldberg danitg@mellanox.com Reviewed-by: Yishai Hadas yishaih@mellanox.com Reviewed-by: Artemy Kovalyov artemyko@mellanox.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/mlx5/main.c | 8 ++++++-- include/rdma/ib_verbs.h | 4 ++-- 2 files changed, 8 insertions(+), 4 deletions(-)
--- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -1043,15 +1043,19 @@ static int mlx5_ib_query_device(struct i }
if (MLX5_CAP_GEN(mdev, tag_matching)) { - props->tm_caps.max_rndv_hdr_size = MLX5_TM_MAX_RNDV_MSG_SIZE; props->tm_caps.max_num_tags = (1 << MLX5_CAP_GEN(mdev, log_tag_matching_list_sz)) - 1; - props->tm_caps.flags = IB_TM_CAP_RC; props->tm_caps.max_ops = 1 << MLX5_CAP_GEN(mdev, log_max_qp_sz); props->tm_caps.max_sge = MLX5_TM_MAX_SGE; }
+ if (MLX5_CAP_GEN(mdev, tag_matching) && + MLX5_CAP_GEN(mdev, rndv_offload_rc)) { + props->tm_caps.flags = IB_TM_CAP_RNDV_RC; + props->tm_caps.max_rndv_hdr_size = MLX5_TM_MAX_RNDV_MSG_SIZE; + } + if (MLX5_CAP_GEN(dev->mdev, cq_moderation)) { props->cq_caps.max_cq_moderation_count = MLX5_MAX_CQ_COUNT; --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -327,8 +327,8 @@ struct ib_rss_caps { };
enum ib_tm_cap_flags { - /* Support tag matching on RC transport */ - IB_TM_CAP_RC = 1 << 0, + /* Support tag matching with rendezvous offload for RC transport */ + IB_TM_CAP_RNDV_RC = 1 << 0, };
struct ib_tm_caps {
From: Aaron Armstrong Skomra skomra@gmail.com
commit d8e9806005f28bbb49899dab2068e3359e22ba35 upstream.
Currently, the driver will attempt to set the mode on all devices with a center button, but some devices with a center button lack LEDs, and attempting to set the LEDs on devices without LEDs results in the kernel error message of the form:
"leds input8::wacom-0.1: Setting an LED's brightness failed (-32)"
This is because the generic codepath erroneously assumes that the BUTTON_CENTER usage indicates that the device has LEDs, the previously ignored TOUCH_RING_SETTING usage is a more accurate indication of the existence of LEDs on the device.
Fixes: 10c55cacb8b2 ("HID: wacom: generic: support LEDs") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Aaron Armstrong Skomra aaron.skomra@wacom.com Reviewed-by: Jason Gerecke jason.gerecke@wacom.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hid/wacom_sys.c | 3 +++ drivers/hid/wacom_wac.c | 2 -- drivers/hid/wacom_wac.h | 1 + 3 files changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/hid/wacom_sys.c +++ b/drivers/hid/wacom_sys.c @@ -304,6 +304,9 @@ static void wacom_feature_mapping(struct wacom_hid_usage_quirk(hdev, field, usage);
switch (equivalent_usage) { + case WACOM_HID_WD_TOUCH_RING_SETTING: + wacom->generic_has_leds = true; + break; case HID_DG_CONTACTMAX: /* leave touch_max as is if predefined */ if (!features->touch_max) { --- a/drivers/hid/wacom_wac.c +++ b/drivers/hid/wacom_wac.c @@ -1926,8 +1926,6 @@ static void wacom_wac_pad_usage_mapping( features->device_type |= WACOM_DEVICETYPE_PAD; break; case WACOM_HID_WD_BUTTONCENTER: - wacom->generic_has_leds = true; - /* fall through */ case WACOM_HID_WD_BUTTONHOME: case WACOM_HID_WD_BUTTONUP: case WACOM_HID_WD_BUTTONDOWN: --- a/drivers/hid/wacom_wac.h +++ b/drivers/hid/wacom_wac.h @@ -141,6 +141,7 @@ #define WACOM_HID_WD_OFFSETBOTTOM (WACOM_HID_UP_WACOMDIGITIZER | 0x0d33) #define WACOM_HID_WD_DATAMODE (WACOM_HID_UP_WACOMDIGITIZER | 0x1002) #define WACOM_HID_WD_DIGITIZERINFO (WACOM_HID_UP_WACOMDIGITIZER | 0x1013) +#define WACOM_HID_WD_TOUCH_RING_SETTING (WACOM_HID_UP_WACOMDIGITIZER | 0x1032) #define WACOM_HID_UP_G9 0xff090000 #define WACOM_HID_G9_PEN (WACOM_HID_UP_G9 | 0x02) #define WACOM_HID_G9_TOUCHSCREEN (WACOM_HID_UP_G9 | 0x11)
From: Aaron Armstrong Skomra skomra@gmail.com
commit d4b8efeb46d99a5d02e7f88ac4eaccbe49370770 upstream.
Only sync the pad once per report, not once per collection. Also avoid syncing the pad on battery reports.
Fixes: f8b6a74719b5 ("HID: wacom: generic: Support multiple tools per report") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Aaron Armstrong Skomra aaron.skomra@wacom.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hid/wacom_wac.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
--- a/drivers/hid/wacom_wac.c +++ b/drivers/hid/wacom_wac.c @@ -2117,14 +2117,12 @@ static void wacom_wac_pad_report(struct bool active = wacom_wac->hid_data.inrange_state != 0;
/* report prox for expresskey events */ - if ((wacom_equivalent_usage(field->physical) == HID_DG_TABLETFUNCTIONKEY) && - wacom_wac->hid_data.pad_input_event_flag) { + if (wacom_wac->hid_data.pad_input_event_flag) { input_event(input, EV_ABS, ABS_MISC, active ? PAD_DEVICE_ID : 0); input_sync(input); if (!active) wacom_wac->hid_data.pad_input_event_flag = false; } - }
static void wacom_wac_pen_usage_mapping(struct hid_device *hdev, @@ -2700,9 +2698,7 @@ static int wacom_wac_collection(struct h if (report->type != HID_INPUT_REPORT) return -1;
- if (WACOM_PAD_FIELD(field) && wacom->wacom_wac.pad_input) - wacom_wac_pad_report(hdev, report, field); - else if (WACOM_PEN_FIELD(field) && wacom->wacom_wac.pen_input) + if (WACOM_PEN_FIELD(field) && wacom->wacom_wac.pen_input) wacom_wac_pen_report(hdev, report); else if (WACOM_FINGER_FIELD(field) && wacom->wacom_wac.touch_input) wacom_wac_finger_report(hdev, report); @@ -2716,7 +2712,7 @@ void wacom_wac_report(struct hid_device struct wacom_wac *wacom_wac = &wacom->wacom_wac; struct hid_field *field; bool pad_in_hid_field = false, pen_in_hid_field = false, - finger_in_hid_field = false; + finger_in_hid_field = false, true_pad = false; int r; int prev_collection = -1;
@@ -2732,6 +2728,8 @@ void wacom_wac_report(struct hid_device pen_in_hid_field = true; if (WACOM_FINGER_FIELD(field)) finger_in_hid_field = true; + if (wacom_equivalent_usage(field->physical) == HID_DG_TABLETFUNCTIONKEY) + true_pad = true; }
wacom_wac_battery_pre_report(hdev, report); @@ -2755,6 +2753,9 @@ void wacom_wac_report(struct hid_device }
wacom_wac_battery_report(hdev, report); + + if (true_pad && wacom->wacom_wac.pad_input) + wacom_wac_pad_report(hdev, report, field); }
static int wacom_bpt_touch(struct wacom_wac *wacom)
From: Aaron Armstrong Skomra skomra@gmail.com
commit 68c20cc2164cc5c7c73f8012ae6491afdb1f7f72 upstream.
This affects the 2nd-gen Intuos Pro Medium and Large when using their Bluetooth connection.
Fixes: 4922cd26f03c ("HID: wacom: Support 2nd-gen Intuos Pro's Bluetooth classic interface") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Aaron Armstrong Skomra aaron.skomra@wacom.com Reviewed-by: Jason Gerecke jason.gerecke@wacom.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hid/wacom_wac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/hid/wacom_wac.c +++ b/drivers/hid/wacom_wac.c @@ -3712,7 +3712,7 @@ int wacom_setup_touch_input_capabilities 0, 5920, 4, 0); } input_abs_set_res(input_dev, ABS_MT_POSITION_X, 40); - input_abs_set_res(input_dev, ABS_MT_POSITION_X, 40); + input_abs_set_res(input_dev, ABS_MT_POSITION_Y, 40);
/* fall through */
From: Kuo-Hsin Yang vovoy@chromium.org
commit 2c012a4ad1a2cd3fb5a0f9307b9d219f84eda1fa upstream.
When file refaults are detected and there are many inactive file pages, the system never reclaim anonymous pages, the file pages are dropped aggressively when there are still a lot of cold anonymous pages and system thrashes. This issue impacts the performance of applications with large executable, e.g. chrome.
With this patch, when file refault is detected, inactive_list_is_low() always returns true for file pages in get_scan_count() to enable scanning anonymous pages.
The problem can be reproduced by the following test program.
---8<--- void fallocate_file(const char *filename, off_t size) { struct stat st; int fd;
if (!stat(filename, &st) && st.st_size >= size) return;
fd = open(filename, O_WRONLY | O_CREAT, 0600); if (fd < 0) { perror("create file"); exit(1); } if (posix_fallocate(fd, 0, size)) { perror("fallocate"); exit(1); } close(fd); }
long *alloc_anon(long size) { long *start = malloc(size); memset(start, 1, size); return start; }
long access_file(const char *filename, long size, long rounds) { int fd, i; volatile char *start1, *end1, *start2; const int page_size = getpagesize(); long sum = 0;
fd = open(filename, O_RDONLY); if (fd == -1) { perror("open"); exit(1); }
/* * Some applications, e.g. chrome, use a lot of executable file * pages, map some of the pages with PROT_EXEC flag to simulate * the behavior. */ start1 = mmap(NULL, size / 2, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0); if (start1 == MAP_FAILED) { perror("mmap"); exit(1); } end1 = start1 + size / 2;
start2 = mmap(NULL, size / 2, PROT_READ, MAP_SHARED, fd, size / 2); if (start2 == MAP_FAILED) { perror("mmap"); exit(1); }
for (i = 0; i < rounds; ++i) { struct timeval before, after; volatile char *ptr1 = start1, *ptr2 = start2; gettimeofday(&before, NULL); for (; ptr1 < end1; ptr1 += page_size, ptr2 += page_size) sum += *ptr1 + *ptr2; gettimeofday(&after, NULL); printf("File access time, round %d: %f (sec) ", i, (after.tv_sec - before.tv_sec) + (after.tv_usec - before.tv_usec) / 1000000.0); } return sum; }
int main(int argc, char *argv[]) { const long MB = 1024 * 1024; long anon_mb, file_mb, file_rounds; const char filename[] = "large"; long *ret1; long ret2;
if (argc != 4) { printf("usage: thrash ANON_MB FILE_MB FILE_ROUNDS "); exit(0); } anon_mb = atoi(argv[1]); file_mb = atoi(argv[2]); file_rounds = atoi(argv[3]);
fallocate_file(filename, file_mb * MB); printf("Allocate %ld MB anonymous pages ", anon_mb); ret1 = alloc_anon(anon_mb * MB); printf("Access %ld MB file pages ", file_mb); ret2 = access_file(filename, file_mb * MB, file_rounds); printf("Print result to prevent optimization: %ld ", *ret1 + ret2); return 0; } ---8<---
Running the test program on 2GB RAM VM with kernel 5.2.0-rc5, the program fills ram with 2048 MB memory, access a 200 MB file for 10 times. Without this patch, the file cache is dropped aggresively and every access to the file is from disk.
$ ./thrash 2048 200 10 Allocate 2048 MB anonymous pages Access 200 MB file pages File access time, round 0: 2.489316 (sec) File access time, round 1: 2.581277 (sec) File access time, round 2: 2.487624 (sec) File access time, round 3: 2.449100 (sec) File access time, round 4: 2.420423 (sec) File access time, round 5: 2.343411 (sec) File access time, round 6: 2.454833 (sec) File access time, round 7: 2.483398 (sec) File access time, round 8: 2.572701 (sec) File access time, round 9: 2.493014 (sec)
With this patch, these file pages can be cached.
$ ./thrash 2048 200 10 Allocate 2048 MB anonymous pages Access 200 MB file pages File access time, round 0: 2.475189 (sec) File access time, round 1: 2.440777 (sec) File access time, round 2: 2.411671 (sec) File access time, round 3: 1.955267 (sec) File access time, round 4: 0.029924 (sec) File access time, round 5: 0.000808 (sec) File access time, round 6: 0.000771 (sec) File access time, round 7: 0.000746 (sec) File access time, round 8: 0.000738 (sec) File access time, round 9: 0.000747 (sec)
Checked the swap out stats during the test [1], 19006 pages swapped out with this patch, 3418 pages swapped out without this patch. There are more swap out, but I think it's within reasonable range when file backed data set doesn't fit into the memory.
$ ./thrash 2000 100 2100 5 1 # ANON_MB FILE_EXEC FILE_NOEXEC ROUNDS PROCESSES Allocate 2000 MB anonymous pages active_anon: 1613644, inactive_anon: 348656, active_file: 892, inactive_file: 1384 (kB) pswpout: 7972443, pgpgin: 478615246 Access 100 MB executable file pages Access 2100 MB regular file pages File access time, round 0: 12.165, (sec) active_anon: 1433788, inactive_anon: 478116, active_file: 17896, inactive_file: 24328 (kB) File access time, round 1: 11.493, (sec) active_anon: 1430576, inactive_anon: 477144, active_file: 25440, inactive_file: 26172 (kB) File access time, round 2: 11.455, (sec) active_anon: 1427436, inactive_anon: 476060, active_file: 21112, inactive_file: 28808 (kB) File access time, round 3: 11.454, (sec) active_anon: 1420444, inactive_anon: 473632, active_file: 23216, inactive_file: 35036 (kB) File access time, round 4: 11.479, (sec) active_anon: 1413964, inactive_anon: 471460, active_file: 31728, inactive_file: 32224 (kB) pswpout: 7991449 (+ 19006), pgpgin: 489924366 (+ 11309120)
With 4 processes accessing non-overlapping parts of a large file, 30316 pages swapped out with this patch, 5152 pages swapped out without this patch. The swapout number is small comparing to pgpgin.
[1]: https://github.com/vovo/testing/blob/master/mem_thrash.c
Link: http://lkml.kernel.org/r/20190701081038.GA83398@google.com Fixes: e9868505987a ("mm,vmscan: only evict file pages when we have plenty") Fixes: 7c5bd705d8f9 ("mm: memcg: only evict file pages when we have plenty") Signed-off-by: Kuo-Hsin Yang vovoy@chromium.org Acked-by: Johannes Weiner hannes@cmpxchg.org Cc: Michal Hocko mhocko@suse.com Cc: Sonny Rao sonnyrao@chromium.org Cc: Mel Gorman mgorman@techsingularity.net Cc: Rik van Riel riel@redhat.com Cc: Vladimir Davydov vdavydov.dev@gmail.com Cc: Minchan Kim minchan@kernel.org Cc: stable@vger.kernel.org [4.12+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/vmscan.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2125,7 +2125,7 @@ static void shrink_active_list(unsigned * 10TB 320 32GB */ static bool inactive_list_is_low(struct lruvec *lruvec, bool file, - struct scan_control *sc, bool actual_reclaim) + struct scan_control *sc, bool trace) { enum lru_list active_lru = file * LRU_FILE + LRU_ACTIVE; struct pglist_data *pgdat = lruvec_pgdat(lruvec); @@ -2151,7 +2151,7 @@ static bool inactive_list_is_low(struct * rid of the stale workingset quickly. */ refaults = lruvec_page_state_local(lruvec, WORKINGSET_ACTIVATE); - if (file && actual_reclaim && lruvec->refaults != refaults) { + if (file && lruvec->refaults != refaults) { inactive_ratio = 0; } else { gb = (inactive + active) >> (30 - PAGE_SHIFT); @@ -2161,7 +2161,7 @@ static bool inactive_list_is_low(struct inactive_ratio = 1; }
- if (actual_reclaim) + if (trace) trace_mm_vmscan_inactive_list_is_low(pgdat->node_id, sc->reclaim_idx, lruvec_lru_size(lruvec, inactive_lru, MAX_NR_ZONES), inactive, lruvec_lru_size(lruvec, active_lru, MAX_NR_ZONES), active,
From: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com
commit 9bd3bb6703d8c0a5fb8aec8e3287bd55b7341dcd upstream.
Architectures like powerpc use different address range to map ioremap and vmalloc range. The memunmap() check used by the nvdimm layer was wrongly using is_vmalloc_addr() to check for ioremap range which fails for ppc64. This result in ppc64 not freeing the ioremap mapping. The side effect of this is an unbind failure during module unload with papr_scm nvdimm driver
Link: http://lkml.kernel.org/r/20190701134038.14165-1-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Fixes: b5beae5e224f ("powerpc/pseries: Add driver for PAPR SCM regions") Cc: Dan Williams dan.j.williams@intel.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/include/asm/pgtable.h | 14 ++++++++++++++ include/linux/mm.h | 5 +++++ kernel/iomem.c | 2 +- 3 files changed, 20 insertions(+), 1 deletion(-)
--- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -140,6 +140,20 @@ static inline void pte_frag_set(mm_conte } #endif
+#ifdef CONFIG_PPC64 +#define is_ioremap_addr is_ioremap_addr +static inline bool is_ioremap_addr(const void *x) +{ +#ifdef CONFIG_MMU + unsigned long addr = (unsigned long)x; + + return addr >= IOREMAP_BASE && addr < IOREMAP_END; +#else + return false; +#endif +} +#endif /* CONFIG_PPC64 */ + #endif /* __ASSEMBLY__ */
#endif /* _ASM_POWERPC_PGTABLE_H */ --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -633,6 +633,11 @@ static inline bool is_vmalloc_addr(const return false; #endif } + +#ifndef is_ioremap_addr +#define is_ioremap_addr(x) is_vmalloc_addr(x) +#endif + #ifdef CONFIG_MMU extern int is_vmalloc_or_module_addr(const void *x); #else --- a/kernel/iomem.c +++ b/kernel/iomem.c @@ -121,7 +121,7 @@ EXPORT_SYMBOL(memremap);
void memunmap(void *addr) { - if (is_vmalloc_addr(addr)) + if (is_ioremap_addr(addr)) iounmap((void __iomem *) addr); } EXPORT_SYMBOL(memunmap);
From: Dan Williams dan.j.williams@intel.com
commit 7e3e888dfc138089f4c15a81b418e88f0978f744 upstream.
At namespace creation time there is the potential for the "expected to be zero" fields of a 'pfn' info-block to be filled with indeterminate data. While the kernel buffer is zeroed on allocation it is immediately overwritten by nd_pfn_validate() filling it with the current contents of the on-media info-block location. For fields like, 'flags' and the 'padding' it potentially means that future implementations can not rely on those fields being zero.
In preparation to stop using the 'start_pad' and 'end_trunc' fields for section alignment, arrange for fields that are not explicitly initialized to be guaranteed zero. Bump the minor version to indicate it is safe to assume the 'padding' and 'flags' are zero. Otherwise, this corruption is expected to benign since all other critical fields are explicitly initialized.
Note The cc: stable is about spreading this new policy to as many kernels as possible not fixing an issue in those kernels. It is not until the change titled "libnvdimm/pfn: Stop padding pmem namespaces to section alignment" where this improper initialization becomes a problem. So if someone decides to backport "libnvdimm/pfn: Stop padding pmem namespaces to section alignment" (which is not tagged for stable), make sure this pre-requisite is flagged.
Link: http://lkml.kernel.org/r/156092356065.979959.6681003754765958296.stgit@dwill... Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem") Signed-off-by: Dan Williams dan.j.williams@intel.com Tested-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com [ppc64] Cc: stable@vger.kernel.org Cc: David Hildenbrand david@redhat.com Cc: Jane Chu jane.chu@oracle.com Cc: Jeff Moyer jmoyer@redhat.com Cc: Jérôme Glisse jglisse@redhat.com Cc: Jonathan Corbet corbet@lwn.net Cc: Logan Gunthorpe logang@deltatee.com Cc: Michal Hocko mhocko@suse.com Cc: Mike Rapoport rppt@linux.ibm.com Cc: Oscar Salvador osalvador@suse.de Cc: Pavel Tatashin pasha.tatashin@soleen.com Cc: Toshi Kani toshi.kani@hpe.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Wei Yang richardw.yang@linux.intel.com Cc: Jason Gunthorpe jgg@mellanox.com Cc: Christoph Hellwig hch@lst.de Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/nvdimm/dax_devs.c | 2 +- drivers/nvdimm/pfn.h | 1 + drivers/nvdimm/pfn_devs.c | 18 +++++++++++++++--- 3 files changed, 17 insertions(+), 4 deletions(-)
--- a/drivers/nvdimm/dax_devs.c +++ b/drivers/nvdimm/dax_devs.c @@ -118,7 +118,7 @@ int nd_dax_probe(struct device *dev, str nvdimm_bus_unlock(&ndns->dev); if (!dax_dev) return -ENOMEM; - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); nd_pfn->pfn_sb = pfn_sb; rc = nd_pfn_validate(nd_pfn, DAX_SIG); dev_dbg(dev, "dax: %s\n", rc == 0 ? dev_name(dax_dev) : "<none>"); --- a/drivers/nvdimm/pfn.h +++ b/drivers/nvdimm/pfn.h @@ -28,6 +28,7 @@ struct nd_pfn_sb { __le32 end_trunc; /* minor-version-2 record the base alignment of the mapping */ __le32 align; + /* minor-version-3 guarantee the padding and flags are zero */ u8 padding[4000]; __le64 checksum; }; --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -412,6 +412,15 @@ static int nd_pfn_clear_memmap_errors(st return 0; }
+/** + * nd_pfn_validate - read and validate info-block + * @nd_pfn: fsdax namespace runtime state / properties + * @sig: 'devdax' or 'fsdax' signature + * + * Upon return the info-block buffer contents (->pfn_sb) are + * indeterminate when validation fails, and a coherent info-block + * otherwise. + */ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) { u64 checksum, offset; @@ -557,7 +566,7 @@ int nd_pfn_probe(struct device *dev, str nvdimm_bus_unlock(&ndns->dev); if (!pfn_dev) return -ENOMEM; - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); nd_pfn = to_nd_pfn(pfn_dev); nd_pfn->pfn_sb = pfn_sb; rc = nd_pfn_validate(nd_pfn, PFN_SIG); @@ -694,7 +703,7 @@ static int nd_pfn_init(struct nd_pfn *nd u64 checksum; int rc;
- pfn_sb = devm_kzalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); if (!pfn_sb) return -ENOMEM;
@@ -703,11 +712,14 @@ static int nd_pfn_init(struct nd_pfn *nd sig = DAX_SIG; else sig = PFN_SIG; + rc = nd_pfn_validate(nd_pfn, sig); if (rc != -ENODEV) return rc;
/* no info block, do init */; + memset(pfn_sb, 0, sizeof(*pfn_sb)); + nd_region = to_nd_region(nd_pfn->dev.parent); if (nd_region->ro) { dev_info(&nd_pfn->dev, @@ -760,7 +772,7 @@ static int nd_pfn_init(struct nd_pfn *nd memcpy(pfn_sb->uuid, nd_pfn->uuid, 16); memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16); pfn_sb->version_major = cpu_to_le16(1); - pfn_sb->version_minor = cpu_to_le16(2); + pfn_sb->version_minor = cpu_to_le16(3); pfn_sb->start_pad = cpu_to_le32(start_pad); pfn_sb->end_trunc = cpu_to_le32(end_trunc); pfn_sb->align = cpu_to_le32(nd_pfn->align);
From: Yafang Shao laoar.shao@gmail.com
commit dd9239900e12db84c198855b262ae7796db1123b upstream.
When we calculate total statistics for memcg1_stats and memcg1_events, we use the the index 'i' in the for loop as the events index. Actually we should use memcg1_stats[i] and memcg1_events[i] as the events index.
Link: http://lkml.kernel.org/r/1562116978-19539-1-git-send-email-laoar.shao@gmail.... Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty"). Signed-off-by: Yafang Shao <laoar.shao@gmail.com Reviewed-by: Shakeel Butt shakeelb@google.com Cc: Michal Hocko mhocko@suse.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Yafang Shao shaoyafang@didiglobal.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/memcontrol.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3530,12 +3530,13 @@ static int memcg_stat_show(struct seq_fi if (memcg1_stats[i] == MEMCG_SWAP && !do_memsw_account()) continue; seq_printf(m, "total_%s %llu\n", memcg1_stat_names[i], - (u64)memcg_page_state(memcg, i) * PAGE_SIZE); + (u64)memcg_page_state(memcg, memcg1_stats[i]) * + PAGE_SIZE); }
for (i = 0; i < ARRAY_SIZE(memcg1_events); i++) seq_printf(m, "total_%s %llu\n", memcg1_event_names[i], - (u64)memcg_events(memcg, i)); + (u64)memcg_events(memcg, memcg1_events[i]));
for (i = 0; i < NR_LRU_LISTS; i++) seq_printf(m, "total_%s %llu\n", mem_cgroup_lru_names[i],
From: Henry Burns henryburns@google.com
commit 810481a246089117d98e3373a3cb735c3efc1f90 upstream.
Following zsmalloc.c's example we call trylock_page() and unlock_page(). Also make z3fold_page_migrate() assert that newpage is passed in locked, as per the documentation.
[akpm@linux-foundation.org: fix trylock_page return value test, per Shakeel] Link: http://lkml.kernel.org/r/20190702005122.41036-1-henryburns@google.com Link: http://lkml.kernel.org/r/20190702233538.52793-1-henryburns@google.com Signed-off-by: Henry Burns henryburns@google.com Suggested-by: Vitaly Wool vitalywool@gmail.com Acked-by: Vitaly Wool vitalywool@gmail.com Acked-by: David Rientjes rientjes@google.com Reviewed-by: Shakeel Butt shakeelb@google.com Cc: Vitaly Vul vitaly.vul@sony.com Cc: Mike Rapoport rppt@linux.vnet.ibm.com Cc: Xidong Wang wangxidong_97@163.com Cc: Jonathan Adams jwadams@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/z3fold.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
--- a/mm/z3fold.c +++ b/mm/z3fold.c @@ -924,7 +924,16 @@ retry: set_bit(PAGE_HEADLESS, &page->private); goto headless; } - __SetPageMovable(page, pool->inode->i_mapping); + if (can_sleep) { + lock_page(page); + __SetPageMovable(page, pool->inode->i_mapping); + unlock_page(page); + } else { + if (trylock_page(page)) { + __SetPageMovable(page, pool->inode->i_mapping); + unlock_page(page); + } + } z3fold_page_lock(zhdr);
found: @@ -1331,6 +1340,7 @@ static int z3fold_page_migrate(struct ad
VM_BUG_ON_PAGE(!PageMovable(page), page); VM_BUG_ON_PAGE(!PageIsolated(page), page); + VM_BUG_ON_PAGE(!PageLocked(newpage), newpage);
zhdr = page_address(page); pool = zhdr_to_pool(zhdr);
From: Jan Harkes jaharkes@cs.cmu.edu
commit 7fa0a1da3dadfd9216df7745a1331fdaa0940d1c upstream.
Patch series "Coda updates".
The following patch series is a collection of various fixes for Coda, most of which were collected from linux-fsdevel or linux-kernel but which have as yet not found their way upstream.
This patch (of 22):
Various file systems expect that vma->vm_file points at their own file handle, several use file_inode(vma->vm_file) to get at their inode or use vma->vm_file->private_data. However the way Coda wrapped mmap on a host file broke this assumption, vm_file was still pointing at the Coda file and the host file systems would scribble over Coda's inode and private file data.
This patch fixes the incorrect expectation and wraps vm_ops->open and vm_ops->close to allow Coda to track when the vm_area_struct is destroyed so we still release the reference on the Coda file handle at the right time.
Link: http://lkml.kernel.org/r/0e850c6e59c0b147dc2dcd51a3af004c948c3697.1558117389... Signed-off-by: Jan Harkes jaharkes@cs.cmu.edu Cc: Arnd Bergmann arnd@arndb.de Cc: Colin Ian King colin.king@canonical.com Cc: Dan Carpenter dan.carpenter@oracle.com Cc: David Howells dhowells@redhat.com Cc: Fabian Frederick fabf@skynet.be Cc: Mikko Rapeli mikko.rapeli@iki.fi Cc: Sam Protsenko semen.protsenko@linaro.org Cc: Yann Droneaud ydroneaud@opteya.com Cc: Zhouyang Jia jiazhouyang09@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/coda/file.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 68 insertions(+), 2 deletions(-)
--- a/fs/coda/file.c +++ b/fs/coda/file.c @@ -27,6 +27,13 @@ #include "coda_linux.h" #include "coda_int.h"
+struct coda_vm_ops { + atomic_t refcnt; + struct file *coda_file; + const struct vm_operations_struct *host_vm_ops; + struct vm_operations_struct vm_ops; +}; + static ssize_t coda_file_read_iter(struct kiocb *iocb, struct iov_iter *to) { @@ -61,6 +68,34 @@ coda_file_write_iter(struct kiocb *iocb, return ret; }
+static void +coda_vm_open(struct vm_area_struct *vma) +{ + struct coda_vm_ops *cvm_ops = + container_of(vma->vm_ops, struct coda_vm_ops, vm_ops); + + atomic_inc(&cvm_ops->refcnt); + + if (cvm_ops->host_vm_ops && cvm_ops->host_vm_ops->open) + cvm_ops->host_vm_ops->open(vma); +} + +static void +coda_vm_close(struct vm_area_struct *vma) +{ + struct coda_vm_ops *cvm_ops = + container_of(vma->vm_ops, struct coda_vm_ops, vm_ops); + + if (cvm_ops->host_vm_ops && cvm_ops->host_vm_ops->close) + cvm_ops->host_vm_ops->close(vma); + + if (atomic_dec_and_test(&cvm_ops->refcnt)) { + vma->vm_ops = cvm_ops->host_vm_ops; + fput(cvm_ops->coda_file); + kfree(cvm_ops); + } +} + static int coda_file_mmap(struct file *coda_file, struct vm_area_struct *vma) { @@ -68,6 +103,8 @@ coda_file_mmap(struct file *coda_file, s struct coda_inode_info *cii; struct file *host_file; struct inode *coda_inode, *host_inode; + struct coda_vm_ops *cvm_ops; + int ret;
cfi = CODA_FTOC(coda_file); BUG_ON(!cfi || cfi->cfi_magic != CODA_MAGIC); @@ -76,6 +113,13 @@ coda_file_mmap(struct file *coda_file, s if (!host_file->f_op->mmap) return -ENODEV;
+ if (WARN_ON(coda_file != vma->vm_file)) + return -EIO; + + cvm_ops = kmalloc(sizeof(struct coda_vm_ops), GFP_KERNEL); + if (!cvm_ops) + return -ENOMEM; + coda_inode = file_inode(coda_file); host_inode = file_inode(host_file);
@@ -89,6 +133,7 @@ coda_file_mmap(struct file *coda_file, s * the container file on us! */ else if (coda_inode->i_mapping != host_inode->i_mapping) { spin_unlock(&cii->c_lock); + kfree(cvm_ops); return -EBUSY; }
@@ -97,7 +142,29 @@ coda_file_mmap(struct file *coda_file, s cfi->cfi_mapcount++; spin_unlock(&cii->c_lock);
- return call_mmap(host_file, vma); + vma->vm_file = get_file(host_file); + ret = call_mmap(vma->vm_file, vma); + + if (ret) { + /* if call_mmap fails, our caller will put coda_file so we + * should drop the reference to the host_file that we got. + */ + fput(host_file); + kfree(cvm_ops); + } else { + /* here we add redirects for the open/close vm_operations */ + cvm_ops->host_vm_ops = vma->vm_ops; + if (vma->vm_ops) + cvm_ops->vm_ops = *vma->vm_ops; + + cvm_ops->vm_ops.open = coda_vm_open; + cvm_ops->vm_ops.close = coda_vm_close; + cvm_ops->coda_file = coda_file; + atomic_set(&cvm_ops->refcnt, 1); + + vma->vm_ops = &cvm_ops->vm_ops; + } + return ret; }
int coda_open(struct inode *coda_inode, struct file *coda_file) @@ -207,4 +274,3 @@ const struct file_operations coda_file_o .fsync = coda_fsync, .splice_read = generic_file_splice_read, }; -
From: Drew Davenport ddavenport@chromium.org
commit 6b15f678fb7d5ef54e089e6ace72f007fe6e9895 upstream.
For architectures using __WARN_TAINT, the WARN_ON macro did not print out the "cut here" string. The other WARN_XXX macros would print "cut here" inside __warn_printk, which is not called for WARN_ON since it doesn't have a message to print.
Link: http://lkml.kernel.org/r/20190624154831.163888-1-ddavenport@chromium.org Fixes: a7bed27af194 ("bug: fix "cut here" location for __WARN_TAINT architectures") Signed-off-by: Drew Davenport ddavenport@chromium.org Acked-by: Kees Cook keescook@chromium.org Tested-by: Kees Cook keescook@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/asm-generic/bug.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/include/asm-generic/bug.h +++ b/include/asm-generic/bug.h @@ -104,8 +104,10 @@ extern void warn_slowpath_null(const cha warn_slowpath_fmt_taint(__FILE__, __LINE__, taint, arg) #else extern __printf(1, 2) void __warn_printk(const char *fmt, ...); -#define __WARN() __WARN_TAINT(TAINT_WARN) -#define __WARN_printf(arg...) do { __warn_printk(arg); __WARN(); } while (0) +#define __WARN() do { \ + printk(KERN_WARNING CUT_HERE); __WARN_TAINT(TAINT_WARN); \ +} while (0) +#define __WARN_printf(arg...) __WARN_printf_taint(TAINT_WARN, arg) #define __WARN_printf_taint(taint, arg...) \ do { __warn_printk(arg); __WARN_TAINT(taint); } while (0) #endif
From: Nadav Amit namit@vmware.com
commit 49f17c26c123b60fd1c74629eef077740d16ffc2 upstream.
Since resources can be removed, locking should ensure that the resource is not removed while accessing it. However, find_next_iomem_res() does not hold the lock while copying the data of the resource.
Keep holding the lock while the data is copied. While at it, change the return value to a more informative value. It is disregarded by the callers.
[akpm@linux-foundation.org: fix find_next_iomem_res() documentation] Link: http://lkml.kernel.org/r/20190613045903.4922-2-namit@vmware.com Fixes: ff3cc952d3f00 ("resource: Add remove_resource interface") Signed-off-by: Nadav Amit namit@vmware.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Dan Williams dan.j.williams@intel.com Cc: Borislav Petkov bp@suse.de Cc: Toshi Kani toshi.kani@hpe.com Cc: Peter Zijlstra peterz@infradead.org Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Bjorn Helgaas bhelgaas@google.com Cc: Ingo Molnar mingo@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/resource.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
--- a/kernel/resource.c +++ b/kernel/resource.c @@ -326,7 +326,7 @@ EXPORT_SYMBOL(release_resource); * * If a resource is found, returns 0 and @*res is overwritten with the part * of the resource that's within [@start..@end]; if none is found, returns - * -1 or -EINVAL for other invalid parameters. + * -ENODEV. Returns -EINVAL for invalid parameters. * * This function walks the whole tree and not just first level children * unless @first_lvl is true. @@ -365,16 +365,16 @@ static int find_next_iomem_res(resource_ break; }
- read_unlock(&resource_lock); - if (!p) - return -1; + if (p) { + /* copy data */ + res->start = max(start, p->start); + res->end = min(end, p->end); + res->flags = p->flags; + res->desc = p->desc; + }
- /* copy data */ - res->start = max(start, p->start); - res->end = min(end, p->end); - res->flags = p->flags; - res->desc = p->desc; - return 0; + read_unlock(&resource_lock); + return p ? 0 : -ENODEV; }
static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
From: Steve Longerbeam slongerbeam@gmail.com
commit 3d1f62c686acdedf5ed9642b763f3808d6a47d1e upstream.
The saturation bit was being set at bit 9 in the second 32-bit word of the TPMEM CSC. This isn't correct, the saturation bit is bit 42, which is bit 10 of the second word.
Fixes: 1aa8ea0d2bd5d ("gpu: ipu-v3: Add Image Converter unit")
Signed-off-by: Steve Longerbeam slongerbeam@gmail.com Reviewed-by: Philipp Zabel p.zabel@pengutronix.de Cc: stable@vger.kernel.org Signed-off-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/ipu-v3/ipu-ic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/ipu-v3/ipu-ic.c +++ b/drivers/gpu/ipu-v3/ipu-ic.c @@ -251,7 +251,7 @@ static int init_csc(struct ipu_ic *ic, writel(param, base++);
param = ((a[0] & 0x1fe0) >> 5) | (params->scale << 8) | - (params->sat << 9); + (params->sat << 10); writel(param, base++);
param = ((a[1] & 0x1f) << 27) | ((c[0][1] & 0x1ff) << 18) |
From: Helge Deller deller@gmx.de
commit 34c32fc603311a72cb558e5e337555434f64c27b upstream.
On parisc the privilege level of a process is stored in the lowest two bits of the instruction pointers (IAOQ0 and IAOQ1). On Linux we use privilege level 0 for the kernel and privilege level 3 for user-space. So userspace should not be allowed to modify IAOQ0 or IAOQ1 of a ptraced process to change it's privilege level to e.g. 0 to try to gain kernel privileges.
This patch prevents such modifications in the regset support functions by always setting the two lowest bits to one (which relates to privilege level 3 for user-space) if IAOQ0 or IAOQ1 are modified via ptrace regset calls.
Link: https://bugs.gentoo.org/481768 Cc: stable@vger.kernel.org # v4.7+ Tested-by: Rolf Eike Beer eike-kernel@sf-tec.de Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/parisc/kernel/ptrace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/parisc/kernel/ptrace.c +++ b/arch/parisc/kernel/ptrace.c @@ -496,7 +496,8 @@ static void set_reg(struct pt_regs *regs return; case RI(iaoq[0]): case RI(iaoq[1]): - regs->iaoq[num - RI(iaoq[0])] = val; + /* set 2 lowest bits to ensure userspace privilege: */ + regs->iaoq[num - RI(iaoq[0])] = val | 3; return; case RI(sar): regs->sar = val; return;
From: Helge Deller deller@gmx.de
commit 59a783dbc0d5fd6792aabff933055373b6dcbf2a upstream.
When running gdb I was able to trigger this kernel panic:
Kernel Fault: Code=26 (Data memory access rights trap) at addr 0000000000000060 CPU: 0 PID: 1401 Comm: gdb-crash Not tainted 5.2.0-rc7-64bit+ #1053
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001000000000000001111 Not tainted r00-03 000000000804000f 0000000040dee1a0 0000000040c78cf0 00000000b8d50160 r04-07 0000000040d2b1a0 000000004360a098 00000000bbbe87b8 0000000000000003 r08-11 00000000fac20a70 00000000fac24160 00000000fac1bbe0 0000000000000000 r12-15 00000000fabfb79a 00000000fac244a4 0000000000010000 0000000000000001 r16-19 00000000bbbe87b8 00000000f8f02910 0000000000010034 0000000000000000 r20-23 00000000fac24630 00000000fac24630 000000006474e552 00000000fac1aa52 r24-27 0000000000000028 00000000bbbe87b8 00000000bbbe87b8 0000000040d2b1a0 r28-31 0000000000000000 00000000b8d501c0 00000000b8d501f0 0000000003424000 sr00-03 0000000000423000 0000000000000000 0000000000000000 0000000000423000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040c78cf0 0000000040c78cf4 IIR: 539f00c0 ISR: 0000000000000000 IOR: 0000000000000060 CPU: 0 CR30: 00000000b8d50000 CR31: 00000000d22345e2 ORIG_R28: 0000000040250798 IAOQ[0]: parisc_kprobe_ss_handler+0x58/0x170 IAOQ[1]: parisc_kprobe_ss_handler+0x5c/0x170 RP(r2): parisc_kprobe_ss_handler+0x58/0x170 Backtrace: [<0000000040206ff8>] handle_interruption+0x178/0xbb8 Kernel panic - not syncing: Kernel Fault
Avoid this panic by checking the return value of kprobe_running() and skip kprobe if none is currently active.
Cc: stable@vger.kernel.org # v5.2 Acked-by: Sven Schnelle svens@stackframe.org Tested-by: Rolf Eike Beer eike-kernel@sf-tec.de Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/parisc/kernel/kprobes.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/arch/parisc/kernel/kprobes.c +++ b/arch/parisc/kernel/kprobes.c @@ -133,6 +133,9 @@ int __kprobes parisc_kprobe_ss_handler(s struct kprobe_ctlblk *kcb = get_kprobe_ctlblk(); struct kprobe *p = kprobe_running();
+ if (!p) + return 0; + if (regs->iaoq[0] != (unsigned long)p->ainsn.insn+4) return 0;
From: Helge Deller deller@gmx.de
commit 10835c854685393a921b68f529bf740fa7c9984d upstream.
On parisc the privilege level of a process is stored in the lowest two bits of the instruction pointers (IAOQ0 and IAOQ1). On Linux we use privilege level 0 for the kernel and privilege level 3 for user-space. So userspace should not be allowed to modify IAOQ0 or IAOQ1 of a ptraced process to change it's privilege level to e.g. 0 to try to gain kernel privileges.
This patch prevents such modifications by always setting the two lowest bits to one (which relates to privilege level 3 for user-space) if IAOQ0 or IAOQ1 are modified via ptrace calls in the native and compat ptrace paths.
Link: https://bugs.gentoo.org/481768 Reported-by: Jeroen Roovers jer@gentoo.org Cc: stable@vger.kernel.org Tested-by: Rolf Eike Beer eike-kernel@sf-tec.de Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/parisc/kernel/ptrace.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-)
--- a/arch/parisc/kernel/ptrace.c +++ b/arch/parisc/kernel/ptrace.c @@ -167,6 +167,9 @@ long arch_ptrace(struct task_struct *chi if ((addr & (sizeof(unsigned long)-1)) || addr >= sizeof(struct pt_regs)) break; + if (addr == PT_IAOQ0 || addr == PT_IAOQ1) { + data |= 3; /* ensure userspace privilege */ + } if ((addr >= PT_GR1 && addr <= PT_GR31) || addr == PT_IAOQ0 || addr == PT_IAOQ1 || (addr >= PT_FR0 && addr <= PT_FR31 + 4) || @@ -228,16 +231,18 @@ long arch_ptrace(struct task_struct *chi
static compat_ulong_t translate_usr_offset(compat_ulong_t offset) { - if (offset < 0) - return sizeof(struct pt_regs); - else if (offset <= 32*4) /* gr[0..31] */ - return offset * 2 + 4; - else if (offset <= 32*4+32*8) /* gr[0..31] + fr[0..31] */ - return offset + 32*4; - else if (offset < sizeof(struct pt_regs)/2 + 32*4) - return offset * 2 + 4 - 32*8; + compat_ulong_t pos; + + if (offset < 32*4) /* gr[0..31] */ + pos = offset * 2 + 4; + else if (offset < 32*4+32*8) /* fr[0] ... fr[31] */ + pos = (offset - 32*4) + PT_FR0; + else if (offset < sizeof(struct pt_regs)/2 + 32*4) /* sr[0] ... ipsw */ + pos = (offset - 32*4 - 32*8) * 2 + PT_SR0 + 4; else - return sizeof(struct pt_regs); + pos = sizeof(struct pt_regs); + + return pos; }
long compat_arch_ptrace(struct task_struct *child, compat_long_t request, @@ -281,9 +286,12 @@ long compat_arch_ptrace(struct task_stru addr = translate_usr_offset(addr); if (addr >= sizeof(struct pt_regs)) break; + if (addr == PT_IAOQ0+4 || addr == PT_IAOQ1+4) { + data |= 3; /* ensure userspace privilege */ + } if (addr >= PT_FR0 && addr <= PT_FR31 + 4) { /* Special case, fp regs are 64 bits anyway */ - *(__u64 *) ((char *) task_regs(child) + addr) = data; + *(__u32 *) ((char *) task_regs(child) + addr) = data; ret = 0; } else if ((addr >= PT_GR1+4 && addr <= PT_GR31+4) ||
From: Christophe Leroy christophe.leroy@c-s.fr
commit 6ecb78ef56e08d2119d337ae23cb951a640dc52d upstream.
Previously, only IBAT1 and IBAT2 were used to map kernel linear mem. Since commit 63b2bc619565 ("powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX"), we may have all 8 BATs used for mapping kernel text. But the suspend/restore functions only save/restore BATs 0 to 3, and clears BATs 4 to 7.
Make suspend and restore functions respectively save and reload the 8 BATs on CPUs having MMU_FTR_USE_HIGH_BATS feature.
Reported-by: Andreas Schwab schwab@linux-m68k.org Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/swsusp_32.S | 73 ++++++++++++++++++++++++++++---- arch/powerpc/platforms/powermac/sleep.S | 68 +++++++++++++++++++++++++++-- 2 files changed, 128 insertions(+), 13 deletions(-)
--- a/arch/powerpc/kernel/swsusp_32.S +++ b/arch/powerpc/kernel/swsusp_32.S @@ -25,11 +25,19 @@ #define SL_IBAT2 0x48 #define SL_DBAT3 0x50 #define SL_IBAT3 0x58 -#define SL_TB 0x60 -#define SL_R2 0x68 -#define SL_CR 0x6c -#define SL_LR 0x70 -#define SL_R12 0x74 /* r12 to r31 */ +#define SL_DBAT4 0x60 +#define SL_IBAT4 0x68 +#define SL_DBAT5 0x70 +#define SL_IBAT5 0x78 +#define SL_DBAT6 0x80 +#define SL_IBAT6 0x88 +#define SL_DBAT7 0x90 +#define SL_IBAT7 0x98 +#define SL_TB 0xa0 +#define SL_R2 0xa8 +#define SL_CR 0xac +#define SL_LR 0xb0 +#define SL_R12 0xb4 /* r12 to r31 */ #define SL_SIZE (SL_R12 + 80)
.section .data @@ -114,6 +122,41 @@ _GLOBAL(swsusp_arch_suspend) mfibatl r4,3 stw r4,SL_IBAT3+4(r11)
+BEGIN_MMU_FTR_SECTION + mfspr r4,SPRN_DBAT4U + stw r4,SL_DBAT4(r11) + mfspr r4,SPRN_DBAT4L + stw r4,SL_DBAT4+4(r11) + mfspr r4,SPRN_DBAT5U + stw r4,SL_DBAT5(r11) + mfspr r4,SPRN_DBAT5L + stw r4,SL_DBAT5+4(r11) + mfspr r4,SPRN_DBAT6U + stw r4,SL_DBAT6(r11) + mfspr r4,SPRN_DBAT6L + stw r4,SL_DBAT6+4(r11) + mfspr r4,SPRN_DBAT7U + stw r4,SL_DBAT7(r11) + mfspr r4,SPRN_DBAT7L + stw r4,SL_DBAT7+4(r11) + mfspr r4,SPRN_IBAT4U + stw r4,SL_IBAT4(r11) + mfspr r4,SPRN_IBAT4L + stw r4,SL_IBAT4+4(r11) + mfspr r4,SPRN_IBAT5U + stw r4,SL_IBAT5(r11) + mfspr r4,SPRN_IBAT5L + stw r4,SL_IBAT5+4(r11) + mfspr r4,SPRN_IBAT6U + stw r4,SL_IBAT6(r11) + mfspr r4,SPRN_IBAT6L + stw r4,SL_IBAT6+4(r11) + mfspr r4,SPRN_IBAT7U + stw r4,SL_IBAT7(r11) + mfspr r4,SPRN_IBAT7L + stw r4,SL_IBAT7+4(r11) +END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS) + #if 0 /* Backup various CPU config stuffs */ bl __save_cpu_setup @@ -279,27 +322,41 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC) mtibatu 3,r4 lwz r4,SL_IBAT3+4(r11) mtibatl 3,r4 -#endif - BEGIN_MMU_FTR_SECTION - li r4,0 + lwz r4,SL_DBAT4(r11) mtspr SPRN_DBAT4U,r4 + lwz r4,SL_DBAT4+4(r11) mtspr SPRN_DBAT4L,r4 + lwz r4,SL_DBAT5(r11) mtspr SPRN_DBAT5U,r4 + lwz r4,SL_DBAT5+4(r11) mtspr SPRN_DBAT5L,r4 + lwz r4,SL_DBAT6(r11) mtspr SPRN_DBAT6U,r4 + lwz r4,SL_DBAT6+4(r11) mtspr SPRN_DBAT6L,r4 + lwz r4,SL_DBAT7(r11) mtspr SPRN_DBAT7U,r4 + lwz r4,SL_DBAT7+4(r11) mtspr SPRN_DBAT7L,r4 + lwz r4,SL_IBAT4(r11) mtspr SPRN_IBAT4U,r4 + lwz r4,SL_IBAT4+4(r11) mtspr SPRN_IBAT4L,r4 + lwz r4,SL_IBAT5(r11) mtspr SPRN_IBAT5U,r4 + lwz r4,SL_IBAT5+4(r11) mtspr SPRN_IBAT5L,r4 + lwz r4,SL_IBAT6(r11) mtspr SPRN_IBAT6U,r4 + lwz r4,SL_IBAT6+4(r11) mtspr SPRN_IBAT6L,r4 + lwz r4,SL_IBAT7(r11) mtspr SPRN_IBAT7U,r4 + lwz r4,SL_IBAT7+4(r11) mtspr SPRN_IBAT7L,r4 END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS) +#endif
/* Flush all TLBs */ lis r4,0x1000 --- a/arch/powerpc/platforms/powermac/sleep.S +++ b/arch/powerpc/platforms/powermac/sleep.S @@ -33,10 +33,18 @@ #define SL_IBAT2 0x48 #define SL_DBAT3 0x50 #define SL_IBAT3 0x58 -#define SL_TB 0x60 -#define SL_R2 0x68 -#define SL_CR 0x6c -#define SL_R12 0x70 /* r12 to r31 */ +#define SL_DBAT4 0x60 +#define SL_IBAT4 0x68 +#define SL_DBAT5 0x70 +#define SL_IBAT5 0x78 +#define SL_DBAT6 0x80 +#define SL_IBAT6 0x88 +#define SL_DBAT7 0x90 +#define SL_IBAT7 0x98 +#define SL_TB 0xa0 +#define SL_R2 0xa8 +#define SL_CR 0xac +#define SL_R12 0xb0 /* r12 to r31 */ #define SL_SIZE (SL_R12 + 80)
.section .text @@ -121,6 +129,41 @@ _GLOBAL(low_sleep_handler) mfibatl r4,3 stw r4,SL_IBAT3+4(r1)
+BEGIN_MMU_FTR_SECTION + mfspr r4,SPRN_DBAT4U + stw r4,SL_DBAT4(r1) + mfspr r4,SPRN_DBAT4L + stw r4,SL_DBAT4+4(r1) + mfspr r4,SPRN_DBAT5U + stw r4,SL_DBAT5(r1) + mfspr r4,SPRN_DBAT5L + stw r4,SL_DBAT5+4(r1) + mfspr r4,SPRN_DBAT6U + stw r4,SL_DBAT6(r1) + mfspr r4,SPRN_DBAT6L + stw r4,SL_DBAT6+4(r1) + mfspr r4,SPRN_DBAT7U + stw r4,SL_DBAT7(r1) + mfspr r4,SPRN_DBAT7L + stw r4,SL_DBAT7+4(r1) + mfspr r4,SPRN_IBAT4U + stw r4,SL_IBAT4(r1) + mfspr r4,SPRN_IBAT4L + stw r4,SL_IBAT4+4(r1) + mfspr r4,SPRN_IBAT5U + stw r4,SL_IBAT5(r1) + mfspr r4,SPRN_IBAT5L + stw r4,SL_IBAT5+4(r1) + mfspr r4,SPRN_IBAT6U + stw r4,SL_IBAT6(r1) + mfspr r4,SPRN_IBAT6L + stw r4,SL_IBAT6+4(r1) + mfspr r4,SPRN_IBAT7U + stw r4,SL_IBAT7(r1) + mfspr r4,SPRN_IBAT7L + stw r4,SL_IBAT7+4(r1) +END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS) + /* Backup various CPU config stuffs */ bl __save_cpu_setup
@@ -321,22 +364,37 @@ grackle_wake_up: mtibatl 3,r4
BEGIN_MMU_FTR_SECTION - li r4,0 + lwz r4,SL_DBAT4(r1) mtspr SPRN_DBAT4U,r4 + lwz r4,SL_DBAT4+4(r1) mtspr SPRN_DBAT4L,r4 + lwz r4,SL_DBAT5(r1) mtspr SPRN_DBAT5U,r4 + lwz r4,SL_DBAT5+4(r1) mtspr SPRN_DBAT5L,r4 + lwz r4,SL_DBAT6(r1) mtspr SPRN_DBAT6U,r4 + lwz r4,SL_DBAT6+4(r1) mtspr SPRN_DBAT6L,r4 + lwz r4,SL_DBAT7(r1) mtspr SPRN_DBAT7U,r4 + lwz r4,SL_DBAT7+4(r1) mtspr SPRN_DBAT7L,r4 + lwz r4,SL_IBAT4(r1) mtspr SPRN_IBAT4U,r4 + lwz r4,SL_IBAT4+4(r1) mtspr SPRN_IBAT4L,r4 + lwz r4,SL_IBAT5(r1) mtspr SPRN_IBAT5U,r4 + lwz r4,SL_IBAT5+4(r1) mtspr SPRN_IBAT5L,r4 + lwz r4,SL_IBAT6(r1) mtspr SPRN_IBAT6U,r4 + lwz r4,SL_IBAT6+4(r1) mtspr SPRN_IBAT6L,r4 + lwz r4,SL_IBAT7(r1) mtspr SPRN_IBAT7U,r4 + lwz r4,SL_IBAT7+4(r1) mtspr SPRN_IBAT7L,r4 END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS)
From: Andreas Schwab schwab@linux-m68k.org
commit 46c2478af610efb3212b8b08f74389d69899ef70 upstream.
Move a misplaced paren that makes the condition always true.
Fixes: 63b2bc619565 ("powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX") Cc: stable@vger.kernel.org # v5.1+ Signed-off-by: Andreas Schwab schwab@linux-m68k.org Reviewed-by: Christophe Leroy christophe.leroy@c-s.fr Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/mm/pgtable_32.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -360,7 +360,7 @@ void mark_initmem_nx(void) unsigned long numpages = PFN_UP((unsigned long)_einittext) - PFN_DOWN((unsigned long)_sinittext);
- if (v_block_mapped((unsigned long)_stext) + 1) + if (v_block_mapped((unsigned long)_stext + 1)) mmu_mark_initmem_nx(); else change_page_attr(page, numpages, PAGE_KERNEL);
From: Ravi Bangoria ravi.bangoria@linux.ibm.com
commit f474c28fbcbe42faca4eb415172c07d76adcb819 upstream.
powerpc hardware triggers watchpoint before executing the instruction. To make trigger-after-execute behavior, kernel emulates the instruction. If the instruction is 'load something into non-volatile register', exception handler should restore emulated register state while returning back, otherwise there will be register state corruption. eg, adding a watchpoint on a list can corrput the list:
# cat /proc/kallsyms | grep kthread_create_list c00000000121c8b8 d kthread_create_list
Add watchpoint on kthread_create_list->prev:
# perf record -e mem:0xc00000000121c8c0
Run some workload such that new kthread gets invoked. eg, I just logged out from console:
list_add corruption. next->prev should be prev (c000000001214e00), \ but was c00000000121c8b8. (next=c00000000121c8b8). WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0 CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69 ... NIP __list_add_valid+0xb4/0xc0 LR __list_add_valid+0xb0/0xc0 Call Trace: __list_add_valid+0xb0/0xc0 (unreliable) __kthread_create_on_node+0xe0/0x260 kthread_create_on_node+0x34/0x50 create_worker+0xe8/0x260 worker_thread+0x444/0x560 kthread+0x160/0x1a0 ret_from_kernel_thread+0x5c/0x70
List corruption happened because it uses 'load into non-volatile register' instruction:
Snippet from __kthread_create_on_node:
c000000000136be8: addis r29,r2,-19 c000000000136bec: ld r29,31424(r29) if (!__list_add_valid(new, prev, next)) c000000000136bf0: mr r3,r30 c000000000136bf4: mr r5,r28 c000000000136bf8: mr r4,r29 c000000000136bfc: bl c00000000059a2f8 <__list_add_valid+0x8>
Register state from WARN_ON():
GPR00: c00000000059a3a0 c000007ff23afb50 c000000001344e00 0000000000000075 GPR04: 0000000000000000 0000000000000000 0000001852af8bc1 0000000000000000 GPR08: 0000000000000001 0000000000000007 0000000000000006 00000000000004aa GPR12: 0000000000000000 c000007ffffeb080 c000000000137038 c000005ff62aaa00 GPR16: 0000000000000000 0000000000000000 c000007fffbe7600 c000007fffbe7370 GPR20: c000007fffbe7320 c000007fffbe7300 c000000001373a00 0000000000000000 GPR24: fffffffffffffef7 c00000000012e320 c000007ff23afcb0 c000000000cb8628 GPR28: c00000000121c8b8 c000000001214e00 c000007fef5b17e8 c000007fef5b17c0
Watchpoint hit at 0xc000000000136bec.
addis r29,r2,-19 => r29 = 0xc000000001344e00 + (-19 << 16) => r29 = 0xc000000001214e00
ld r29,31424(r29) => r29 = *(0xc000000001214e00 + 31424) => r29 = *(0xc00000000121c8c0)
0xc00000000121c8c0 is where we placed a watchpoint and thus this instruction was emulated by emulate_step. But because handle_dabr_fault did not restore emulated register state, r29 still contains stale value in above register state.
Fixes: 5aae8a5370802 ("powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors") Signed-off-by: Ravi Bangoria ravi.bangoria@linux.ibm.com Cc: stable@vger.kernel.org # 2.6.36+ Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/exceptions-64s.S | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -1746,7 +1746,7 @@ handle_page_fault: addi r3,r1,STACK_FRAME_OVERHEAD bl do_page_fault cmpdi r3,0 - beq+ 12f + beq+ ret_from_except_lite bl save_nvgprs mr r5,r3 addi r3,r1,STACK_FRAME_OVERHEAD @@ -1761,7 +1761,12 @@ handle_dabr_fault: ld r5,_DSISR(r1) addi r3,r1,STACK_FRAME_OVERHEAD bl do_break -12: b ret_from_except_lite + /* + * do_break() may have changed the NV GPRS while handling a breakpoint. + * If so, we need to restore them with their updated values. Don't use + * ret_from_except_lite here. + */ + b ret_from_except
#ifdef CONFIG_PPC_BOOK3S_64
From: Greg Kurz groug@kaod.org
commit 02c5f5394918b9b47ff4357b1b18335768cd867d upstream.
Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This has the effect of incrementing the reference count of the PCI device, as explained in drivers/pci/search.c:
* Given a PCI domain, bus, and slot/function number, the desired PCI * device is located in the list of PCI devices. If the device is * found, its reference count is increased and this function returns a * pointer to its data structure. The caller must decrement the * reference count by calling pci_dev_put(). If no device is found, * %NULL is returned.
Nothing was done to call pci_dev_put() and the reference count of GPU and NPU PCI devices rockets up.
A natural way to fix this would be to teach the callers about the change, so that they call pci_dev_put() when done with the pointer. This turns out to be quite intrusive, as it affects many paths in npu-dma.c, pci-ioda.c and vfio_pci_nvlink2.c. Also, the issue appeared in 4.16 and some affected code got moved around since then: it would be problematic to backport the fix to stable releases.
All that code never cared for reference counting anyway. Call pci_dev_put() from get_pci_dev() to revert to the previous behavior.
Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn") Cc: stable@vger.kernel.org # v4.16 Signed-off-by: Greg Kurz groug@kaod.org Reviewed-by: Alexey Kardashevskiy aik@ozlabs.ru Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/platforms/powernv/npu-dma.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
--- a/arch/powerpc/platforms/powernv/npu-dma.c +++ b/arch/powerpc/platforms/powernv/npu-dma.c @@ -28,9 +28,22 @@ static DEFINE_SPINLOCK(npu_context_lock) static struct pci_dev *get_pci_dev(struct device_node *dn) { struct pci_dn *pdn = PCI_DN(dn); + struct pci_dev *pdev;
- return pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus), + pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus), pdn->busno, pdn->devfn); + + /* + * pci_get_domain_bus_and_slot() increased the reference count of + * the PCI device, but callers don't need that actually as the PE + * already holds a reference to the device. Since callers aren't + * aware of the reference count change, call pci_dev_put() now to + * avoid leaks. + */ + if (pdev) + pci_dev_put(pdev); + + return pdev; }
/* Given a NPU device get the associated PCI device. */
From: Athira Rajeev atrajeev@linux.vnet.ibm.com
commit f5a9e488d62360c91c5770bd55a0b40e419a71ce upstream.
commit 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C") reimplemented book3S code to pltform/powernv/idle.c. But when doing so missed to add the per-thread LDBAR update in the core_woken path of the power9_idle_stop(). Patch fixes the same.
Fixes: 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Athira Rajeev atrajeev@linux.vnet.ibm.com Signed-off-by: Madhavan Srinivasan maddy@linux.vnet.ibm.com Reviewed-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20190702105836.26695-1-maddy@linux.vnet.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/platforms/powernv/idle.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/platforms/powernv/idle.c +++ b/arch/powerpc/platforms/powernv/idle.c @@ -758,7 +758,6 @@ static unsigned long power9_idle_stop(un mtspr(SPRN_PTCR, sprs.ptcr); mtspr(SPRN_RPR, sprs.rpr); mtspr(SPRN_TSCR, sprs.tscr); - mtspr(SPRN_LDBAR, sprs.ldbar);
if (pls >= pnv_first_tb_loss_level) { /* TB loss */ @@ -790,6 +789,7 @@ core_woken: mtspr(SPRN_MMCR0, sprs.mmcr0); mtspr(SPRN_MMCR1, sprs.mmcr1); mtspr(SPRN_MMCR2, sprs.mmcr2); + mtspr(SPRN_LDBAR, sprs.ldbar);
mtspr(SPRN_SPRG3, local_paca->sprg_vdso);
From: Alexey Kardashevskiy aik@ozlabs.ru
commit 5636427d087a55842c1a199dfb839e6545d30e5d upstream.
The powernv platform uses @dma_iommu_ops for non-bypass DMA. These ops need an iommu_table pointer which is stored in dev->archdata.iommu_table_base. It is initialized during pcibios_setup_device() which handles boot time devices. However when a device is taken from the system in order to pass it through, the default IOMMU table is destroyed but the pointer in a device is not updated; also when a device is returned back to the system, a new table pointer is not stored in dev->archdata.iommu_table_base either. So when a just returned device tries using IOMMU, it crashes on accessing stale iommu_table or its members.
This calls set_iommu_table_base() when the default window is created. Note it used to be there before but was wrongly removed (see "fixes"). It did not appear before as these days most devices simply use bypass.
This adds set_iommu_table_base(NULL) when a device is taken from the system to make it clear that IOMMU DMA cannot be used past that point.
Fixes: c4e9d3c1e65a ("powerpc/powernv/pseries: Rework device adding to IOMMU groups") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Alexey Kardashevskiy aik@ozlabs.ru Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/platforms/powernv/pci-ioda.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2456,6 +2456,14 @@ static long pnv_pci_ioda2_setup_default_ if (!pnv_iommu_bypass_disabled) pnv_pci_ioda2_set_bypass(pe, true);
+ /* + * Set table base for the case of IOMMU DMA use. Usually this is done + * from dma_dev_setup() which is not called when a device is returned + * from VFIO so do it here. + */ + if (pe->pdev) + set_iommu_table_base(&pe->pdev->dev, tbl); + return 0; }
@@ -2543,6 +2551,8 @@ static void pnv_ioda2_take_ownership(str pnv_pci_ioda2_unset_window(&pe->table_group, 0); if (pe->pbus) pnv_ioda_setup_bus_dma(pe, pe->pbus); + else if (pe->pdev) + set_iommu_table_base(&pe->pdev->dev, NULL); iommu_tce_table_put(tbl); }
From: Greg Kurz groug@kaod.org
commit a3bf9fbdad600b1e4335dd90979f8d6072e4f602 upstream.
On POWER9, if the hypervisor supports XIVE exploitation mode, the guest OS will unconditionally requests for the XIVE interrupt mode even if XIVE was deactivated with the kernel command line xive=off. Later on, when the spapr XIVE init code handles xive=off, it disables XIVE and tries to fall back on the legacy mode XICS.
This discrepency causes a kernel panic because the hypervisor is configured to provide the XIVE interrupt mode to the guest :
kernel BUG at arch/powerpc/sysdev/xics/xics-common.c:135! ... NIP xics_smp_probe+0x38/0x98 LR xics_smp_probe+0x2c/0x98 Call Trace: xics_smp_probe+0x2c/0x98 (unreliable) pSeries_smp_probe+0x40/0xa0 smp_prepare_cpus+0x62c/0x6ec kernel_init_freeable+0x148/0x448 kernel_init+0x2c/0x148 ret_from_kernel_thread+0x5c/0x68
Look for xive=off during prom_init and don't ask for XIVE in this case. One exception though: if the host only supports XIVE, we still want to boot so we ignore xive=off.
Similarly, have the spapr XIVE init code to looking at the interrupt mode negotiated during CAS, and ignore xive=off if the hypervisor only supports XIVE.
Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.20 Reported-by: Pavithra R. Prakash pavrampu@in.ibm.com Signed-off-by: Greg Kurz groug@kaod.org Reviewed-by: Cédric Le Goater clg@kaod.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/prom_init.c | 16 +++++++++++- arch/powerpc/sysdev/xive/spapr.c | 52 ++++++++++++++++++++++++++++++++++++++- 2 files changed, 66 insertions(+), 2 deletions(-)
--- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -168,6 +168,7 @@ static unsigned long __prombss prom_tce_
#ifdef CONFIG_PPC_PSERIES static bool __prombss prom_radix_disable; +static bool __prombss prom_xive_disable; #endif
struct platform_support { @@ -804,6 +805,12 @@ static void __init early_cmdline_parse(v } if (prom_radix_disable) prom_debug("Radix disabled from cmdline\n"); + + opt = prom_strstr(prom_cmd_line, "xive=off"); + if (opt) { + prom_xive_disable = true; + prom_debug("XIVE disabled from cmdline\n"); + } #endif /* CONFIG_PPC_PSERIES */ }
@@ -1212,10 +1219,17 @@ static void __init prom_parse_xive_model switch (val) { case OV5_FEAT(OV5_XIVE_EITHER): /* Either Available */ prom_debug("XIVE - either mode supported\n"); - support->xive = true; + support->xive = !prom_xive_disable; break; case OV5_FEAT(OV5_XIVE_EXPLOIT): /* Only Exploitation mode */ prom_debug("XIVE - exploitation mode supported\n"); + if (prom_xive_disable) { + /* + * If we __have__ to do XIVE, we're better off ignoring + * the command line rather than not booting. + */ + prom_printf("WARNING: Ignoring cmdline option xive=off\n"); + } support->xive = true; break; case OV5_FEAT(OV5_XIVE_LEGACY): /* Only Legacy mode */ --- a/arch/powerpc/sysdev/xive/spapr.c +++ b/arch/powerpc/sysdev/xive/spapr.c @@ -16,6 +16,7 @@ #include <linux/cpumask.h> #include <linux/mm.h> #include <linux/delay.h> +#include <linux/libfdt.h>
#include <asm/prom.h> #include <asm/io.h> @@ -659,6 +660,55 @@ static bool xive_get_max_prio(u8 *max_pr return true; }
+static const u8 *get_vec5_feature(unsigned int index) +{ + unsigned long root, chosen; + int size; + const u8 *vec5; + + root = of_get_flat_dt_root(); + chosen = of_get_flat_dt_subnode_by_name(root, "chosen"); + if (chosen == -FDT_ERR_NOTFOUND) + return NULL; + + vec5 = of_get_flat_dt_prop(chosen, "ibm,architecture-vec-5", &size); + if (!vec5) + return NULL; + + if (size <= index) + return NULL; + + return vec5 + index; +} + +static bool xive_spapr_disabled(void) +{ + const u8 *vec5_xive; + + vec5_xive = get_vec5_feature(OV5_INDX(OV5_XIVE_SUPPORT)); + if (vec5_xive) { + u8 val; + + val = *vec5_xive & OV5_FEAT(OV5_XIVE_SUPPORT); + switch (val) { + case OV5_FEAT(OV5_XIVE_EITHER): + case OV5_FEAT(OV5_XIVE_LEGACY): + break; + case OV5_FEAT(OV5_XIVE_EXPLOIT): + /* Hypervisor only supports XIVE */ + if (xive_cmdline_disabled) + pr_warn("WARNING: Ignoring cmdline option xive=off\n"); + return false; + default: + pr_warn("%s: Unknown xive support option: 0x%x\n", + __func__, val); + break; + } + } + + return xive_cmdline_disabled; +} + bool __init xive_spapr_init(void) { struct device_node *np; @@ -671,7 +721,7 @@ bool __init xive_spapr_init(void) const __be32 *reg; int i;
- if (xive_cmdline_disabled) + if (xive_spapr_disabled()) return false;
pr_devel("%s()\n", __func__);
From: Nathan Lynch nathanl@linux.ibm.com
commit 0aa82c482ab2ece530a6f44897b63b274bb43c8e upstream.
During post-migration device tree updates, we can oops in pseries_update_drconf_memory() if the source device tree has an ibm,dynamic-memory-v2 property and the destination has a ibm,dynamic_memory (v1) property. The notifier processes an "update" for the ibm,dynamic-memory property but it's really an add in this scenario. So make sure the old property object is there before dereferencing it.
Fixes: 2b31e3aec1db ("powerpc/drmem: Add support for ibm, dynamic-memory-v2 property") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Nathan Lynch nathanl@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/platforms/pseries/hotplug-memory.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -976,6 +976,9 @@ static int pseries_update_drconf_memory( if (!memblock_size) return -EINVAL;
+ if (!pr->old_prop) + return 0; + p = (__be32 *) pr->old_prop->value; if (!p) return -EINVAL;
From: Jorge Ramirez-Ortiz jorge.ramirez-ortiz@linaro.org
commit 5e6b6651d22de109ebf48ca00d0373bc2c0cc080 upstream.
mutexes can sleep and therefore should not be taken while holding a spinlock. move clk_get_rate (can sleep) outside the spinlock protected region.
Fixes: 83736352e0ca ("mmc: sdhci-msm: Update DLL reset sequence") Cc: stable@vger.kernel.org Signed-off-by: Jorge Ramirez-Ortiz jorge.ramirez-ortiz@linaro.org Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org Reviewed-by: Vinod Koul vkoul@kernel.org Acked-by: Adrian Hunter adrian.hunter@intel.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mmc/host/sdhci-msm.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/mmc/host/sdhci-msm.c +++ b/drivers/mmc/host/sdhci-msm.c @@ -575,11 +575,14 @@ static int msm_init_cm_dll(struct sdhci_ struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); struct sdhci_msm_host *msm_host = sdhci_pltfm_priv(pltfm_host); int wait_cnt = 50; - unsigned long flags; + unsigned long flags, xo_clk = 0; u32 config; const struct sdhci_msm_offset *msm_offset = msm_host->offset;
+ if (msm_host->use_14lpp_dll_reset && !IS_ERR_OR_NULL(msm_host->xo_clk)) + xo_clk = clk_get_rate(msm_host->xo_clk); + spin_lock_irqsave(&host->lock, flags);
/* @@ -627,10 +630,10 @@ static int msm_init_cm_dll(struct sdhci_ config &= CORE_FLL_CYCLE_CNT; if (config) mclk_freq = DIV_ROUND_CLOSEST_ULL((host->clock * 8), - clk_get_rate(msm_host->xo_clk)); + xo_clk); else mclk_freq = DIV_ROUND_CLOSEST_ULL((host->clock * 4), - clk_get_rate(msm_host->xo_clk)); + xo_clk);
config = readl_relaxed(host->ioaddr + msm_offset->core_dll_config_2);
From: Dan Carpenter dan.carpenter@oracle.com
commit 0bdf8a8245fdea6f075a5fede833a5fcf1b3466c upstream.
ECRYPTFS_SIZE_AND_MARKER_BYTES is type size_t, so if "rc" is negative that gets type promoted to a high positive value and treated as success.
Fixes: 778aeb42a708 ("eCryptfs: Cleanup and optimize ecryptfs_lookup_interpose()") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com [tyhicks: Use "if/else if" rather than "if/if"] Cc: stable@vger.kernel.org Signed-off-by: Tyler Hicks tyhicks@canonical.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ecryptfs/crypto.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
--- a/fs/ecryptfs/crypto.c +++ b/fs/ecryptfs/crypto.c @@ -1004,8 +1004,10 @@ int ecryptfs_read_and_validate_header_re
rc = ecryptfs_read_lower(file_size, 0, ECRYPTFS_SIZE_AND_MARKER_BYTES, inode); - if (rc < ECRYPTFS_SIZE_AND_MARKER_BYTES) - return rc >= 0 ? -EINVAL : rc; + if (rc < 0) + return rc; + else if (rc < ECRYPTFS_SIZE_AND_MARKER_BYTES) + return -EINVAL; rc = ecryptfs_validate_marker(marker); if (!rc) ecryptfs_i_size_init(file_size, inode); @@ -1367,8 +1369,10 @@ int ecryptfs_read_and_validate_xattr_reg ecryptfs_inode_to_lower(inode), ECRYPTFS_XATTR_NAME, file_size, ECRYPTFS_SIZE_AND_MARKER_BYTES); - if (rc < ECRYPTFS_SIZE_AND_MARKER_BYTES) - return rc >= 0 ? -EINVAL : rc; + if (rc < 0) + return rc; + else if (rc < ECRYPTFS_SIZE_AND_MARKER_BYTES) + return -EINVAL; rc = ecryptfs_validate_marker(marker); if (!rc) ecryptfs_i_size_init(file_size, inode);
From: Xiaolei Li xiaolei.li@mediatek.com
commit e1884ffddacc0424d7e785e6f8087bd12f7196db upstream.
At present, the flow of calculating AC timing of read/write cycle in SDR mode is that: At first, calculate high hold time which is valid for both read and write cycle using the max value between tREH_min and tWH_min. Secondly, calculate WE# pulse width using tWP_min. Thridly, calculate RE# pulse width using the bigger one between tREA_max and tRP_min.
But NAND SPEC shows that Controller should also meet write/read cycle time. That is write cycle time should be more than tWC_min and read cycle should be more than tRC_min. Obviously, we do not achieve that now.
This patch corrects the low level time calculation to meet minimum read/write cycle time required. After getting the high hold time, WE# low level time will be promised to meet tWP_min and tWC_min requirement, and RE# low level time will be promised to meet tREA_max, tRP_min and tRC_min requirement.
Fixes: edfee3619c49 ("mtd: nand: mtk: add ->setup_data_interface() hook") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Xiaolei Li xiaolei.li@mediatek.com Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mtd/nand/raw/mtk_nand.c | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-)
--- a/drivers/mtd/nand/raw/mtk_nand.c +++ b/drivers/mtd/nand/raw/mtk_nand.c @@ -500,7 +500,8 @@ static int mtk_nfc_setup_data_interface( { struct mtk_nfc *nfc = nand_get_controller_data(chip); const struct nand_sdr_timings *timings; - u32 rate, tpoecs, tprecs, tc2r, tw2r, twh, twst, trlt; + u32 rate, tpoecs, tprecs, tc2r, tw2r, twh, twst = 0, trlt = 0; + u32 thold;
timings = nand_get_sdr_timings(conf); if (IS_ERR(timings)) @@ -536,11 +537,28 @@ static int mtk_nfc_setup_data_interface( twh = DIV_ROUND_UP(twh * rate, 1000000) - 1; twh &= 0xf;
- twst = timings->tWP_min / 1000; + /* Calculate real WE#/RE# hold time in nanosecond */ + thold = (twh + 1) * 1000000 / rate; + /* nanosecond to picosecond */ + thold *= 1000; + + /* + * WE# low level time should be expaned to meet WE# pulse time + * and WE# cycle time at the same time. + */ + if (thold < timings->tWC_min) + twst = timings->tWC_min - thold; + twst = max(timings->tWP_min, twst) / 1000; twst = DIV_ROUND_UP(twst * rate, 1000000) - 1; twst &= 0xf;
- trlt = max(timings->tREA_max, timings->tRP_min) / 1000; + /* + * RE# low level time should be expaned to meet RE# pulse time, + * RE# access time and RE# cycle time at the same time. + */ + if (thold < timings->tRC_min) + trlt = timings->tRC_min - thold; + trlt = max3(trlt, timings->tREA_max, timings->tRP_min) / 1000; trlt = DIV_ROUND_UP(trlt * rate, 1000000) - 1; trlt &= 0xf;
From: liaoweixiong liaoweixiong@allwinnertech.com
commit b83408b580eccf8d2797cd6cb9ae42c2a28656a7 upstream.
In case of the last page containing bitflips (ret > 0), spinand_mtd_read() will return that number of bitflips for the last page while it should instead return max_bitflips like it does when the last page read returns with 0.
Signed-off-by: Weixiong Liao liaoweixiong@allwinnertech.com Reviewed-by: Boris Brezillon boris.brezillon@collabora.com Reviewed-by: Frieder Schrempf frieder.schrempf@kontron.de Cc: stable@vger.kernel.org Fixes: 7529df465248 ("mtd: nand: Add core infrastructure to support SPI NANDs") Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mtd/nand/spi/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mtd/nand/spi/core.c +++ b/drivers/mtd/nand/spi/core.c @@ -511,12 +511,12 @@ static int spinand_mtd_read(struct mtd_i if (ret == -EBADMSG) { ecc_failed = true; mtd->ecc_stats.failed++; - ret = 0; } else { mtd->ecc_stats.corrected += ret; max_bitflips = max_t(unsigned int, max_bitflips, ret); }
+ ret = 0; ops->retlen += iter.req.datalen; ops->oobretlen += iter.req.ooblen; }
From: YueHaibing yuehaibing@huawei.com
commit 9800db282dff675dd700d5985d90b605c34b5ccd upstream.
Commit aad14ad3cf3a ("intel_th: msu: Add current window tracking") added the following gcc warning:
drivers/hwtracing/intel_th/msu.c: In function msc_win_switch: drivers/hwtracing/intel_th/msu.c:1389:21: warning: variable last set but not used [-Wunused-but-set-variable]
Fix it by removing the variable.
Signed-off-by: YueHaibing yuehaibing@huawei.com Fixes: aad14ad3cf3a ("intel_th: msu: Add current window tracking") Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20190621161930.60785-3-alexander.shishkin@linux.in... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwtracing/intel_th/msu.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/hwtracing/intel_th/msu.c +++ b/drivers/hwtracing/intel_th/msu.c @@ -1400,10 +1400,9 @@ static int intel_th_msc_init(struct msc
static void msc_win_switch(struct msc *msc) { - struct msc_window *last, *first; + struct msc_window *first;
first = list_first_entry(&msc->win_list, struct msc_window, entry); - last = list_last_entry(&msc->win_list, struct msc_window, entry);
if (msc_is_last_win(msc->cur_win)) msc->cur_win = first;
From: Alexander Shishkin alexander.shishkin@linux.intel.com
commit 918b8646497b5dba6ae82d4a7325f01b258972b9 upstream.
Commit 4e0eaf239fb3 ("intel_th: msu: Fix single mode with IOMMU") switched the single mode code to use dma mapping pages obtained from the page allocator, but with IOMMU disabled, that may lead to using SWIOTLB bounce buffers and without additional sync'ing, produces empty trace buffers.
Fix this by using a DMA32 GFP flag to the page allocation in single mode, as the device supports full 32-bit DMA addressing.
Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Fixes: 4e0eaf239fb3 ("intel_th: msu: Fix single mode with IOMMU") Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reported-by: Ammy Yi ammy.yi@intel.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20190621161930.60785-4-alexander.shishkin@linux.in... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hwtracing/intel_th/msu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/hwtracing/intel_th/msu.c +++ b/drivers/hwtracing/intel_th/msu.c @@ -667,7 +667,7 @@ static int msc_buffer_contig_alloc(struc goto err_out;
ret = -ENOMEM; - page = alloc_pages(GFP_KERNEL | __GFP_ZERO, order); + page = alloc_pages(GFP_KERNEL | __GFP_ZERO | GFP_DMA32, order); if (!page) goto err_free_sgt;
From: Szymon Janc szymon.janc@codecoup.pl
commit 1d87b88ba26eabd4745e158ecfd87c93a9b51dc2 upstream.
Microsoft Surface Precision Mouse provides bogus identity address when pairing. It connects with Static Random address but provides Public Address in SMP Identity Address Information PDU. Address has same value but type is different. Workaround this by dropping IRK if ID address discrepancy is detected.
HCI Event: LE Meta Event (0x3e) plen 19
LE Connection Complete (0x01) Status: Success (0x00) Handle: 75 Role: Master (0x00) Peer address type: Random (0x01) Peer address: E0:52:33:93:3B:21 (Static) Connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Master clock accuracy: 0x00
....
ACL Data RX: Handle 75 flags 0x02 dlen 12
SMP: Identity Address Information (0x09) len 7 Address type: Public (0x00) Address: E0:52:33:93:3B:21
Signed-off-by: Szymon Janc szymon.janc@codecoup.pl Tested-by: Maarten Fonville maarten.fonville@gmail.com Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199461 Cc: stable@vger.kernel.org Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/bluetooth/smp.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/net/bluetooth/smp.c +++ b/net/bluetooth/smp.c @@ -2579,6 +2579,19 @@ static int smp_cmd_ident_addr_info(struc goto distribute; }
+ /* Drop IRK if peer is using identity address during pairing but is + * providing different address as identity information. + * + * Microsoft Surface Precision Mouse is known to have this bug. + */ + if (hci_is_identity_address(&hcon->dst, hcon->dst_type) && + (bacmp(&info->bdaddr, &hcon->dst) || + info->addr_type != hcon->dst_type)) { + bt_dev_err(hcon->hdev, + "ignoring IRK with invalid identity address"); + goto distribute; + } + bacpy(&smp->id_addr, &info->bdaddr); smp->id_addr_type = info->addr_type;
From: Matthew Wilcox (Oracle) willy@infradead.org
commit 23c84eb7837514e16d79ed6d849b13745e0ce688 upstream.
RocksDB can hang indefinitely when using a DAX file. This is due to a bug in the XArray conversion when handling a PMD fault and finding a PTE entry. We use the wrong index in the hash and end up waiting on the wrong waitqueue.
There's actually no need to wait; if we find a PTE entry while looking for a PMD entry, we can return immediately as we know we should fall back to a PTE fault (which may not conflict with the lock held).
We reuse the XA_RETRY_ENTRY to signal a conflicting entry was found. This value can never be found in an XArray while holding its lock, so it does not create an ambiguity.
Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/CAPcyv4hwHpX-MkUEqxwdTj7wCCZCN4RV-L4jsnuwLGyL_UEG4A... Fixes: b15cd800682f ("dax: Convert page fault handlers to XArray") Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Tested-by: Dan Williams dan.j.williams@intel.com Reported-by: Robert Barror robert.barror@intel.com Reported-by: Seema Pandit seema.pandit@intel.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/dax.c | 53 +++++++++++++++++++++++++++++++++-------------------- 1 file changed, 33 insertions(+), 20 deletions(-)
--- a/fs/dax.c +++ b/fs/dax.c @@ -124,6 +124,15 @@ static int dax_is_empty_entry(void *entr }
/* + * true if the entry that was found is of a smaller order than the entry + * we were looking for + */ +static bool dax_is_conflict(void *entry) +{ + return entry == XA_RETRY_ENTRY; +} + +/* * DAX page cache entry locking */ struct exceptional_entry_key { @@ -195,11 +204,13 @@ static void dax_wake_entry(struct xa_sta * Look up entry in page cache, wait for it to become unlocked if it * is a DAX entry and return it. The caller must subsequently call * put_unlocked_entry() if it did not lock the entry or dax_unlock_entry() - * if it did. + * if it did. The entry returned may have a larger order than @order. + * If @order is larger than the order of the entry found in i_pages, this + * function returns a dax_is_conflict entry. * * Must be called with the i_pages lock held. */ -static void *get_unlocked_entry(struct xa_state *xas) +static void *get_unlocked_entry(struct xa_state *xas, unsigned int order) { void *entry; struct wait_exceptional_entry_queue ewait; @@ -210,6 +221,8 @@ static void *get_unlocked_entry(struct x
for (;;) { entry = xas_find_conflict(xas); + if (dax_entry_order(entry) < order) + return XA_RETRY_ENTRY; if (!entry || WARN_ON_ONCE(!xa_is_value(entry)) || !dax_is_locked(entry)) return entry; @@ -254,7 +267,7 @@ static void wait_entry_unlocked(struct x static void put_unlocked_entry(struct xa_state *xas, void *entry) { /* If we were the only waiter woken, wake the next one */ - if (entry) + if (entry && dax_is_conflict(entry)) dax_wake_entry(xas, entry, false); }
@@ -461,7 +474,7 @@ void dax_unlock_page(struct page *page, * overlap with xarray value entries. */ static void *grab_mapping_entry(struct xa_state *xas, - struct address_space *mapping, unsigned long size_flag) + struct address_space *mapping, unsigned int order) { unsigned long index = xas->xa_index; bool pmd_downgrade = false; /* splitting PMD entry into PTE entries? */ @@ -469,20 +482,17 @@ static void *grab_mapping_entry(struct x
retry: xas_lock_irq(xas); - entry = get_unlocked_entry(xas); + entry = get_unlocked_entry(xas, order);
if (entry) { + if (dax_is_conflict(entry)) + goto fallback; if (!xa_is_value(entry)) { xas_set_err(xas, EIO); goto out_unlock; }
- if (size_flag & DAX_PMD) { - if (dax_is_pte_entry(entry)) { - put_unlocked_entry(xas, entry); - goto fallback; - } - } else { /* trying to grab a PTE entry */ + if (order == 0) { if (dax_is_pmd_entry(entry) && (dax_is_zero_entry(entry) || dax_is_empty_entry(entry))) { @@ -523,7 +533,11 @@ retry: if (entry) { dax_lock_entry(xas, entry); } else { - entry = dax_make_entry(pfn_to_pfn_t(0), size_flag | DAX_EMPTY); + unsigned long flags = DAX_EMPTY; + + if (order > 0) + flags |= DAX_PMD; + entry = dax_make_entry(pfn_to_pfn_t(0), flags); dax_lock_entry(xas, entry); if (xas_error(xas)) goto out_unlock; @@ -594,7 +608,7 @@ struct page *dax_layout_busy_page(struct if (WARN_ON_ONCE(!xa_is_value(entry))) continue; if (unlikely(dax_is_locked(entry))) - entry = get_unlocked_entry(&xas); + entry = get_unlocked_entry(&xas, 0); if (entry) page = dax_busy_page(entry); put_unlocked_entry(&xas, entry); @@ -621,7 +635,7 @@ static int __dax_invalidate_entry(struct void *entry;
xas_lock_irq(&xas); - entry = get_unlocked_entry(&xas); + entry = get_unlocked_entry(&xas, 0); if (!entry || WARN_ON_ONCE(!xa_is_value(entry))) goto out; if (!trunc && @@ -848,7 +862,7 @@ static int dax_writeback_one(struct xa_s if (unlikely(dax_is_locked(entry))) { void *old_entry = entry;
- entry = get_unlocked_entry(xas); + entry = get_unlocked_entry(xas, 0);
/* Entry got punched out / reallocated? */ if (!entry || WARN_ON_ONCE(!xa_is_value(entry))) @@ -1509,7 +1523,7 @@ static vm_fault_t dax_iomap_pmd_fault(st * entry is already in the array, for instance), it will return * VM_FAULT_FALLBACK. */ - entry = grab_mapping_entry(&xas, mapping, DAX_PMD); + entry = grab_mapping_entry(&xas, mapping, PMD_ORDER); if (xa_is_internal(entry)) { result = xa_to_internal(entry); goto fallback; @@ -1658,11 +1672,10 @@ dax_insert_pfn_mkwrite(struct vm_fault * vm_fault_t ret;
xas_lock_irq(&xas); - entry = get_unlocked_entry(&xas); + entry = get_unlocked_entry(&xas, order); /* Did we race with someone splitting entry or so? */ - if (!entry || - (order == 0 && !dax_is_pte_entry(entry)) || - (order == PMD_ORDER && !dax_is_pmd_entry(entry))) { + if (!entry || dax_is_conflict(entry) || + (order == 0 && !dax_is_pte_entry(entry))) { put_unlocked_entry(&xas, entry); xas_unlock_irq(&xas); trace_dax_insert_pfn_mkwrite_no_entry(mapping->host, vmf,
From: Lee, Chiasheng chiasheng.lee@intel.com
commit e244c4699f859cf7149b0781b1894c7996a8a1df upstream.
With Link Power Management (LPM) enabled USB3 links transition to low power U1/U2 link states from U0 state automatically.
Current hub code detects USB3 remote wakeups by checking if the software state still shows suspended, but the link has transitioned from suspended U3 to enabled U0 state.
As it takes some time before the hub thread reads the port link state after a USB3 wake notification, the link may have transitioned from U0 to U1/U2, and wake is not detected by hub code.
Fix this by handling U1/U2 states in the same way as U0 in USB3 wakeup handling
This patch should be added to stable kernels since 4.13 where LPM was kept enabled during suspend/resume
Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Lee, Chiasheng chiasheng.lee@intel.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/core/hub.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -3617,6 +3617,7 @@ static int hub_handle_remote_wakeup(stru struct usb_device *hdev; struct usb_device *udev; int connect_change = 0; + u16 link_state; int ret;
hdev = hub->hdev; @@ -3626,9 +3627,11 @@ static int hub_handle_remote_wakeup(stru return 0; usb_clear_port_feature(hdev, port, USB_PORT_FEAT_C_SUSPEND); } else { + link_state = portstatus & USB_PORT_STAT_LINK_STATE; if (!udev || udev->state != USB_STATE_SUSPENDED || - (portstatus & USB_PORT_STAT_LINK_STATE) != - USB_SS_PORT_LS_U0) + (link_state != USB_SS_PORT_LS_U0 && + link_state != USB_SS_PORT_LS_U1 && + link_state != USB_SS_PORT_LS_U2)) return 0; }
From: Konstantin Khlebnikov khlebnikov@yandex-team.ru
commit 3a10f999ffd464d01c5a05592a15470a3c4bbc36 upstream.
After commit 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops limit is enforced") wait time could be zero even if group is throttled and cannot issue requests right now. As a result throtl_select_dispatch() turns into busy-loop under irq-safe queue spinlock.
Fix is simple: always round up target time to the next throttle slice.
Fixes: 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops limit is enforced") Signed-off-by: Konstantin Khlebnikov khlebnikov@yandex-team.ru Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- block/blk-throttle.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
--- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -881,13 +881,10 @@ static bool tg_with_in_iops_limit(struct unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd; u64 tmp;
- jiffy_elapsed = jiffy_elapsed_rnd = jiffies - tg->slice_start[rw]; + jiffy_elapsed = jiffies - tg->slice_start[rw];
- /* Slice has just started. Consider one slice interval */ - if (!jiffy_elapsed) - jiffy_elapsed_rnd = tg->td->throtl_slice; - - jiffy_elapsed_rnd = roundup(jiffy_elapsed_rnd, tg->td->throtl_slice); + /* Round up to the next throttle slice, wait time must be nonzero */ + jiffy_elapsed_rnd = roundup(jiffy_elapsed + 1, tg->td->throtl_slice);
/* * jiffy_elapsed_rnd should not be a big value as minimum iops can be
From: Peng Fan peng.fan@nxp.com
commit 5b933e28d8b1fbdc7fbac4bfc569f3b152c3dd59 upstream.
There is no audio_pll2_clk registered, it should be audio_pll2_out.
Cc: stable@vger.kernel.org Fixes: ba5625c3e272 ("clk: imx: Add clock driver support for imx8mm") Signed-off-by: Peng Fan peng.fan@nxp.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/clk/imx/clk-imx8mm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/clk/imx/clk-imx8mm.c +++ b/drivers/clk/imx/clk-imx8mm.c @@ -325,7 +325,7 @@ static const char *imx8mm_dsi_dbi_sels[] "sys_pll2_1000m", "sys_pll3_out", "audio_pll2_out", "video_pll1_out", };
static const char *imx8mm_usdhc3_sels[] = {"osc_24m", "sys_pll1_400m", "sys_pll1_800m", "sys_pll2_500m", - "sys_pll3_out", "sys_pll1_266m", "audio_pll2_clk", "sys_pll1_100m", }; + "sys_pll3_out", "sys_pll1_266m", "audio_pll2_out", "sys_pll1_100m", };
static const char *imx8mm_csi1_core_sels[] = {"osc_24m", "sys_pll1_266m", "sys_pll2_250m", "sys_pll1_800m", "sys_pll2_1000m", "sys_pll3_out", "audio_pll2_out", "video_pll1_out", }; @@ -361,11 +361,11 @@ static const char *imx8mm_pdm_sels[] = { "sys_pll2_1000m", "sys_pll3_out", "clk_ext3", "audio_pll2_out", };
static const char *imx8mm_vpu_h1_sels[] = {"osc_24m", "vpu_pll_out", "sys_pll1_800m", "sys_pll2_1000m", - "audio_pll2_clk", "sys_pll2_125m", "sys_pll3_clk", "audio_pll1_out", }; + "audio_pll2_out", "sys_pll2_125m", "sys_pll3_clk", "audio_pll1_out", };
static const char *imx8mm_dram_core_sels[] = {"dram_pll_out", "dram_alt_root", };
-static const char *imx8mm_clko1_sels[] = {"osc_24m", "sys_pll1_800m", "osc_27m", "sys_pll1_200m", "audio_pll2_clk", +static const char *imx8mm_clko1_sels[] = {"osc_24m", "sys_pll1_800m", "osc_27m", "sys_pll1_200m", "audio_pll2_out", "vpu_pll", "sys_pll1_80m", };
static struct clk *clks[IMX8MM_CLK_END];
From: Tejun Heo tj@kernel.org
commit 5de0073fcd50cc1f150895a7bb04d3cf8067b1d7 upstream.
If use_delay was non-zero when the latency target of a cgroup was set to zero, it will stay stuck until io.latency is enabled on the cgroup again. This keeps readahead disabled for the cgroup impacting performance negatively.
Signed-off-by: Tejun Heo tj@kernel.org Cc: Josef Bacik jbacik@fb.com Fixes: d70675121546 ("block: introduce blk-iolatency io controller") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- block/blk-iolatency.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -759,8 +759,10 @@ static int iolatency_set_min_lat_nsec(st
if (!oldval && val) return 1; - if (oldval && !val) + if (oldval && !val) { + blkcg_clear_delay(blkg); return -1; + } return 0; }
From: Tejun Heo tj@kernel.org
commit f539da82f2158916e154d206054e0efd5df7ab61 upstream.
Depending on the number of devices, blkcg stats can go over the default seqfile buf size. seqfile normally retries with a larger buffer but since the ->pd_stat() addition, blkcg_print_stat() doesn't tell seqfile that overflow has happened and the output gets printed truncated. Fix it by calling seq_commit() w/ -1 on possible overflows.
Signed-off-by: Tejun Heo tj@kernel.org Fixes: 903d23f0a354 ("blk-cgroup: allow controllers to output their own stats") Cc: stable@vger.kernel.org # v4.19+ Cc: Josef Bacik jbacik@fb.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- block/blk-cgroup.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1006,8 +1006,12 @@ static int blkcg_print_stat(struct seq_f } next: if (has_stats) { - off += scnprintf(buf+off, size-off, "\n"); - seq_commit(sf, off); + if (off < size - 1) { + off += scnprintf(buf+off, size-off, "\n"); + seq_commit(sf, off); + } else { + seq_commit(sf, -1); + } } }
From: Josua Mayer josua@solid-run.com
commit 4aabed699c400810981d3dda170f05fa4d782905 upstream.
Allow up to four clocks to be specified and enabled for the orion-mdio interface, which are required by the Armada 8k and defined in armada-cp110.dtsi.
Fixes a hang in probing the mvmdio driver that was encountered on the Clearfog GT 8K with all drivers built as modules, but also affects other boards such as the MacchiatoBIN.
Cc: stable@vger.kernel.org Fixes: 96cb43423822 ("net: mvmdio: allow up to three clocks to be specified for orion-mdio") Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Josua Mayer josua@solid-run.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/ethernet/marvell/mvmdio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/marvell/mvmdio.c +++ b/drivers/net/ethernet/marvell/mvmdio.c @@ -64,7 +64,7 @@
struct orion_mdio_dev { void __iomem *regs; - struct clk *clk[3]; + struct clk *clk[4]; /* * If we have access to the error interrupt pin (which is * somewhat misnamed as it not only reflects internal errors
From: Josua Mayer josua@solid-run.com
commit 80785f5a22e9073e2ded5958feb7f220e066d17b upstream.
Armada 8040 needs four clocks to be enabled for MDIO accesses to work. Update the binding to allow the extra clock to be specified.
Cc: stable@vger.kernel.org Fixes: 6d6a331f44a1 ("dt-bindings: allow up to three clocks for orion-mdio") Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Josua Mayer josua@solid-run.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Documentation/devicetree/bindings/net/marvell-orion-mdio.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/Documentation/devicetree/bindings/net/marvell-orion-mdio.txt +++ b/Documentation/devicetree/bindings/net/marvell-orion-mdio.txt @@ -16,7 +16,7 @@ Required properties:
Optional properties: - interrupts: interrupt line number for the SMI error/done interrupt -- clocks: phandle for up to three required clocks for the MDIO instance +- clocks: phandle for up to four required clocks for the MDIO instance
The child nodes of the MDIO driver are the individual PHY devices connected to this MDIO bus. They must have a "reg" property given the
From: Norbert Manthey nmanthey@amazon.de
commit 4c6d80e1144bdf48cae6b602ae30d41f3e5c76a9 upstream.
The pstore_mkfile() function is passed a pointer to a struct pstore_record. On success it consumes this 'record' pointer and references it from the created inode.
On failure, however, it may or may not free the record. There are even two different code paths which return -ENOMEM -- one of which does and the other doesn't free the record.
Make the behaviour deterministic by never consuming and freeing the record when returning failure, allowing the caller to do the cleanup consistently.
Signed-off-by: Norbert Manthey nmanthey@amazon.de Link: https://lore.kernel.org/r/1562331960-26198-1-git-send-email-nmanthey@amazon.... Fixes: 83f70f0769ddd ("pstore: Do not duplicate record metadata") Fixes: 1dfff7dd67d1a ("pstore: Pass record contents instead of copying") Cc: stable@vger.kernel.org [kees: also move "private" allocation location, rename inode cleanup label] Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/pstore/inode.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
--- a/fs/pstore/inode.c +++ b/fs/pstore/inode.c @@ -318,22 +318,21 @@ int pstore_mkfile(struct dentry *root, s goto fail; inode->i_mode = S_IFREG | 0444; inode->i_fop = &pstore_file_operations; - private = kzalloc(sizeof(*private), GFP_KERNEL); - if (!private) - goto fail_alloc; - private->record = record; - scnprintf(name, sizeof(name), "%s-%s-%llu%s", pstore_type_to_name(record->type), record->psi->name, record->id, record->compressed ? ".enc.z" : "");
+ private = kzalloc(sizeof(*private), GFP_KERNEL); + if (!private) + goto fail_inode; + dentry = d_alloc_name(root, name); if (!dentry) goto fail_private;
+ private->record = record; inode->i_size = private->total_size = size; - inode->i_private = private;
if (record->time.tv_sec) @@ -349,7 +348,7 @@ int pstore_mkfile(struct dentry *root, s
fail_private: free_pstore_private(private); -fail_alloc: +fail_inode: iput(inode);
fail:
From: Bjorn Andersson bjorn.andersson@linaro.org
commit 885bd765963b42c380db442db7f1c0f2a26076fa upstream.
After issuing a PHY_START request to the QMP, the hardware documentation states that the software should wait for the PCS_READY_STATUS to become 1.
With the introduction of commit c9b589791fc1 ("phy: qcom: Utilize UFS reset controller") an additional 1ms delay was introduced between the start request and the check of the status bit. This greatly increases the chances for the hardware to actually becoming ready before the status bit is read.
The result can be seen in that UFS PHY enabling is now reported as a failure in 10% of the boots on SDM845, which is a clear regression from the previous rare/occasional failure.
This patch fixes the "break condition" of the poll to check for the correct state of the status bit.
Unfortunately PCIe on 8996 and 8998 does not specify the mask_pcs_ready register, which means that the code checks a bit that's always 0. So the patch also fixes these, in order to not regress these targets.
Fixes: 73d7ec899bd8 ("phy: qcom-qmp: Add msm8998 PCIe QMP PHY support") Fixes: e78f3d15e115 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets") Cc: stable@vger.kernel.org Cc: Evan Green evgreen@chromium.org Cc: Marc Gonzalez marc.w.gonzalez@free.fr Cc: Vivek Gautam vivek.gautam@codeaurora.org Reviewed-by: Evan Green evgreen@chromium.org Reviewed-by: Niklas Cassel niklas.cassel@linaro.org Reviewed-by: Marc Gonzalez marc.w.gonzalez@free.fr Tested-by: Marc Gonzalez marc.w.gonzalez@free.fr Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Kishon Vijay Abraham I kishon@ti.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/phy/qualcomm/phy-qcom-qmp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/phy/qualcomm/phy-qcom-qmp.c +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c @@ -1074,6 +1074,7 @@ static const struct qmp_phy_cfg msm8996_
.start_ctrl = PCS_START | PLL_READY_GATE_EN, .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, + .mask_pcs_ready = PHYSTATUS, .mask_com_pcs_ready = PCS_READY,
.has_phy_com_ctrl = true, @@ -1253,6 +1254,7 @@ static const struct qmp_phy_cfg msm8998_
.start_ctrl = SERDES_START | PCS_START, .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, + .mask_pcs_ready = PHYSTATUS, .mask_com_pcs_ready = PCS_READY, };
@@ -1547,7 +1549,7 @@ static int qcom_qmp_phy_enable(struct ph status = pcs + cfg->regs[QPHY_PCS_READY_STATUS]; mask = cfg->mask_pcs_ready;
- ret = readl_poll_timeout(status, val, !(val & mask), 1, + ret = readl_poll_timeout(status, val, val & mask, 1, PHY_INIT_COMPLETE_TIMEOUT); if (ret) { dev_err(qmp->dev, "phy initialization timed-out\n");
From: Mike Snitzer snitzer@redhat.com
commit 54fa16ee532705985e6c946da455856f18f63ee1 upstream.
Check if in fail_io mode at start of dm_pool_metadata_set_needs_check(). Otherwise dm_pool_metadata_set_needs_check()'s superblock_lock() can crash in dm_bm_write_lock() while accessing the block manager object that was previously destroyed as part of a failed dm_pool_abort_metadata() that ultimately set fail_io to begin with.
Also, update DMERR() message to more accurately describe superblock_lock() failure.
Cc: stable@vger.kernel.org Reported-by: Zdenek Kabelac zkabelac@redhat.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-thin-metadata.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/md/dm-thin-metadata.c +++ b/drivers/md/dm-thin-metadata.c @@ -2046,16 +2046,19 @@ int dm_pool_register_metadata_threshold(
int dm_pool_metadata_set_needs_check(struct dm_pool_metadata *pmd) { - int r; + int r = -EINVAL; struct dm_block *sblock; struct thin_disk_superblock *disk_super;
pmd_write_lock(pmd); + if (pmd->fail_io) + goto out; + pmd->flags |= THIN_METADATA_NEEDS_CHECK_FLAG;
r = superblock_lock(pmd, &sblock); if (r) { - DMERR("couldn't read superblock"); + DMERR("couldn't lock superblock"); goto out; }
From: Junxiao Bi junxiao.bi@oracle.com
commit bd293d071ffe65e645b4d8104f9d8fe15ea13862 upstream.
When thin-volume is built on loop device, if available memory is low, the following deadlock can be triggered:
One process P1 allocates memory with GFP_FS flag, direct alloc fails, memory reclaim invokes memory shrinker in dm_bufio, dm_bufio_shrink_scan() runs, mutex dm_bufio_client->lock is acquired, then P1 waits for dm_buffer IO to complete in __try_evict_buffer().
But this IO may never complete if issued to an underlying loop device that forwards it using direct-IO, which allocates memory using GFP_KERNEL (see: do_blockdev_direct_IO()). If allocation fails, memory reclaim will invoke memory shrinker in dm_bufio, dm_bufio_shrink_scan() will be invoked, and since the mutex is already held by P1 the loop thread will hang, and IO will never complete. Resulting in ABBA deadlock.
Cc: stable@vger.kernel.org Signed-off-by: Junxiao Bi junxiao.bi@oracle.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-bufio.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -1599,9 +1599,7 @@ dm_bufio_shrink_scan(struct shrinker *sh unsigned long freed;
c = container_of(shrink, struct dm_bufio_client, shrinker); - if (sc->gfp_mask & __GFP_FS) - dm_bufio_lock(c); - else if (!dm_bufio_trylock(c)) + if (!dm_bufio_trylock(c)) return SHRINK_STOP;
freed = __scan(c, sc->nr_to_scan, sc->gfp_mask);
stable-rc/linux-5.2.y boot: 139 boots: 1 failed, 136 passed with 1 offline, 1 untried/unknown (v5.2.2-414-ga4059e390eb8)
Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-5.2.y/kernel/v5.2.2... Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-5.2.y/kernel/v5.2.2-414-ga...
Tree: stable-rc Branch: linux-5.2.y Git Describe: v5.2.2-414-ga4059e390eb8 Git Commit: a4059e390eb842ee95dcb0b856eee5cc422a815b Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Tested: 81 unique boards, 28 SoC families, 17 builds out of 209
Boot Failure Detected:
arm64: defconfig: gcc-8: meson-gxl-s905x-nexbox-a95x: 1 failed lab
Offline Platforms:
arm64:
defconfig: gcc-8 meson-gxbb-odroidc2: 1 offline lab
--- For more info write to info@kernelci.org
On 24/07/2019 20:14, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.2.3 release. There are 413 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Jul 2019 07:13:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.2.3-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.2.y and the diffstat can be found below.
thanks,
greg k-h
All tests are passing for Tegra ...
Test results for stable-v5.2: 12 builds: 12 pass, 0 fail 22 boots: 22 pass, 0 fail 38 tests: 38 pass, 0 fail
Linux version: 5.2.3-rc1-gdb628fe0e67f Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Cheers Jon
On Thu, Jul 25, 2019 at 10:04:43AM +0100, Jon Hunter wrote:
On 24/07/2019 20:14, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.2.3 release. There are 413 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Jul 2019 07:13:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.2.3-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.2.y and the diffstat can be found below.
thanks,
greg k-h
All tests are passing for Tegra ...
Test results for stable-v5.2: 12 builds: 12 pass, 0 fail 22 boots: 22 pass, 0 fail 38 tests: 38 pass, 0 fail
Linux version: 5.2.3-rc1-gdb628fe0e67f Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Great, thanks for testing all of these and letting me know.
greg k-h
On Wed, 24 Jul 2019 at 21:25, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.2.3 release. There are 413 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Jul 2019 07:13:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.2.3-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.2.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. Regressions detected.
Summary ------------------------------------------------------------------------
kernel: 5.2.3-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-5.2.y git commit: db628fe0e67ff8c66e8c6ba76e5e4becfa75fe21 git describe: v5.2.2-414-gdb628fe0e67f Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.2-oe/build/v5.2.2-414-g...
Regressions (compared to build v5.2.2) ------------------------------------------------------------------------
x86: kvm-unit-tests: * vmx
TESTNAME=vmx TIMEOUT=90s ACCEL= ./x86/run x86/vmx.flat -smp 1 -cpu host,+vmx -append "-exit_monitor_from_l2_test -ept_access* -vmx_smp* -vmx_vmcs_shadow_test" [ 155.670748] kvm [6062]: vcpu0, guest rIP: 0x4050cb kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x1, nop [ 155.681027] kvm [6062]: vcpu0, guest rIP: 0x408911 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x3, nop [ 155.690749] kvm [6062]: vcpu0, guest rIP: 0x40bb39 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x1, nop [ 155.700595] kvm [6062]: vcpu0, guest rIP: 0x4089b2 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x3, nop [ 158.349308] nested_vmx_exit_reflected failed vm entry 7 [ 158.363737] nested_vmx_exit_reflected failed vm entry 7 [ 158.378010] nested_vmx_exit_reflected failed vm entry 7 [ 158.392480] nested_vmx_exit_reflected failed vm entry 7 [ 158.406920] nested_vmx_exit_reflected failed vm entry 7 [ 158.421390] nested_vmx_exit_reflected failed vm entry 7 [ 158.435795] nested_vmx_exit_reflected failed vm entry 7 [ 158.450276] nested_vmx_exit_reflected failed vm entry 7 [ 158.464674] nested_vmx_exit_reflected failed vm entry 7 [ 158.479030] nested_vmx_exit_reflected failed vm entry 7 [ 161.044379] set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state. FAIL vmx (timeout; duration=90s)
kernel-config: http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-corei7-64/lkft... Full log: https://lkft.validation.linaro.org/scheduler/job/836289#L1597
No fixes (compared to build v5.2.2)
Ran 22506 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - hi6220-hikey - i386 - juno-r2 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - x86
Test Suites ----------- * build * install-android-platform-tools-r2600 * kselftest * libgpiod * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * network-basic-tests * perf * spectre-meltdown-checker-test * v4l2-compliance * ltp-fs-tests * ltp-open-posix-tests * kvm-unit-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On Thu, Jul 25, 2019 at 01:16:19PM +0200, Anders Roxell wrote:
On Wed, 24 Jul 2019 at 21:25, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.2.3 release. There are 413 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Jul 2019 07:13:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.2.3-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.2.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. Regressions detected.
Summary
kernel: 5.2.3-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-5.2.y git commit: db628fe0e67ff8c66e8c6ba76e5e4becfa75fe21 git describe: v5.2.2-414-gdb628fe0e67f Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.2-oe/build/v5.2.2-414-g...
Regressions (compared to build v5.2.2)
x86: kvm-unit-tests: * vmx
TESTNAME=vmx TIMEOUT=90s ACCEL= ./x86/run x86/vmx.flat -smp 1 -cpu host,+vmx -append "-exit_monitor_from_l2_test -ept_access* -vmx_smp* -vmx_vmcs_shadow_test" [ 155.670748] kvm [6062]: vcpu0, guest rIP: 0x4050cb kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x1, nop [ 155.681027] kvm [6062]: vcpu0, guest rIP: 0x408911 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x3, nop [ 155.690749] kvm [6062]: vcpu0, guest rIP: 0x40bb39 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x1, nop [ 155.700595] kvm [6062]: vcpu0, guest rIP: 0x4089b2 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x3, nop [ 158.349308] nested_vmx_exit_reflected failed vm entry 7 [ 158.363737] nested_vmx_exit_reflected failed vm entry 7 [ 158.378010] nested_vmx_exit_reflected failed vm entry 7 [ 158.392480] nested_vmx_exit_reflected failed vm entry 7 [ 158.406920] nested_vmx_exit_reflected failed vm entry 7 [ 158.421390] nested_vmx_exit_reflected failed vm entry 7 [ 158.435795] nested_vmx_exit_reflected failed vm entry 7 [ 158.450276] nested_vmx_exit_reflected failed vm entry 7 [ 158.464674] nested_vmx_exit_reflected failed vm entry 7 [ 158.479030] nested_vmx_exit_reflected failed vm entry 7 [ 161.044379] set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state. FAIL vmx (timeout; duration=90s)
kernel-config: http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-corei7-64/lkft... Full log: https://lkft.validation.linaro.org/scheduler/job/836289#L1597
Ick.
Any chance you can run 'git bisect' to find the offending patch? Or just try reverting a few, you can ignore the ppc ones, so that only leaves you 7 different commits.
Does this same test pass in 5.3-rc1?
thanks,
greg k-h
Regressions (compared to build v5.2.2)
x86: kvm-unit-tests: * vmx
TESTNAME=vmx TIMEOUT=90s ACCEL= ./x86/run x86/vmx.flat -smp 1 -cpu host,+vmx -append "-exit_monitor_from_l2_test -ept_access* -vmx_smp* -vmx_vmcs_shadow_test" [ 155.670748] kvm [6062]: vcpu0, guest rIP: 0x4050cb
...
[ 158.479030] nested_vmx_exit_reflected failed vm entry 7 [ 161.044379] set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state. FAIL vmx (timeout; duration=90s)
kernel-config: http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-corei7-64/lkft... Full log: https://lkft.validation.linaro.org/scheduler/job/836289#L1597
Ick.
Any chance you can run 'git bisect' to find the offending patch? Or just try reverting a few, you can ignore the ppc ones, so that only leaves you 7 different commits.
We have started 'git bisect' please allow sometime.
Does this same test pass in 5.3-rc1?
yes. kvm-unit-tests: vmx test getting PASS on 5.3-rc1 mainline kernel [1].
ref: [1] https://qa-reports.linaro.org/lkft/linux-mainline-oe/tests/kvm-unit-tests/vm...
- Naresh
thanks,
greg k-h
On 25/07/19 13:34, Greg Kroah-Hartman wrote:
Any chance you can run 'git bisect' to find the offending patch? Or just try reverting a few, you can ignore the ppc ones, so that only leaves you 7 different commits.
Does this same test pass in 5.3-rc1?
Anders, are you running the same kvm-unit-tests commit that passed for 5.2.2? My suspicion is that your previous test didn't have this commit
commit 95d6d2c3228891537ee8e35d2e2984964ee0cf6b Author: Krish Sadhukhan krish.sadhukhan@oracle.com AuthorDate: Fri Jun 28 18:14:47 2019 -0400 Commit: Paolo Bonzini pbonzini@redhat.com CommitDate: Thu Jul 11 14:26:53 2019 +0200
nVMX: Test Host Segment Registers and Descriptor Tables on vmentry of nested guests
since the symptoms match and the corresponding fix was made in 5.3.
I think Linaro's tests would be helped by making kvm-unit-tests.git a submodule of Linux, but I'm a bit wary since it would be the first submodule and I wouldn't know where to put it...
Paolo
Paolo,
On Thu, 25 Jul 2019 at 19:17, Paolo Bonzini pbonzini@redhat.com wrote:
On 25/07/19 13:34, Greg Kroah-Hartman wrote:
Any chance you can run 'git bisect' to find the offending patch? Or just try reverting a few, you can ignore the ppc ones, so that only leaves you 7 different commits.
Does this same test pass in 5.3-rc1?
Yes. same test pass on 5.3-rc1 kvm unit test always fetching master branch and at tip runs the latest test code on all branches mainline 5.3-rc1 and stable-rc-5.2 branch
Anders, are you running the same kvm-unit-tests commit that passed for 5.2.2? My suspicion is that your previous test didn't have this commit
No. I see two extra test code commits for 5.2.3 Re-tested 5.2.2 with tip of kvm unit tests sources and vmx test FAILED [1].
Greg, This investigation confirms it is a new test code failure on stable-rc 5.2.3
since the symptoms match and the corresponding fix was made in 5.3.
Thanks for your findings.
Paolo
- Naresh [1] https://lkft.validation.linaro.org/scheduler/job/837811
On Thu, Jul 25, 2019 at 07:35:13PM +0530, Naresh Kamboju wrote:
Paolo,
On Thu, 25 Jul 2019 at 19:17, Paolo Bonzini pbonzini@redhat.com wrote:
On 25/07/19 13:34, Greg Kroah-Hartman wrote:
Any chance you can run 'git bisect' to find the offending patch? Or just try reverting a few, you can ignore the ppc ones, so that only leaves you 7 different commits.
Does this same test pass in 5.3-rc1?
Yes. same test pass on 5.3-rc1 kvm unit test always fetching master branch and at tip runs the latest test code on all branches mainline 5.3-rc1 and stable-rc-5.2 branch
Anders, are you running the same kvm-unit-tests commit that passed for 5.2.2? My suspicion is that your previous test didn't have this commit
No. I see two extra test code commits for 5.2.3 Re-tested 5.2.2 with tip of kvm unit tests sources and vmx test FAILED [1].
Greg, This investigation confirms it is a new test code failure on stable-rc 5.2.3
No, it only confirms that kvm-unit-tests/master fails on 5.2.*. To confirm a new failure in 5.2.3 you would need to show a test that passes on 5.2.2 and fails on 5.2.3.
As Paolo suspected, kvm-unit-tests/master fails on 5.2.* and passes if commit 95d6d2c ("nVMX: Test Host Segment Registers and Descriptor Tables on vmentry of nested guests") is reverted (from kvm-unit-tests).
The failures are quite clearly in the new test(s).
PASS: HOST_SEL_CS 8: vmlaunch succeeds FAIL: HOST_SEL_CS 9: vmlaunch fails FAIL: HOST_SEL_CS c: vmlaunch fails PASS: HOST_SEL_SS 10: vmlaunch succeeds FAIL: HOST_SEL_SS 11: vmlaunch fails FAIL: HOST_SEL_SS 14: vmlaunch fails PASS: HOST_SEL_DS 10: vmlaunch succeeds FAIL: HOST_SEL_DS 11: vmlaunch fails FAIL: HOST_SEL_DS 14: vmlaunch fails PASS: HOST_SEL_ES 10: vmlaunch succeeds FAIL: HOST_SEL_ES 11: vmlaunch fails FAIL: HOST_SEL_ES 14: vmlaunch fails PASS: HOST_SEL_FS 10: vmlaunch succeeds FAIL: HOST_SEL_FS 11: vmlaunch fails FAIL: HOST_SEL_FS 14: vmlaunch fails PASS: HOST_SEL_GS 10: vmlaunch succeeds FAIL: HOST_SEL_GS 11: vmlaunch fails FAIL: HOST_SEL_GS 14: vmlaunch fails PASS: HOST_SEL_TR 80: vmlaunch succeeds FAIL: HOST_SEL_TR 81: vmlaunch fails KVM: entry failed, hardware error 0x80000021
On 25/07/19 18:09, Sean Christopherson wrote:
This investigation confirms it is a new test code failure on stable-rc 5.2.3
No, it only confirms that kvm-unit-tests/master fails on 5.2.*. To confirm a new failure in 5.2.3 you would need to show a test that passes on 5.2.2 and fails on 5.2.3.
I think he meant "a failure in new test code". :)
Paolo
As Paolo suspected, kvm-unit-tests/master fails on 5.2.* and passes if commit 95d6d2c ("nVMX: Test Host Segment Registers and Descriptor Tables on vmentry of nested guests") is reverted (from kvm-unit-tests).
On Thu, Jul 25, 2019 at 06:10:37PM +0200, Paolo Bonzini wrote:
On 25/07/19 18:09, Sean Christopherson wrote:
This investigation confirms it is a new test code failure on stable-rc 5.2.3
No, it only confirms that kvm-unit-tests/master fails on 5.2.*. To confirm a new failure in 5.2.3 you would need to show a test that passes on 5.2.2 and fails on 5.2.3.
I think he meant "a failure in new test code". :)
Ah, that does appear to be the case. So just to be clear, we're good, right?
On 25/07/19 18:20, Sean Christopherson wrote:
On Thu, Jul 25, 2019 at 06:10:37PM +0200, Paolo Bonzini wrote:
On 25/07/19 18:09, Sean Christopherson wrote:
This investigation confirms it is a new test code failure on stable-rc 5.2.3
No, it only confirms that kvm-unit-tests/master fails on 5.2.*. To confirm a new failure in 5.2.3 you would need to show a test that passes on 5.2.2 and fails on 5.2.3.
I think he meant "a failure in new test code". :)
Ah, that does appear to be the case. So just to be clear, we're good, right?
Yes. I'm happy to gather ideas on how to avoid this (i.e. 1) if a submodule would be useful; 2) where to stick it).
Paolo
On Thu, Jul 25, 2019 at 06:30:10PM +0200, Paolo Bonzini wrote:
On 25/07/19 18:20, Sean Christopherson wrote:
On Thu, Jul 25, 2019 at 06:10:37PM +0200, Paolo Bonzini wrote:
On 25/07/19 18:09, Sean Christopherson wrote:
This investigation confirms it is a new test code failure on stable-rc 5.2.3
No, it only confirms that kvm-unit-tests/master fails on 5.2.*. To confirm a new failure in 5.2.3 you would need to show a test that passes on 5.2.2 and fails on 5.2.3.
I think he meant "a failure in new test code". :)
Ah, that does appear to be the case. So just to be clear, we're good, right?
Yes. I'm happy to gather ideas on how to avoid this (i.e. 1) if a submodule would be useful; 2) where to stick it).
Hi!
First, to be clear: from LKFT perspective there are no kernel regressions here.
To your point Paolo - reporting 'fail' because of a missing kernel feature is a generic problem we see across test suites, and causes tons of pain and misery for CI people. As a general rule, I'd avoid submodules, and even branches that track specific kernels. Rather, and I don't know if it's possible in this case, but the best way to manage it from both a test author and a test runner POV is to wrap the test in kernel feature checks, kernel version checks, kernel config checks, etc. Report 'skip' if the environment in which the test is running isn't sufficient to run the test. Then, you only have to maintain one version of the test suite, users can always use the latest, and critically: all failures are actual failures.
Dan
Paolo
On 25/07/19 18:39, Dan Rue wrote:
To your point Paolo - reporting 'fail' because of a missing kernel feature is a generic problem we see across test suites, and causes tons of pain and misery for CI people. As a general rule, I'd avoid submodules, and even branches that track specific kernels. Rather, and I don't know if it's possible in this case, but the best way to manage it from both a test author and a test runner POV is to wrap the test in kernel feature checks, kernel version checks, kernel config checks, etc. Report 'skip' if the environment in which the test is running isn't sufficient to run the test. Then, you only have to maintain one version of the test suite, users can always use the latest, and critically: all failures are actual failures.
Note that kvm-unit-tests are not really testing new kernel features; those are covered by tools/testing/selftests/kvm. For some of these kvm-unit-tests there are some CPU features that we can check from the virtual machine, but those are easy to handle and they produce SKIP results just fine.
The problematic ones are tests that cover emulation accuracy. These are effectively bugfixes, so the failures you see _are_ actual failures. At the same time, the bugs are usually inoffensive(*), while the fixes are invasive and a bad source of cause conflicts in older Linux versions. This combines so that backporting to stable is not feasible.
Passing the host kernel version would be really ugly because 1) the tests can run on other hypervisor or emulators or even bare metal, and of course the host kernel version has no bearing if you're using userspace emulation 2) there are thousands of tests that would be littered with kernel version checks of little significance.
So this is why I suggested a submodule: using a submodule effectively ignores all tests that were added after a given Linus release, and thus all the failures for which backports are just not going to happen. However, if Sean's idea of creating a linux-M.N branch in kvm-unit-tests.git works for you, we can also do that as a stopgap measure to ease your testing.
Thanks,
Paolo
(*) if they aren't, we *do* mark them for backport!
On Thu, Jul 25, 2019 at 07:06:19PM +0200, Paolo Bonzini wrote:
On 25/07/19 18:39, Dan Rue wrote:
To your point Paolo - reporting 'fail' because of a missing kernel feature is a generic problem we see across test suites, and causes tons of pain and misery for CI people. As a general rule, I'd avoid submodules, and even branches that track specific kernels. Rather, and I don't know if it's possible in this case, but the best way to manage it from both a test author and a test runner POV is to wrap the test in kernel feature checks, kernel version checks, kernel config checks, etc. Report 'skip' if the environment in which the test is running isn't sufficient to run the test. Then, you only have to maintain one version of the test suite, users can always use the latest, and critically: all failures are actual failures.
Note that kvm-unit-tests are not really testing new kernel features; those are covered by tools/testing/selftests/kvm. For some of these kvm-unit-tests there are some CPU features that we can check from the virtual machine, but those are easy to handle and they produce SKIP results just fine.
The problematic ones are tests that cover emulation accuracy. These are effectively bugfixes, so the failures you see _are_ actual failures. At the same time, the bugs are usually inoffensive(*), while the fixes are invasive and a bad source of cause conflicts in older Linux versions. This combines so that backporting to stable is not feasible.
In this case, a fail result seems correct then. The thing we're doing that we need to fix is to run against a pinned version of kvm-unit-tests and upgrade it independently so that we can identify such failures and mark them as known issues.
Passing the host kernel version would be really ugly because 1) the tests can run on other hypervisor or emulators or even bare metal, and of course the host kernel version has no bearing if you're using userspace emulation 2) there are thousands of tests that would be littered with kernel version checks of little significance.
So this is why I suggested a submodule: using a submodule effectively ignores all tests that were added after a given Linus release, and thus all the failures for which backports are just not going to happen. However, if Sean's idea of creating a linux-M.N branch in kvm-unit-tests.git works for you, we can also do that as a stopgap measure to ease your testing.
I would still prefer to run the latest tests against all kernel versions (but better control when we upgrade it). Like I said, we can handle expected failures, and it would even help to validate backports for fixes that do get backported. I'm afraid on your behalf that snapping (and maintaining) branches per kernel branch is going to be a lot to manage.
In any case, _thank you so much_ for jumping on this and helping us run these tests. Is there anything else we can do to make this better for you?
Dan
Thanks,
Paolo
(*) if they aren't, we *do* mark them for backport!
On Thu, Jul 25, 2019 at 03:19:33PM -0500, Dan Rue wrote:
I would still prefer to run the latest tests against all kernel versions (but better control when we upgrade it). Like I said, we can handle expected failures, and it would even help to validate backports for fixes that do get backported. I'm afraid on your behalf that snapping (and maintaining) branches per kernel branch is going to be a lot to manage.
Having the branches would be beneficial for kernel developers as well, e.g. on multiple occasions I've spent time hunting down non-existent KVM bugs, only to realize my base kernel was stale with respect to kvm-unit-tests.
My thought was to have a mostly-unmaintained branch for each major kernel version, e.g. snapshot a working version of kvm_unit_tests when the KVM pull request for the merge window is sent, and for the most part leave it at that. I don't think it would introduce much overhead, but then again, I'm not the person who would be maintaining this :-)
On 25/07/19 22:57, Sean Christopherson wrote:
On Thu, Jul 25, 2019 at 03:19:33PM -0500, Dan Rue wrote:
I would still prefer to run the latest tests against all kernel versions (but better control when we upgrade it). Like I said, we can handle expected failures, and it would even help to validate backports for fixes that do get backported. I'm afraid on your behalf that snapping (and maintaining) branches per kernel branch is going to be a lot to manage.
Having the branches would be beneficial for kernel developers as well, e.g. on multiple occasions I've spent time hunting down non-existent KVM bugs, only to realize my base kernel was stale with respect to kvm-unit-tests.
My thought was to have a mostly-unmaintained branch for each major kernel version, e.g. snapshot a working version of kvm_unit_tests when the KVM pull request for the merge window is sent, and for the most part leave it at that. I don't think it would introduce much overhead, but then again, I'm not the person who would be maintaining this :-)
Yes, I agree. Stable backports that have fixes in kvm-unit-tests are relatively rare, so the branch would hardly move after a release is cut.
Paolo
On Thu, Jul 25, 2019 at 06:30:10PM +0200, Paolo Bonzini wrote:
On 25/07/19 18:20, Sean Christopherson wrote:
On Thu, Jul 25, 2019 at 06:10:37PM +0200, Paolo Bonzini wrote:
On 25/07/19 18:09, Sean Christopherson wrote:
This investigation confirms it is a new test code failure on stable-rc 5.2.3
No, it only confirms that kvm-unit-tests/master fails on 5.2.*. To confirm a new failure in 5.2.3 you would need to show a test that passes on 5.2.2 and fails on 5.2.3.
I think he meant "a failure in new test code". :)
Ah, that does appear to be the case. So just to be clear, we're good, right?
Yes. I'm happy to gather ideas on how to avoid this (i.e. 1) if a submodule would be useful; 2) where to stick it).
As a starting point, what about adding "stable" branches for each kernel release to kvm-unit-tests, e.g. linux-5.2.y? I assume we'd need something similar for the submodules anyways.
On 7/24/19 1:14 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.2.3 release. There are 413 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Jul 2019 07:13:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.2.3-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.2.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On Thu, Jul 25, 2019 at 09:35:09AM -0600, shuah wrote:
On 7/24/19 1:14 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.2.3 release. There are 413 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Jul 2019 07:13:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.2.3-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.2.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Thanks for testing all of these and letting me know.
greg k-h
On Wed, Jul 24, 2019 at 09:14:51PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.2.3 release. There are 413 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Jul 2019 07:13:35 PM UTC. Anything received after that time might be too late.
Build results: total: 159 pass: 159 fail: 0 Qemu test results: total: 364 pass: 364 fail: 0
Guenter
On Wed, Jul 24, 2019 at 09:14:51PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.2.3 release. There are 413 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Jul 2019 07:13:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.2.3-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.2.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted with no regressions on my system.
Cheers, Kelsey
On Fri, Jul 26, 2019 at 12:18:54AM -0600, Kelsey Skunberg wrote:
On Wed, Jul 24, 2019 at 09:14:51PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.2.3 release. There are 413 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri 26 Jul 2019 07:13:35 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.2.3-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.2.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted with no regressions on my system.
Great, thanks for testing and letting me know.
greg k-h
linux-stable-mirror@lists.linaro.org