This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.25-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.15.25-rc1
Cheng Jui Wang cheng-jui.wang@mediatek.com lockdep: Correct lock_classes index mapping
Rafał Miłecki rafal@milecki.pl i2c: brcmstb: fix support for DSL and CM variants
Jesse Brandeburg jesse.brandeburg@intel.com ice: enable parsing IPSEC SPI headers for RSS
Mike Christie michael.christie@oracle.com scsi: qedi: Fix ABBA deadlock in qedi_process_tmf_resp() and qedi_process_cmd_cleanup_resp()
Waiman Long longman@redhat.com copy_process(): Move fd_install() out of sighand->siglock critical section
Christophe JAILLET christophe.jaillet@wanadoo.fr dmaengine: ptdma: Fix the error handling path in pt_core_init()
Vladimir Zapolskiy vladimir.zapolskiy@linaro.org i2c: qcom-cci: don't put a device tree node before i2c_add_adapter()
Vladimir Zapolskiy vladimir.zapolskiy@linaro.org i2c: qcom-cci: don't delete an unregistered adapter
Christian Brauner brauner@kernel.org tests: fix idmapped mount_setattr test
Jiasheng Jiang jiasheng@iscas.ac.cn dmaengine: sh: rcar-dmac: Check for error num after dma_set_max_seg_size
Miaoqian Lin linmq006@gmail.com dmaengine: stm32-dmamux: Fix PM disable depth imbalance in stm32_dmamux_probe
Jiasheng Jiang jiasheng@iscas.ac.cn dmaengine: sh: rcar-dmac: Check for error num after setting mask
Eric Dumazet edumazet@google.com net: sched: limit TC_ACT_REPEAT loops
Eric W. Biederman ebiederm@xmission.com ucounts: Move RLIMIT_NPROC handling after set_user
Eric W. Biederman ebiederm@xmission.com rlimit: Fix RLIMIT_NPROC enforcement failure caused by capability calls in set_user
Eric W. Biederman ebiederm@xmission.com ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1
Eric W. Biederman ebiederm@xmission.com ucounts: Base set_cred_ucounts changes on the real user
Eric W. Biederman ebiederm@xmission.com ucounts: In set_cred_ucounts assume new->ucounts is non-NULL
Eric W. Biederman ebiederm@xmission.com ucounts: Handle wrapping in is_ucounts_overlimit
Eliav Farber farbere@amazon.com EDAC: Fix calculation of returned address and next offset in edac_align_ptr()
James Smart jsmart2021@gmail.com scsi: lpfc: Fix pt2pt NVMe PRLI reject LOGO loop
Jing Leng jleng@ambarella.com kconfig: fix failing to generate auto.conf
Marc St-Amand mstamand@ciena.com net: macb: Align the dma and coherent dma masks
Slark Xiao slark_xiao@163.com net: usb: qmi_wwan: Add support for Dell DW5829e
Dmytro Laktyushkin Dmytro.Laktyushkin@amd.com drm/amd/display: fix yellow carp wm clamping
Roman Li Roman.Li@amd.com drm/amd/display: Cap pflip irqs per max otg number
Mario Limonciello mario.limonciello@amd.com display/amd: decrease message verbosity about watermarks table failure
JaeSang Yoo js.yoo.5b@gmail.com tracing: Fix tp_printk option related with tp_printk_stop_on_boot
Sascha Hauer s.hauer@pengutronix.de drm/rockchip: dw_hdmi: Do not leave clock enabled in error case
Dan Aloni dan.aloni@vastdata.com xprtrdma: fix pointer derefs in error cases of rpcrdma_ep_create
Jae Hyun Yoo jae.hyun.yoo@linux.intel.com soc: aspeed: lpc-ctrl: Block error printing on probe defer cases
Zoltán Böszörményi zboszor@gmail.com ata: libata-core: Disable TRIM on M88V29
Brenda Streiff brenda.streiff@ni.com kconfig: let 'shell' return enough output for deep path names
Mario Limonciello mario.limonciello@amd.com ACPI: PM: Revert "Only mark EC GPE for wakeup on Intel systems"
Shakeel Butt shakeelb@google.com mm: io_uring: allow oom-killer from io_uring_setup
Axel Rasmussen axelrasmussen@google.com selftests: fixup build warnings in pidfd / clone3 tests
Axel Rasmussen axelrasmussen@google.com pidfd: fix test failure due to stack overflow on some arches
Christian Hewitt christianshewitt@gmail.com arm64: dts: meson-g12: drop BL32 region from SEI510/SEI610
Christian Hewitt christianshewitt@gmail.com arm64: dts: meson-g12: add ATF BL32 reserved-memory region
Christian Hewitt christianshewitt@gmail.com arm64: dts: meson-gx: add ATF BL32 reserved-memory region
Namjae Jeon linkinjeon@kernel.org ksmbd: don't align last entry offset in smb2 query directory
Namjae Jeon linkinjeon@kernel.org ksmbd: fix same UniqueId for dot and dotdot entries
Florian Westphal fw@strlen.de netfilter: conntrack: don't refresh sctp entries in closed state
Nick Desaulniers ndesaulniers@google.com x86/bug: Merge annotate_reachable() into _BUG_FLAGS() asm
Guo Ren guoren@linux.alibaba.com irqchip/sifive-plic: Add missing thead,c900-plic match string
Wan Jiabing wanjiabing@vivo.com phy: phy-mtk-tphy: Fix duplicated argument in phy-mtk-tphy
Padmanabha Srinivasaiah treasure4paddy@gmail.com staging: vc04_services: Fix RCU dereference check
Al Cooper alcooperx@gmail.com phy: usb: Leave some clocks running during suspend
Ye Guojin ye.guojin@zte.com.cn ARM: OMAP2+: adjust the location of put_device() call in omapdss_init_of
Wan Jiabing wanjiabing@vivo.com ARM: OMAP2+: hwmod: Add of_node_put() before break
Jim Mattson jmattson@google.com KVM: x86/pmu: Use AMD64_RAW_EVENT_MASK for PERF_TYPE_RAW
Jim Mattson jmattson@google.com KVM: x86/pmu: Don't truncate the PerfEvtSeln MSR when creating a perf event
Like Xu likexu@tencent.com KVM: x86/pmu: Refactoring find_arch_event() to pmc_perf_hw_id()
Miaoqian Lin linmq006@gmail.com Drivers: hv: vmbus: Fix memory leak in vmbus_add_channel_kobj
Miaoqian Lin linmq006@gmail.com mtd: rawnand: ingenic: Fix missing put_device in ingenic_ecc_get
Dongliang Mu mudongliangabcd@gmail.com HID: elo: fix memory leak in elo_probe
david regan dregan@mail.com mtd: rawnand: brcmnand: Fixed incorrect sub-page ECC status
Dan Carpenter dan.carpenter@oracle.com mtd: phram: Prevent divide by zero bug in phram_setup()
Ansuel Smith ansuelsmth@gmail.com mtd: parsers: qcom: Fix missing free for pparts in cleanup
Ansuel Smith ansuelsmth@gmail.com mtd: parsers: qcom: Fix kernel panic on skipped partition
Bryan O'Donoghue bryan.odonoghue@linaro.org mtd: rawnand: qcom: Fix clock sequencing in qcom_nandc_probe()
Christoph Hellwig hch@lst.de block: fix surprise removal for drivers calling blk_set_queue_dying
Linus Torvalds torvalds@linux-foundation.org tty: n_tty: do not look ahead for EOL character past the end of the buffer
Trond Myklebust trond.myklebust@hammerspace.com NFS: Do not report writeback errors in nfs_getattr()
Trond Myklebust trond.myklebust@hammerspace.com NFS: LOOKUP_DIRECTORY is also ok with symlinks
Trond Myklebust trond.myklebust@hammerspace.com NFS: Remove an incorrect revalidation in nfs4_update_changeattr_locked()
Laibin Qiu qiulaibin@huawei.com block/wbt: fix negative inflight counter when remove scsi device
Stephen Boyd swboyd@chromium.org ASoC: qcom: Actually clear DMA interrupt register for HDMI
Martin Povišer povik+lin@cutebit.org ASoC: tas2770: Insert post reset delay
Bart Van Assche bvanassche@acm.org scsi: ufs: Fix a deadlock in the error handler
Bart Van Assche bvanassche@acm.org scsi: ufs: Remove dead code
Jon Maloy jmaloy@redhat.com tipc: fix wrong notification node addresses
Steve French stfrench@microsoft.com smb3: fix snapshot mount option
Christian Eggers ceggers@arri.de mtd: rawnand: gpmi: don't leak PM reference in error path
Anders Roxell anders.roxell@linaro.org powerpc/lib/sstep: fix 'ptesync' build error
Christophe Leroy christophe.leroy@csgroup.eu powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE
Amir Goldstein amir73il@gmail.com cifs: fix set of group SID via NTSD xattrs
Mark Brown broonie@kernel.org ASoC: ops: Fix stereo change notifications in snd_soc_put_xr_sx()
Mark Brown broonie@kernel.org ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw_sx()
Mark Brown broonie@kernel.org ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw_range()
Mark Brown broonie@kernel.org ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw()
Takashi Iwai tiwai@suse.de ALSA: hda: Fix missing codec probe on Shenker Dock 15
Takashi Iwai tiwai@suse.de ALSA: hda: Fix regression on forced probe mask option
Takashi Iwai tiwai@suse.de ALSA: hda/realtek: Fix deadlock by COEF mutex
Yu Huang diwang90@gmail.com ALSA: hda/realtek: Add quirk for Legion Y9000X 2019
Matteo Martelli matteomartelli3@gmail.com ALSA: usb-audio: revert to IMPLICIT_FB_FIXED_DEV for M-Audio FastTrack Ultra
Joakim Tjernlund joakim.tjernlund@infinera.com arm64: Correct wrong label in macro __init_el2_gicv3
Muhammad Usama Anjum usama.anjum@collabora.com selftests/exec: Add non-regular to TEST_GEN_PROGS
Arnaldo Carvalho de Melo acme@redhat.com perf bpf: Defer freeing string after possible strlen() on it
Oleksandr Mazur oleksandr.mazur@plvision.eu net: bridge: multicast: notify switchdev driver whenever MC processing gets disabled
Radu Bulie radu-andrei.bulie@nxp.com dpaa2-eth: Initialize mutex used in one step timestamping path
Tom Rix trix@redhat.com dpaa2-switch: fix default return of dpaa2_switch_flower_parse_mirror_key
Jon Maloy jmaloy@redhat.com tipc: fix wrong publisher node address in link publications
Gatis Peisenieks gatis@mikrotik.com atl1c: fix tx timeout after link flap on Mikrotik 10/25G NIC
DENG Qingfang dqfext@gmail.com net: phy: mediatek: remove PHY mode check on MT7531
Wen Gu guwen@linux.alibaba.com net/smc: Avoid overwriting the copies of clcsock callback functions
Kees Cook keescook@chromium.org libsubcmd: Fix use-after-free for realloc(..., 0)
Eric Dumazet edumazet@google.com bonding: fix data-races around agg_select_timer
Eric Dumazet edumazet@google.com net_sched: add __rcu annotation to netdev->qdisc
Eric Dumazet edumazet@google.com drop_monitor: fix data-race in dropmon_net_event / trace_napi_poll_hit
Zhang Changzhong zhangchangzhong@huawei.com bonding: force carrier update when releasing slave
Xin Long lucien.xin@gmail.com ping: fix the dif and sdif check in ping_lookup
Miquel Raynal miquel.raynal@bootlin.com net: ieee802154: ca8210: Fix lifs/sifs periods
Mans Rullgard mans@mansr.com net: dsa: lan9303: add VLAN IDs to master device
Mans Rullgard mans@mansr.com net: dsa: lan9303: handle hwaccel VLAN tags
Alexey Khoroshilov khoroshilov@ispras.ru net: dsa: lantiq_gswip: fix use after free in gswip_remove()
Vladimir Oltean vladimir.oltean@nxp.com net: dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN
Mans Rullgard mans@mansr.com net: dsa: lan9303: fix reset on probe
Johannes Berg johannes.berg@intel.com cfg80211: fix race in netlink owner interface destruction
Phil Elwell phil@raspberrypi.com brcmfmac: firmware: Fix crash in brcm_alt_fw_path
Jiasheng Jiang jiasheng@iscas.ac.cn mac80211: mlme: check for null after calling kmemdup
Jonas Gorski jonas.gorski@gmail.com Revert "net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname"
Willem de Bruijn willemb@google.com ipv6: per-netns exclusive flowlabel checks
Ignat Korchagin ignat@cloudflare.com ipv6: mcast: use rcu-safe version of ipv6_get_lladdr()
Eric Dumazet edumazet@google.com ipv6: fix data-race in fib6_info_hw_flags_set / fib6_purge_rt
Eric Dumazet edumazet@google.com ipv4: fix data races in fib_alias_hw_flags_set
Hangbin Liu liuhangbin@gmail.com selftests: netfilter: disable rp_filter on router
Pablo Neira Ayuso pablo@netfilter.org netfilter: nft_synproxy: unregister hooks on init error path
Hangbin Liu liuhangbin@gmail.com selftests: netfilter: fix exit value for nft_concat_range
Eric Dumazet edumazet@google.com netfilter: xt_socket: fix a typo in socket_mt_destroy()
Luca Coelho luciano.coelho@intel.com iwlwifi: mvm: don't send SAR GEO command for 3160 devices
Johannes Berg johannes.berg@intel.com iwlwifi: pcie: gen2: fix locking when "HW not ready"
Johannes Berg johannes.berg@intel.com iwlwifi: pcie: fix locking when "HW not ready"
Matthew Auld matthew.auld@intel.com drm/i915/ttm: tweak priority hint selection
Siva Mullati siva.mullati@intel.com drm/i915/gvt: Make DRM_I915_GVT depend on X86
Robin Murphy robin.murphy@arm.com drm/cma-helper: Set VM_DONTEXPAND for mmap
Jens Wiklander jens.wiklander@linaro.org optee: use driver internal tee_context for some rpc
Jens Wiklander jens.wiklander@linaro.org tee: export teedev_open() and teedev_close_context()
Seth Forshee sforshee@digitalocean.com vsock: remove vsock from connected table when connect is interrupted by a signal
Ville Syrjälä ville.syrjala@linux.intel.com drm/i915: Fix mbus join config lookup
Ville Syrjälä ville.syrjala@linux.intel.com drm/i915: Fix dbuf slice config lookup
Jani Nikula jani.nikula@intel.com drm/i915/opregion: check port number bounds for SWSCI display power state
Rajib Mahapatra rajib.mahapatra@amd.com drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix.
Yifan Zhang yifan1.zhang@amd.com drm/amd/pm: correct the sequence of sending gpu reset msg
Ville Syrjälä ville.syrjala@linux.intel.com drm/atomic: Don't pollute crtc_state->mode_blob with error pointers
Nicholas Bishop nicholasbishop@google.com drm/radeon: Fix backlight control on iMac 12,1
Johannes Berg johannes.berg@intel.com iwlwifi: fix use-after-free
Maxim Levitsky mlevitsk@redhat.com KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state
Maxim Levitsky mlevitsk@redhat.com KVM: x86: nSVM: fix potential NULL derefernce on nested migration
Maxim Levitsky mlevitsk@redhat.com KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT && !gCR0.PG case
Maxim Levitsky mlevitsk@redhat.com KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM
David Woodhouse dwmw@amazon.co.uk KVM: x86/xen: Fix runstate updates to be atomic when preempting vCPU
Jason A. Donenfeld Jason@zx2c4.com random: wake up /dev/random writers after zap
Kees Cook keescook@chromium.org gcc-plugins/stackleak: Use noinstr in favor of notrace
Igor Pylypiv ipylypiv@google.com Revert "module, async: async_synchronize_full() on module init iff async is used"
Jan Beulich jbeulich@suse.com x86/Xen: streamline (and fix) PV CPU enumeration
Christian König christian.koenig@amd.com drm/amdgpu: fix logic inversion in check
Mario Limonciello mario.limonciello@amd.com drm/amd: Only run s3 or s0ix if system is configured properly
Mario Limonciello mario.limonciello@amd.com drm/amd: add support to check whether the system is set to s3
Steen Hegelund steen.hegelund@microchip.com net: sparx5: do not refer to skb after passing it on
Sagi Grimberg sagi@grimberg.me nvme-rdma: fix possible use-after-free in transport error_recovery work
Sagi Grimberg sagi@grimberg.me nvme-tcp: fix possible use-after-free in transport error_recovery work
Sagi Grimberg sagi@grimberg.me nvme: fix a possible use-after-free in controller reset during load
Mario Limonciello mario.limonciello@amd.com drm/amd: Warn users about potential s0ix problems
John Garry john.garry@huawei.com scsi: pm8001: Fix use-after-free for aborted SSP/STP sas_task
John Garry john.garry@huawei.com scsi: pm8001: Fix use-after-free for aborted TMF sas_task
Ming Lei ming.lei@redhat.com scsi: core: Reallocate device's budget map on queue depth change
Vincenzo Frascino vincenzo.frascino@arm.com kselftest: Fix vdso_test_abi return status
Ajish Koshy Ajish.Koshy@microchip.com scsi: pm80xx: Fix double completion for SATA devices
Darrick J. Wong djwong@kernel.org quota: make dquot_quota_sync return errors from ->sync_fs
Darrick J. Wong djwong@kernel.org vfs: make freeze_super abort when sync_filesystem returns error
Julian Braha julianbraha@gmail.com pinctrl: bcm63xx: fix unmet dependency on REGMAP for GPIO_REGMAP
Duoming Zhou duoming@zju.edu.cn ax25: improve the incomplete fix to avoid UAF and NPD bugs
Cristian Marussi cristian.marussi@arm.com selftests: skip mincore.check_file_mmap when fs lacks needed support
Cristian Marussi cristian.marussi@arm.com selftests: openat2: Skip testcases that fail with EOPNOTSUPP
Cristian Marussi cristian.marussi@arm.com selftests: openat2: Add missing dependency in Makefile
Cristian Marussi cristian.marussi@arm.com selftests: openat2: Print also errno in failure messages
Yang Xu xuyang2018.jy@fujitsu.com selftests/zram: Adapt the situation that /dev/zram0 is being used
Yang Xu xuyang2018.jy@fujitsu.com selftests/zram01.sh: Fix compression ratio calculation
Yang Xu xuyang2018.jy@fujitsu.com selftests/zram: Skip max_comp_streams interface on newer kernel
Miquel Raynal miquel.raynal@bootlin.com net: ieee802154: at86rf230: Stop leaking skb's
Li Zhijian lizhijian@cn.fujitsu.com kselftest: signal all child processes
Nícolas F. R. A. Prado nfraprado@collabora.com selftests: rtc: Increase test timeout so that all tests run
Michał Winiarski michal.winiarski@intel.com kunit: tool: Import missing importlib.abc
Srinivas Pandruvada srinivas.pandruvada@linux.intel.com platform/x86: ISST: Fix possible circular locking dependency detected
Yuka Kawajiri yukx00@gmail.com platform/x86: touchscreen_dmi: Add info for the RWC NANOTE P8 AY07J 2-in-1
Dāvis Mosāns davispuh@gmail.com btrfs: send: in case of IO error log it
Andy Shevchenko andriy.shevchenko@linux.intel.com parisc: Add ioread64_lo_hi() and iowrite64_lo_hi()
Long Li longli@microsoft.com PCI: hv: Fix NUMA node assignment when kernel boots with custom NUMA topology
Basavaraj Natikar Basavaraj.Natikar@amd.com HID: amd_sfh: Correct the structure field name
Basavaraj Natikar Basavaraj.Natikar@amd.com HID: amd_sfh: Increase sensor command timeout
Daniel Thompson daniel.thompson@linaro.org HID: i2c-hid: goodix: Fix a lockdep splat
Basavaraj Natikar Basavaraj.Natikar@amd.com HID: amd_sfh: Add illuminance mask to limit ALS max value
Linus Torvalds torvalds@linux-foundation.org mm: don't try to NUMA-migrate COW pages that have other uses
Christian Löhle CLoehle@hyperstone.com mmc: block: fix read single on recovery logic
John David Anglin dave.anglin@bell.net parisc: Fix sglist access in ccio-dma.c
John David Anglin dave.anglin@bell.net parisc: Fix data TLB miss in sba_unmap_sg
John David Anglin dave.anglin@bell.net parisc: Drop __init from map_pages declaration
Randy Dunlap rdunlap@infradead.org serial: parisc: GSC: fix build when IOSAPIC is not set
Helge Deller deller@gmx.de parisc: Show error if wrong 32/64-bit compiler is being used
Sean Christopherson seanjc@google.com Revert "svm: Add warning message for AVIC IPI invalid target"
Sergio Costas rastersoft@gmail.com HID:Add support for UGTABLET WP5540
James Smart jsmart2021@gmail.com scsi: lpfc: Fix mailbox command failure during driver initialization
Naohiro Aota naohiro.aota@wdc.com btrfs: zoned: cache reported zone during mount
Yang Shi shy828301@gmail.com fs/proc: task_mmu.c: don't read mapcount for migration entry
Ben Skeggs bskeggs@redhat.com drm/nouveau/pmu/gm200-: use alternate falcon reset sequence
-------------
Diffstat:
Makefile | 4 +- arch/arm/mach-omap2/display.c | 2 +- arch/arm/mach-omap2/omap_hwmod.c | 4 +- arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi | 6 + arch/arm64/boot/dts/amlogic/meson-g12a-sei510.dts | 8 -- arch/arm64/boot/dts/amlogic/meson-gx.dtsi | 6 + arch/arm64/boot/dts/amlogic/meson-sm1-sei610.dts | 8 -- arch/arm64/include/asm/el2_setup.h | 2 +- arch/parisc/include/asm/bitops.h | 8 ++ arch/parisc/lib/iomap.c | 18 +++ arch/parisc/mm/init.c | 9 +- arch/powerpc/kernel/head_book3s_32.S | 4 +- arch/powerpc/lib/sstep.c | 2 + arch/x86/include/asm/bug.h | 20 +-- arch/x86/kvm/pmu.c | 15 +-- arch/x86/kvm/pmu.h | 3 +- arch/x86/kvm/svm/avic.c | 2 - arch/x86/kvm/svm/nested.c | 26 ++-- arch/x86/kvm/svm/pmu.c | 8 +- arch/x86/kvm/svm/svm.c | 19 ++- arch/x86/kvm/vmx/pmu_intel.c | 9 +- arch/x86/kvm/vmx/vmx.c | 1 + arch/x86/kvm/xen.c | 97 ++++++++++----- arch/x86/xen/enlighten_pv.c | 4 - arch/x86/xen/smp_pv.c | 26 +--- block/bfq-iosched.c | 2 + block/blk-core.c | 10 +- block/elevator.c | 2 - block/genhd.c | 14 +++ drivers/acpi/x86/s2idle.c | 12 +- drivers/ata/libata-core.c | 1 + drivers/block/mtip32xx/mtip32xx.c | 2 +- drivers/block/rbd.c | 2 +- drivers/block/xen-blkfront.c | 2 +- drivers/char/random.c | 5 +- drivers/dma/ptdma/ptdma-dev.c | 17 +-- drivers/dma/sh/rcar-dmac.c | 9 +- drivers/dma/stm32-dmamux.c | 4 +- drivers/edac/edac_mc.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 10 +- drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 37 +++++- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 8 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 8 ++ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- .../drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c | 6 +- drivers/gpu/drm/amd/display/dc/core/dc.c | 2 + drivers/gpu/drm/amd/display/dc/dc.h | 1 + .../gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c | 61 +++++----- .../gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c | 9 +- drivers/gpu/drm/drm_atomic_uapi.c | 14 ++- drivers/gpu/drm/drm_gem_cma_helper.c | 1 + drivers/gpu/drm/i915/Kconfig | 1 + drivers/gpu/drm/i915/display/intel_opregion.c | 15 +++ drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 6 +- drivers/gpu/drm/i915/intel_pm.c | 4 +- drivers/gpu/drm/nouveau/nvkm/falcon/base.c | 8 +- drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm200.c | 31 ++++- drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c | 2 +- drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp102.c | 2 +- drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp10b.c | 2 +- drivers/gpu/drm/nouveau/nvkm/subdev/pmu/priv.h | 2 + drivers/gpu/drm/radeon/atombios_encoders.c | 3 +- drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c | 14 +-- drivers/hid/amd-sfh-hid/amd_sfh_pcie.c | 4 +- drivers/hid/amd-sfh-hid/amd_sfh_pcie.h | 2 +- .../amd-sfh-hid/hid_descriptor/amd_sfh_hid_desc.c | 4 +- drivers/hid/hid-elo.c | 1 + drivers/hid/hid-ids.h | 1 + drivers/hid/hid-quirks.c | 1 + drivers/hid/i2c-hid/i2c-hid-of-goodix.c | 28 ++--- drivers/hv/vmbus_drv.c | 5 +- drivers/i2c/busses/i2c-brcmstb.c | 2 +- drivers/i2c/busses/i2c-qcom-cci.c | 16 ++- drivers/irqchip/irq-sifive-plic.c | 1 + drivers/md/dm.c | 2 +- drivers/mmc/core/block.c | 28 ++--- drivers/mtd/devices/phram.c | 12 +- drivers/mtd/nand/raw/brcmnand/brcmnand.c | 2 +- drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 3 +- drivers/mtd/nand/raw/ingenic/ingenic_ecc.c | 7 +- drivers/mtd/nand/raw/qcom_nandc.c | 14 +-- drivers/mtd/parsers/qcomsmempart.c | 33 +++-- drivers/net/bonding/bond_3ad.c | 30 ++++- drivers/net/bonding/bond_main.c | 5 +- drivers/net/dsa/Kconfig | 1 + drivers/net/dsa/lan9303-core.c | 13 +- drivers/net/dsa/lantiq_gswip.c | 2 +- drivers/net/dsa/mv88e6xxx/chip.c | 7 ++ drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 2 +- drivers/net/ethernet/broadcom/bgmac-platform.c | 23 ++-- drivers/net/ethernet/cadence/macb_main.c | 2 +- drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 2 +- .../ethernet/freescale/dpaa2/dpaa2-switch-flower.c | 4 +- drivers/net/ethernet/intel/ice/ice_lib.c | 6 + .../net/ethernet/microchip/sparx5/sparx5_packet.c | 2 +- drivers/net/ieee802154/at86rf230.c | 13 +- drivers/net/ieee802154/ca8210.c | 4 +- drivers/net/netdevsim/fib.c | 4 +- drivers/net/phy/mediatek-ge.c | 3 - drivers/net/usb/qmi_wwan.c | 2 + .../broadcom/brcm80211/brcmfmac/firmware.c | 6 +- drivers/net/wireless/intel/iwlwifi/fw/acpi.c | 11 +- drivers/net/wireless/intel/iwlwifi/iwl-csr.h | 3 +- drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 2 + drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 2 +- .../net/wireless/intel/iwlwifi/pcie/trans-gen2.c | 3 +- drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 3 +- drivers/nvme/host/core.c | 11 +- drivers/nvme/host/multipath.c | 2 +- drivers/nvme/host/rdma.c | 1 + drivers/nvme/host/tcp.c | 1 + drivers/parisc/ccio-dma.c | 3 +- drivers/parisc/sba_iommu.c | 3 +- drivers/pci/controller/pci-hyperv.c | 13 +- drivers/phy/broadcom/phy-brcm-usb.c | 38 ++++++ drivers/phy/mediatek/phy-mtk-tphy.c | 2 +- drivers/pinctrl/bcm/Kconfig | 1 + .../x86/intel/speed_select_if/isst_if_common.c | 97 +++++++++------ drivers/platform/x86/touchscreen_dmi.c | 24 ++++ drivers/scsi/lpfc/lpfc.h | 1 + drivers/scsi/lpfc/lpfc_attr.c | 3 + drivers/scsi/lpfc/lpfc_els.c | 20 ++- drivers/scsi/lpfc/lpfc_nportdisc.c | 5 +- drivers/scsi/lpfc/lpfc_sli.c | 15 ++- drivers/scsi/pm8001/pm8001_hwi.c | 18 --- drivers/scsi/pm8001/pm8001_sas.c | 5 + drivers/scsi/pm8001/pm80xx_hwi.c | 30 +---- drivers/scsi/qedi/qedi_fw.c | 6 +- drivers/scsi/scsi_scan.c | 55 ++++++++- drivers/scsi/ufs/ufshcd.c | 58 +++------ drivers/scsi/ufs/ufshcd.h | 2 + drivers/soc/aspeed/aspeed-lpc-ctrl.c | 7 +- .../vc04_services/interface/vchiq_arm/vchiq_arm.c | 20 ++- drivers/tee/optee/core.c | 8 ++ drivers/tee/optee/optee_private.h | 2 + drivers/tee/optee/rpc.c | 8 +- drivers/tee/tee_core.c | 6 +- drivers/tty/n_tty.c | 6 +- drivers/tty/serial/8250/8250_gsc.c | 2 +- fs/btrfs/dev-replace.c | 2 +- fs/btrfs/disk-io.c | 2 + fs/btrfs/send.c | 4 + fs/btrfs/volumes.c | 2 +- fs/btrfs/zoned.c | 85 +++++++++++-- fs/btrfs/zoned.h | 8 +- fs/cifs/fs_context.c | 4 +- fs/cifs/xattr.c | 2 + fs/io_uring.c | 5 +- fs/ksmbd/smb2pdu.c | 7 +- fs/ksmbd/smb_common.c | 5 +- fs/ksmbd/vfs.h | 1 + fs/nfs/dir.c | 4 +- fs/nfs/inode.c | 9 +- fs/nfs/nfs4proc.c | 3 +- fs/proc/task_mmu.c | 40 ++++-- fs/quota/dquot.c | 11 +- fs/super.c | 19 +-- include/linux/blkdev.h | 3 +- include/linux/compiler.h | 21 +--- include/linux/netdevice.h | 2 +- include/linux/sched.h | 1 - include/linux/tee_drv.h | 14 +++ include/net/addrconf.h | 2 - include/net/bond_3ad.h | 2 +- include/net/dsa.h | 1 + include/net/ip6_fib.h | 10 +- include/net/ipv6.h | 5 +- include/net/netns/ipv6.h | 3 +- kernel/async.c | 3 - kernel/cred.c | 12 +- kernel/fork.c | 17 ++- kernel/locking/lockdep.c | 4 +- kernel/module.c | 25 +--- kernel/stackleak.c | 5 +- kernel/sys.c | 20 ++- kernel/trace/trace.c | 4 + kernel/ucount.c | 3 +- mm/mprotect.c | 2 +- net/ax25/af_ax25.c | 9 +- net/bridge/br_multicast.c | 4 + net/core/drop_monitor.c | 11 +- net/core/rtnetlink.c | 6 +- net/dsa/dsa.c | 1 + net/dsa/dsa_priv.h | 1 - net/dsa/tag_lan9303.c | 21 ++-- net/ipv4/fib_lookup.h | 7 +- net/ipv4/fib_semantics.c | 6 +- net/ipv4/fib_trie.c | 22 ++-- net/ipv4/ping.c | 11 +- net/ipv4/route.c | 4 +- net/ipv6/addrconf.c | 4 +- net/ipv6/ip6_flowlabel.c | 4 +- net/ipv6/mcast.c | 2 +- net/ipv6/route.c | 19 +-- net/mac80211/mlme.c | 29 +++-- net/netfilter/nf_conntrack_proto_sctp.c | 9 ++ net/netfilter/nft_synproxy.c | 4 +- net/netfilter/xt_socket.c | 2 +- net/sched/act_api.c | 13 +- net/sched/cls_api.c | 6 +- net/sched/sch_api.c | 22 ++-- net/sched/sch_generic.c | 29 +++-- net/smc/af_smc.c | 10 +- net/sunrpc/xprtrdma/verbs.c | 3 + net/tipc/node.c | 13 +- net/vmw_vsock/af_vsock.c | 1 + net/wireless/core.c | 17 +-- scripts/kconfig/confdata.c | 13 +- scripts/kconfig/preprocess.c | 2 +- sound/pci/hda/hda_intel.c | 5 +- sound/pci/hda/patch_realtek.c | 40 +++--- sound/soc/codecs/tas2770.c | 7 +- sound/soc/qcom/lpass-platform.c | 8 +- sound/soc/soc-ops.c | 41 +++++-- sound/usb/implicit.c | 4 +- tools/lib/subcmd/subcmd-util.h | 11 +- tools/perf/util/bpf-loader.c | 3 +- tools/testing/kunit/kunit_kernel.py | 1 + tools/testing/selftests/clone3/clone3.c | 2 - tools/testing/selftests/exec/Makefile | 4 +- tools/testing/selftests/kselftest_harness.h | 4 +- tools/testing/selftests/mincore/mincore_selftest.c | 20 ++- .../selftests/mount_setattr/mount_setattr_test.c | 4 +- .../selftests/netfilter/nft_concat_range.sh | 2 +- tools/testing/selftests/netfilter/nft_fib.sh | 1 + tools/testing/selftests/openat2/Makefile | 2 +- tools/testing/selftests/openat2/helpers.h | 12 +- tools/testing/selftests/openat2/openat2_test.c | 12 +- tools/testing/selftests/pidfd/pidfd.h | 13 +- tools/testing/selftests/pidfd/pidfd_fdinfo_test.c | 22 +++- tools/testing/selftests/pidfd/pidfd_test.c | 6 +- tools/testing/selftests/pidfd/pidfd_wait.c | 5 +- tools/testing/selftests/rtc/settings | 2 +- tools/testing/selftests/vDSO/vdso_test_abi.c | 135 ++++++++++----------- tools/testing/selftests/zram/zram.sh | 15 +-- tools/testing/selftests/zram/zram01.sh | 33 ++--- tools/testing/selftests/zram/zram02.sh | 1 - tools/testing/selftests/zram/zram_lib.sh | 134 +++++++++++++------- 239 files changed, 1667 insertions(+), 999 deletions(-)
From: Ben Skeggs bskeggs@redhat.com
commit 4cdd2450bf739bada353e82d27b00db9af8c3001 upstream.
Signed-off-by: Ben Skeggs bskeggs@redhat.com Reviewed-by: Karol Herbst kherbst@redhat.com Signed-off-by: Karol Herbst kherbst@redhat.com Link: https://gitlab.freedesktop.org/drm/nouveau/-/merge_requests/10 Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/nouveau/nvkm/falcon/base.c | 8 ++++-- drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm200.c | 31 +++++++++++++++++++++++- drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c | 2 - drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp102.c | 2 - drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp10b.c | 2 - drivers/gpu/drm/nouveau/nvkm/subdev/pmu/priv.h | 2 + 6 files changed, 41 insertions(+), 6 deletions(-)
--- a/drivers/gpu/drm/nouveau/nvkm/falcon/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/falcon/base.c @@ -117,8 +117,12 @@ nvkm_falcon_disable(struct nvkm_falcon * int nvkm_falcon_reset(struct nvkm_falcon *falcon) { - nvkm_falcon_disable(falcon); - return nvkm_falcon_enable(falcon); + if (!falcon->func->reset) { + nvkm_falcon_disable(falcon); + return nvkm_falcon_enable(falcon); + } + + return falcon->func->reset(falcon); }
int --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm200.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm200.c @@ -23,9 +23,38 @@ */ #include "priv.h"
+static int +gm200_pmu_flcn_reset(struct nvkm_falcon *falcon) +{ + struct nvkm_pmu *pmu = container_of(falcon, typeof(*pmu), falcon); + + nvkm_falcon_wr32(falcon, 0x014, 0x0000ffff); + pmu->func->reset(pmu); + return nvkm_falcon_enable(falcon); +} + +const struct nvkm_falcon_func +gm200_pmu_flcn = { + .debug = 0xc08, + .fbif = 0xe00, + .load_imem = nvkm_falcon_v1_load_imem, + .load_dmem = nvkm_falcon_v1_load_dmem, + .read_dmem = nvkm_falcon_v1_read_dmem, + .bind_context = nvkm_falcon_v1_bind_context, + .wait_for_halt = nvkm_falcon_v1_wait_for_halt, + .clear_interrupt = nvkm_falcon_v1_clear_interrupt, + .set_start_addr = nvkm_falcon_v1_set_start_addr, + .start = nvkm_falcon_v1_start, + .enable = nvkm_falcon_v1_enable, + .disable = nvkm_falcon_v1_disable, + .reset = gm200_pmu_flcn_reset, + .cmdq = { 0x4a0, 0x4b0, 4 }, + .msgq = { 0x4c8, 0x4cc, 0 }, +}; + static const struct nvkm_pmu_func gm200_pmu = { - .flcn = >215_pmu_flcn, + .flcn = &gm200_pmu_flcn, .enabled = gf100_pmu_enabled, .reset = gf100_pmu_reset, }; --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c @@ -211,7 +211,7 @@ gm20b_pmu_recv(struct nvkm_pmu *pmu)
static const struct nvkm_pmu_func gm20b_pmu = { - .flcn = >215_pmu_flcn, + .flcn = &gm200_pmu_flcn, .enabled = gf100_pmu_enabled, .intr = gt215_pmu_intr, .recv = gm20b_pmu_recv, --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp102.c @@ -39,7 +39,7 @@ gp102_pmu_enabled(struct nvkm_pmu *pmu)
static const struct nvkm_pmu_func gp102_pmu = { - .flcn = >215_pmu_flcn, + .flcn = &gm200_pmu_flcn, .enabled = gp102_pmu_enabled, .reset = gp102_pmu_reset, }; --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp10b.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp10b.c @@ -78,7 +78,7 @@ gp10b_pmu_acr = {
static const struct nvkm_pmu_func gp10b_pmu = { - .flcn = >215_pmu_flcn, + .flcn = &gm200_pmu_flcn, .enabled = gf100_pmu_enabled, .intr = gt215_pmu_intr, .recv = gm20b_pmu_recv, --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/priv.h @@ -44,6 +44,8 @@ void gf100_pmu_reset(struct nvkm_pmu *);
void gk110_pmu_pgob(struct nvkm_pmu *, bool);
+extern const struct nvkm_falcon_func gm200_pmu_flcn; + void gm20b_pmu_acr_bld_patch(struct nvkm_acr *, u32, s64); void gm20b_pmu_acr_bld_write(struct nvkm_acr *, u32, struct nvkm_acr_lsfw *); int gm20b_pmu_acr_boot(struct nvkm_falcon *);
From: Yang Shi shy828301@gmail.com
commit 24d7275ce2791829953ed4e72f68277ceb2571c6 upstream.
The syzbot reported the below BUG:
kernel BUG at include/linux/page-flags.h:785! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 4392 Comm: syz-executor560 Not tainted 5.16.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline] RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744 Call Trace: page_mapcount include/linux/mm.h:837 [inline] smaps_account+0x470/0xb10 fs/proc/task_mmu.c:466 smaps_pte_entry fs/proc/task_mmu.c:538 [inline] smaps_pte_range+0x611/0x1250 fs/proc/task_mmu.c:601 walk_pmd_range mm/pagewalk.c:128 [inline] walk_pud_range mm/pagewalk.c:205 [inline] walk_p4d_range mm/pagewalk.c:240 [inline] walk_pgd_range mm/pagewalk.c:277 [inline] __walk_page_range+0xe23/0x1ea0 mm/pagewalk.c:379 walk_page_vma+0x277/0x350 mm/pagewalk.c:530 smap_gather_stats.part.0+0x148/0x260 fs/proc/task_mmu.c:768 smap_gather_stats fs/proc/task_mmu.c:741 [inline] show_smap+0xc6/0x440 fs/proc/task_mmu.c:822 seq_read_iter+0xbb0/0x1240 fs/seq_file.c:272 seq_read+0x3e0/0x5b0 fs/seq_file.c:162 vfs_read+0x1b5/0x600 fs/read_write.c:479 ksys_read+0x12d/0x250 fs/read_write.c:619 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
The reproducer was trying to read /proc/$PID/smaps when calling MADV_FREE at the mean time. MADV_FREE may split THPs if it is called for partial THP. It may trigger the below race:
CPU A CPU B ----- ----- smaps walk: MADV_FREE: page_mapcount() PageCompound() split_huge_page() page = compound_head(page) PageDoubleMap(page)
When calling PageDoubleMap() this page is not a tail page of THP anymore so the BUG is triggered.
This could be fixed by elevated refcount of the page before calling mapcount, but that would prevent it from counting migration entries, and it seems overkilling because the race just could happen when PMD is split so all PTE entries of tail pages are actually migration entries, and smaps_account() does treat migration entries as mapcount == 1 as Kirill pointed out.
Add a new parameter for smaps_account() to tell this entry is migration entry then skip calling page_mapcount(). Don't skip getting mapcount for device private entries since they do track references with mapcount.
Pagemap also has the similar issue although it was not reported. Fixed it as well.
[shy828301@gmail.com: v4] Link: https://lkml.kernel.org/r/20220203182641.824731-1-shy828301@gmail.com [nathan@kernel.org: avoid unused variable warning in pagemap_pmd_range()] Link: https://lkml.kernel.org/r/20220207171049.1102239-1-nathan@kernel.org Link: https://lkml.kernel.org/r/20220120202805.3369-1-shy828301@gmail.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Yang Shi shy828301@gmail.com Signed-off-by: Nathan Chancellor nathan@kernel.org Reported-by: syzbot+1f52b3a18d5633fa7f82@syzkaller.appspotmail.com Acked-by: David Hildenbrand david@redhat.com Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Cc: Jann Horn jannh@google.com Cc: Matthew Wilcox willy@infradead.org Cc: Alexey Dobriyan adobriyan@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/proc/task_mmu.c | 40 +++++++++++++++++++++++++++++++--------- 1 file changed, 31 insertions(+), 9 deletions(-)
--- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -430,7 +430,8 @@ static void smaps_page_accumulate(struct }
static void smaps_account(struct mem_size_stats *mss, struct page *page, - bool compound, bool young, bool dirty, bool locked) + bool compound, bool young, bool dirty, bool locked, + bool migration) { int i, nr = compound ? compound_nr(page) : 1; unsigned long size = nr * PAGE_SIZE; @@ -457,8 +458,15 @@ static void smaps_account(struct mem_siz * page_count(page) == 1 guarantees the page is mapped exactly once. * If any subpage of the compound page mapped with PTE it would elevate * page_count(). + * + * The page_mapcount() is called to get a snapshot of the mapcount. + * Without holding the page lock this snapshot can be slightly wrong as + * we cannot always read the mapcount atomically. It is not safe to + * call page_mapcount() even with PTL held if the page is not mapped, + * especially for migration entries. Treat regular migration entries + * as mapcount == 1. */ - if (page_count(page) == 1) { + if ((page_count(page) == 1) || migration) { smaps_page_accumulate(mss, page, size, size << PSS_SHIFT, dirty, locked, true); return; @@ -495,6 +503,7 @@ static void smaps_pte_entry(pte_t *pte, struct vm_area_struct *vma = walk->vma; bool locked = !!(vma->vm_flags & VM_LOCKED); struct page *page = NULL; + bool migration = false;
if (pte_present(*pte)) { page = vm_normal_page(vma, addr, *pte); @@ -514,8 +523,11 @@ static void smaps_pte_entry(pte_t *pte, } else { mss->swap_pss += (u64)PAGE_SIZE << PSS_SHIFT; } - } else if (is_pfn_swap_entry(swpent)) + } else if (is_pfn_swap_entry(swpent)) { + if (is_migration_entry(swpent)) + migration = true; page = pfn_swap_entry_to_page(swpent); + } } else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && mss->check_shmem_swap && pte_none(*pte))) { page = xa_load(&vma->vm_file->f_mapping->i_pages, @@ -528,7 +540,8 @@ static void smaps_pte_entry(pte_t *pte, if (!page) return;
- smaps_account(mss, page, false, pte_young(*pte), pte_dirty(*pte), locked); + smaps_account(mss, page, false, pte_young(*pte), pte_dirty(*pte), + locked, migration); }
#ifdef CONFIG_TRANSPARENT_HUGEPAGE @@ -539,6 +552,7 @@ static void smaps_pmd_entry(pmd_t *pmd, struct vm_area_struct *vma = walk->vma; bool locked = !!(vma->vm_flags & VM_LOCKED); struct page *page = NULL; + bool migration = false;
if (pmd_present(*pmd)) { /* FOLL_DUMP will return -EFAULT on huge zero page */ @@ -546,8 +560,10 @@ static void smaps_pmd_entry(pmd_t *pmd, } else if (unlikely(thp_migration_supported() && is_swap_pmd(*pmd))) { swp_entry_t entry = pmd_to_swp_entry(*pmd);
- if (is_migration_entry(entry)) + if (is_migration_entry(entry)) { + migration = true; page = pfn_swap_entry_to_page(entry); + } } if (IS_ERR_OR_NULL(page)) return; @@ -559,7 +575,9 @@ static void smaps_pmd_entry(pmd_t *pmd, /* pass */; else mss->file_thp += HPAGE_PMD_SIZE; - smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd), locked); + + smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd), + locked, migration); } #else static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, @@ -1363,6 +1381,7 @@ static pagemap_entry_t pte_to_pagemap_en { u64 frame = 0, flags = 0; struct page *page = NULL; + bool migration = false;
if (pte_present(pte)) { if (pm->show_pfn) @@ -1384,13 +1403,14 @@ static pagemap_entry_t pte_to_pagemap_en frame = swp_type(entry) | (swp_offset(entry) << MAX_SWAPFILES_SHIFT); flags |= PM_SWAP; + migration = is_migration_entry(entry); if (is_pfn_swap_entry(entry)) page = pfn_swap_entry_to_page(entry); }
if (page && !PageAnon(page)) flags |= PM_FILE; - if (page && page_mapcount(page) == 1) + if (page && !migration && page_mapcount(page) == 1) flags |= PM_MMAP_EXCLUSIVE; if (vma->vm_flags & VM_SOFTDIRTY) flags |= PM_SOFT_DIRTY; @@ -1406,8 +1426,9 @@ static int pagemap_pmd_range(pmd_t *pmdp spinlock_t *ptl; pte_t *pte, *orig_pte; int err = 0; - #ifdef CONFIG_TRANSPARENT_HUGEPAGE + bool migration = false; + ptl = pmd_trans_huge_lock(pmdp, vma); if (ptl) { u64 flags = 0, frame = 0; @@ -1446,11 +1467,12 @@ static int pagemap_pmd_range(pmd_t *pmdp if (pmd_swp_uffd_wp(pmd)) flags |= PM_UFFD_WP; VM_BUG_ON(!is_pmd_migration_entry(pmd)); + migration = is_migration_entry(entry); page = pfn_swap_entry_to_page(entry); } #endif
- if (page && page_mapcount(page) == 1) + if (page && !migration && page_mapcount(page) == 1) flags |= PM_MMAP_EXCLUSIVE;
for (; addr != end; addr += PAGE_SIZE) {
From: Naohiro Aota naohiro.aota@wdc.com
commit 16beac87e95e2fb278b552397c8260637f8a63f7 upstream.
When mounting a device, we are reporting the zones twice: once for checking the zone attributes in btrfs_get_dev_zone_info and once for loading block groups' zone info in btrfs_load_block_group_zone_info(). With a lot of block groups, that leads to a lot of REPORT ZONE commands and slows down the mount process.
This patch introduces a zone info cache in struct btrfs_zoned_device_info. The cache is populated while in btrfs_get_dev_zone_info() and used for btrfs_load_block_group_zone_info() to reduce the number of REPORT ZONE commands. The zone cache is then released after loading the block groups, as it will not be much effective during the run time.
Benchmark: Mount an HDD with 57,007 block groups Before patch: 171.368 seconds After patch: 64.064 seconds
While it still takes a minute due to the slowness of loading all the block groups, the patch reduces the mount time by 1/3.
Link: https://lore.kernel.org/linux-btrfs/CAHQ7scUiLtcTqZOMMY5kbWUBOhGRwKo6J6wYPT5... Fixes: 5b316468983d ("btrfs: get zone information of zoned block devices") CC: stable@vger.kernel.org Signed-off-by: Naohiro Aota naohiro.aota@wdc.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/dev-replace.c | 2 - fs/btrfs/disk-io.c | 2 + fs/btrfs/volumes.c | 2 - fs/btrfs/zoned.c | 85 ++++++++++++++++++++++++++++++++++++++++++++----- fs/btrfs/zoned.h | 8 +++- 5 files changed, 87 insertions(+), 12 deletions(-)
--- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -325,7 +325,7 @@ static int btrfs_init_dev_replace_tgtdev set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE); device->fs_devices = fs_info->fs_devices;
- ret = btrfs_get_dev_zone_info(device); + ret = btrfs_get_dev_zone_info(device, false); if (ret) goto error;
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3565,6 +3565,8 @@ int __cold open_ctree(struct super_block goto fail_sysfs; }
+ btrfs_free_zone_cache(fs_info); + if (!sb_rdonly(sb) && fs_info->fs_devices->missing_devices && !btrfs_check_rw_degradable(fs_info, NULL)) { btrfs_warn(fs_info, --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2596,7 +2596,7 @@ int btrfs_init_new_device(struct btrfs_f device->fs_info = fs_info; device->bdev = bdev;
- ret = btrfs_get_dev_zone_info(device); + ret = btrfs_get_dev_zone_info(device, false); if (ret) goto error_free_device;
--- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -4,6 +4,7 @@ #include <linux/slab.h> #include <linux/blkdev.h> #include <linux/sched/mm.h> +#include <linux/vmalloc.h> #include "ctree.h" #include "volumes.h" #include "zoned.h" @@ -195,6 +196,8 @@ static int emulate_report_zones(struct b static int btrfs_get_dev_zones(struct btrfs_device *device, u64 pos, struct blk_zone *zones, unsigned int *nr_zones) { + struct btrfs_zoned_device_info *zinfo = device->zone_info; + u32 zno; int ret;
if (!*nr_zones) @@ -206,6 +209,34 @@ static int btrfs_get_dev_zones(struct bt return 0; }
+ /* Check cache */ + if (zinfo->zone_cache) { + unsigned int i; + + ASSERT(IS_ALIGNED(pos, zinfo->zone_size)); + zno = pos >> zinfo->zone_size_shift; + /* + * We cannot report zones beyond the zone end. So, it is OK to + * cap *nr_zones to at the end. + */ + *nr_zones = min_t(u32, *nr_zones, zinfo->nr_zones - zno); + + for (i = 0; i < *nr_zones; i++) { + struct blk_zone *zone_info; + + zone_info = &zinfo->zone_cache[zno + i]; + if (!zone_info->len) + break; + } + + if (i == *nr_zones) { + /* Cache hit on all the zones */ + memcpy(zones, zinfo->zone_cache + zno, + sizeof(*zinfo->zone_cache) * *nr_zones); + return 0; + } + } + ret = blkdev_report_zones(device->bdev, pos >> SECTOR_SHIFT, *nr_zones, copy_zone_info_cb, zones); if (ret < 0) { @@ -219,6 +250,11 @@ static int btrfs_get_dev_zones(struct bt if (!ret) return -EIO;
+ /* Populate cache */ + if (zinfo->zone_cache) + memcpy(zinfo->zone_cache + zno, zones, + sizeof(*zinfo->zone_cache) * *nr_zones); + return 0; }
@@ -282,7 +318,7 @@ int btrfs_get_dev_zone_info_all_devices( if (!device->bdev) continue;
- ret = btrfs_get_dev_zone_info(device); + ret = btrfs_get_dev_zone_info(device, true); if (ret) break; } @@ -291,7 +327,7 @@ int btrfs_get_dev_zone_info_all_devices( return ret; }
-int btrfs_get_dev_zone_info(struct btrfs_device *device) +int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache) { struct btrfs_fs_info *fs_info = device->fs_info; struct btrfs_zoned_device_info *zone_info = NULL; @@ -318,6 +354,8 @@ int btrfs_get_dev_zone_info(struct btrfs if (!zone_info) return -ENOMEM;
+ device->zone_info = zone_info; + if (!bdev_is_zoned(bdev)) { if (!fs_info->zone_size) { ret = calculate_emulated_zone_size(fs_info); @@ -369,6 +407,23 @@ int btrfs_get_dev_zone_info(struct btrfs goto out; }
+ /* + * Enable zone cache only for a zoned device. On a non-zoned device, we + * fill the zone info with emulated CONVENTIONAL zones, so no need to + * use the cache. + */ + if (populate_cache && bdev_is_zoned(device->bdev)) { + zone_info->zone_cache = vzalloc(sizeof(struct blk_zone) * + zone_info->nr_zones); + if (!zone_info->zone_cache) { + btrfs_err_in_rcu(device->fs_info, + "zoned: failed to allocate zone cache for %s", + rcu_str_deref(device->name)); + ret = -ENOMEM; + goto out; + } + } + /* Get zones type */ while (sector < nr_sectors) { nr_zones = BTRFS_REPORT_NR_ZONES; @@ -444,8 +499,6 @@ int btrfs_get_dev_zone_info(struct btrfs
kfree(zones);
- device->zone_info = zone_info; - switch (bdev_zoned_model(bdev)) { case BLK_ZONED_HM: model = "host-managed zoned"; @@ -478,10 +531,7 @@ int btrfs_get_dev_zone_info(struct btrfs out: kfree(zones); out_free_zone_info: - bitmap_free(zone_info->empty_zones); - bitmap_free(zone_info->seq_zones); - kfree(zone_info); - device->zone_info = NULL; + btrfs_destroy_dev_zone_info(device);
return ret; } @@ -495,6 +545,7 @@ void btrfs_destroy_dev_zone_info(struct
bitmap_free(zone_info->seq_zones); bitmap_free(zone_info->empty_zones); + vfree(zone_info->zone_cache); kfree(zone_info); device->zone_info = NULL; } @@ -1551,3 +1602,21 @@ void btrfs_clear_data_reloc_bg(struct bt fs_info->data_reloc_bg = 0; spin_unlock(&fs_info->relocation_bg_lock); } + +void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info) +{ + struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; + struct btrfs_device *device; + + if (!btrfs_is_zoned(fs_info)) + return; + + mutex_lock(&fs_devices->device_list_mutex); + list_for_each_entry(device, &fs_devices->devices, dev_list) { + if (device->zone_info) { + vfree(device->zone_info->zone_cache); + device->zone_info->zone_cache = NULL; + } + } + mutex_unlock(&fs_devices->device_list_mutex); +} --- a/fs/btrfs/zoned.h +++ b/fs/btrfs/zoned.h @@ -25,6 +25,7 @@ struct btrfs_zoned_device_info { u32 nr_zones; unsigned long *seq_zones; unsigned long *empty_zones; + struct blk_zone *zone_cache; struct blk_zone sb_zones[2 * BTRFS_SUPER_MIRROR_MAX]; };
@@ -32,7 +33,7 @@ struct btrfs_zoned_device_info { int btrfs_get_dev_zone(struct btrfs_device *device, u64 pos, struct blk_zone *zone); int btrfs_get_dev_zone_info_all_devices(struct btrfs_fs_info *fs_info); -int btrfs_get_dev_zone_info(struct btrfs_device *device); +int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache); void btrfs_destroy_dev_zone_info(struct btrfs_device *device); int btrfs_check_zoned_mode(struct btrfs_fs_info *fs_info); int btrfs_check_mountopts_zoned(struct btrfs_fs_info *info); @@ -67,6 +68,7 @@ int btrfs_sync_zone_write_pointer(struct struct btrfs_device *btrfs_zoned_get_device(struct btrfs_fs_info *fs_info, u64 logical, u64 length); void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg); +void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info); #else /* CONFIG_BLK_DEV_ZONED */ static inline int btrfs_get_dev_zone(struct btrfs_device *device, u64 pos, struct blk_zone *zone) @@ -79,7 +81,8 @@ static inline int btrfs_get_dev_zone_inf return 0; }
-static inline int btrfs_get_dev_zone_info(struct btrfs_device *device) +static inline int btrfs_get_dev_zone_info(struct btrfs_device *device, + bool populate_cache) { return 0; } @@ -202,6 +205,7 @@ static inline struct btrfs_device *btrfs
static inline void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg) { }
+static inline void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info) { } #endif
static inline bool btrfs_dev_is_sequential(struct btrfs_device *device, u64 pos)
From: James Smart jsmart2021@gmail.com
commit efe1dc571a5b808baa26682eef16561be2e356fd upstream.
Contention for the mailbox interface may occur during driver initialization (immediately after a function reset), between mailbox commands initiated via ioctl (bsg) and those driver requested by the driver.
After setting SLI_ACTIVE flag for a port, there is a window in which the driver will allow an ioctl to be initiated while the adapter is initializing and issuing mailbox commands via polling. The polling logic then gets confused.
Correct by having thread setting SLI_ACTIVE spot an active mailbox command and allow it complete before proceeding.
Link: https://lore.kernel.org/r/20210921143008.64212-1-jsmart2021@gmail.com Co-developed-by: Nigel Kirkland nkirkland2304@gmail.com Signed-off-by: Nigel Kirkland nkirkland2304@gmail.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/lpfc/lpfc_sli.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
--- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -8147,6 +8147,7 @@ lpfc_sli4_hba_setup(struct lpfc_hba *phb struct lpfc_vport *vport = phba->pport; struct lpfc_dmabuf *mp; struct lpfc_rqb *rqbp; + u32 flg;
/* Perform a PCI function reset to start from clean */ rc = lpfc_pci_function_reset(phba); @@ -8160,7 +8161,17 @@ lpfc_sli4_hba_setup(struct lpfc_hba *phb else { spin_lock_irq(&phba->hbalock); phba->sli.sli_flag |= LPFC_SLI_ACTIVE; + flg = phba->sli.sli_flag; spin_unlock_irq(&phba->hbalock); + /* Allow a little time after setting SLI_ACTIVE for any polled + * MBX commands to complete via BSG. + */ + for (i = 0; i < 50 && (flg & LPFC_SLI_MBOX_ACTIVE); i++) { + msleep(20); + spin_lock_irq(&phba->hbalock); + flg = phba->sli.sli_flag; + spin_unlock_irq(&phba->hbalock); + } }
lpfc_sli4_dip(phba); @@ -9744,7 +9755,7 @@ lpfc_sli_issue_mbox_s4(struct lpfc_hba * "(%d):2541 Mailbox command x%x " "(x%x/x%x) failure: " "mqe_sta: x%x mcqe_sta: x%x/x%x " - "Data: x%x x%x\n,", + "Data: x%x x%x\n", mboxq->vport ? mboxq->vport->vpi : 0, mboxq->u.mb.mbxCommand, lpfc_sli_config_mbox_subsys_get(phba, @@ -9778,7 +9789,7 @@ lpfc_sli_issue_mbox_s4(struct lpfc_hba * "(%d):2597 Sync Mailbox command " "x%x (x%x/x%x) failure: " "mqe_sta: x%x mcqe_sta: x%x/x%x " - "Data: x%x x%x\n,", + "Data: x%x x%x\n", mboxq->vport ? mboxq->vport->vpi : 0, mboxq->u.mb.mbxCommand, lpfc_sli_config_mbox_subsys_get(phba,
From: Sergio Costas rastersoft@gmail.com
commit fd5dd6acd8f823ea804f76d3af64fa1be9d5fb78 upstream.
This patch adds support for the UGTABLET WP5540 digitizer tablet devices. Without it, the pen moves the cursor, but neither the buttons nor the tap sensor in the tip do work.
Signed-off-by: Sergio Costas rastersoft@gmail.com Link: https://lore.kernel.org/r/63dece1d-91ca-1b1b-d90d-335be66896be@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Benjamin Tissoires benjamin.tissoires@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-quirks.c | 1 + 2 files changed, 2 insertions(+)
--- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -1353,6 +1353,7 @@ #define USB_VENDOR_ID_UGTIZER 0x2179 #define USB_DEVICE_ID_UGTIZER_TABLET_GP0610 0x0053 #define USB_DEVICE_ID_UGTIZER_TABLET_GT5040 0x0077 +#define USB_DEVICE_ID_UGTIZER_TABLET_WP5540 0x0004
#define USB_VENDOR_ID_VIEWSONIC 0x0543 #define USB_DEVICE_ID_VIEWSONIC_PD1011 0xe621 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -187,6 +187,7 @@ static const struct hid_device_id hid_qu { HID_USB_DEVICE(USB_VENDOR_ID_TURBOX, USB_DEVICE_ID_TURBOX_KEYBOARD), HID_QUIRK_NOGET }, { HID_USB_DEVICE(USB_VENDOR_ID_UCLOGIC, USB_DEVICE_ID_UCLOGIC_TABLET_KNA5), HID_QUIRK_MULTI_INPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_UCLOGIC, USB_DEVICE_ID_UCLOGIC_TABLET_TWA60), HID_QUIRK_MULTI_INPUT }, + { HID_USB_DEVICE(USB_VENDOR_ID_UGTIZER, USB_DEVICE_ID_UGTIZER_TABLET_WP5540), HID_QUIRK_MULTI_INPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_WALTOP, USB_DEVICE_ID_WALTOP_MEDIA_TABLET_10_6_INCH), HID_QUIRK_MULTI_INPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_WALTOP, USB_DEVICE_ID_WALTOP_MEDIA_TABLET_14_1_INCH), HID_QUIRK_MULTI_INPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_WALTOP, USB_DEVICE_ID_WALTOP_SIRIUS_BATTERY_FREE_TABLET), HID_QUIRK_MULTI_INPUT },
From: Sean Christopherson seanjc@google.com
commit dd4589eee99db8f61f7b8f7df1531cad3f74a64d upstream.
Remove a WARN on an "AVIC IPI invalid target" exit, the WARN is trivial to trigger from guest as it will fail on any destination APIC ID that doesn't exist from the guest's perspective.
Don't bother recording anything in the kernel log, the common tracepoint for kvm_avic_incomplete_ipi() is sufficient for debugging.
This reverts commit 37ef0c4414c9743ba7f1af4392f0a27a99649f2a.
Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20220204214205.3306634-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/avic.c | 2 -- 1 file changed, 2 deletions(-)
--- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -342,8 +342,6 @@ int avic_incomplete_ipi_interception(str avic_kick_target_vcpus(vcpu->kvm, apic, icrl, icrh); break; case AVIC_IPI_FAILURE_INVALID_TARGET: - WARN_ONCE(1, "Invalid IPI target: index=%u, vcpu=%d, icr=%#0x:%#0x\n", - index, vcpu->vcpu_id, icrh, icrl); break; case AVIC_IPI_FAILURE_INVALID_BACKING_PAGE: WARN_ONCE(1, "Invalid backing page\n");
From: Helge Deller deller@gmx.de
commit b160628e9ebcdc85d0db9d7f423c26b3c7c179d0 upstream.
It happens quite often that people use the wrong compiler to build the kernel:
make ARCH=parisc -> builds the 32-bit kernel make ARCH=parisc64 -> builds the 64-bit kernel
This patch adds a sanity check which errors out with an instruction how use the correct ARCH= option.
Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/include/asm/bitops.h | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/arch/parisc/include/asm/bitops.h +++ b/arch/parisc/include/asm/bitops.h @@ -12,6 +12,14 @@ #include <asm/barrier.h> #include <linux/atomic.h>
+/* compiler build environment sanity checks: */ +#if !defined(CONFIG_64BIT) && defined(__LP64__) +#error "Please use 'ARCH=parisc' to build the 32-bit kernel." +#endif +#if defined(CONFIG_64BIT) && !defined(__LP64__) +#error "Please use 'ARCH=parisc64' to build the 64-bit kernel." +#endif + /* See http://marc.theaimsgroup.com/?t=108826637900003 for discussion * on use of volatile and __*_bit() (set/clear/change): * *_bit() want use of volatile.
From: Randy Dunlap rdunlap@infradead.org
commit 6e8793674bb0d1135ca0e5c9f7e16fecbf815926 upstream.
There is a build error when using a kernel .config file from 'kernel test robot' for a different build problem:
hppa64-linux-ld: drivers/tty/serial/8250/8250_gsc.o: in function `.LC3': (.data.rel.ro+0x18): undefined reference to `iosapic_serial_irq'
when: CONFIG_GSC=y CONFIG_SERIO_GSCPS2=y CONFIG_SERIAL_8250_GSC=y CONFIG_PCI is not set and hence PCI_LBA is not set. IOSAPIC depends on PCI_LBA, so IOSAPIC is not set/enabled.
Make the use of iosapic_serial_irq() conditional to fix the build error.
Signed-off-by: Randy Dunlap rdunlap@infradead.org Reported-by: kernel test robot lkp@intel.com Cc: "James E.J. Bottomley" James.Bottomley@HansenPartnership.com Cc: Helge Deller deller@gmx.de Cc: linux-parisc@vger.kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: linux-serial@vger.kernel.org Cc: Jiri Slaby jirislaby@kernel.org Cc: Johan Hovold johan@kernel.org Suggested-by: Helge Deller deller@gmx.de Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/8250/8250_gsc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/tty/serial/8250/8250_gsc.c +++ b/drivers/tty/serial/8250/8250_gsc.c @@ -26,7 +26,7 @@ static int __init serial_init_chip(struc unsigned long address; int err;
-#ifdef CONFIG_64BIT +#if defined(CONFIG_64BIT) && defined(CONFIG_IOSAPIC) if (!dev->irq && (dev->id.sversion == 0xad)) dev->irq = iosapic_serial_irq(dev); #endif
From: John David Anglin dave.anglin@bell.net
commit 9129886b88185962538180625ca8051362b01327 upstream.
With huge kernel pages, we randomly eat a SPARC in map_pages(). This is fixed by dropping __init from the declaration.
However, map_pages references the __init routine memblock_alloc_try_nid via memblock_alloc. Thus, it needs to be marked with __ref.
memblock_alloc is only called before the kernel text is set to readonly.
The __ref on free_initmem is no longer needed.
Comment regarding map_pages being in the init section is removed.
Signed-off-by: John David Anglin dave.anglin@bell.net Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/mm/init.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
--- a/arch/parisc/mm/init.c +++ b/arch/parisc/mm/init.c @@ -341,9 +341,9 @@ static void __init setup_bootmem(void)
static bool kernel_set_to_readonly;
-static void __init map_pages(unsigned long start_vaddr, - unsigned long start_paddr, unsigned long size, - pgprot_t pgprot, int force) +static void __ref map_pages(unsigned long start_vaddr, + unsigned long start_paddr, unsigned long size, + pgprot_t pgprot, int force) { pmd_t *pmd; pte_t *pg_table; @@ -453,7 +453,7 @@ void __init set_kernel_text_rw(int enabl flush_tlb_all(); }
-void __ref free_initmem(void) +void free_initmem(void) { unsigned long init_begin = (unsigned long)__init_begin; unsigned long init_end = (unsigned long)__init_end; @@ -467,7 +467,6 @@ void __ref free_initmem(void) /* The init text pages are marked R-X. We have to * flush the icache and mark them RW- * - * This is tricky, because map_pages is in the init section. * Do a dummy remap of the data section first (the data * section is already PAGE_KERNEL) to pull in the TLB entries * for map_kernel */
From: John David Anglin dave.anglin@bell.net
commit b7d6f44a0fa716a82969725516dc0b16bc7cd514 upstream.
Rolf Eike Beer reported the following bug:
[1274934.746891] Bad Address (null pointer deref?): Code=15 (Data TLB miss fault) at addr 0000004140000018 [1274934.746891] CPU: 3 PID: 5549 Comm: cmake Not tainted 5.15.4-gentoo-parisc64 #4 [1274934.746891] Hardware name: 9000/785/C8000 [1274934.746891] [1274934.746891] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI [1274934.746891] PSW: 00001000000001001111111000001110 Not tainted [1274934.746891] r00-03 000000ff0804fe0e 0000000040bc9bc0 00000000406760e4 0000004140000000 [1274934.746891] r04-07 0000000040b693c0 0000004140000000 000000004a2b08b0 0000000000000001 [1274934.746891] r08-11 0000000041f98810 0000000000000000 000000004a0a7000 0000000000000001 [1274934.746891] r12-15 0000000040bddbc0 0000000040c0cbc0 0000000040bddbc0 0000000040bddbc0 [1274934.746891] r16-19 0000000040bde3c0 0000000040bddbc0 0000000040bde3c0 0000000000000007 [1274934.746891] r20-23 0000000000000006 000000004a368950 0000000000000000 0000000000000001 [1274934.746891] r24-27 0000000000001fff 000000000800000e 000000004a1710f0 0000000040b693c0 [1274934.746891] r28-31 0000000000000001 0000000041f988b0 0000000041f98840 000000004a171118 [1274934.746891] sr00-03 00000000066e5800 0000000000000000 0000000000000000 00000000066e5800 [1274934.746891] sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [1274934.746891] [1274934.746891] IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000406760e8 00000000406760ec [1274934.746891] IIR: 48780030 ISR: 0000000000000000 IOR: 0000004140000018 [1274934.746891] CPU: 3 CR30: 00000040e3a9c000 CR31: ffffffffffffffff [1274934.746891] ORIG_R28: 0000000040acdd58 [1274934.746891] IAOQ[0]: sba_unmap_sg+0xb0/0x118 [1274934.746891] IAOQ[1]: sba_unmap_sg+0xb4/0x118 [1274934.746891] RP(r2): sba_unmap_sg+0xac/0x118 [1274934.746891] Backtrace: [1274934.746891] [<00000000402740cc>] dma_unmap_sg_attrs+0x6c/0x70 [1274934.746891] [<000000004074d6bc>] scsi_dma_unmap+0x54/0x60 [1274934.746891] [<00000000407a3488>] mptscsih_io_done+0x150/0xd70 [1274934.746891] [<0000000040798600>] mpt_interrupt+0x168/0xa68 [1274934.746891] [<0000000040255a48>] __handle_irq_event_percpu+0xc8/0x278 [1274934.746891] [<0000000040255c34>] handle_irq_event_percpu+0x3c/0xd8 [1274934.746891] [<000000004025ecb4>] handle_percpu_irq+0xb4/0xf0 [1274934.746891] [<00000000402548e0>] generic_handle_irq+0x50/0x70 [1274934.746891] [<000000004019a254>] call_on_stack+0x18/0x24 [1274934.746891] [1274934.746891] Kernel panic - not syncing: Bad Address (null pointer deref?)
The bug is caused by overrunning the sglist and incorrectly testing sg_dma_len(sglist) before nents. Normally this doesn't cause a crash, but in this case sglist crossed a page boundary. This occurs in the following code:
while (sg_dma_len(sglist) && nents--) {
The fix is simply to test nents first and move the decrement of nents into the loop.
Reported-by: Rolf Eike Beer eike-kernel@sf-tec.de Signed-off-by: John David Anglin dave.anglin@bell.net Cc: stable@vger.kernel.org Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/parisc/sba_iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/parisc/sba_iommu.c +++ b/drivers/parisc/sba_iommu.c @@ -1047,7 +1047,7 @@ sba_unmap_sg(struct device *dev, struct spin_unlock_irqrestore(&ioc->res_lock, flags); #endif
- while (sg_dma_len(sglist) && nents--) { + while (nents && sg_dma_len(sglist)) {
sba_unmap_page(dev, sg_dma_address(sglist), sg_dma_len(sglist), direction, 0); @@ -1056,6 +1056,7 @@ sba_unmap_sg(struct device *dev, struct ioc->usingle_calls--; /* kluge since call is unmap_sg() */ #endif ++sglist; + nents--; }
DBG_RUN_SG("%s() DONE (nents %d)\n", __func__, nents);
From: John David Anglin dave.anglin@bell.net
commit d7da660cab47183cded65e11b64497d0f56c6edf upstream.
This patch implements the same bug fix to ccio-dma.c as to sba_iommu.c. It ensures that only the allocated entries of the sglist are accessed.
Signed-off-by: John David Anglin dave.anglin@bell.net Cc: stable@vger.kernel.org Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/parisc/ccio-dma.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/parisc/ccio-dma.c +++ b/drivers/parisc/ccio-dma.c @@ -1003,7 +1003,7 @@ ccio_unmap_sg(struct device *dev, struct ioc->usg_calls++; #endif
- while(sg_dma_len(sglist) && nents--) { + while (nents && sg_dma_len(sglist)) {
#ifdef CCIO_COLLECT_STATS ioc->usg_pages += sg_dma_len(sglist) >> PAGE_SHIFT; @@ -1011,6 +1011,7 @@ ccio_unmap_sg(struct device *dev, struct ccio_unmap_page(dev, sg_dma_address(sglist), sg_dma_len(sglist), direction, 0); ++sglist; + nents--; }
DBG_RUN_SG("%s() DONE (nents %d)\n", __func__, nents);
From: Christian Löhle CLoehle@hyperstone.com
commit 54309fde1a352ad2674ebba004a79f7d20b9f037 upstream.
On reads with MMC_READ_MULTIPLE_BLOCK that fail, the recovery handler will use MMC_READ_SINGLE_BLOCK for each of the blocks, up to MMC_READ_SINGLE_RETRIES times each. The logic for this is fixed to never report unsuccessful reads as success to the block layer.
On command error with retries remaining, blk_update_request was called with whatever value error was set last to. In case it was last set to BLK_STS_OK (default), the read will be reported as success, even though there was no data read from the device. This could happen on a CRC mismatch for the response, a card rejecting the command (e.g. again due to a CRC mismatch). In case it was last set to BLK_STS_IOERR, the error is reported correctly, but no retries will be attempted.
Fixes: 81196976ed946c ("mmc: block: Add blk-mq support") Cc: stable@vger.kernel.org Signed-off-by: Christian Loehle cloehle@hyperstone.com Reviewed-by: Adrian Hunter adrian.hunter@intel.com Link: https://lore.kernel.org/r/bc706a6ab08c4fe2834ba0c05a804672@hyperstone.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/block.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-)
--- a/drivers/mmc/core/block.c +++ b/drivers/mmc/core/block.c @@ -1682,31 +1682,31 @@ static void mmc_blk_read_single(struct m struct mmc_card *card = mq->card; struct mmc_host *host = card->host; blk_status_t error = BLK_STS_OK; - int retries = 0;
do { u32 status; int err; + int retries = 0;
- mmc_blk_rw_rq_prep(mqrq, card, 1, mq); + while (retries++ <= MMC_READ_SINGLE_RETRIES) { + mmc_blk_rw_rq_prep(mqrq, card, 1, mq);
- mmc_wait_for_req(host, mrq); + mmc_wait_for_req(host, mrq);
- err = mmc_send_status(card, &status); - if (err) - goto error_exit; - - if (!mmc_host_is_spi(host) && - !mmc_ready_for_data(status)) { - err = mmc_blk_fix_state(card, req); + err = mmc_send_status(card, &status); if (err) goto error_exit; - }
- if (mrq->cmd->error && retries++ < MMC_READ_SINGLE_RETRIES) - continue; + if (!mmc_host_is_spi(host) && + !mmc_ready_for_data(status)) { + err = mmc_blk_fix_state(card, req); + if (err) + goto error_exit; + }
- retries = 0; + if (!mrq->cmd->error) + break; + }
if (mrq->cmd->error || mrq->data->error ||
From: Linus Torvalds torvalds@linux-foundation.org
commit 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 upstream.
Oded Gabbay reports that enabling NUMA balancing causes corruption with his Gaudi accelerator test load:
"All the details are in the bug, but the bottom line is that somehow, this patch causes corruption when the numa balancing feature is enabled AND we don't use process affinity AND we use GUP to pin pages so our accelerator can DMA to/from system memory.
Either disabling numa balancing, using process affinity to bind to specific numa-node or reverting this patch causes the bug to disappear"
and Oded bisected the issue to commit 09854ba94c6a ("mm: do_wp_page() simplification").
Now, the NUMA balancing shouldn't actually be changing the writability of a page, and as such shouldn't matter for COW. But it appears it does. Suspicious.
However, regardless of that, the condition for enabling NUMA faults in change_pte_range() is nonsensical. It uses "page_mapcount(page)" to decide if a COW page should be NUMA-protected or not, and that makes absolutely no sense.
The number of mappings a page has is irrelevant: not only does GUP get a reference to a page as in Oded's case, but the other mappings migth be paged out and the only reference to them would be in the page count.
Since we should never try to NUMA-balance a page that we can't move anyway due to other references, just fix the code to use 'page_count()'. Oded confirms that that fixes his issue.
Now, this does imply that something in NUMA balancing ends up changing page protections (other than the obvious one of making the page inaccessible to get the NUMA faulting information). Otherwise the COW simplification wouldn't matter - since doing the GUP on the page would make sure it's writable.
The cause of that permission change would be good to figure out too, since it clearly results in spurious COW events - but fixing the nonsensical test that just happened to work before is obviously the CorrectThing(tm) to do regardless.
Fixes: 09854ba94c6a ("mm: do_wp_page() simplification") Link: https://bugzilla.kernel.org/show_bug.cgi?id=215616 Link: https://lore.kernel.org/all/CAFCwf10eNmwq2wD71xjUhqkvv5+_pJMR1nPug2RqNDcFT4H... Reported-and-tested-by: Oded Gabbay oded.gabbay@gmail.com Cc: David Hildenbrand david@redhat.com Cc: Peter Xu peterx@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/mprotect.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -94,7 +94,7 @@ static unsigned long change_pte_range(st
/* Also skip shared copy-on-write pages */ if (is_cow_mapping(vma->vm_flags) && - page_mapcount(page) != 1) + page_count(page) != 1) continue;
/*
From: Basavaraj Natikar Basavaraj.Natikar@amd.com
commit 91aaea527bc3b707c5d3208cde035421ed54f79c upstream.
ALS illuminance value present only in first 15 bits from SFH firmware for V2 platforms. Hence added a mask of 15 bit to limit ALS max illuminance values to get correct illuminance value.
Fixes: 0aad9c95eb9a ("HID: amd_sfh: Extend ALS support for newer AMD platform") Signed-off-by: Basavaraj Natikar Basavaraj.Natikar@amd.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/amd-sfh-hid/hid_descriptor/amd_sfh_hid_desc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/hid/amd-sfh-hid/hid_descriptor/amd_sfh_hid_desc.c +++ b/drivers/hid/amd-sfh-hid/hid_descriptor/amd_sfh_hid_desc.c @@ -26,6 +26,7 @@ #define HID_USAGE_SENSOR_STATE_READY_ENUM 0x02 #define HID_USAGE_SENSOR_STATE_INITIALIZING_ENUM 0x05 #define HID_USAGE_SENSOR_EVENT_DATA_UPDATED_ENUM 0x04 +#define ILLUMINANCE_MASK GENMASK(14, 0)
int get_report_descriptor(int sensor_idx, u8 *rep_desc) { @@ -245,7 +246,8 @@ u8 get_input_report(u8 current_index, in get_common_inputs(&als_input.common_property, report_id); /* For ALS ,V2 Platforms uses C2P_MSG5 register instead of DRAM access method */ if (supported_input == V2_STATUS) - als_input.illuminance_value = (int)readl(privdata->mmio + AMD_C2P_MSG(5)); + als_input.illuminance_value = + readl(privdata->mmio + AMD_C2P_MSG(5)) & ILLUMINANCE_MASK; else als_input.illuminance_value = (int)sensor_virt_addr[0] / AMD_SFH_FW_MULTIPLIER;
From: Daniel Thompson daniel.thompson@linaro.org
commit 2787710f73fcce4a9bdab540aaf1aef778a27462 upstream.
I'm was on the receiving end of a lockdep splat from this driver and after scratching my head I couldn't be entirely sure it was a false positive given we would also have to think about whether the regulator locking is safe (since the notifier is called whilst holding regulator locks which are also needed for regulator_is_enabled() ).
Regardless of whether it is a real bug or not, the mutex isn't needed. We can use reference counting tricks instead to avoid races with the notifier calls.
The observed splat follows:
------------------------------------------------------ kworker/u16:3/127 is trying to acquire lock: ffff00008021fb20 (&ihid_goodix->regulator_mutex){+.+.}-{4:4}, at: ihid_goodix_vdd_notify+0x30/0x94
but task is already holding lock: ffff0000835c60c0 (&(&rdev->notifier)->rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x30/0x70
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&(&rdev->notifier)->rwsem){++++}-{4:4}: down_write+0x68/0x8c blocking_notifier_chain_register+0x54/0x70 regulator_register_notifier+0x1c/0x24 devm_regulator_register_notifier+0x58/0x98 i2c_hid_of_goodix_probe+0xdc/0x158 i2c_device_probe+0x25d/0x270 really_probe+0x174/0x2cc __driver_probe_device+0xc0/0xd8 driver_probe_device+0x50/0xe4 __device_attach_driver+0xa8/0xc0 bus_for_each_drv+0x9c/0xc0 __device_attach_async_helper+0x6c/0xbc async_run_entry_fn+0x38/0x100 process_one_work+0x294/0x438 worker_thread+0x180/0x258 kthread+0x120/0x130 ret_from_fork+0x10/0x20
-> #0 (&ihid_goodix->regulator_mutex){+.+.}-{4:4}: __lock_acquire+0xd24/0xfe8 lock_acquire+0x288/0x2f4 __mutex_lock+0xa0/0x338 mutex_lock_nested+0x3c/0x5c ihid_goodix_vdd_notify+0x30/0x94 notifier_call_chain+0x6c/0x8c blocking_notifier_call_chain+0x48/0x70 _notifier_call_chain.isra.0+0x18/0x20 _regulator_enable+0xc0/0x178 regulator_enable+0x40/0x7c goodix_i2c_hid_power_up+0x18/0x20 i2c_hid_core_power_up.isra.0+0x1c/0x2c i2c_hid_core_probe+0xd8/0x3d4 i2c_hid_of_goodix_probe+0x14c/0x158 i2c_device_probe+0x25c/0x270 really_probe+0x174/0x2cc __driver_probe_device+0xc0/0xd8 driver_probe_device+0x50/0xe4 __device_attach_driver+0xa8/0xc0 bus_for_each_drv+0x9c/0xc0 __device_attach_async_helper+0x6c/0xbc async_run_entry_fn+0x38/0x100 process_one_work+0x294/0x438 worker_thread+0x180/0x258 kthread+0x120/0x130 ret_from_fork+0x10/0x20
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&(&rdev->notifier)->rwsem); lock(&ihid_goodix->regulator_mutex); lock(&(&rdev->notifier)->rwsem); lock(&ihid_goodix->regulator_mutex);
*** DEADLOCK ***
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Fixes: 18eeef46d359 ("HID: i2c-hid: goodix: Tie the reset line to true state of the regulator") Reviewed-by: Douglas Anderson dianders@chromium.org Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/i2c-hid/i2c-hid-of-goodix.c | 28 ++++++++++++---------------- 1 file changed, 12 insertions(+), 16 deletions(-)
--- a/drivers/hid/i2c-hid/i2c-hid-of-goodix.c +++ b/drivers/hid/i2c-hid/i2c-hid-of-goodix.c @@ -27,7 +27,6 @@ struct i2c_hid_of_goodix {
struct regulator *vdd; struct notifier_block nb; - struct mutex regulator_mutex; struct gpio_desc *reset_gpio; const struct goodix_i2c_hid_timing_data *timings; }; @@ -67,8 +66,6 @@ static int ihid_goodix_vdd_notify(struct container_of(nb, struct i2c_hid_of_goodix, nb); int ret = NOTIFY_OK;
- mutex_lock(&ihid_goodix->regulator_mutex); - switch (event) { case REGULATOR_EVENT_PRE_DISABLE: gpiod_set_value_cansleep(ihid_goodix->reset_gpio, 1); @@ -87,8 +84,6 @@ static int ihid_goodix_vdd_notify(struct break; }
- mutex_unlock(&ihid_goodix->regulator_mutex); - return ret; }
@@ -102,8 +97,6 @@ static int i2c_hid_of_goodix_probe(struc if (!ihid_goodix) return -ENOMEM;
- mutex_init(&ihid_goodix->regulator_mutex); - ihid_goodix->ops.power_up = goodix_i2c_hid_power_up; ihid_goodix->ops.power_down = goodix_i2c_hid_power_down;
@@ -130,25 +123,28 @@ static int i2c_hid_of_goodix_probe(struc * long. Holding the controller in reset apparently draws extra * power. */ - mutex_lock(&ihid_goodix->regulator_mutex); ihid_goodix->nb.notifier_call = ihid_goodix_vdd_notify; ret = devm_regulator_register_notifier(ihid_goodix->vdd, &ihid_goodix->nb); - if (ret) { - mutex_unlock(&ihid_goodix->regulator_mutex); + if (ret) return dev_err_probe(&client->dev, ret, "regulator notifier request failed\n"); - }
/* * If someone else is holding the regulator on (or the regulator is * an always-on one) we might never be told to deassert reset. Do it - * now. Here we'll assume that someone else might have _just - * barely_ turned the regulator on so we'll do the full - * "post_power_delay" just in case. + * now... and temporarily bump the regulator reference count just to + * make sure it is impossible for this to race with our own notifier! + * We also assume that someone else might have _just barely_ turned + * the regulator on so we'll do the full "post_power_delay" just in + * case. */ - if (ihid_goodix->reset_gpio && regulator_is_enabled(ihid_goodix->vdd)) + if (ihid_goodix->reset_gpio && regulator_is_enabled(ihid_goodix->vdd)) { + ret = regulator_enable(ihid_goodix->vdd); + if (ret) + return ret; goodix_i2c_hid_deassert_reset(ihid_goodix, true); - mutex_unlock(&ihid_goodix->regulator_mutex); + regulator_disable(ihid_goodix->vdd); + }
return i2c_hid_core_probe(client, &ihid_goodix->ops, 0x0001, 0); }
From: Basavaraj Natikar Basavaraj.Natikar@amd.com
commit a7072c01c3ac3ae6ecd08fa7b43431cfc8ed331f upstream.
HPD sensors take more time to initialize. Hence increasing sensor command timeout to get response with status within a max timeout.
Fixes: 173709f50e98 ("HID: amd_sfh: Add command response to check command status") Signed-off-by: Basavaraj Natikar Basavaraj.Natikar@amd.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/amd-sfh-hid/amd_sfh_pcie.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c +++ b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c @@ -36,11 +36,11 @@ static int amd_sfh_wait_response_v2(stru { union cmd_response cmd_resp;
- /* Get response with status within a max of 800 ms timeout */ + /* Get response with status within a max of 1600 ms timeout */ if (!readl_poll_timeout(mp2->mmio + AMD_P2C_MSG(0), cmd_resp.resp, (cmd_resp.response_v2.response == sensor_sts && cmd_resp.response_v2.status == 0 && (sid == 0xff || - cmd_resp.response_v2.sensor_id == sid)), 500, 800000)) + cmd_resp.response_v2.sensor_id == sid)), 500, 1600000)) return cmd_resp.response_v2.response;
return SENSOR_DISABLED;
From: Basavaraj Natikar Basavaraj.Natikar@amd.com
commit aa0b724a2bf041036e56cbb3b4b3afde7c5e7c9e upstream.
Misinterpreted intr_enable field name. Hence correct the structure field name accordingly to reflect the functionality.
Fixes: f264481ad614 ("HID: amd_sfh: Extend driver capabilities for multi-generation support") Signed-off-by: Basavaraj Natikar Basavaraj.Natikar@amd.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/amd-sfh-hid/amd_sfh_pcie.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.h +++ b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.h @@ -48,7 +48,7 @@ union sfh_cmd_base { } s; struct { u32 cmd_id : 4; - u32 intr_enable : 1; + u32 intr_disable : 1; u32 rsvd1 : 3; u32 length : 7; u32 mem_type : 1;
From: Long Li longli@microsoft.com
commit 3149efcdf2c6314420c418dfc94de53bfd076b1f upstream.
When kernel boots with a NUMA topology with some NUMA nodes offline, the PCI driver should only set an online NUMA node on the device. This can happen during KDUMP where some NUMA nodes are not made online by the KDUMP kernel.
This patch also fixes the case where kernel is booting with "numa=off".
Fixes: 999dd956d838 ("PCI: hv: Add support for protocol 1.3 and support PCI_BUS_RELATIONS2") Signed-off-by: Long Li longli@microsoft.com Reviewed-by: Michael Kelley mikelley@microsoft.com Tested-by: Purna Pavan Chandra Aekkaladevi paekkaladevi@microsoft.com Acked-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Link: https://lore.kernel.org/r/1643247814-15184-1-git-send-email-longli@linuxonhy... Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-hyperv.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
--- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -1899,8 +1899,17 @@ static void hv_pci_assign_numa_node(stru if (!hv_dev) continue;
- if (hv_dev->desc.flags & HV_PCI_DEVICE_FLAG_NUMA_AFFINITY) - set_dev_node(&dev->dev, hv_dev->desc.virtual_numa_node); + if (hv_dev->desc.flags & HV_PCI_DEVICE_FLAG_NUMA_AFFINITY && + hv_dev->desc.virtual_numa_node < num_possible_nodes()) + /* + * The kernel may boot with some NUMA nodes offline + * (e.g. in a KDUMP kernel) or with NUMA disabled via + * "numa=off". In those cases, adjust the host provided + * NUMA node to a valid NUMA node used by the kernel. + */ + set_dev_node(&dev->dev, + numa_map_to_online_node( + hv_dev->desc.virtual_numa_node));
put_pcichild(hv_dev); }
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit 18a1d5e1945385d9b5adc3fe11427ce4a9d2826e upstream.
It's a followup to the previous commit f15309d7ad5d ("parisc: Add ioread64_hi_lo() and iowrite64_hi_lo()") which does only half of the job. Add the rest, so we won't get a new kernel test robot reports.
Fixes: f15309d7ad5d ("parisc: Add ioread64_hi_lo() and iowrite64_hi_lo()") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/lib/iomap.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
--- a/arch/parisc/lib/iomap.c +++ b/arch/parisc/lib/iomap.c @@ -346,6 +346,16 @@ u64 ioread64be(const void __iomem *addr) return *((u64 *)addr); }
+u64 ioread64_lo_hi(const void __iomem *addr) +{ + u32 low, high; + + low = ioread32(addr); + high = ioread32(addr + sizeof(u32)); + + return low + ((u64)high << 32); +} + u64 ioread64_hi_lo(const void __iomem *addr) { u32 low, high; @@ -419,6 +429,12 @@ void iowrite64be(u64 datum, void __iomem } }
+void iowrite64_lo_hi(u64 val, void __iomem *addr) +{ + iowrite32(val, addr); + iowrite32(val >> 32, addr + sizeof(u32)); +} + void iowrite64_hi_lo(u64 val, void __iomem *addr) { iowrite32(val >> 32, addr + sizeof(u32)); @@ -530,6 +546,7 @@ EXPORT_SYMBOL(ioread32); EXPORT_SYMBOL(ioread32be); EXPORT_SYMBOL(ioread64); EXPORT_SYMBOL(ioread64be); +EXPORT_SYMBOL(ioread64_lo_hi); EXPORT_SYMBOL(ioread64_hi_lo); EXPORT_SYMBOL(iowrite8); EXPORT_SYMBOL(iowrite16); @@ -538,6 +555,7 @@ EXPORT_SYMBOL(iowrite32); EXPORT_SYMBOL(iowrite32be); EXPORT_SYMBOL(iowrite64); EXPORT_SYMBOL(iowrite64be); +EXPORT_SYMBOL(iowrite64_lo_hi); EXPORT_SYMBOL(iowrite64_hi_lo); EXPORT_SYMBOL(ioread8_rep); EXPORT_SYMBOL(ioread16_rep);
From: Dāvis Mosāns davispuh@gmail.com
commit 2e7be9db125a0bf940c5d65eb5c40d8700f738b5 upstream.
Currently if we get IO error while doing send then we abort without logging information about which file caused issue. So log it to help with debugging.
CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Dāvis Mosāns davispuh@gmail.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/send.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -4978,6 +4978,10 @@ static int put_file_data(struct send_ctx lock_page(page); if (!PageUptodate(page)) { unlock_page(page); + btrfs_err(fs_info, + "send: IO error at offset %llu for inode %llu root %llu", + page_offset(page), sctx->cur_ino, + sctx->send_root->root_key.objectid); put_page(page); ret = -EIO; break;
From: Yuka Kawajiri yukx00@gmail.com
[ Upstream commit 512eb73cfd1208898cf10cb06094e0ee0bb53b58 ]
Add touchscreen info for RWC NANOTE P8 (AY07J) 2-in-1.
Signed-off-by: Yuka Kawajiri yukx00@gmail.com Link: https://lore.kernel.org/r/20220111154019.4599-1-yukx00@gmail.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/touchscreen_dmi.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/drivers/platform/x86/touchscreen_dmi.c b/drivers/platform/x86/touchscreen_dmi.c index 033f797861d8a..c608078538a79 100644 --- a/drivers/platform/x86/touchscreen_dmi.c +++ b/drivers/platform/x86/touchscreen_dmi.c @@ -773,6 +773,21 @@ static const struct ts_dmi_data predia_basic_data = { .properties = predia_basic_props, };
+static const struct property_entry rwc_nanote_p8_props[] = { + PROPERTY_ENTRY_U32("touchscreen-min-y", 46), + PROPERTY_ENTRY_U32("touchscreen-size-x", 1728), + PROPERTY_ENTRY_U32("touchscreen-size-y", 1140), + PROPERTY_ENTRY_BOOL("touchscreen-inverted-y"), + PROPERTY_ENTRY_STRING("firmware-name", "gsl1680-rwc-nanote-p8.fw"), + PROPERTY_ENTRY_U32("silead,max-fingers", 10), + { } +}; + +static const struct ts_dmi_data rwc_nanote_p8_data = { + .acpi_name = "MSSL1680:00", + .properties = rwc_nanote_p8_props, +}; + static const struct property_entry schneider_sct101ctm_props[] = { PROPERTY_ENTRY_U32("touchscreen-size-x", 1715), PROPERTY_ENTRY_U32("touchscreen-size-y", 1140), @@ -1379,6 +1394,15 @@ const struct dmi_system_id touchscreen_dmi_table[] = { DMI_EXACT_MATCH(DMI_BOARD_NAME, "0E57"), }, }, + { + /* RWC NANOTE P8 */ + .driver_data = (void *)&rwc_nanote_p8_data, + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "Default string"), + DMI_MATCH(DMI_PRODUCT_NAME, "AY07J"), + DMI_MATCH(DMI_PRODUCT_SKU, "0001") + }, + }, { /* Schneider SCT101CTM */ .driver_data = (void *)&schneider_sct101ctm_data,
From: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com
[ Upstream commit 17da2d5f93692086dd096a975225ffd5622d0bf8 ]
As reported:
[ 256.104522] ====================================================== [ 256.113783] WARNING: possible circular locking dependency detected [ 256.120093] 5.16.0-rc6-yocto-standard+ #99 Not tainted [ 256.125362] ------------------------------------------------------ [ 256.131673] intel-speed-sel/844 is trying to acquire lock: [ 256.137290] ffffffffc036f0d0 (punit_misc_dev_lock){+.+.}-{3:3}, at: isst_if_open+0x18/0x90 [isst_if_common] [ 256.147171] [ 256.147171] but task is already holding lock: [ 256.153135] ffffffff8ee7cb50 (misc_mtx){+.+.}-{3:3}, at: misc_open+0x2a/0x170 [ 256.160407] [ 256.160407] which lock already depends on the new lock. [ 256.160407] [ 256.168712] [ 256.168712] the existing dependency chain (in reverse order) is: [ 256.176327] [ 256.176327] -> #1 (misc_mtx){+.+.}-{3:3}: [ 256.181946] lock_acquire+0x1e6/0x330 [ 256.186265] __mutex_lock+0x9b/0x9b0 [ 256.190497] mutex_lock_nested+0x1b/0x20 [ 256.195075] misc_register+0x32/0x1a0 [ 256.199390] isst_if_cdev_register+0x65/0x180 [isst_if_common] [ 256.205878] isst_if_probe+0x144/0x16e [isst_if_mmio] ... [ 256.241976] [ 256.241976] -> #0 (punit_misc_dev_lock){+.+.}-{3:3}: [ 256.248552] validate_chain+0xbc6/0x1750 [ 256.253131] __lock_acquire+0x88c/0xc10 [ 256.257618] lock_acquire+0x1e6/0x330 [ 256.261933] __mutex_lock+0x9b/0x9b0 [ 256.266165] mutex_lock_nested+0x1b/0x20 [ 256.270739] isst_if_open+0x18/0x90 [isst_if_common] [ 256.276356] misc_open+0x100/0x170 [ 256.280409] chrdev_open+0xa5/0x1e0 ...
The call sequence suggested that misc_device /dev file can be opened before misc device is yet to be registered, which is done only once.
Here punit_misc_dev_lock was used as common lock, to protect the registration by multiple ISST HW drivers, one time setup, prevent duplicate registry of misc device and prevent load/unload when device is open.
We can split into locks: - One which just prevent duplicate call to misc_register() and one time setup. Also never call again if the misc_register() failed or required one time setup is failed. This lock is not shared with any misc device callbacks.
- The other lock protects registry, load and unload of HW drivers.
Sequence in isst_if_cdev_register() - Register callbacks under punit_misc_dev_open_lock - Call isst_misc_reg() which registers misc_device on the first registry which is under punit_misc_dev_reg_lock, which is not shared with callbacks.
Sequence in isst_if_cdev_unregister Just opposite of isst_if_cdev_register
Reported-and-tested-by: Liwei Song liwei.song@windriver.com Signed-off-by: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Link: https://lore.kernel.org/r/20220112022521.54669-1-srinivas.pandruvada@linux.i... Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../intel/speed_select_if/isst_if_common.c | 97 ++++++++++++------- 1 file changed, 63 insertions(+), 34 deletions(-)
diff --git a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c index c9a85eb2e8600..e8424e70d81d2 100644 --- a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c +++ b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c @@ -596,7 +596,10 @@ static long isst_if_def_ioctl(struct file *file, unsigned int cmd, return ret; }
-static DEFINE_MUTEX(punit_misc_dev_lock); +/* Lock to prevent module registration when already opened by user space */ +static DEFINE_MUTEX(punit_misc_dev_open_lock); +/* Lock to allow one share misc device for all ISST interace */ +static DEFINE_MUTEX(punit_misc_dev_reg_lock); static int misc_usage_count; static int misc_device_ret; static int misc_device_open; @@ -606,7 +609,7 @@ static int isst_if_open(struct inode *inode, struct file *file) int i, ret = 0;
/* Fail open, if a module is going away */ - mutex_lock(&punit_misc_dev_lock); + mutex_lock(&punit_misc_dev_open_lock); for (i = 0; i < ISST_IF_DEV_MAX; ++i) { struct isst_if_cmd_cb *cb = &punit_callbacks[i];
@@ -628,7 +631,7 @@ static int isst_if_open(struct inode *inode, struct file *file) } else { misc_device_open++; } - mutex_unlock(&punit_misc_dev_lock); + mutex_unlock(&punit_misc_dev_open_lock);
return ret; } @@ -637,7 +640,7 @@ static int isst_if_relase(struct inode *inode, struct file *f) { int i;
- mutex_lock(&punit_misc_dev_lock); + mutex_lock(&punit_misc_dev_open_lock); misc_device_open--; for (i = 0; i < ISST_IF_DEV_MAX; ++i) { struct isst_if_cmd_cb *cb = &punit_callbacks[i]; @@ -645,7 +648,7 @@ static int isst_if_relase(struct inode *inode, struct file *f) if (cb->registered) module_put(cb->owner); } - mutex_unlock(&punit_misc_dev_lock); + mutex_unlock(&punit_misc_dev_open_lock);
return 0; } @@ -662,6 +665,43 @@ static struct miscdevice isst_if_char_driver = { .fops = &isst_if_char_driver_ops, };
+static int isst_misc_reg(void) +{ + mutex_lock(&punit_misc_dev_reg_lock); + if (misc_device_ret) + goto unlock_exit; + + if (!misc_usage_count) { + misc_device_ret = isst_if_cpu_info_init(); + if (misc_device_ret) + goto unlock_exit; + + misc_device_ret = misc_register(&isst_if_char_driver); + if (misc_device_ret) { + isst_if_cpu_info_exit(); + goto unlock_exit; + } + } + misc_usage_count++; + +unlock_exit: + mutex_unlock(&punit_misc_dev_reg_lock); + + return misc_device_ret; +} + +static void isst_misc_unreg(void) +{ + mutex_lock(&punit_misc_dev_reg_lock); + if (misc_usage_count) + misc_usage_count--; + if (!misc_usage_count && !misc_device_ret) { + misc_deregister(&isst_if_char_driver); + isst_if_cpu_info_exit(); + } + mutex_unlock(&punit_misc_dev_reg_lock); +} + /** * isst_if_cdev_register() - Register callback for IOCTL * @device_type: The device type this callback handling. @@ -679,38 +719,31 @@ static struct miscdevice isst_if_char_driver = { */ int isst_if_cdev_register(int device_type, struct isst_if_cmd_cb *cb) { - if (misc_device_ret) - return misc_device_ret; + int ret;
if (device_type >= ISST_IF_DEV_MAX) return -EINVAL;
- mutex_lock(&punit_misc_dev_lock); + mutex_lock(&punit_misc_dev_open_lock); + /* Device is already open, we don't want to add new callbacks */ if (misc_device_open) { - mutex_unlock(&punit_misc_dev_lock); + mutex_unlock(&punit_misc_dev_open_lock); return -EAGAIN; } - if (!misc_usage_count) { - int ret; - - misc_device_ret = misc_register(&isst_if_char_driver); - if (misc_device_ret) - goto unlock_exit; - - ret = isst_if_cpu_info_init(); - if (ret) { - misc_deregister(&isst_if_char_driver); - misc_device_ret = ret; - goto unlock_exit; - } - } memcpy(&punit_callbacks[device_type], cb, sizeof(*cb)); punit_callbacks[device_type].registered = 1; - misc_usage_count++; -unlock_exit: - mutex_unlock(&punit_misc_dev_lock); + mutex_unlock(&punit_misc_dev_open_lock);
- return misc_device_ret; + ret = isst_misc_reg(); + if (ret) { + /* + * No need of mutex as the misc device register failed + * as no one can open device yet. Hence no contention. + */ + punit_callbacks[device_type].registered = 0; + return ret; + } + return 0; } EXPORT_SYMBOL_GPL(isst_if_cdev_register);
@@ -725,16 +758,12 @@ EXPORT_SYMBOL_GPL(isst_if_cdev_register); */ void isst_if_cdev_unregister(int device_type) { - mutex_lock(&punit_misc_dev_lock); - misc_usage_count--; + isst_misc_unreg(); + mutex_lock(&punit_misc_dev_open_lock); punit_callbacks[device_type].registered = 0; if (device_type == ISST_IF_DEV_MBOX) isst_delete_hash(); - if (!misc_usage_count && !misc_device_ret) { - misc_deregister(&isst_if_char_driver); - isst_if_cpu_info_exit(); - } - mutex_unlock(&punit_misc_dev_lock); + mutex_unlock(&punit_misc_dev_open_lock); } EXPORT_SYMBOL_GPL(isst_if_cdev_unregister);
From: Michał Winiarski michal.winiarski@intel.com
[ Upstream commit 235528072f28b3b0a1446279b7eaddda36dbf743 ]
Python 3.10.0 contains: 9e09849d20 ("bpo-41006: importlib.util no longer imports typing (GH-20938)")
It causes importlib.util to no longer import importlib.abs, which leads to the following error when trying to use kunit with qemu: AttributeError: module 'importlib' has no attribute 'abc'. Did you mean: '_abc'?
Add the missing import.
Signed-off-by: Michał Winiarski michal.winiarski@intel.com Reviewed-by: Daniel Latypov dlatypov@google.com Reviewed-by: Brendan Higgins brendanhiggins@google.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/kunit/kunit_kernel.py | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py index 2c6f916ccbafa..0874e512d109b 100644 --- a/tools/testing/kunit/kunit_kernel.py +++ b/tools/testing/kunit/kunit_kernel.py @@ -6,6 +6,7 @@ # Author: Felix Guo felixguoxiuping@gmail.com # Author: Brendan Higgins brendanhiggins@google.com
+import importlib.abc import importlib.util import logging import subprocess
From: Nícolas F. R. A. Prado nfraprado@collabora.com
[ Upstream commit f034cc1301e7d83d4ec428dd6b8ffb57ca446efb ]
The timeout setting for the rtc kselftest is currently 90 seconds. This setting is used by the kselftest runner to stop running a test if it takes longer than the assigned value.
However, two of the test cases inside rtc set alarms. These alarms are set to the next beginning of the minute, so each of these test cases may take up to, in the worst case, 60 seconds.
In order to allow for all test cases in rtc to run, even in the worst case, when using the kselftest runner, the timeout value should be increased to at least 120. Set it to 180, so there's some additional slack.
Correct operation can be tested by running the following command right after the start of a minute (low second count), and checking that all test cases run:
./run_kselftest.sh -c rtc
Signed-off-by: Nícolas F. R. A. Prado nfraprado@collabora.com Acked-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/rtc/settings | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/rtc/settings b/tools/testing/selftests/rtc/settings index ba4d85f74cd6b..a953c96aa16e1 100644 --- a/tools/testing/selftests/rtc/settings +++ b/tools/testing/selftests/rtc/settings @@ -1 +1 @@ -timeout=90 +timeout=180
From: Li Zhijian lizhijian@cn.fujitsu.com
[ Upstream commit 92d25637a3a45904292c93f1863c6bbda4e3e38f ]
We have some many cases that will create child process as well, such as pidfd_wait. Previously, we will signal/kill the parent process when it is time out, but this signal will not be sent to its child process. In such case, if child process doesn't terminate itself, ksefltest framework will hang forever.
Here we group all its child processes so that kill() can signal all of them in timeout.
Fixed change log: Shuah Khan skhan@linuxfoundation.org
Suggested-by: yang xu xuyang2018.jy@cn.fujitsu.com Signed-off-by: Li Zhijian lizhijian@cn.fujitsu.com Acked-by: Christian Brauner christian.brauner@ubuntu.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/kselftest_harness.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index 79a182cfa43ad..78e59620d28de 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -875,7 +875,8 @@ static void __timeout_handler(int sig, siginfo_t *info, void *ucontext) }
t->timed_out = true; - kill(t->pid, SIGKILL); + // signal process group + kill(-(t->pid), SIGKILL); }
void __wait_for_test(struct __test_metadata *t) @@ -985,6 +986,7 @@ void __run_test(struct __fixture_metadata *f, ksft_print_msg("ERROR SPAWNING TEST CHILD\n"); t->passed = 0; } else if (t->pid == 0) { + setpgrp(); t->fn(t, variant); if (t->skip) _exit(255);
From: Miquel Raynal miquel.raynal@bootlin.com
[ Upstream commit e5ce576d45bf72fd0e3dc37eff897bfcc488f6a9 ]
Upon error the ieee802154_xmit_complete() helper is not called. Only ieee802154_wake_queue() is called manually. In the Tx case we then leak the skb structure.
Free the skb structure upon error before returning when appropriate.
As the 'is_tx = 0' cannot be moved in the complete handler because of a possible race between the delay in switching to STATE_RX_AACK_ON and a new interrupt, we introduce an intermediate 'was_tx' boolean just for this purpose.
There is no Fixes tag applying here, many changes have been made on this area and the issue kind of always existed.
Suggested-by: Alexander Aring alex.aring@gmail.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Acked-by: Alexander Aring aahringo@redhat.com Link: https://lore.kernel.org/r/20220125121426.848337-4-miquel.raynal@bootlin.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ieee802154/at86rf230.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ieee802154/at86rf230.c b/drivers/net/ieee802154/at86rf230.c index 7d67f41387f55..4f5ef8a9a9a87 100644 --- a/drivers/net/ieee802154/at86rf230.c +++ b/drivers/net/ieee802154/at86rf230.c @@ -100,6 +100,7 @@ struct at86rf230_local { unsigned long cal_timeout; bool is_tx; bool is_tx_from_off; + bool was_tx; u8 tx_retry; struct sk_buff *tx_skb; struct at86rf230_state_change tx; @@ -343,7 +344,11 @@ at86rf230_async_error_recover_complete(void *context) if (ctx->free) kfree(ctx);
- ieee802154_wake_queue(lp->hw); + if (lp->was_tx) { + lp->was_tx = 0; + dev_kfree_skb_any(lp->tx_skb); + ieee802154_wake_queue(lp->hw); + } }
static void @@ -352,7 +357,11 @@ at86rf230_async_error_recover(void *context) struct at86rf230_state_change *ctx = context; struct at86rf230_local *lp = ctx->lp;
- lp->is_tx = 0; + if (lp->is_tx) { + lp->was_tx = 1; + lp->is_tx = 0; + } + at86rf230_async_state_change(lp, ctx, STATE_RX_AACK_ON, at86rf230_async_error_recover_complete); }
From: Yang Xu xuyang2018.jy@fujitsu.com
[ Upstream commit fc4eb486a59d70bd35cf1209f0e68c2d8b979193 ]
Since commit 43209ea2d17a ("zram: remove max_comp_streams internals"), zram has switched to per-cpu streams. Even kernel still keep this interface for some reasons, but writing to max_comp_stream doesn't take any effect. So skip it on newer kernel ie 4.7.
The code that comparing kernel version is from xfstests testsuite ext4/053.
Signed-off-by: Yang Xu xuyang2018.jy@fujitsu.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/zram/zram_lib.sh | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/tools/testing/selftests/zram/zram_lib.sh b/tools/testing/selftests/zram/zram_lib.sh index 6f872f266fd11..f47fc0f27e99e 100755 --- a/tools/testing/selftests/zram/zram_lib.sh +++ b/tools/testing/selftests/zram/zram_lib.sh @@ -11,6 +11,9 @@ dev_mounted=-1
# Kselftest framework requirement - SKIP code is 4. ksft_skip=4 +kernel_version=`uname -r | cut -d'.' -f1,2` +kernel_major=${kernel_version%.*} +kernel_minor=${kernel_version#*.}
trap INT
@@ -25,6 +28,20 @@ check_prereqs() fi }
+kernel_gte() +{ + major=${1%.*} + minor=${1#*.} + + if [ $kernel_major -gt $major ]; then + return 0 + elif [[ $kernel_major -eq $major && $kernel_minor -ge $minor ]]; then + return 0 + fi + + return 1 +} + zram_cleanup() { echo "zram cleanup" @@ -86,6 +103,13 @@ zram_max_streams() { echo "set max_comp_streams to zram device(s)"
+ kernel_gte 4.7 + if [ $? -eq 0 ]; then + echo "The device attribute max_comp_streams was"\ + "deprecated in 4.7" + return 0 + fi + local i=0 for max_s in $zram_max_streams; do local sys_path="/sys/block/zram${i}/max_comp_streams"
From: Yang Xu xuyang2018.jy@fujitsu.com
[ Upstream commit d18da7ec3719559d6e74937266d0416e6c7e0b31 ]
zram01 uses `free -m` to measure zram memory usage. The results are no sense because they are polluted by all running processes on the system.
We Should only calculate the free memory delta for the current process. So use the third field of /sys/block/zram<id>/mm_stat to measure memory usage instead. The file is available since kernel 4.1.
orig_data_size(first): uncompressed size of data stored in this disk. compr_data_size(second): compressed size of data stored in this disk mem_used_total(third): the amount of memory allocated for this disk
Also remove useless zram cleanup call in zram_fill_fs and so we don't need to cleanup zram twice if fails.
Signed-off-by: Yang Xu xuyang2018.jy@fujitsu.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/zram/zram01.sh | 30 +++++++------------------- 1 file changed, 8 insertions(+), 22 deletions(-)
diff --git a/tools/testing/selftests/zram/zram01.sh b/tools/testing/selftests/zram/zram01.sh index 114863d9fb876..e9e9eb777e2c7 100755 --- a/tools/testing/selftests/zram/zram01.sh +++ b/tools/testing/selftests/zram/zram01.sh @@ -33,8 +33,6 @@ zram_algs="lzo"
zram_fill_fs() { - local mem_free0=$(free -m | awk 'NR==2 {print $4}') - for i in $(seq 0 $(($dev_num - 1))); do echo "fill zram$i..." local b=0 @@ -45,29 +43,17 @@ zram_fill_fs() b=$(($b + 1)) done echo "zram$i can be filled with '$b' KB" - done
- local mem_free1=$(free -m | awk 'NR==2 {print $4}') - local used_mem=$(($mem_free0 - $mem_free1)) + local mem_used_total=`awk '{print $3}' "/sys/block/zram$i/mm_stat"` + local v=$((100 * 1024 * $b / $mem_used_total)) + if [ "$v" -lt 100 ]; then + echo "FAIL compression ratio: 0.$v:1" + ERR_CODE=-1 + return + fi
- local total_size=0 - for sm in $zram_sizes; do - local s=$(echo $sm | sed 's/M//') - total_size=$(($total_size + $s)) + echo "zram compression ratio: $(echo "scale=2; $v / 100 " | bc):1: OK" done - - echo "zram used ${used_mem}M, zram disk sizes ${total_size}M" - - local v=$((100 * $total_size / $used_mem)) - - if [ "$v" -lt 100 ]; then - echo "FAIL compression ratio: 0.$v:1" - ERR_CODE=-1 - zram_cleanup - return - fi - - echo "zram compression ratio: $(echo "scale=2; $v / 100 " | bc):1: OK" }
check_prereqs
From: Yang Xu xuyang2018.jy@fujitsu.com
[ Upstream commit 01dabed20573804750af5c7bf8d1598a6bf7bf6e ]
If zram-generator package is installed and works, then we can not remove zram module because zram swap is being used. This case needs a clean zram environment, change this test by using hot_add/hot_remove interface. So even zram device is being used, we still can add zram device and remove them in cleanup.
The two interface was introduced since kernel commit 6566d1a32bf7("zram: add dynamic device add/remove functionality") in v4.2-rc1. If kernel supports these two interface, we use hot_add/hot_remove to slove this problem, if not, just check whether zram is being used or built in, then skip it on old kernel.
Signed-off-by: Yang Xu xuyang2018.jy@fujitsu.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/zram/zram.sh | 15 +--- tools/testing/selftests/zram/zram01.sh | 3 +- tools/testing/selftests/zram/zram02.sh | 1 - tools/testing/selftests/zram/zram_lib.sh | 110 +++++++++++++---------- 4 files changed, 66 insertions(+), 63 deletions(-)
diff --git a/tools/testing/selftests/zram/zram.sh b/tools/testing/selftests/zram/zram.sh index 232e958ec4547..b0b91d9b0dc21 100755 --- a/tools/testing/selftests/zram/zram.sh +++ b/tools/testing/selftests/zram/zram.sh @@ -2,9 +2,6 @@ # SPDX-License-Identifier: GPL-2.0 TCID="zram.sh"
-# Kselftest framework requirement - SKIP code is 4. -ksft_skip=4 - . ./zram_lib.sh
run_zram () { @@ -18,14 +15,4 @@ echo ""
check_prereqs
-# check zram module exists -MODULE_PATH=/lib/modules/`uname -r`/kernel/drivers/block/zram/zram.ko -if [ -f $MODULE_PATH ]; then - run_zram -elif [ -b /dev/zram0 ]; then - run_zram -else - echo "$TCID : No zram.ko module or /dev/zram0 device file not found" - echo "$TCID : CONFIG_ZRAM is not set" - exit $ksft_skip -fi +run_zram diff --git a/tools/testing/selftests/zram/zram01.sh b/tools/testing/selftests/zram/zram01.sh index e9e9eb777e2c7..8f4affe34f3e4 100755 --- a/tools/testing/selftests/zram/zram01.sh +++ b/tools/testing/selftests/zram/zram01.sh @@ -33,7 +33,7 @@ zram_algs="lzo"
zram_fill_fs() { - for i in $(seq 0 $(($dev_num - 1))); do + for i in $(seq $dev_start $dev_end); do echo "fill zram$i..." local b=0 while [ true ]; do @@ -67,7 +67,6 @@ zram_mount
zram_fill_fs zram_cleanup -zram_unload
if [ $ERR_CODE -ne 0 ]; then echo "$TCID : [FAIL]" diff --git a/tools/testing/selftests/zram/zram02.sh b/tools/testing/selftests/zram/zram02.sh index e83b404807c09..2418b0c4ed136 100755 --- a/tools/testing/selftests/zram/zram02.sh +++ b/tools/testing/selftests/zram/zram02.sh @@ -36,7 +36,6 @@ zram_set_memlimit zram_makeswap zram_swapoff zram_cleanup -zram_unload
if [ $ERR_CODE -ne 0 ]; then echo "$TCID : [FAIL]" diff --git a/tools/testing/selftests/zram/zram_lib.sh b/tools/testing/selftests/zram/zram_lib.sh index f47fc0f27e99e..21ec1966de76c 100755 --- a/tools/testing/selftests/zram/zram_lib.sh +++ b/tools/testing/selftests/zram/zram_lib.sh @@ -5,10 +5,12 @@ # Author: Alexey Kodanev alexey.kodanev@oracle.com # Modified: Naresh Kamboju naresh.kamboju@linaro.org
-MODULE=0 dev_makeswap=-1 dev_mounted=-1 - +dev_start=0 +dev_end=-1 +module_load=-1 +sys_control=-1 # Kselftest framework requirement - SKIP code is 4. ksft_skip=4 kernel_version=`uname -r | cut -d'.' -f1,2` @@ -46,57 +48,72 @@ zram_cleanup() { echo "zram cleanup" local i= - for i in $(seq 0 $dev_makeswap); do + for i in $(seq $dev_start $dev_makeswap); do swapoff /dev/zram$i done
- for i in $(seq 0 $dev_mounted); do + for i in $(seq $dev_start $dev_mounted); do umount /dev/zram$i done
- for i in $(seq 0 $(($dev_num - 1))); do + for i in $(seq $dev_start $dev_end); do echo 1 > /sys/block/zram${i}/reset rm -rf zram$i done
-} + if [ $sys_control -eq 1 ]; then + for i in $(seq $dev_start $dev_end); do + echo $i > /sys/class/zram-control/hot_remove + done + fi
-zram_unload() -{ - if [ $MODULE -ne 0 ] ; then - echo "zram rmmod zram" + if [ $module_load -eq 1 ]; then rmmod zram > /dev/null 2>&1 fi }
zram_load() { - # check zram module exists - MODULE_PATH=/lib/modules/`uname -r`/kernel/drivers/block/zram/zram.ko - if [ -f $MODULE_PATH ]; then - MODULE=1 - echo "create '$dev_num' zram device(s)" - modprobe zram num_devices=$dev_num - if [ $? -ne 0 ]; then - echo "failed to insert zram module" - exit 1 - fi - - dev_num_created=$(ls /dev/zram* | wc -w) + echo "create '$dev_num' zram device(s)" + + # zram module loaded, new kernel + if [ -d "/sys/class/zram-control" ]; then + echo "zram modules already loaded, kernel supports" \ + "zram-control interface" + dev_start=$(ls /dev/zram* | wc -w) + dev_end=$(($dev_start + $dev_num - 1)) + sys_control=1 + + for i in $(seq $dev_start $dev_end); do + cat /sys/class/zram-control/hot_add > /dev/null + done + + echo "all zram devices (/dev/zram$dev_start~$dev_end" \ + "successfully created" + return 0 + fi
- if [ "$dev_num_created" -ne "$dev_num" ]; then - echo "unexpected num of devices: $dev_num_created" - ERR_CODE=-1 + # detect old kernel or built-in + modprobe zram num_devices=$dev_num + if [ ! -d "/sys/class/zram-control" ]; then + if grep -q '^zram' /proc/modules; then + rmmod zram > /dev/null 2>&1 + if [ $? -ne 0 ]; then + echo "zram module is being used on old kernel" \ + "without zram-control interface" + exit $ksft_skip + fi else - echo "zram load module successful" + echo "test needs CONFIG_ZRAM=m on old kernel without" \ + "zram-control interface" + exit $ksft_skip fi - elif [ -b /dev/zram0 ]; then - echo "/dev/zram0 device file found: OK" - else - echo "ERROR: No zram.ko module or no /dev/zram0 device found" - echo "$TCID : CONFIG_ZRAM is not set" - exit 1 + modprobe zram num_devices=$dev_num fi + + module_load=1 + dev_end=$(($dev_num - 1)) + echo "all zram devices (/dev/zram0~$dev_end) successfully created" }
zram_max_streams() @@ -110,7 +127,7 @@ zram_max_streams() return 0 fi
- local i=0 + local i=$dev_start for max_s in $zram_max_streams; do local sys_path="/sys/block/zram${i}/max_comp_streams" echo $max_s > $sys_path || \ @@ -122,7 +139,7 @@ zram_max_streams() echo "FAIL can't set max_streams '$max_s', get $max_stream"
i=$(($i + 1)) - echo "$sys_path = '$max_streams' ($i/$dev_num)" + echo "$sys_path = '$max_streams'" done
echo "zram max streams: OK" @@ -132,15 +149,16 @@ zram_compress_alg() { echo "test that we can set compression algorithm"
- local algs=$(cat /sys/block/zram0/comp_algorithm) + local i=$dev_start + local algs=$(cat /sys/block/zram${i}/comp_algorithm) echo "supported algs: $algs" - local i=0 + for alg in $zram_algs; do local sys_path="/sys/block/zram${i}/comp_algorithm" echo "$alg" > $sys_path || \ echo "FAIL can't set '$alg' to $sys_path" i=$(($i + 1)) - echo "$sys_path = '$alg' ($i/$dev_num)" + echo "$sys_path = '$alg'" done
echo "zram set compression algorithm: OK" @@ -149,14 +167,14 @@ zram_compress_alg() zram_set_disksizes() { echo "set disk size to zram device(s)" - local i=0 + local i=$dev_start for ds in $zram_sizes; do local sys_path="/sys/block/zram${i}/disksize" echo "$ds" > $sys_path || \ echo "FAIL can't set '$ds' to $sys_path"
i=$(($i + 1)) - echo "$sys_path = '$ds' ($i/$dev_num)" + echo "$sys_path = '$ds'" done
echo "zram set disksizes: OK" @@ -166,14 +184,14 @@ zram_set_memlimit() { echo "set memory limit to zram device(s)"
- local i=0 + local i=$dev_start for ds in $zram_mem_limits; do local sys_path="/sys/block/zram${i}/mem_limit" echo "$ds" > $sys_path || \ echo "FAIL can't set '$ds' to $sys_path"
i=$(($i + 1)) - echo "$sys_path = '$ds' ($i/$dev_num)" + echo "$sys_path = '$ds'" done
echo "zram set memory limit: OK" @@ -182,8 +200,8 @@ zram_set_memlimit() zram_makeswap() { echo "make swap with zram device(s)" - local i=0 - for i in $(seq 0 $(($dev_num - 1))); do + local i=$dev_start + for i in $(seq $dev_start $dev_end); do mkswap /dev/zram$i > err.log 2>&1 if [ $? -ne 0 ]; then cat err.log @@ -206,7 +224,7 @@ zram_makeswap() zram_swapoff() { local i= - for i in $(seq 0 $dev_makeswap); do + for i in $(seq $dev_start $dev_end); do swapoff /dev/zram$i > err.log 2>&1 if [ $? -ne 0 ]; then cat err.log @@ -220,7 +238,7 @@ zram_swapoff()
zram_makefs() { - local i=0 + local i=$dev_start for fs in $zram_filesystems; do # if requested fs not supported default it to ext2 which mkfs.$fs > /dev/null 2>&1 || fs=ext2 @@ -239,7 +257,7 @@ zram_makefs() zram_mount() { local i=0 - for i in $(seq 0 $(($dev_num - 1))); do + for i in $(seq $dev_start $dev_end); do echo "mount /dev/zram$i" mkdir zram$i mount /dev/zram$i zram$i > /dev/null || \
From: Cristian Marussi cristian.marussi@arm.com
[ Upstream commit e051cdf655fa016692008a446a060eff06222bb5 ]
In E_func() macro, on error, print also errno in order to aid debugging.
Cc: Aleksa Sarai cyphar@cyphar.com Signed-off-by: Cristian Marussi cristian.marussi@arm.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/openat2/helpers.h | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/openat2/helpers.h b/tools/testing/selftests/openat2/helpers.h index a6ea27344db2d..ad5d0ba5b6ce9 100644 --- a/tools/testing/selftests/openat2/helpers.h +++ b/tools/testing/selftests/openat2/helpers.h @@ -62,11 +62,12 @@ bool needs_openat2(const struct open_how *how); (similar to chroot(2)). */ #endif /* RESOLVE_IN_ROOT */
-#define E_func(func, ...) \ - do { \ - if (func(__VA_ARGS__) < 0) \ - ksft_exit_fail_msg("%s:%d %s failed\n", \ - __FILE__, __LINE__, #func);\ +#define E_func(func, ...) \ + do { \ + errno = 0; \ + if (func(__VA_ARGS__) < 0) \ + ksft_exit_fail_msg("%s:%d %s failed - errno:%d\n", \ + __FILE__, __LINE__, #func, errno); \ } while (0)
#define E_asprintf(...) E_func(asprintf, __VA_ARGS__)
From: Cristian Marussi cristian.marussi@arm.com
[ Upstream commit ea3396725aa143dd42fe388cb67e44c90d2fb719 ]
Add a dependency on header helpers.h to the main target; while at that add to helpers.h also a missing include for bool types.
Cc: Aleksa Sarai cyphar@cyphar.com Signed-off-by: Cristian Marussi cristian.marussi@arm.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/openat2/Makefile | 2 +- tools/testing/selftests/openat2/helpers.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/openat2/Makefile b/tools/testing/selftests/openat2/Makefile index 4b93b1417b862..843ba56d8e49e 100644 --- a/tools/testing/selftests/openat2/Makefile +++ b/tools/testing/selftests/openat2/Makefile @@ -5,4 +5,4 @@ TEST_GEN_PROGS := openat2_test resolve_test rename_attack_test
include ../lib.mk
-$(TEST_GEN_PROGS): helpers.c +$(TEST_GEN_PROGS): helpers.c helpers.h diff --git a/tools/testing/selftests/openat2/helpers.h b/tools/testing/selftests/openat2/helpers.h index ad5d0ba5b6ce9..7056340b9339e 100644 --- a/tools/testing/selftests/openat2/helpers.h +++ b/tools/testing/selftests/openat2/helpers.h @@ -9,6 +9,7 @@
#define _GNU_SOURCE #include <stdint.h> +#include <stdbool.h> #include <errno.h> #include <linux/types.h> #include "../kselftest.h"
From: Cristian Marussi cristian.marussi@arm.com
[ Upstream commit ac9e0a250bb155078601a5b999aab05f2a04d1ab ]
Skip testcases that fail since the requested valid flags combination is not supported by the underlying filesystem.
Cc: Aleksa Sarai cyphar@cyphar.com Signed-off-by: Cristian Marussi cristian.marussi@arm.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/openat2/openat2_test.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/openat2/openat2_test.c b/tools/testing/selftests/openat2/openat2_test.c index 1bddbe934204c..7fb902099de45 100644 --- a/tools/testing/selftests/openat2/openat2_test.c +++ b/tools/testing/selftests/openat2/openat2_test.c @@ -259,6 +259,16 @@ void test_openat2_flags(void) unlink(path);
fd = sys_openat2(AT_FDCWD, path, &test->how); + if (fd < 0 && fd == -EOPNOTSUPP) { + /* + * Skip the testcase if it failed because not supported + * by FS. (e.g. a valid O_TMPFILE combination on NFS) + */ + ksft_test_result_skip("openat2 with %s fails with %d (%s)\n", + test->name, fd, strerror(-fd)); + goto next; + } + if (test->err >= 0) failed = (fd < 0); else @@ -303,7 +313,7 @@ void test_openat2_flags(void) else resultfn("openat2 with %s fails with %d (%s)\n", test->name, test->err, strerror(-test->err)); - +next: free(fdpath); fflush(stdout); }
From: Cristian Marussi cristian.marussi@arm.com
[ Upstream commit dae1d8ac31896988e7313384c0370176a75e9b45 ]
Report mincore.check_file_mmap as SKIP instead of FAIL if the underlying filesystem lacks support of O_TMPFILE or fallocate since such failures are not really related to mincore functionality.
Cc: Ricardo Cañuelo ricardo.canuelo@collabora.com Signed-off-by: Cristian Marussi cristian.marussi@arm.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../selftests/mincore/mincore_selftest.c | 20 +++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/mincore/mincore_selftest.c b/tools/testing/selftests/mincore/mincore_selftest.c index e54106643337b..4c88238fc8f05 100644 --- a/tools/testing/selftests/mincore/mincore_selftest.c +++ b/tools/testing/selftests/mincore/mincore_selftest.c @@ -207,15 +207,21 @@ TEST(check_file_mmap)
errno = 0; fd = open(".", O_TMPFILE | O_RDWR, 0600); - ASSERT_NE(-1, fd) { - TH_LOG("Can't create temporary file: %s", - strerror(errno)); + if (fd < 0) { + ASSERT_EQ(errno, EOPNOTSUPP) { + TH_LOG("Can't create temporary file: %s", + strerror(errno)); + } + SKIP(goto out_free, "O_TMPFILE not supported by filesystem."); } errno = 0; retval = fallocate(fd, 0, 0, FILE_SIZE); - ASSERT_EQ(0, retval) { - TH_LOG("Error allocating space for the temporary file: %s", - strerror(errno)); + if (retval) { + ASSERT_EQ(errno, EOPNOTSUPP) { + TH_LOG("Error allocating space for the temporary file: %s", + strerror(errno)); + } + SKIP(goto out_close, "fallocate not supported by filesystem."); }
/* @@ -271,7 +277,9 @@ TEST(check_file_mmap) }
munmap(addr, FILE_SIZE); +out_close: close(fd); +out_free: free(vec); }
From: Duoming Zhou duoming@zju.edu.cn
[ Upstream commit 4e0f718daf97d47cf7dec122da1be970f145c809 ]
The previous commit 1ade48d0c27d ("ax25: NPD bug when detaching AX25 device") introduce lock_sock() into ax25_kill_by_device to prevent NPD bug. But the concurrency NPD or UAF bug will occur, when lock_sock() or release_sock() dereferences the ax25_cb->sock.
The NULL pointer dereference bug can be shown as below:
ax25_kill_by_device() | ax25_release() | ax25_destroy_socket() | ax25_cb_del() ... | ... | ax25->sk=NULL; lock_sock(s->sk); //(1) | s->ax25_dev = NULL; | ... release_sock(s->sk); //(2) | ... |
The root cause is that the sock is set to null before dereference site (1) or (2). Therefore, this patch extracts the ax25_cb->sock in advance, and uses ax25_list_lock to protect it, which can synchronize with ax25_cb_del() and ensure the value of sock is not null before dereference sites.
The concurrency UAF bug can be shown as below:
ax25_kill_by_device() | ax25_release() | ax25_destroy_socket() ... | ... | sock_put(sk); //FREE lock_sock(s->sk); //(1) | s->ax25_dev = NULL; | ... release_sock(s->sk); //(2) | ... |
The root cause is that the sock is released before dereference site (1) or (2). Therefore, this patch uses sock_hold() to increase the refcount of sock and uses ax25_list_lock to protect it, which can synchronize with ax25_cb_del() in ax25_destroy_socket() and ensure the sock wil not be released before dereference sites.
Signed-off-by: Duoming Zhou duoming@zju.edu.cn Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ax25/af_ax25.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index 7473e0cc6d469..ea3431ac46a14 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -77,6 +77,7 @@ static void ax25_kill_by_device(struct net_device *dev) { ax25_dev *ax25_dev; ax25_cb *s; + struct sock *sk;
if ((ax25_dev = ax25_dev_ax25dev(dev)) == NULL) return; @@ -85,13 +86,15 @@ static void ax25_kill_by_device(struct net_device *dev) again: ax25_for_each(s, &ax25_list) { if (s->ax25_dev == ax25_dev) { + sk = s->sk; + sock_hold(sk); spin_unlock_bh(&ax25_list_lock); - lock_sock(s->sk); + lock_sock(sk); s->ax25_dev = NULL; - release_sock(s->sk); + release_sock(sk); ax25_disconnect(s, ENETUNREACH); spin_lock_bh(&ax25_list_lock); - + sock_put(sk); /* The entry could have been deleted from the * list meanwhile and thus the next pointer is * no longer valid. Play it safe and restart
From: Julian Braha julianbraha@gmail.com
[ Upstream commit 3a5286955bf5febc3d151bcb2c5e272e383b64aa ]
When PINCTRL_BCM63XX is selected, and REGMAP is not selected, Kbuild gives the following warning:
WARNING: unmet direct dependencies detected for GPIO_REGMAP Depends on [n]: GPIOLIB [=y] && REGMAP [=n] Selected by [y]: - PINCTRL_BCM63XX [=y] && PINCTRL [=y]
This is because PINCTRL_BCM63XX selects GPIO_REGMAP without selecting or depending on REGMAP, despite GPIO_REGMAP depending on REGMAP.
This unmet dependency bug was detected by Kismet, a static analysis tool for Kconfig. Please advise if this is not the appropriate solution.
Signed-off-by: Julian Braha julianbraha@gmail.com Link: https://lore.kernel.org/r/20220117062557.89568-1-julianbraha@gmail.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/bcm/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/pinctrl/bcm/Kconfig b/drivers/pinctrl/bcm/Kconfig index c9c5efc927311..5973a279e6b8c 100644 --- a/drivers/pinctrl/bcm/Kconfig +++ b/drivers/pinctrl/bcm/Kconfig @@ -35,6 +35,7 @@ config PINCTRL_BCM63XX select PINCONF select GENERIC_PINCONF select GPIOLIB + select REGMAP select GPIO_REGMAP
config PINCTRL_BCM6318
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit 2719c7160dcfaae1f73a1c0c210ad3281c19022e ]
If we fail to synchronize the filesystem while preparing to freeze the fs, abort the freeze.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Christoph Hellwig hch@lst.de Acked-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/super.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/fs/super.c b/fs/super.c index a1f82dfd1b39a..87379bb1f7a30 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1616,11 +1616,9 @@ static void lockdep_sb_freeze_acquire(struct super_block *sb) percpu_rwsem_acquire(sb->s_writers.rw_sem + level, 0, _THIS_IP_); }
-static void sb_freeze_unlock(struct super_block *sb) +static void sb_freeze_unlock(struct super_block *sb, int level) { - int level; - - for (level = SB_FREEZE_LEVELS - 1; level >= 0; level--) + for (level--; level >= 0; level--) percpu_up_write(sb->s_writers.rw_sem + level); }
@@ -1691,7 +1689,14 @@ int freeze_super(struct super_block *sb) sb_wait_write(sb, SB_FREEZE_PAGEFAULT);
/* All writers are done so after syncing there won't be dirty data */ - sync_filesystem(sb); + ret = sync_filesystem(sb); + if (ret) { + sb->s_writers.frozen = SB_UNFROZEN; + sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT); + wake_up(&sb->s_writers.wait_unfrozen); + deactivate_locked_super(sb); + return ret; + }
/* Now wait for internal filesystem counter */ sb->s_writers.frozen = SB_FREEZE_FS; @@ -1703,7 +1708,7 @@ int freeze_super(struct super_block *sb) printk(KERN_ERR "VFS:Filesystem freeze failed\n"); sb->s_writers.frozen = SB_UNFROZEN; - sb_freeze_unlock(sb); + sb_freeze_unlock(sb, SB_FREEZE_FS); wake_up(&sb->s_writers.wait_unfrozen); deactivate_locked_super(sb); return ret; @@ -1748,7 +1753,7 @@ static int thaw_super_locked(struct super_block *sb) }
sb->s_writers.frozen = SB_UNFROZEN; - sb_freeze_unlock(sb); + sb_freeze_unlock(sb, SB_FREEZE_FS); out: wake_up(&sb->s_writers.wait_unfrozen); deactivate_locked_super(sb);
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit dd5532a4994bfda0386eb2286ec00758cee08444 ]
Strangely, dquot_quota_sync ignores the return code from the ->sync_fs call, which means that quotacalls like Q_SYNC never see the error. This doesn't seem right, so fix that.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Christoph Hellwig hch@lst.de Acked-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/quota/dquot.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index 22d904bde6ab9..a74aef99bd3d6 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -690,9 +690,14 @@ int dquot_quota_sync(struct super_block *sb, int type) /* This is not very clever (and fast) but currently I don't know about * any other simple way of getting quota data to disk and we must get * them there for userspace to be visible... */ - if (sb->s_op->sync_fs) - sb->s_op->sync_fs(sb, 1); - sync_blockdev(sb->s_bdev); + if (sb->s_op->sync_fs) { + ret = sb->s_op->sync_fs(sb, 1); + if (ret) + return ret; + } + ret = sync_blockdev(sb->s_bdev); + if (ret) + return ret;
/* * Now when everything is written we can discard the pagecache so
From: Ajish Koshy Ajish.Koshy@microchip.com
[ Upstream commit c26b85ea16365079be8d206b20556a60a0c69ad4 ]
Current code handles completions for SATA devices in mpi_sata_completion() and mpi_sata_event().
However, at the time when any SATA event happens, for almost all the event types, the command is still in the target. It is therefore incorrect to complete the task in sata_event().
There are some events for which we get sata_completions, some need recovery procedure and others abort. All the tasks must be completed via sata_completion() path.
Removed the task done related code from sata_events(). For tasks where we don't get completions, let top layer call abort() to abort the command post timeout.
Link: https://lore.kernel.org/r/20220124082255.86223-1-Ajish.Koshy@microchip.com Acked-by: Jack Wang jinpu.wang@ionos.com Co-developed-by: Viswas G Viswas.G@microchip.com Signed-off-by: Viswas G Viswas.G@microchip.com Signed-off-by: Ajish Koshy Ajish.Koshy@microchip.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/pm8001/pm8001_hwi.c | 18 ------------------ drivers/scsi/pm8001/pm80xx_hwi.c | 26 -------------------------- 2 files changed, 44 deletions(-)
diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c index 880e1f356defc..5e6b23da4157c 100644 --- a/drivers/scsi/pm8001/pm8001_hwi.c +++ b/drivers/scsi/pm8001/pm8001_hwi.c @@ -2695,7 +2695,6 @@ static void mpi_sata_event(struct pm8001_hba_info *pm8001_ha, void *piomb) u32 tag = le32_to_cpu(psataPayload->tag); u32 port_id = le32_to_cpu(psataPayload->port_id); u32 dev_id = le32_to_cpu(psataPayload->device_id); - unsigned long flags;
ccb = &pm8001_ha->ccb_info[tag];
@@ -2735,8 +2734,6 @@ static void mpi_sata_event(struct pm8001_hba_info *pm8001_ha, void *piomb) ts->resp = SAS_TASK_COMPLETE; ts->stat = SAS_DATA_OVERRUN; ts->residual = 0; - if (pm8001_dev) - atomic_dec(&pm8001_dev->running_req); break; case IO_XFER_ERROR_BREAK: pm8001_dbg(pm8001_ha, IO, "IO_XFER_ERROR_BREAK\n"); @@ -2778,7 +2775,6 @@ static void mpi_sata_event(struct pm8001_hba_info *pm8001_ha, void *piomb) IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS); ts->resp = SAS_TASK_COMPLETE; ts->stat = SAS_QUEUE_FULL; - pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag); return; } break; @@ -2864,20 +2860,6 @@ static void mpi_sata_event(struct pm8001_hba_info *pm8001_ha, void *piomb) ts->stat = SAS_OPEN_TO; break; } - spin_lock_irqsave(&t->task_state_lock, flags); - t->task_state_flags &= ~SAS_TASK_STATE_PENDING; - t->task_state_flags &= ~SAS_TASK_AT_INITIATOR; - t->task_state_flags |= SAS_TASK_STATE_DONE; - if (unlikely((t->task_state_flags & SAS_TASK_STATE_ABORTED))) { - spin_unlock_irqrestore(&t->task_state_lock, flags); - pm8001_dbg(pm8001_ha, FAIL, - "task 0x%p done with io_status 0x%x resp 0x%x stat 0x%x but aborted by upper layer!\n", - t, event, ts->resp, ts->stat); - pm8001_ccb_task_free(pm8001_ha, t, ccb, tag); - } else { - spin_unlock_irqrestore(&t->task_state_lock, flags); - pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag); - } }
/*See the comments for mpi_ssp_completion */ diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c index ed13e0e044b74..2cd100c062aba 100644 --- a/drivers/scsi/pm8001/pm80xx_hwi.c +++ b/drivers/scsi/pm8001/pm80xx_hwi.c @@ -2828,7 +2828,6 @@ static void mpi_sata_event(struct pm8001_hba_info *pm8001_ha, u32 tag = le32_to_cpu(psataPayload->tag); u32 port_id = le32_to_cpu(psataPayload->port_id); u32 dev_id = le32_to_cpu(psataPayload->device_id); - unsigned long flags;
ccb = &pm8001_ha->ccb_info[tag];
@@ -2866,8 +2865,6 @@ static void mpi_sata_event(struct pm8001_hba_info *pm8001_ha, ts->resp = SAS_TASK_COMPLETE; ts->stat = SAS_DATA_OVERRUN; ts->residual = 0; - if (pm8001_dev) - atomic_dec(&pm8001_dev->running_req); break; case IO_XFER_ERROR_BREAK: pm8001_dbg(pm8001_ha, IO, "IO_XFER_ERROR_BREAK\n"); @@ -2916,11 +2913,6 @@ static void mpi_sata_event(struct pm8001_hba_info *pm8001_ha, IO_OPEN_CNX_ERROR_IT_NEXUS_LOSS); ts->resp = SAS_TASK_COMPLETE; ts->stat = SAS_QUEUE_FULL; - spin_unlock_irqrestore(&circularQ->oq_lock, - circularQ->lock_flags); - pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag); - spin_lock_irqsave(&circularQ->oq_lock, - circularQ->lock_flags); return; } break; @@ -3020,24 +3012,6 @@ static void mpi_sata_event(struct pm8001_hba_info *pm8001_ha, ts->stat = SAS_OPEN_TO; break; } - spin_lock_irqsave(&t->task_state_lock, flags); - t->task_state_flags &= ~SAS_TASK_STATE_PENDING; - t->task_state_flags &= ~SAS_TASK_AT_INITIATOR; - t->task_state_flags |= SAS_TASK_STATE_DONE; - if (unlikely((t->task_state_flags & SAS_TASK_STATE_ABORTED))) { - spin_unlock_irqrestore(&t->task_state_lock, flags); - pm8001_dbg(pm8001_ha, FAIL, - "task 0x%p done with io_status 0x%x resp 0x%x stat 0x%x but aborted by upper layer!\n", - t, event, ts->resp, ts->stat); - pm8001_ccb_task_free(pm8001_ha, t, ccb, tag); - } else { - spin_unlock_irqrestore(&t->task_state_lock, flags); - spin_unlock_irqrestore(&circularQ->oq_lock, - circularQ->lock_flags); - pm8001_ccb_task_free_done(pm8001_ha, t, ccb, tag); - spin_lock_irqsave(&circularQ->oq_lock, - circularQ->lock_flags); - } }
/*See the comments for mpi_ssp_completion */
From: Vincenzo Frascino vincenzo.frascino@arm.com
[ Upstream commit ec049891b2dc16591813eacaddc476b3d27c8c14 ]
vdso_test_abi contains a batch of tests that verify the validity of the vDSO ABI.
When a vDSO symbol is not found the relevant test is skipped reporting KSFT_SKIP. All the tests return values are then added in a single variable which is checked to verify failures. This approach can have side effects which result in reporting the wrong kselftest exit status.
Fix vdso_test_abi verifying the return code of each test separately.
Cc: Shuah Khan shuah@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Thomas Gleixner tglx@linutronix.de Reported-by: Cristian Marussi cristian.marussi@arm.com Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/vDSO/vdso_test_abi.c | 135 +++++++++---------- 1 file changed, 62 insertions(+), 73 deletions(-)
diff --git a/tools/testing/selftests/vDSO/vdso_test_abi.c b/tools/testing/selftests/vDSO/vdso_test_abi.c index 3d603f1394af4..883ca85424bc5 100644 --- a/tools/testing/selftests/vDSO/vdso_test_abi.c +++ b/tools/testing/selftests/vDSO/vdso_test_abi.c @@ -33,110 +33,114 @@ typedef long (*vdso_clock_gettime_t)(clockid_t clk_id, struct timespec *ts); typedef long (*vdso_clock_getres_t)(clockid_t clk_id, struct timespec *ts); typedef time_t (*vdso_time_t)(time_t *t);
-static int vdso_test_gettimeofday(void) +#define VDSO_TEST_PASS_MSG() "\n%s(): PASS\n", __func__ +#define VDSO_TEST_FAIL_MSG(x) "\n%s(): %s FAIL\n", __func__, x +#define VDSO_TEST_SKIP_MSG(x) "\n%s(): SKIP: Could not find %s\n", __func__, x + +static void vdso_test_gettimeofday(void) { /* Find gettimeofday. */ vdso_gettimeofday_t vdso_gettimeofday = (vdso_gettimeofday_t)vdso_sym(version, name[0]);
if (!vdso_gettimeofday) { - printf("Could not find %s\n", name[0]); - return KSFT_SKIP; + ksft_test_result_skip(VDSO_TEST_SKIP_MSG(name[0])); + return; }
struct timeval tv; long ret = vdso_gettimeofday(&tv, 0);
if (ret == 0) { - printf("The time is %lld.%06lld\n", - (long long)tv.tv_sec, (long long)tv.tv_usec); + ksft_print_msg("The time is %lld.%06lld\n", + (long long)tv.tv_sec, (long long)tv.tv_usec); + ksft_test_result_pass(VDSO_TEST_PASS_MSG()); } else { - printf("%s failed\n", name[0]); - return KSFT_FAIL; + ksft_test_result_fail(VDSO_TEST_FAIL_MSG(name[0])); } - - return KSFT_PASS; }
-static int vdso_test_clock_gettime(clockid_t clk_id) +static void vdso_test_clock_gettime(clockid_t clk_id) { /* Find clock_gettime. */ vdso_clock_gettime_t vdso_clock_gettime = (vdso_clock_gettime_t)vdso_sym(version, name[1]);
if (!vdso_clock_gettime) { - printf("Could not find %s\n", name[1]); - return KSFT_SKIP; + ksft_test_result_skip(VDSO_TEST_SKIP_MSG(name[1])); + return; }
struct timespec ts; long ret = vdso_clock_gettime(clk_id, &ts);
if (ret == 0) { - printf("The time is %lld.%06lld\n", - (long long)ts.tv_sec, (long long)ts.tv_nsec); + ksft_print_msg("The time is %lld.%06lld\n", + (long long)ts.tv_sec, (long long)ts.tv_nsec); + ksft_test_result_pass(VDSO_TEST_PASS_MSG()); } else { - printf("%s failed\n", name[1]); - return KSFT_FAIL; + ksft_test_result_fail(VDSO_TEST_FAIL_MSG(name[1])); } - - return KSFT_PASS; }
-static int vdso_test_time(void) +static void vdso_test_time(void) { /* Find time. */ vdso_time_t vdso_time = (vdso_time_t)vdso_sym(version, name[2]);
if (!vdso_time) { - printf("Could not find %s\n", name[2]); - return KSFT_SKIP; + ksft_test_result_skip(VDSO_TEST_SKIP_MSG(name[2])); + return; }
long ret = vdso_time(NULL);
if (ret > 0) { - printf("The time in hours since January 1, 1970 is %lld\n", + ksft_print_msg("The time in hours since January 1, 1970 is %lld\n", (long long)(ret / 3600)); + ksft_test_result_pass(VDSO_TEST_PASS_MSG()); } else { - printf("%s failed\n", name[2]); - return KSFT_FAIL; + ksft_test_result_fail(VDSO_TEST_FAIL_MSG(name[2])); } - - return KSFT_PASS; }
-static int vdso_test_clock_getres(clockid_t clk_id) +static void vdso_test_clock_getres(clockid_t clk_id) { + int clock_getres_fail = 0; + /* Find clock_getres. */ vdso_clock_getres_t vdso_clock_getres = (vdso_clock_getres_t)vdso_sym(version, name[3]);
if (!vdso_clock_getres) { - printf("Could not find %s\n", name[3]); - return KSFT_SKIP; + ksft_test_result_skip(VDSO_TEST_SKIP_MSG(name[3])); + return; }
struct timespec ts, sys_ts; long ret = vdso_clock_getres(clk_id, &ts);
if (ret == 0) { - printf("The resolution is %lld %lld\n", - (long long)ts.tv_sec, (long long)ts.tv_nsec); + ksft_print_msg("The vdso resolution is %lld %lld\n", + (long long)ts.tv_sec, (long long)ts.tv_nsec); } else { - printf("%s failed\n", name[3]); - return KSFT_FAIL; + clock_getres_fail++; }
ret = syscall(SYS_clock_getres, clk_id, &sys_ts);
- if ((sys_ts.tv_sec != ts.tv_sec) || (sys_ts.tv_nsec != ts.tv_nsec)) { - printf("%s failed\n", name[3]); - return KSFT_FAIL; - } + ksft_print_msg("The syscall resolution is %lld %lld\n", + (long long)sys_ts.tv_sec, (long long)sys_ts.tv_nsec);
- return KSFT_PASS; + if ((sys_ts.tv_sec != ts.tv_sec) || (sys_ts.tv_nsec != ts.tv_nsec)) + clock_getres_fail++; + + if (clock_getres_fail > 0) { + ksft_test_result_fail(VDSO_TEST_FAIL_MSG(name[3])); + } else { + ksft_test_result_pass(VDSO_TEST_PASS_MSG()); + } }
const char *vdso_clock_name[12] = { @@ -158,36 +162,23 @@ const char *vdso_clock_name[12] = { * This function calls vdso_test_clock_gettime and vdso_test_clock_getres * with different values for clock_id. */ -static inline int vdso_test_clock(clockid_t clock_id) +static inline void vdso_test_clock(clockid_t clock_id) { - int ret0, ret1; - - ret0 = vdso_test_clock_gettime(clock_id); - /* A skipped test is considered passed */ - if (ret0 == KSFT_SKIP) - ret0 = KSFT_PASS; - - ret1 = vdso_test_clock_getres(clock_id); - /* A skipped test is considered passed */ - if (ret1 == KSFT_SKIP) - ret1 = KSFT_PASS; + ksft_print_msg("\nclock_id: %s\n", vdso_clock_name[clock_id]);
- ret0 += ret1; + vdso_test_clock_gettime(clock_id);
- printf("clock_id: %s", vdso_clock_name[clock_id]); - - if (ret0 > 0) - printf(" [FAIL]\n"); - else - printf(" [PASS]\n"); - - return ret0; + vdso_test_clock_getres(clock_id); }
+#define VDSO_TEST_PLAN 16 + int main(int argc, char **argv) { unsigned long sysinfo_ehdr = getauxval(AT_SYSINFO_EHDR); - int ret; + + ksft_print_header(); + ksft_set_plan(VDSO_TEST_PLAN);
if (!sysinfo_ehdr) { printf("AT_SYSINFO_EHDR is not present!\n"); @@ -201,44 +192,42 @@ int main(int argc, char **argv)
vdso_init_from_sysinfo_ehdr(getauxval(AT_SYSINFO_EHDR));
- ret = vdso_test_gettimeofday(); + vdso_test_gettimeofday();
#if _POSIX_TIMERS > 0
#ifdef CLOCK_REALTIME - ret += vdso_test_clock(CLOCK_REALTIME); + vdso_test_clock(CLOCK_REALTIME); #endif
#ifdef CLOCK_BOOTTIME - ret += vdso_test_clock(CLOCK_BOOTTIME); + vdso_test_clock(CLOCK_BOOTTIME); #endif
#ifdef CLOCK_TAI - ret += vdso_test_clock(CLOCK_TAI); + vdso_test_clock(CLOCK_TAI); #endif
#ifdef CLOCK_REALTIME_COARSE - ret += vdso_test_clock(CLOCK_REALTIME_COARSE); + vdso_test_clock(CLOCK_REALTIME_COARSE); #endif
#ifdef CLOCK_MONOTONIC - ret += vdso_test_clock(CLOCK_MONOTONIC); + vdso_test_clock(CLOCK_MONOTONIC); #endif
#ifdef CLOCK_MONOTONIC_RAW - ret += vdso_test_clock(CLOCK_MONOTONIC_RAW); + vdso_test_clock(CLOCK_MONOTONIC_RAW); #endif
#ifdef CLOCK_MONOTONIC_COARSE - ret += vdso_test_clock(CLOCK_MONOTONIC_COARSE); + vdso_test_clock(CLOCK_MONOTONIC_COARSE); #endif
#endif
- ret += vdso_test_time(); - - if (ret > 0) - return KSFT_FAIL; + vdso_test_time();
- return KSFT_PASS; + ksft_print_cnts(); + return ksft_get_fail_cnt() == 0 ? KSFT_PASS : KSFT_FAIL; }
From: Ming Lei ming.lei@redhat.com
[ Upstream commit edb854a3680bacc9ef9b91ec0c5ff6105886f6f3 ]
We currently use ->cmd_per_lun as initial queue depth for setting up the budget_map. Martin Wilck reported that it is common for the queue_depth to be subsequently updated in slave_configure() based on detected hardware characteristics.
As a result, for some drivers, the static host template settings for cmd_per_lun and can_queue won't actually get used in practice. And if the default values are used to allocate the budget_map, memory may be consumed unnecessarily.
Fix the issue by reallocating the budget_map after ->slave_configure() returns. At that time the device queue_depth should accurately reflect what the hardware needs.
Link: https://lore.kernel.org/r/20220127153733.409132-1-ming.lei@redhat.com Cc: Bart Van Assche bvanassche@acm.org Reported-by: Martin Wilck martin.wilck@suse.com Suggested-by: Martin Wilck martin.wilck@suse.com Tested-by: Martin Wilck mwilck@suse.com Reviewed-by: Martin Wilck mwilck@suse.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Ming Lei ming.lei@redhat.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_scan.c | 55 ++++++++++++++++++++++++++++++++++++---- 1 file changed, 50 insertions(+), 5 deletions(-)
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index fe22191522a3b..7266880c70c21 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -198,6 +198,48 @@ static void scsi_unlock_floptical(struct scsi_device *sdev, SCSI_TIMEOUT, 3, NULL); }
+static int scsi_realloc_sdev_budget_map(struct scsi_device *sdev, + unsigned int depth) +{ + int new_shift = sbitmap_calculate_shift(depth); + bool need_alloc = !sdev->budget_map.map; + bool need_free = false; + int ret; + struct sbitmap sb_backup; + + /* + * realloc if new shift is calculated, which is caused by setting + * up one new default queue depth after calling ->slave_configure + */ + if (!need_alloc && new_shift != sdev->budget_map.shift) + need_alloc = need_free = true; + + if (!need_alloc) + return 0; + + /* + * Request queue has to be frozen for reallocating budget map, + * and here disk isn't added yet, so freezing is pretty fast + */ + if (need_free) { + blk_mq_freeze_queue(sdev->request_queue); + sb_backup = sdev->budget_map; + } + ret = sbitmap_init_node(&sdev->budget_map, + scsi_device_max_queue_depth(sdev), + new_shift, GFP_KERNEL, + sdev->request_queue->node, false, true); + if (need_free) { + if (ret) + sdev->budget_map = sb_backup; + else + sbitmap_free(&sb_backup); + ret = 0; + blk_mq_unfreeze_queue(sdev->request_queue); + } + return ret; +} + /** * scsi_alloc_sdev - allocate and setup a scsi_Device * @starget: which target to allocate a &scsi_device for @@ -291,11 +333,7 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget, * default device queue depth to figure out sbitmap shift * since we use this queue depth most of times. */ - if (sbitmap_init_node(&sdev->budget_map, - scsi_device_max_queue_depth(sdev), - sbitmap_calculate_shift(depth), - GFP_KERNEL, sdev->request_queue->node, - false, true)) { + if (scsi_realloc_sdev_budget_map(sdev, depth)) { put_device(&starget->dev); kfree(sdev); goto out; @@ -1001,6 +1039,13 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result, } return SCSI_SCAN_NO_RESPONSE; } + + /* + * The queue_depth is often changed in ->slave_configure. + * Set up budget map again since memory consumption of + * the map depends on actual queue depth. + */ + scsi_realloc_sdev_budget_map(sdev, sdev->queue_depth); }
if (sdev->scsi_level >= SCSI_3)
From: John Garry john.garry@huawei.com
[ Upstream commit 61f162aa4381845acbdc7f2be4dfb694d027c018 ]
Currently a use-after-free may occur if a TMF sas_task is aborted before we handle the IO completion in mpi_ssp_completion(). The abort occurs due to timeout.
When the timeout occurs, the SAS_TASK_STATE_ABORTED flag is set and the sas_task is freed in pm8001_exec_internal_tmf_task().
However, if the I/O completion occurs later, the I/O completion still thinks that the sas_task is available. Fix this by clearing the ccb->task if the TMF times out - the I/O completion handler does nothing if this pointer is cleared.
Link: https://lore.kernel.org/r/1643289172-165636-3-git-send-email-john.garry@huaw... Reviewed-by: Damien Le Moal damien.lemoal@opensource.wdc.com Acked-by: Jack Wang jinpu.wang@ionos.com Signed-off-by: John Garry john.garry@huawei.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/pm8001/pm8001_sas.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c index 32e60f0c3b148..491cecbbe1aa7 100644 --- a/drivers/scsi/pm8001/pm8001_sas.c +++ b/drivers/scsi/pm8001/pm8001_sas.c @@ -753,8 +753,13 @@ static int pm8001_exec_internal_tmf_task(struct domain_device *dev, res = -TMF_RESP_FUNC_FAILED; /* Even TMF timed out, return direct. */ if (task->task_state_flags & SAS_TASK_STATE_ABORTED) { + struct pm8001_ccb_info *ccb = task->lldd_task; + pm8001_dbg(pm8001_ha, FAIL, "TMF task[%x]timeout.\n", tmf->tmf); + + if (ccb) + ccb->task = NULL; goto ex_err; }
From: John Garry john.garry@huawei.com
[ Upstream commit df7abcaa1246e2537ab4016077b5443bb3c09378 ]
Currently a use-after-free may occur if a sas_task is aborted by the upper layer before we handle the I/O completion in mpi_ssp_completion() or mpi_sata_completion().
In this case, the following are the two steps in handling those I/O completions:
- Call complete() to inform the upper layer handler of completion of the I/O.
- Release driver resources associated with the sas_task in pm8001_ccb_task_free() call.
When complete() is called, the upper layer may free the sas_task. As such, we should not touch the associated sas_task afterwards, but we do so in the pm8001_ccb_task_free() call.
Fix by swapping the complete() and pm8001_ccb_task_free() calls ordering.
Link: https://lore.kernel.org/r/1643289172-165636-4-git-send-email-john.garry@huaw... Reviewed-by: Damien Le Moal damien.lemoal@opensource.wdc.com Acked-by: Jack Wang jinpu.wang@ionos.com Signed-off-by: John Garry john.garry@huawei.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/pm8001/pm80xx_hwi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c index 2cd100c062aba..3056f3615ab8a 100644 --- a/drivers/scsi/pm8001/pm80xx_hwi.c +++ b/drivers/scsi/pm8001/pm80xx_hwi.c @@ -2184,9 +2184,9 @@ mpi_ssp_completion(struct pm8001_hba_info *pm8001_ha, void *piomb) pm8001_dbg(pm8001_ha, FAIL, "task 0x%p done with io_status 0x%x resp 0x%x stat 0x%x but aborted by upper layer!\n", t, status, ts->resp, ts->stat); + pm8001_ccb_task_free(pm8001_ha, t, ccb, tag); if (t->slow_task) complete(&t->slow_task->completion); - pm8001_ccb_task_free(pm8001_ha, t, ccb, tag); } else { spin_unlock_irqrestore(&t->task_state_lock, flags); pm8001_ccb_task_free(pm8001_ha, t, ccb, tag); @@ -2801,9 +2801,9 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha, pm8001_dbg(pm8001_ha, FAIL, "task 0x%p done with io_status 0x%x resp 0x%x stat 0x%x but aborted by upper layer!\n", t, status, ts->resp, ts->stat); + pm8001_ccb_task_free(pm8001_ha, t, ccb, tag); if (t->slow_task) complete(&t->slow_task->completion); - pm8001_ccb_task_free(pm8001_ha, t, ccb, tag); } else { spin_unlock_irqrestore(&t->task_state_lock, flags); spin_unlock_irqrestore(&circularQ->oq_lock,
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit a6ed2035878e5ad2e43ed175d8812ac9399d6c40 ]
On some OEM setups users can configure the BIOS for S3 or S2idle. When configured to S3 users can still choose 's2idle' in the kernel by using `/sys/power/mem_sleep`. Before commit 6dc8265f9803 ("drm/amdgpu: always reset the asic in suspend (v2)"), the GPU would crash. Now when configured this way, the system should resume but will use more power.
As such, adjust the `amdpu_acpi_is_s0ix function` to warn users about potential power consumption issues during their first attempt at suspending.
Reported-by: Bjoren Dasse bjoern.daase@gmail.com Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1824 Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 8 ++++++-- drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 24 +++++++++++++++++++----- 2 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index f428f94b43c0a..c8d31a22176f3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1397,12 +1397,10 @@ int amdgpu_acpi_smart_shift_update(struct drm_device *dev, enum amdgpu_ss ss_sta int amdgpu_acpi_pcie_notify_device_ready(struct amdgpu_device *adev);
void amdgpu_acpi_get_backlight_caps(struct amdgpu_dm_backlight_caps *caps); -bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev); void amdgpu_acpi_detect(void); #else static inline int amdgpu_acpi_init(struct amdgpu_device *adev) { return 0; } static inline void amdgpu_acpi_fini(struct amdgpu_device *adev) { } -static inline bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev) { return false; } static inline void amdgpu_acpi_detect(void) { } static inline bool amdgpu_acpi_is_power_shift_control_supported(void) { return false; } static inline int amdgpu_acpi_power_shift_control(struct amdgpu_device *adev, @@ -1411,6 +1409,12 @@ static inline int amdgpu_acpi_smart_shift_update(struct drm_device *dev, enum amdgpu_ss ss_state) { return 0; } #endif
+#if defined(CONFIG_ACPI) && defined(CONFIG_SUSPEND) +bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev); +#else +static inline bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev) { return false; } +#endif + int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser, uint64_t addr, struct amdgpu_bo **bo, struct amdgpu_bo_va_mapping **mapping); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c index 4811b0faafd9a..b19d407518024 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c @@ -1031,6 +1031,7 @@ void amdgpu_acpi_detect(void) } }
+#if IS_ENABLED(CONFIG_SUSPEND) /** * amdgpu_acpi_is_s0ix_active * @@ -1040,11 +1041,24 @@ void amdgpu_acpi_detect(void) */ bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev) { -#if IS_ENABLED(CONFIG_AMD_PMC) && IS_ENABLED(CONFIG_SUSPEND) - if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) { - if (adev->flags & AMD_IS_APU) - return pm_suspend_target_state == PM_SUSPEND_TO_IDLE; + if (!(adev->flags & AMD_IS_APU) || + (pm_suspend_target_state != PM_SUSPEND_TO_IDLE)) + return false; + + if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0)) { + dev_warn_once(adev->dev, + "Power consumption will be higher as BIOS has not been configured for suspend-to-idle.\n" + "To use suspend-to-idle change the sleep mode in BIOS setup.\n"); + return false; } -#endif + +#if !IS_ENABLED(CONFIG_AMD_PMC) + dev_warn_once(adev->dev, + "Power consumption will be higher as the kernel has not been compiled with CONFIG_AMD_PMC.\n"); return false; +#else + return true; +#endif /* CONFIG_AMD_PMC */ } + +#endif /* CONFIG_SUSPEND */
From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit 0fa0f99fc84e41057cbdd2efbfe91c6b2f47dd9d ]
Unlike .queue_rq, in .submit_async_event drivers may not check the ctrl readiness for AER submission. This may lead to a use-after-free condition that was observed with nvme-tcp.
The race condition may happen in the following scenario: 1. driver executes its reset_ctrl_work 2. -> nvme_stop_ctrl - flushes ctrl async_event_work 3. ctrl sends AEN which is received by the host, which in turn schedules AEN handling 4. teardown admin queue (which releases the queue socket) 5. AEN processed, submits another AER, calling the driver to submit 6. driver attempts to send the cmd ==> use-after-free
In order to fix that, add ctrl state check to validate the ctrl is actually able to accept the AER submission.
This addresses the above race in controller resets because the driver during teardown should: 1. change ctrl state to RESETTING 2. flush async_event_work (as well as other async work elements)
So after 1,2, any other AER command will find the ctrl state to be RESETTING and bail out without submitting the AER.
Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index f8dd664b2eda5..8aa92ebb8b7c1 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4187,7 +4187,14 @@ static void nvme_async_event_work(struct work_struct *work) container_of(work, struct nvme_ctrl, async_event_work);
nvme_aen_uevent(ctrl); - ctrl->ops->submit_async_event(ctrl); + + /* + * The transport drivers must guarantee AER submission here is safe by + * flushing ctrl async_event_work after changing the controller state + * from LIVE and before freeing the admin queue. + */ + if (ctrl->state == NVME_CTRL_LIVE) + ctrl->ops->submit_async_event(ctrl); }
static bool nvme_ctrl_pp_status(struct nvme_ctrl *ctrl)
From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit ff9fc7ebf5c06de1ef72a69f9b1ab40af8b07f9e ]
While nvme_tcp_submit_async_event_work is checking the ctrl and queue state before preparing the AER command and scheduling io_work, in order to fully prevent a race where this check is not reliable the error recovery work must flush async_event_work before continuing to destroy the admin queue after setting the ctrl state to RESETTING such that there is no race .submit_async_event and the error recovery handler itself changing the ctrl state.
Tested-by: Chris Leech cleech@redhat.com Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/tcp.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index efa9037da53c9..ef65d24639c44 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -2105,6 +2105,7 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work) struct nvme_ctrl *ctrl = &tcp_ctrl->ctrl;
nvme_stop_keep_alive(ctrl); + flush_work(&ctrl->async_event_work); nvme_tcp_teardown_io_queues(ctrl, false); /* unquiesce to fail fast pending requests */ nvme_start_queues(ctrl);
From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit b6bb1722f34bbdbabed27acdceaf585d300c5fd2 ]
While nvme_rdma_submit_async_event_work is checking the ctrl and queue state before preparing the AER command and scheduling io_work, in order to fully prevent a race where this check is not reliable the error recovery work must flush async_event_work before continuing to destroy the admin queue after setting the ctrl state to RESETTING such that there is no race .submit_async_event and the error recovery handler itself changing the ctrl state.
Signed-off-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/rdma.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 0498801542eb6..d51f52e296f50 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -1192,6 +1192,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work) struct nvme_rdma_ctrl, err_work);
nvme_stop_keep_alive(&ctrl->ctrl); + flush_work(&ctrl->ctrl.async_event_work); nvme_rdma_teardown_io_queues(ctrl, false); nvme_start_queues(&ctrl->ctrl); nvme_rdma_teardown_admin_queue(ctrl, false);
From: Steen Hegelund steen.hegelund@microchip.com
[ Upstream commit 81eb8b0b18789e647e65579303529fd52d861cc2 ]
Do not try to use any SKB fields after the packet has been passed up in the receive stack.
Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Steen Hegelund steen.hegelund@microchip.com Link: https://lore.kernel.org/r/20220202083039.3774851-1-steen.hegelund@microchip.... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/microchip/sparx5/sparx5_packet.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_packet.c b/drivers/net/ethernet/microchip/sparx5/sparx5_packet.c index dc7e5ea6ec158..148d431fcde42 100644 --- a/drivers/net/ethernet/microchip/sparx5/sparx5_packet.c +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_packet.c @@ -145,9 +145,9 @@ static void sparx5_xtr_grp(struct sparx5 *sparx5, u8 grp, bool byte_swap) skb_put(skb, byte_cnt - ETH_FCS_LEN); eth_skb_pad(skb); skb->protocol = eth_type_trans(skb, netdev); - netif_rx(skb); netdev->stats.rx_bytes += skb->len; netdev->stats.rx_packets++; + netif_rx(skb); }
static int sparx5_inject(struct sparx5 *sparx5,
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit f52a2b8badbd24faf73a13c9c07fdb9d07352944 ]
This will be used to help make decisions on what to do in misconfigured systems.
v2: squash in semicolon fix from Stephen Rothwell
Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 13 +++++++++++++ 2 files changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index c8d31a22176f3..7e73ac6fb21db 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1410,9 +1410,11 @@ static inline int amdgpu_acpi_smart_shift_update(struct drm_device *dev, #endif
#if defined(CONFIG_ACPI) && defined(CONFIG_SUSPEND) +bool amdgpu_acpi_is_s3_active(struct amdgpu_device *adev); bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev); #else static inline bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev) { return false; } +static inline bool amdgpu_acpi_is_s3_active(struct amdgpu_device *adev) { return false; } #endif
int amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c index b19d407518024..0e12315fa0cb8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c @@ -1032,6 +1032,19 @@ void amdgpu_acpi_detect(void) }
#if IS_ENABLED(CONFIG_SUSPEND) +/** + * amdgpu_acpi_is_s3_active + * + * @adev: amdgpu_device_pointer + * + * returns true if supported, false if not. + */ +bool amdgpu_acpi_is_s3_active(struct amdgpu_device *adev) +{ + return !(adev->flags & AMD_IS_APU) || + (pm_suspend_target_state == PM_SUSPEND_MEM); +} + /** * amdgpu_acpi_is_s0ix_active *
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 04ef860469fda6a646dc841190d05b31fae68e8c ]
This will cause misconfigured systems to not run the GPU suspend routines.
* In APUs that are properly configured system will go into s2idle. * In APUs that are intended to be S3 but user selects s2idle the GPU will stay fully powered for the suspend. * In APUs that are intended to be s2idle and system misconfigured the GPU will stay fully powered for the suspend. * In systems that are intended to be s2idle, but AMD dGPU is also present, the dGPU will go through S3
Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 30059b7db0b25..b7509d3f7c1c7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1499,6 +1499,7 @@ static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work) static int amdgpu_pmops_prepare(struct device *dev) { struct drm_device *drm_dev = dev_get_drvdata(dev); + struct amdgpu_device *adev = drm_to_adev(drm_dev);
/* Return a positive number here so * DPM_FLAG_SMART_SUSPEND works properly @@ -1506,6 +1507,13 @@ static int amdgpu_pmops_prepare(struct device *dev) if (amdgpu_device_supports_boco(drm_dev)) return pm_runtime_suspended(dev);
+ /* if we will not support s3 or s2i for the device + * then skip suspend + */ + if (!amdgpu_acpi_is_s0ix_active(adev) && + !amdgpu_acpi_is_s3_active(adev)) + return 1; + return 0; }
From: Christian König christian.koenig@amd.com
[ Upstream commit e8ae38720e1a685fd98cfa5ae118c9d07b45ca79 ]
We probably never trigger this, but the logic inside the check is inverted.
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 94126dc396888..8132f66177c27 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1892,7 +1892,7 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t src_offset, unsigned i; int r;
- if (direct_submit && !ring->sched.ready) { + if (!direct_submit && !ring->sched.ready) { DRM_ERROR("Trying to move memory with ring turned off.\n"); return -EINVAL; }
From: Jan Beulich jbeulich@suse.com
[ Upstream commit e25a8d959992f61b64a58fc62fb7951dc6f31d1f ]
This started out with me noticing that "dom0_max_vcpus=<N>" with <N> larger than the number of physical CPUs reported through ACPI tables would not bring up the "excess" vCPU-s. Addressing this is the primary purpose of the change; CPU maps handling is being tidied only as far as is necessary for the change here (with the effect of also avoiding the setting up of too much per-CPU infrastructure, i.e. for CPUs which can never come online).
Noticing that xen_fill_possible_map() is called way too early, whereas xen_filter_cpu_maps() is called too late (after per-CPU areas were already set up), and further observing that each of the functions serves only one of Dom0 or DomU, it looked like it was better to simplify this. Use the .get_smp_config hook instead, uniformly for Dom0 and DomU. xen_fill_possible_map() can be dropped altogether, while xen_filter_cpu_maps() is re-purposed but not otherwise changed.
Signed-off-by: Jan Beulich jbeulich@suse.com Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Link: https://lore.kernel.org/r/2dbd5f0a-9859-ca2d-085e-a02f7166c610@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/xen/enlighten_pv.c | 4 ---- arch/x86/xen/smp_pv.c | 26 ++++++-------------------- 2 files changed, 6 insertions(+), 24 deletions(-)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index a7b7d674f5005..133ef31639df1 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -1364,10 +1364,6 @@ asmlinkage __visible void __init xen_start_kernel(void)
xen_acpi_sleep_register();
- /* Avoid searching for BIOS MP tables */ - x86_init.mpparse.find_smp_config = x86_init_noop; - x86_init.mpparse.get_smp_config = x86_init_uint_noop; - xen_boot_params_init_edd();
#ifdef CONFIG_ACPI diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c index 7ed56c6075b0c..477c484eb202c 100644 --- a/arch/x86/xen/smp_pv.c +++ b/arch/x86/xen/smp_pv.c @@ -148,28 +148,12 @@ int xen_smp_intr_init_pv(unsigned int cpu) return rc; }
-static void __init xen_fill_possible_map(void) -{ - int i, rc; - - if (xen_initial_domain()) - return; - - for (i = 0; i < nr_cpu_ids; i++) { - rc = HYPERVISOR_vcpu_op(VCPUOP_is_up, i, NULL); - if (rc >= 0) { - num_processors++; - set_cpu_possible(i, true); - } - } -} - -static void __init xen_filter_cpu_maps(void) +static void __init _get_smp_config(unsigned int early) { int i, rc; unsigned int subtract = 0;
- if (!xen_initial_domain()) + if (early) return;
num_processors = 0; @@ -210,7 +194,6 @@ static void __init xen_pv_smp_prepare_boot_cpu(void) * sure the old memory can be recycled. */ make_lowmem_page_readwrite(xen_initial_gdt);
- xen_filter_cpu_maps(); xen_setup_vcpu_info_placement();
/* @@ -486,5 +469,8 @@ static const struct smp_ops xen_smp_ops __initconst = { void __init xen_smp_init(void) { smp_ops = xen_smp_ops; - xen_fill_possible_map(); + + /* Avoid searching for BIOS MP tables */ + x86_init.mpparse.find_smp_config = x86_init_noop; + x86_init.mpparse.get_smp_config = _get_smp_config; }
From: Igor Pylypiv ipylypiv@google.com
[ Upstream commit 67d6212afda218d564890d1674bab28e8612170f ]
This reverts commit 774a1221e862b343388347bac9b318767336b20b.
We need to finish all async code before the module init sequence is done. In the reverted commit the PF_USED_ASYNC flag was added to mark a thread that called async_schedule(). Then the PF_USED_ASYNC flag was used to determine whether or not async_synchronize_full() needs to be invoked. This works when modprobe thread is calling async_schedule(), but it does not work if module dispatches init code to a worker thread which then calls async_schedule().
For example, PCI driver probing is invoked from a worker thread based on a node where device is attached:
if (cpu < nr_cpu_ids) error = work_on_cpu(cpu, local_pci_probe, &ddi); else error = local_pci_probe(&ddi);
We end up in a situation where a worker thread gets the PF_USED_ASYNC flag set instead of the modprobe thread. As a result, async_synchronize_full() is not invoked and modprobe completes without waiting for the async code to finish.
The issue was discovered while loading the pm80xx driver: (scsi_mod.scan=async)
modprobe pm80xx worker ... do_init_module() ... pci_call_probe() work_on_cpu(local_pci_probe) local_pci_probe() pm8001_pci_probe() scsi_scan_host() async_schedule() worker->flags |= PF_USED_ASYNC; ... < return from worker > ... if (current->flags & PF_USED_ASYNC) <--- false async_synchronize_full();
Commit 21c3c5d28007 ("block: don't request module during elevator init") fixed the deadlock issue which the reverted commit 774a1221e862 ("module, async: async_synchronize_full() on module init iff async is used") tried to fix.
Since commit 0fdff3ec6d87 ("async, kmod: warn on synchronous request_module() from async workers") synchronous module loading from async is not allowed.
Given that the original deadlock issue is fixed and it is no longer allowed to call synchronous request_module() from async we can remove PF_USED_ASYNC flag to make module init consistently invoke async_synchronize_full() unless async module probe is requested.
Signed-off-by: Igor Pylypiv ipylypiv@google.com Reviewed-by: Changyuan Lyu changyuanl@google.com Reviewed-by: Luis Chamberlain mcgrof@kernel.org Acked-by: Tejun Heo tj@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/sched.h | 1 - kernel/async.c | 3 --- kernel/module.c | 25 +++++-------------------- 3 files changed, 5 insertions(+), 24 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h index c1a927ddec646..76e8695506465 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1675,7 +1675,6 @@ extern struct pid *cad_pid; #define PF_MEMALLOC 0x00000800 /* Allocating memory */ #define PF_NPROC_EXCEEDED 0x00001000 /* set_user() noticed that RLIMIT_NPROC was exceeded */ #define PF_USED_MATH 0x00002000 /* If unset the fpu must be initialized before use */ -#define PF_USED_ASYNC 0x00004000 /* Used async_schedule*(), used by module init */ #define PF_NOFREEZE 0x00008000 /* This thread should not be frozen */ #define PF_FROZEN 0x00010000 /* Frozen for system suspend */ #define PF_KSWAPD 0x00020000 /* I am kswapd */ diff --git a/kernel/async.c b/kernel/async.c index b8d7a663497f9..b2c4ba5686ee4 100644 --- a/kernel/async.c +++ b/kernel/async.c @@ -205,9 +205,6 @@ async_cookie_t async_schedule_node_domain(async_func_t func, void *data, atomic_inc(&entry_count); spin_unlock_irqrestore(&async_lock, flags);
- /* mark that this task has queued an async job, used by module init */ - current->flags |= PF_USED_ASYNC; - /* schedule for execution */ queue_work_node(node, system_unbound_wq, &entry->work);
diff --git a/kernel/module.c b/kernel/module.c index 5c26a76e800b5..83991c2d5af9e 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -3683,12 +3683,6 @@ static noinline int do_init_module(struct module *mod) } freeinit->module_init = mod->init_layout.base;
- /* - * We want to find out whether @mod uses async during init. Clear - * PF_USED_ASYNC. async_schedule*() will set it. - */ - current->flags &= ~PF_USED_ASYNC; - do_mod_ctors(mod); /* Start the module */ if (mod->init != NULL) @@ -3714,22 +3708,13 @@ static noinline int do_init_module(struct module *mod)
/* * We need to finish all async code before the module init sequence - * is done. This has potential to deadlock. For example, a newly - * detected block device can trigger request_module() of the - * default iosched from async probing task. Once userland helper - * reaches here, async_synchronize_full() will wait on the async - * task waiting on request_module() and deadlock. - * - * This deadlock is avoided by perfomring async_synchronize_full() - * iff module init queued any async jobs. This isn't a full - * solution as it will deadlock the same if module loading from - * async jobs nests more than once; however, due to the various - * constraints, this hack seems to be the best option for now. - * Please refer to the following thread for details. + * is done. This has potential to deadlock if synchronous module + * loading is requested from async (which is not allowed!). * - * http://thread.gmane.org/gmane.linux.kernel/1420814 + * See commit 0fdff3ec6d87 ("async, kmod: warn on synchronous + * request_module() from async workers") for more details. */ - if (!mod->async_probe_requested && (current->flags & PF_USED_ASYNC)) + if (!mod->async_probe_requested) async_synchronize_full();
ftrace_free_mem(mod, mod->init_layout.base, mod->init_layout.base +
From: Kees Cook keescook@chromium.org
[ Upstream commit dcb85f85fa6f142aae1fe86f399d4503d49f2b60 ]
While the stackleak plugin was already using notrace, objtool is now a bit more picky. Update the notrace uses to noinstr. Silences the following objtool warnings when building with:
CONFIG_DEBUG_ENTRY=y CONFIG_STACK_VALIDATION=y CONFIG_VMLINUX_VALIDATION=y CONFIG_GCC_PLUGIN_STACKLEAK=y
vmlinux.o: warning: objtool: do_syscall_64()+0x9: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: do_int80_syscall_32()+0x9: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: exc_general_protection()+0x22: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: fixup_bad_iret()+0x20: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: do_machine_check()+0x27: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: .text+0x5346e: call to stackleak_erase() leaves .noinstr.text section vmlinux.o: warning: objtool: .entry.text+0x143: call to stackleak_erase() leaves .noinstr.text section vmlinux.o: warning: objtool: .entry.text+0x10eb: call to stackleak_erase() leaves .noinstr.text section vmlinux.o: warning: objtool: .entry.text+0x17f9: call to stackleak_erase() leaves .noinstr.text section
Note that the plugin's addition of calls to stackleak_track_stack() from noinstr functions is expected to be safe, as it isn't runtime instrumentation and is self-contained.
Cc: Alexander Popov alex.popov@linux.com Suggested-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/stackleak.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/kernel/stackleak.c b/kernel/stackleak.c index ce161a8e8d975..dd07239ddff9f 100644 --- a/kernel/stackleak.c +++ b/kernel/stackleak.c @@ -48,7 +48,7 @@ int stack_erasing_sysctl(struct ctl_table *table, int write, #define skip_erasing() false #endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */
-asmlinkage void notrace stackleak_erase(void) +asmlinkage void noinstr stackleak_erase(void) { /* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */ unsigned long kstack_ptr = current->lowest_stack; @@ -102,9 +102,8 @@ asmlinkage void notrace stackleak_erase(void) /* Reset the 'lowest_stack' value for the next syscall */ current->lowest_stack = current_top_of_stack() - THREAD_SIZE/64; } -NOKPROBE_SYMBOL(stackleak_erase);
-void __used __no_caller_saved_registers notrace stackleak_track_stack(void) +void __used __no_caller_saved_registers noinstr stackleak_track_stack(void) { unsigned long sp = current_stack_pointer;
From: Jason A. Donenfeld Jason@zx2c4.com
[ Upstream commit 042e293e16e3aa9794ce60c29f5b7b0c8170f933 ]
When account() is called, and the amount of entropy dips below random_write_wakeup_bits, we wake up the random writers, so that they can write some more in. However, the RNDZAPENTCNT/RNDCLEARPOOL ioctl sets the entropy count to zero -- a potential reduction just like account() -- but does not unblock writers. This commit adds the missing logic to that ioctl to unblock waiting writers.
Reviewed-by: Dominik Brodowski linux@dominikbrodowski.net Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/random.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/char/random.c b/drivers/char/random.c index a27ae3999ff32..ebe86de9d0acc 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1963,7 +1963,10 @@ static long random_ioctl(struct file *f, unsigned int cmd, unsigned long arg) */ if (!capable(CAP_SYS_ADMIN)) return -EPERM; - input_pool.entropy_count = 0; + if (xchg(&input_pool.entropy_count, 0) && random_write_wakeup_bits) { + wake_up_interruptible(&random_write_wait); + kill_fasync(&fasync, SIGIO, POLL_OUT); + } return 0; case RNDRESEEDCRNG: if (!capable(CAP_SYS_ADMIN))
From: David Woodhouse dwmw@amazon.co.uk
commit fcb732d8f8cf6084f8480015ad41d25fb023a4dd upstream.
There are circumstances whem kvm_xen_update_runstate_guest() should not sleep because it ends up being called from __schedule() when the vCPU is preempted:
[ 222.830825] kvm_xen_update_runstate_guest+0x24/0x100 [ 222.830878] kvm_arch_vcpu_put+0x14c/0x200 [ 222.830920] kvm_sched_out+0x30/0x40 [ 222.830960] __schedule+0x55c/0x9f0
To handle this, make it use the same trick as __kvm_xen_has_interrupt(), of using the hva from the gfn_to_hva_cache directly. Then it can use pagefault_disable() around the accesses and just bail out if the page is absent (which is unlikely).
I almost switched to using a gfn_to_pfn_cache here and bailing out if kvm_map_gfn() fails, like kvm_steal_time_set_preempted() does — but on closer inspection it looks like kvm_map_gfn() will *always* fail in atomic context for a page in IOMEM, which means it will silently fail to make the update every single time for such guests, AFAICT. So I didn't do it that way after all. And will probably fix that one too.
Cc: stable@vger.kernel.org Fixes: 30b5c851af79 ("KVM: x86/xen: Add support for vCPU runstate information") Signed-off-by: David Woodhouse dwmw@amazon.co.uk Message-Id: b17a93e5ff4561e57b1238e3e7ccd0b613eb827e.camel@infradead.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/xen.c | 97 ++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 67 insertions(+), 30 deletions(-)
--- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -93,32 +93,57 @@ static void kvm_xen_update_runstate(stru void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) { struct kvm_vcpu_xen *vx = &v->arch.xen; + struct gfn_to_hva_cache *ghc = &vx->runstate_cache; + struct kvm_memslots *slots = kvm_memslots(v->kvm); + bool atomic = (state == RUNSTATE_runnable); uint64_t state_entry_time; - unsigned int offset; + int __user *user_state; + uint64_t __user *user_times;
kvm_xen_update_runstate(v, state);
if (!vx->runstate_set) return;
- BUILD_BUG_ON(sizeof(struct compat_vcpu_runstate_info) != 0x2c); + if (unlikely(slots->generation != ghc->generation || kvm_is_error_hva(ghc->hva)) && + kvm_gfn_to_hva_cache_init(v->kvm, ghc, ghc->gpa, ghc->len)) + return; + + /* We made sure it fits in a single page */ + BUG_ON(!ghc->memslot); + + if (atomic) + pagefault_disable();
- offset = offsetof(struct compat_vcpu_runstate_info, state_entry_time); -#ifdef CONFIG_X86_64 /* - * The only difference is alignment of uint64_t in 32-bit. - * So the first field 'state' is accessed directly using - * offsetof() (where its offset happens to be zero), while the - * remaining fields which are all uint64_t, start at 'offset' - * which we tweak here by adding 4. + * The only difference between 32-bit and 64-bit versions of the + * runstate struct us the alignment of uint64_t in 32-bit, which + * means that the 64-bit version has an additional 4 bytes of + * padding after the first field 'state'. + * + * So we use 'int __user *user_state' to point to the state field, + * and 'uint64_t __user *user_times' for runstate_entry_time. So + * the actual array of time[] in each state starts at user_times[1]. */ + BUILD_BUG_ON(offsetof(struct vcpu_runstate_info, state) != 0); + BUILD_BUG_ON(offsetof(struct compat_vcpu_runstate_info, state) != 0); + user_state = (int __user *)ghc->hva; + + BUILD_BUG_ON(sizeof(struct compat_vcpu_runstate_info) != 0x2c); + + user_times = (uint64_t __user *)(ghc->hva + + offsetof(struct compat_vcpu_runstate_info, + state_entry_time)); +#ifdef CONFIG_X86_64 BUILD_BUG_ON(offsetof(struct vcpu_runstate_info, state_entry_time) != offsetof(struct compat_vcpu_runstate_info, state_entry_time) + 4); BUILD_BUG_ON(offsetof(struct vcpu_runstate_info, time) != offsetof(struct compat_vcpu_runstate_info, time) + 4);
if (v->kvm->arch.xen.long_mode) - offset = offsetof(struct vcpu_runstate_info, state_entry_time); + user_times = (uint64_t __user *)(ghc->hva + + offsetof(struct vcpu_runstate_info, + state_entry_time)); #endif /* * First write the updated state_entry_time at the appropriate @@ -132,10 +157,8 @@ void kvm_xen_update_runstate_guest(struc BUILD_BUG_ON(sizeof(((struct compat_vcpu_runstate_info *)0)->state_entry_time) != sizeof(state_entry_time));
- if (kvm_write_guest_offset_cached(v->kvm, &v->arch.xen.runstate_cache, - &state_entry_time, offset, - sizeof(state_entry_time))) - return; + if (__put_user(state_entry_time, user_times)) + goto out; smp_wmb();
/* @@ -149,11 +172,8 @@ void kvm_xen_update_runstate_guest(struc BUILD_BUG_ON(sizeof(((struct compat_vcpu_runstate_info *)0)->state) != sizeof(vx->current_runstate));
- if (kvm_write_guest_offset_cached(v->kvm, &v->arch.xen.runstate_cache, - &vx->current_runstate, - offsetof(struct vcpu_runstate_info, state), - sizeof(vx->current_runstate))) - return; + if (__put_user(vx->current_runstate, user_state)) + goto out;
/* * Write the actual runstate times immediately after the @@ -168,24 +188,23 @@ void kvm_xen_update_runstate_guest(struc BUILD_BUG_ON(sizeof(((struct vcpu_runstate_info *)0)->time) != sizeof(vx->runstate_times));
- if (kvm_write_guest_offset_cached(v->kvm, &v->arch.xen.runstate_cache, - &vx->runstate_times[0], - offset + sizeof(u64), - sizeof(vx->runstate_times))) - return; - + if (__copy_to_user(user_times + 1, vx->runstate_times, sizeof(vx->runstate_times))) + goto out; smp_wmb();
/* * Finally, clear the XEN_RUNSTATE_UPDATE bit in the guest's * runstate_entry_time field. */ - state_entry_time &= ~XEN_RUNSTATE_UPDATE; - if (kvm_write_guest_offset_cached(v->kvm, &v->arch.xen.runstate_cache, - &state_entry_time, offset, - sizeof(state_entry_time))) - return; + __put_user(state_entry_time, user_times); + smp_wmb(); + + out: + mark_page_dirty_in_slot(v->kvm, ghc->memslot, ghc->gpa >> PAGE_SHIFT); + + if (atomic) + pagefault_enable(); }
int __kvm_xen_has_interrupt(struct kvm_vcpu *v) @@ -337,6 +356,12 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcp break; }
+ /* It must fit within a single page */ + if ((data->u.gpa & ~PAGE_MASK) + sizeof(struct vcpu_info) > PAGE_SIZE) { + r = -EINVAL; + break; + } + r = kvm_gfn_to_hva_cache_init(vcpu->kvm, &vcpu->arch.xen.vcpu_info_cache, data->u.gpa, @@ -354,6 +379,12 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcp break; }
+ /* It must fit within a single page */ + if ((data->u.gpa & ~PAGE_MASK) + sizeof(struct pvclock_vcpu_time_info) > PAGE_SIZE) { + r = -EINVAL; + break; + } + r = kvm_gfn_to_hva_cache_init(vcpu->kvm, &vcpu->arch.xen.vcpu_time_info_cache, data->u.gpa, @@ -375,6 +406,12 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcp break; }
+ /* It must fit within a single page */ + if ((data->u.gpa & ~PAGE_MASK) + sizeof(struct vcpu_runstate_info) > PAGE_SIZE) { + r = -EINVAL; + break; + } + r = kvm_gfn_to_hva_cache_init(vcpu->kvm, &vcpu->arch.xen.runstate_cache, data->u.gpa,
From: Maxim Levitsky mlevitsk@redhat.com
commit 759cbd59674a6c0aec616a3f4f0740ebd3f5fbef upstream.
While RSM induced VM entries are not full VM entries, they still need to be followed by actual VM entry to complete it, unlike setting the nested state.
This patch fixes boot of hyperv and SMM enabled windows VM running nested on KVM, which fail due to this issue combined with lack of dirty bit setting.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com Cc: stable@vger.kernel.org Message-Id: 20220207155447.840194-5-mlevitsk@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/svm.c | 5 +++++ arch/x86/kvm/vmx/vmx.c | 1 + 2 files changed, 6 insertions(+)
--- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4388,6 +4388,11 @@ static int svm_leave_smm(struct kvm_vcpu nested_load_control_from_vmcb12(svm, &vmcb12->control); ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, false);
+ if (ret) + goto unmap_save; + + svm->nested.nested_run_pending = 1; + unmap_save: kvm_vcpu_unmap(vcpu, &map_save, true); unmap_map: --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7532,6 +7532,7 @@ static int vmx_leave_smm(struct kvm_vcpu if (ret) return ret;
+ vmx->nested.nested_run_pending = 1; vmx->nested.smm.guest_mode = false; } return 0;
From: Maxim Levitsky mlevitsk@redhat.com
commit c53bbe2145f51d3bc0438c2db02e737b9b598bf3 upstream.
When the guest doesn't enable paging, and NPT/EPT is disabled, we use guest't paging CR3's as KVM's shadow paging pointer and we are technically in direct mode as if we were to use NPT/EPT.
In direct mode we create SPTEs with user mode permissions because usually in the direct mode the NPT/EPT doesn't need to restrict access based on guest CPL (there are MBE/GMET extenstions for that but KVM doesn't use them).
In this special "use guest paging as direct" mode however, and if CR4.SMAP/CR4.SMEP are enabled, that will make the CPU fault on each access and KVM will enter endless loop of page faults.
Since page protection doesn't have any meaning in !PG case, just don't passthrough these bits.
The fix is the same as was done for VMX in commit: commit 656ec4a4928a ("KVM: VMX: fix SMEP and SMAP without EPT")
This fixes the boot of windows 10 without NPT for good. (Without this patch, BSP boots, but APs were stuck in endless loop of page faults, causing the VM boot with 1 CPU)
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com Cc: stable@vger.kernel.org Message-Id: 20220207155447.840194-2-mlevitsk@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/svm.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1727,6 +1727,7 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, { struct vcpu_svm *svm = to_svm(vcpu); u64 hcr0 = cr0; + bool old_paging = is_paging(vcpu);
#ifdef CONFIG_X86_64 if (vcpu->arch.efer & EFER_LME && !vcpu->arch.guest_state_protected) { @@ -1743,8 +1744,11 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, #endif vcpu->arch.cr0 = cr0;
- if (!npt_enabled) + if (!npt_enabled) { hcr0 |= X86_CR0_PG | X86_CR0_WP; + if (old_paging != is_paging(vcpu)) + svm_set_cr4(vcpu, kvm_read_cr4(vcpu)); + }
/* * re-enable caching here because the QEMU bios @@ -1788,8 +1792,12 @@ void svm_set_cr4(struct kvm_vcpu *vcpu, svm_flush_tlb(vcpu);
vcpu->arch.cr4 = cr4; - if (!npt_enabled) + if (!npt_enabled) { cr4 |= X86_CR4_PAE; + + if (!is_paging(vcpu)) + cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE); + } cr4 |= host_cr4_mce; to_svm(vcpu)->vmcb->save.cr4 = cr4; vmcb_mark_dirty(to_svm(vcpu)->vmcb, VMCB_CR);
From: Maxim Levitsky mlevitsk@redhat.com
commit e1779c2714c3023e4629825762bcbc43a3b943df upstream.
Turns out that due to review feedback and/or rebases I accidentally moved the call to nested_svm_load_cr3 to be too early, before the NPT is enabled, which is very wrong to do.
KVM can't even access guest memory at that point as nested NPT is needed for that, and of course it won't initialize the walk_mmu, which is main issue the patch was addressing.
Fix this for real.
Fixes: 232f75d3b4b5 ("KVM: nSVM: call nested_svm_load_cr3 on nested state load") Cc: stable@vger.kernel.org
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com Message-Id: 20220207155447.840194-3-mlevitsk@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/nested.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-)
--- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1357,18 +1357,6 @@ static int svm_set_nested_state(struct k !nested_vmcb_valid_sregs(vcpu, save)) goto out_free;
- /* - * While the nested guest CR3 is already checked and set by - * KVM_SET_SREGS, it was set when nested state was yet loaded, - * thus MMU might not be initialized correctly. - * Set it again to fix this. - */ - - ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3, - nested_npt_enabled(svm), false); - if (WARN_ON_ONCE(ret)) - goto out_free; -
/* * All checks done, we can enter guest mode. Userspace provides @@ -1394,6 +1382,20 @@ static int svm_set_nested_state(struct k
svm_switch_vmcb(svm, &svm->nested.vmcb02); nested_vmcb02_prepare_control(svm); + + /* + * While the nested guest CR3 is already checked and set by + * KVM_SET_SREGS, it was set when nested state was yet loaded, + * thus MMU might not be initialized correctly. + * Set it again to fix this. + */ + + ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3, + nested_npt_enabled(svm), false); + if (WARN_ON_ONCE(ret)) + goto out_free; + + kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); ret = 0; out_free:
From: Maxim Levitsky mlevitsk@redhat.com
commit e8efa4ff00374d2e6f47f6e4628ca3b541c001af upstream.
While usually, restoring the smm state makes the KVM enter the nested guest thus a different vmcb (vmcb02 vs vmcb01), KVM should still mark it as dirty, since hardware can in theory cache multiple vmcbs.
Failure to do so, combined with lack of setting the nested_run_pending (which is fixed in the next patch), might make KVM re-enter vmcb01, which was just exited from, with completely different set of guest state registers (SMM vs non SMM) and without proper dirty bits set, which results in the CPU reusing stale IDTR pointer which leads to a guest shutdown on any interrupt.
On the real hardware this usually doesn't happen, but when running nested, L0's KVM does check and honour few dirty bits, causing this issue to happen.
This patch fixes boot of hyperv and SMM enabled windows VM running nested on KVM.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com Cc: stable@vger.kernel.org Message-Id: 20220207155447.840194-4-mlevitsk@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/svm.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4392,6 +4392,8 @@ static int svm_leave_smm(struct kvm_vcpu * Enter the nested guest now */
+ vmcb_mark_all_dirty(svm->vmcb01.ptr); + vmcb12 = map.hva; nested_load_control_from_vmcb12(svm, &vmcb12->control); ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, false);
From: Johannes Berg johannes.berg@intel.com
commit bea2662e7818e15d7607d17d57912ac984275d94 upstream.
If no firmware was present at all (or, presumably, all of the firmware files failed to parse), we end up unbinding by calling device_release_driver(), which calls remove(), which then in iwlwifi calls iwl_drv_stop(), freeing the 'drv' struct. However the new code I added will still erroneously access it after it was freed.
Set 'failure=false' in this case to avoid the access, all data was already freed anyway.
Cc: stable@vger.kernel.org Reported-by: Stefan Agner stefan@agner.ch Reported-by: Wolfgang Walter linux@stwm.de Reported-by: Jason Self jason@bluehome.net Reported-by: Dominik Behr dominik@dominikbehr.com Reported-by: Marek Marczykowski-Górecki marmarek@invisiblethingslab.com Fixes: ab07506b0454 ("iwlwifi: fix leaks/bad data after failed firmware load") Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20220208114728.e6b514cf4c85.Iffb575ca2a623d7859b54... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c +++ b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c @@ -1614,6 +1614,8 @@ static void iwl_req_fw_callback(const st out_unbind: complete(&drv->request_firmware_complete); device_release_driver(drv->trans->dev); + /* drv has just been freed by the release */ + failure = false; free: if (failure) iwl_dealloc_ucode(drv);
From: Nicholas Bishop nicholasbishop@google.com
commit 364438fd629f7611a84c8e6d7de91659300f1502 upstream.
The iMac 12,1 does not use the gmux driver for backlight, so the radeon backlight device is needed to set the brightness.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1838 Signed-off-by: Nicholas Bishop nicholasbishop@google.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/radeon/atombios_encoders.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/radeon/atombios_encoders.c +++ b/drivers/gpu/drm/radeon/atombios_encoders.c @@ -198,7 +198,8 @@ void radeon_atom_backlight_init(struct r * so don't register a backlight device */ if ((rdev->pdev->subsystem_vendor == PCI_VENDOR_ID_APPLE) && - (rdev->pdev->device == 0x6741)) + (rdev->pdev->device == 0x6741) && + !dmi_match(DMI_PRODUCT_NAME, "iMac12,1")) return;
if (!radeon_encoder->enc_priv)
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit 439cf34c8e0a8a33d8c15a31be1b7423426bc765 upstream.
Make sure we don't assign an error pointer to crtc_state->mode_blob as that will break all kinds of places that assume either NULL or a valid pointer (eg. drm_property_blob_put()).
Cc: stable@vger.kernel.org Reported-by: fuyufan fuyufan@huawei.com Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20220209091928.14766-1-ville.s... Acked-by: Maxime Ripard maxime@cerno.tech Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_atomic_uapi.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
--- a/drivers/gpu/drm/drm_atomic_uapi.c +++ b/drivers/gpu/drm/drm_atomic_uapi.c @@ -76,15 +76,17 @@ int drm_atomic_set_mode_for_crtc(struct state->mode_blob = NULL;
if (mode) { + struct drm_property_blob *blob; + drm_mode_convert_to_umode(&umode, mode); - state->mode_blob = - drm_property_create_blob(state->crtc->dev, - sizeof(umode), - &umode); - if (IS_ERR(state->mode_blob)) - return PTR_ERR(state->mode_blob); + blob = drm_property_create_blob(crtc->dev, + sizeof(umode), &umode); + if (IS_ERR(blob)) + return PTR_ERR(blob);
drm_mode_copy(&state->mode, mode); + + state->mode_blob = blob; state->enable = true; drm_dbg_atomic(crtc->dev, "Set [MODE:%s] for [CRTC:%d:%s] state %p\n",
From: Yifan Zhang yifan1.zhang@amd.com
commit 9c4f59ea3f865693150edf0c91d1cc6b451360dd upstream.
the 2nd parameter should be smu msg type rather than asic msg index.
Fixes: 7d38d9dc4ecc ("drm/amdgpu: add mode2 reset support for yellow carp") Signed-off-by: Yifan Zhang yifan1.zhang@amd.com Acked-by: Aaron Liu aaron.liu@amd.com Reviewed-by: Huang Rui ray.huang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c @@ -291,14 +291,9 @@ static int yellow_carp_post_smu_init(str
static int yellow_carp_mode_reset(struct smu_context *smu, int type) { - int ret = 0, index = 0; + int ret = 0;
- index = smu_cmn_to_asic_specific_index(smu, CMN2ASIC_MAPPING_MSG, - SMU_MSG_GfxDeviceDriverReset); - if (index < 0) - return index == -EACCES ? 0 : index; - - ret = smu_cmn_send_smc_msg_with_param(smu, (uint16_t)index, type, NULL); + ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_GfxDeviceDriverReset, type, NULL); if (ret) dev_err(smu->adev->dev, "Failed to mode reset!\n");
From: Rajib Mahapatra rajib.mahapatra@amd.com
commit f8f4e2a518347063179def4e64580b2d28233d03 upstream.
[Why] SDMA ring buffer test failed if suspend is aborted during S0i3 resume.
[How] If suspend is aborted for some reason during S0i3 resume cycle, it follows SDMA ring test failing and errors in amdgpu resume. For RN/CZN/Picasso, SMU saves and restores SDMA registers during S0ix cycle. So, skipping SDMA suspend and resume from driver solves the issue. This time, the system is able to resume gracefully even the suspend is aborted.
Reviewed-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Rajib Mahapatra rajib.mahapatra@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -2062,6 +2062,10 @@ static int sdma_v4_0_suspend(void *handl { struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+ /* SMU saves SDMA state for us */ + if (adev->in_s0ix) + return 0; + return sdma_v4_0_hw_fini(adev); }
@@ -2069,6 +2073,10 @@ static int sdma_v4_0_resume(void *handle { struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+ /* SMU restores SDMA state for us */ + if (adev->in_s0ix) + return 0; + return sdma_v4_0_hw_init(adev); }
From: Jani Nikula jani.nikula@intel.com
commit ea958422291de248b9e2eaaeea36004e84b64043 upstream.
The mapping from enum port to whatever port numbering scheme is used by the SWSCI Display Power State Notification is odd, and the memory of it has faded. In any case, the parameter only has space for ports numbered [0..4], and UBSAN reports bit shift beyond it when the platform has port F or more.
Since the SWSCI functionality is supposed to be obsolete for new platforms (i.e. ones that might have port F or more), just bail out early if the mapped and mangled port number is beyond what the Display Power State Notification can support.
Fixes: 9c4b0a683193 ("drm/i915: add opregion function to notify bios of encoder enable/disable") Cc: stable@vger.kernel.org # v3.13+ Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Lucas De Marchi lucas.demarchi@intel.com Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4800 Signed-off-by: Jani Nikula jani.nikula@intel.com Reviewed-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/cc363f42d6b5a5932b6d218fefcc8b... (cherry picked from commit 24a644ebbfd3b13cda702f98907f9dd123e34bf9) Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/display/intel_opregion.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
--- a/drivers/gpu/drm/i915/display/intel_opregion.c +++ b/drivers/gpu/drm/i915/display/intel_opregion.c @@ -361,6 +361,21 @@ int intel_opregion_notify_encoder(struct port++; }
+ /* + * The port numbering and mapping here is bizarre. The now-obsolete + * swsci spec supports ports numbered [0..4]. Port E is handled as a + * special case, but port F and beyond are not. The functionality is + * supposed to be obsolete for new platforms. Just bail out if the port + * number is out of bounds after mapping. + */ + if (port > 4) { + drm_dbg_kms(&dev_priv->drm, + "[ENCODER:%d:%s] port %c (index %u) out of bounds for display power state notification\n", + intel_encoder->base.base.id, intel_encoder->base.name, + port_name(intel_encoder->port), port); + return -EINVAL; + } + if (!enable) parm |= 4 << 8;
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit 698bef8ff5d2edea5d1c9d6e5adf1bfed1e8a106 upstream.
Apparently I totally fumbled the loop condition when I removed the ARRAY_SIZE() stuff from the dbuf slice config lookup. Comparing the loop index with the active_pipes bitmask is utter nonsense, what we want to do is check to see if the mask is zero or not.
Note that the code actually ended up working correctly despite the fumble, up until commit eef173954432 ("drm/i915: Allow !join_mbus cases for adlp+ dbuf configuration") when things broke for real.
Cc: stable@vger.kernel.org Fixes: 05e8155afe35 ("drm/i915: Use a sentinel to terminate the dbuf slice arrays") Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20220207132700.481-1-ville.syr... Reviewed-by: Jani Nikula jani.nikula@intel.com (cherry picked from commit a28fde308c3c1c174249ff9559b57f24e6850086) Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/intel_pm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4861,7 +4861,7 @@ static u8 compute_dbuf_slices(enum pipe { int i;
- for (i = 0; i < dbuf_slices[i].active_pipes; i++) { + for (i = 0; dbuf_slices[i].active_pipes != 0; i++) { if (dbuf_slices[i].active_pipes == active_pipes && dbuf_slices[i].join_mbus == join_mbus) return dbuf_slices[i].dbuf_mask[pipe];
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit 8d9d2a723d64b650f2e6423024ccb4a33f0cdc40 upstream.
The bogus loop from compute_dbuf_slices() was copied into check_mbus_joined() as well. So this lookup is wrong as well. Fix it.
Cc: stable@vger.kernel.org Fixes: f4dc00863226 ("drm/i915/adl_p: MBUS programming") Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20220207132700.481-2-ville.syr... Reviewed-by: Jani Nikula jani.nikula@intel.com (cherry picked from commit 053f2b85631316a9226f6340c1c0fd95634f7a5b) Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/intel_pm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4844,7 +4844,7 @@ static bool check_mbus_joined(u8 active_ { int i;
- for (i = 0; i < dbuf_slices[i].active_pipes; i++) { + for (i = 0; dbuf_slices[i].active_pipes != 0; i++) { if (dbuf_slices[i].active_pipes == active_pipes) return dbuf_slices[i].join_mbus; }
From: Seth Forshee sforshee@digitalocean.com
commit b9208492fcaecff8f43915529ae34b3bcb03877c upstream.
vsock_connect() expects that the socket could already be in the TCP_ESTABLISHED state when the connecting task wakes up with a signal pending. If this happens the socket will be in the connected table, and it is not removed when the socket state is reset. In this situation it's common for the process to retry connect(), and if the connection is successful the socket will be added to the connected table a second time, corrupting the list.
Prevent this by calling vsock_remove_connected() if a signal is received while waiting for a connection. This is harmless if the socket is not in the connected table, and if it is in the table then removing it will prevent list corruption from a double add.
Note for backporting: this patch requires d5afa82c977e ("vsock: correct removal of socket from the list"), which is in all current stable trees except 4.9.y.
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Signed-off-by: Seth Forshee sforshee@digitalocean.com Reviewed-by: Stefano Garzarella sgarzare@redhat.com Link: https://lore.kernel.org/r/20220217141312.2297547-1-sforshee@digitalocean.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/vmw_vsock/af_vsock.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1400,6 +1400,7 @@ static int vsock_connect(struct socket * sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; sock->state = SS_UNCONNECTED; vsock_transport_cancel_pkt(vsk); + vsock_remove_connected(vsk); goto out_wait; } else if (timeout == 0) { err = -ETIMEDOUT;
From: Jens Wiklander jens.wiklander@linaro.org
[ Upstream commit 1e2c3ef0496e72ba9001da5fd1b7ed56ccb30597 ]
Exports the two functions teedev_open() and teedev_close_context() in order to make it easier to create a driver internal struct tee_context.
Reviewed-by: Sumit Garg sumit.garg@linaro.org Signed-off-by: Jens Wiklander jens.wiklander@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tee/tee_core.c | 6 ++++-- include/linux/tee_drv.h | 14 ++++++++++++++ 2 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/tee/tee_core.c b/drivers/tee/tee_core.c index 85102d12d7169..3fc426dad2df3 100644 --- a/drivers/tee/tee_core.c +++ b/drivers/tee/tee_core.c @@ -43,7 +43,7 @@ static DEFINE_SPINLOCK(driver_lock); static struct class *tee_class; static dev_t tee_devt;
-static struct tee_context *teedev_open(struct tee_device *teedev) +struct tee_context *teedev_open(struct tee_device *teedev) { int rc; struct tee_context *ctx; @@ -70,6 +70,7 @@ static struct tee_context *teedev_open(struct tee_device *teedev) return ERR_PTR(rc);
} +EXPORT_SYMBOL_GPL(teedev_open);
void teedev_ctx_get(struct tee_context *ctx) { @@ -96,13 +97,14 @@ void teedev_ctx_put(struct tee_context *ctx) kref_put(&ctx->refcount, teedev_ctx_release); }
-static void teedev_close_context(struct tee_context *ctx) +void teedev_close_context(struct tee_context *ctx) { struct tee_device *teedev = ctx->teedev;
teedev_ctx_put(ctx); tee_device_put(teedev); } +EXPORT_SYMBOL_GPL(teedev_close_context);
static int tee_open(struct inode *inode, struct file *filp) { diff --git a/include/linux/tee_drv.h b/include/linux/tee_drv.h index feda1dc7f98ee..38b701b7af4cf 100644 --- a/include/linux/tee_drv.h +++ b/include/linux/tee_drv.h @@ -582,4 +582,18 @@ struct tee_client_driver { #define to_tee_client_driver(d) \ container_of(d, struct tee_client_driver, driver)
+/** + * teedev_open() - Open a struct tee_device + * @teedev: Device to open + * + * @return a pointer to struct tee_context on success or an ERR_PTR on failure. + */ +struct tee_context *teedev_open(struct tee_device *teedev); + +/** + * teedev_close_context() - closes a struct tee_context + * @ctx: The struct tee_context to close + */ +void teedev_close_context(struct tee_context *ctx); + #endif /*__TEE_DRV_H*/
From: Jens Wiklander jens.wiklander@linaro.org
commit aceeafefff736057e8f93f19bbfbef26abd94604 upstream.
Adds a driver private tee_context by moving the tee_context in struct optee_notif to struct optee. This tee_context was previously used when doing internal calls to secure world to deliver notification.
The new driver internal tee_context is now also when allocating driver private shared memory. This decouples the shared memory object from its original tee_context. This is needed when the life time of such a memory allocation outlives the client tee_context.
This patch fixes the problem described below:
The addition of a shutdown hook by commit f25889f93184 ("optee: fix tee out of memory failure seen during kexec reboot") introduced a kernel shutdown regression that can be triggered after running the OP-TEE xtest suites.
Once the shutdown hook is called it is not possible to communicate any more with the supplicant process because the system is not scheduling task any longer. Thus if the optee driver shutdown path receives a supplicant RPC request from the OP-TEE we will deadlock the kernel's shutdown.
Fixes: f25889f93184 ("optee: fix tee out of memory failure seen during kexec reboot") Fixes: 217e0250cccb ("tee: use reference counting for tee_context") Reported-by: Lars Persson larper@axis.com Cc: stable@vger.kernel.org Reviewed-by: Sumit Garg sumit.garg@linaro.org [JW: backport to 5.15-stable + update commit message] Signed-off-by: Jens Wiklander jens.wiklander@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tee/optee/core.c | 8 ++++++++ drivers/tee/optee/optee_private.h | 2 ++ drivers/tee/optee/rpc.c | 8 +++++--- 3 files changed, 15 insertions(+), 3 deletions(-)
--- a/drivers/tee/optee/core.c +++ b/drivers/tee/optee/core.c @@ -588,6 +588,7 @@ static int optee_remove(struct platform_ /* Unregister OP-TEE specific client devices on TEE bus */ optee_unregister_devices();
+ teedev_close_context(optee->ctx); /* * Ask OP-TEE to free all cached shared memory objects to decrease * reference counters and also avoid wild pointers in secure world @@ -633,6 +634,7 @@ static int optee_probe(struct platform_d struct optee *optee = NULL; void *memremaped_shm = NULL; struct tee_device *teedev; + struct tee_context *ctx; u32 sec_caps; int rc;
@@ -719,6 +721,12 @@ static int optee_probe(struct platform_d optee_supp_init(&optee->supp); optee->memremaped_shm = memremaped_shm; optee->pool = pool; + ctx = teedev_open(optee->teedev); + if (IS_ERR(ctx)) { + rc = rc = PTR_ERR(ctx); + goto err; + } + optee->ctx = ctx;
/* * Ensure that there are no pre-existing shm objects before enabling --- a/drivers/tee/optee/optee_private.h +++ b/drivers/tee/optee/optee_private.h @@ -70,6 +70,7 @@ struct optee_supp { * struct optee - main service struct * @supp_teedev: supplicant device * @teedev: client device + * @ctx: driver internal TEE context * @invoke_fn: function to issue smc or hvc * @call_queue: queue of threads waiting to call @invoke_fn * @wait_queue: queue of threads from secure world waiting for a @@ -87,6 +88,7 @@ struct optee { struct tee_device *supp_teedev; struct tee_device *teedev; optee_invoke_fn *invoke_fn; + struct tee_context *ctx; struct optee_call_queue call_queue; struct optee_wait_queue wait_queue; struct optee_supp supp; --- a/drivers/tee/optee/rpc.c +++ b/drivers/tee/optee/rpc.c @@ -285,6 +285,7 @@ static struct tee_shm *cmd_alloc_suppl(s }
static void handle_rpc_func_cmd_shm_alloc(struct tee_context *ctx, + struct optee *optee, struct optee_msg_arg *arg, struct optee_call_ctx *call_ctx) { @@ -314,7 +315,8 @@ static void handle_rpc_func_cmd_shm_allo shm = cmd_alloc_suppl(ctx, sz); break; case OPTEE_RPC_SHM_TYPE_KERNEL: - shm = tee_shm_alloc(ctx, sz, TEE_SHM_MAPPED | TEE_SHM_PRIV); + shm = tee_shm_alloc(optee->ctx, sz, + TEE_SHM_MAPPED | TEE_SHM_PRIV); break; default: arg->ret = TEEC_ERROR_BAD_PARAMETERS; @@ -471,7 +473,7 @@ static void handle_rpc_func_cmd(struct t break; case OPTEE_RPC_CMD_SHM_ALLOC: free_pages_list(call_ctx); - handle_rpc_func_cmd_shm_alloc(ctx, arg, call_ctx); + handle_rpc_func_cmd_shm_alloc(ctx, optee, arg, call_ctx); break; case OPTEE_RPC_CMD_SHM_FREE: handle_rpc_func_cmd_shm_free(ctx, arg); @@ -502,7 +504,7 @@ void optee_handle_rpc(struct tee_context
switch (OPTEE_SMC_RETURN_GET_RPC_FUNC(param->a0)) { case OPTEE_SMC_RPC_FUNC_ALLOC: - shm = tee_shm_alloc(ctx, param->a1, + shm = tee_shm_alloc(optee->ctx, param->a1, TEE_SHM_MAPPED | TEE_SHM_PRIV); if (!IS_ERR(shm) && !tee_shm_get_pa(shm, 0, &pa)) { reg_pair_from_64(¶m->a1, ¶m->a2, pa);
From: Robin Murphy robin.murphy@arm.com
commit 59f39bfa6553d598cb22f694d45e89547f420d85 upstream.
drm_gem_cma_mmap() cannot assume every implementation of dma_mmap_wc() will end up calling remap_pfn_range() (which happens to set the relevant vma flag, among others), so in order to make sure expectations around VM_DONTEXPAND are met, let it explicitly set the flag like most other GEM mmap implementations do.
This avoids repeated warnings on a small minority of systems where the display is behind an IOMMU, and has a simple driver which does not override drm_gem_cma_default_funcs. Arm hdlcd is an in-tree affected driver. Out-of-tree, the Apple DCP driver is affected; this fix is required for DCP to be mainlined.
[Alyssa: Update commit message.]
Fixes: c40069cb7bd6 ("drm: add mmap() to drm_gem_object_funcs") Acked-by: Daniel Vetter daniel@ffwll.ch Signed-off-by: Robin Murphy robin.murphy@arm.com Signed-off-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com Link: https://patchwork.freedesktop.org/patch/msgid/20211013143654.39031-1-alyssa@... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_gem_cma_helper.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/drm_gem_cma_helper.c +++ b/drivers/gpu/drm/drm_gem_cma_helper.c @@ -515,6 +515,7 @@ int drm_gem_cma_mmap(struct drm_gem_obje */ vma->vm_pgoff -= drm_vma_node_start(&obj->vma_node); vma->vm_flags &= ~VM_PFNMAP; + vma->vm_flags |= VM_DONTEXPAND;
cma_obj = to_drm_gem_cma_obj(obj);
From: Siva Mullati siva.mullati@intel.com
commit d72d69abfdb6e0375981cfdda8eb45143f12c77d upstream.
GVT is not supported on non-x86 platforms, So add dependency of X86 on config parameter DRM_I915_GVT.
Fixes: 0ad35fed618c ("drm/i915: gvt: Introduce the basic architecture of GVT-g") Signed-off-by: Siva Mullati siva.mullati@intel.com Signed-off-by: Zhi Wang zhi.a.wang@intel.com Link: http://patchwork.freedesktop.org/patch/msgid/20220107095235.243448-1-siva.mu... Reviewed-by: Zhi Wang zhi.a.wang@intel.com Signed-off-by: Zhi Wang zhi.a.wang@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -101,6 +101,7 @@ config DRM_I915_USERPTR config DRM_I915_GVT bool "Enable Intel GVT-g graphics virtualization host support" depends on DRM_I915 + depends on X86 depends on 64BIT default n help
From: Matthew Auld matthew.auld@intel.com
commit 0bdc0a0699929c814a8aecd55d2accb8c11beae2 upstream.
For some reason we are selecting PRIO_HAS_PAGES when we don't have mm.pages, and vice versa.
v2(Thomas): - Add missing fixes tag
Fixes: 213d50927763 ("drm/i915/ttm: Introduce a TTM i915 gem object backend") Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Thomas Hellström thomas.hellstrom@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20220209111652.468762-1-matthe... (cherry picked from commit ba2c5d15022a565da187d90e2fe44768e33e5034) Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -759,11 +759,9 @@ static void i915_ttm_adjust_lru(struct d if (obj->mm.madv != I915_MADV_WILLNEED) { bo->priority = I915_TTM_PRIO_PURGE; } else if (!i915_gem_object_has_pages(obj)) { - if (bo->priority < I915_TTM_PRIO_HAS_PAGES) - bo->priority = I915_TTM_PRIO_HAS_PAGES; + bo->priority = I915_TTM_PRIO_NO_PAGES; } else { - if (bo->priority > I915_TTM_PRIO_NO_PAGES) - bo->priority = I915_TTM_PRIO_NO_PAGES; + bo->priority = I915_TTM_PRIO_HAS_PAGES; }
ttm_bo_move_to_lru_tail(bo, bo->resource, NULL);
From: Johannes Berg johannes.berg@intel.com
commit e9848aed147708a06193b40d78493b0ef6abccf2 upstream.
If we run into this error path, we shouldn't unlock the mutex since it's not locked since. Fix this.
Fixes: a6bd005fe92d ("iwlwifi: pcie: fix RF-Kill vs. firmware load race") Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/iwlwifi.20220128142706.5d16821d1433.Id259699ddf980... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c @@ -1273,8 +1273,7 @@ static int iwl_trans_pcie_start_fw(struc /* This may fail if AMT took ownership of the device */ if (iwl_pcie_prepare_card_hw(trans)) { IWL_WARN(trans, "Exit HW not ready\n"); - ret = -EIO; - goto out; + return -EIO; }
iwl_enable_rfkill_int(trans);
From: Johannes Berg johannes.berg@intel.com
commit 4c29c1e27a1e178a219b3877d055e6dd643bdfda upstream.
If we run into this error path, we shouldn't unlock the mutex since it's not locked since. Fix this in the gen2 code as well.
Fixes: eda50cde58de ("iwlwifi: pcie: add context information support") Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/iwlwifi.20220128142706.b8b0dfce16ef.Ie20f0f7b23e59... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c @@ -408,8 +408,7 @@ int iwl_trans_pcie_gen2_start_fw(struct /* This may fail if AMT took ownership of the device */ if (iwl_pcie_prepare_card_hw(trans)) { IWL_WARN(trans, "Exit HW not ready\n"); - ret = -EIO; - goto out; + return -EIO; }
iwl_enable_rfkill_int(trans);
From: Luca Coelho luciano.coelho@intel.com
commit 5f06f6bf8d816578c390a2b8a485d40adcca4749 upstream.
SAR GEO offsets are not supported on 3160 devices. The code was refactored and caused us to start sending the command anyway, which causes a FW assertion failure. Fix that only considering this feature supported on FW API with major version is 17 if the device is not 3160.
Additionally, fix the caller of iwl_mvm_sar_geo_init() so that it checks for the return value, which it was ignoring.
Reported-by: Len Brown lenb@kernel.org Signed-off-by: Luca Coelho luciano.coelho@intel.com Fixes: 78a19d5285d9 ("iwlwifi: mvm: Read the PPAG and SAR tables at INIT stage") Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/iwlwifi.20220128144623.96f683a89b42.I14e2985bfd7dd... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/intel/iwlwifi/fw/acpi.c | 11 ++++++----- drivers/net/wireless/intel/iwlwifi/iwl-csr.h | 3 ++- drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 2 +- 3 files changed, 9 insertions(+), 7 deletions(-)
--- a/drivers/net/wireless/intel/iwlwifi/fw/acpi.c +++ b/drivers/net/wireless/intel/iwlwifi/fw/acpi.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause /* * Copyright (C) 2017 Intel Deutschland GmbH - * Copyright (C) 2019-2021 Intel Corporation + * Copyright (C) 2019-2022 Intel Corporation */ #include <linux/uuid.h> #include "iwl-drv.h" @@ -814,10 +814,11 @@ bool iwl_sar_geo_support(struct iwl_fw_r * only one using version 36, so skip this version entirely. */ return IWL_UCODE_SERIAL(fwrt->fw->ucode_ver) >= 38 || - IWL_UCODE_SERIAL(fwrt->fw->ucode_ver) == 17 || - (IWL_UCODE_SERIAL(fwrt->fw->ucode_ver) == 29 && - ((fwrt->trans->hw_rev & CSR_HW_REV_TYPE_MSK) == - CSR_HW_REV_TYPE_7265D)); + (IWL_UCODE_SERIAL(fwrt->fw->ucode_ver) == 17 && + fwrt->trans->hw_rev != CSR_HW_REV_TYPE_3160) || + (IWL_UCODE_SERIAL(fwrt->fw->ucode_ver) == 29 && + ((fwrt->trans->hw_rev & CSR_HW_REV_TYPE_MSK) == + CSR_HW_REV_TYPE_7265D)); } IWL_EXPORT_SYMBOL(iwl_sar_geo_support);
--- a/drivers/net/wireless/intel/iwlwifi/iwl-csr.h +++ b/drivers/net/wireless/intel/iwlwifi/iwl-csr.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */ /* - * Copyright (C) 2005-2014, 2018-2021 Intel Corporation + * Copyright (C) 2005-2014, 2018-2022 Intel Corporation * Copyright (C) 2013-2014 Intel Mobile Communications GmbH * Copyright (C) 2016 Intel Deutschland GmbH */ @@ -319,6 +319,7 @@ enum { #define CSR_HW_REV_TYPE_2x00 (0x0000100) #define CSR_HW_REV_TYPE_105 (0x0000110) #define CSR_HW_REV_TYPE_135 (0x0000120) +#define CSR_HW_REV_TYPE_3160 (0x0000164) #define CSR_HW_REV_TYPE_7265D (0x0000210) #define CSR_HW_REV_TYPE_NONE (0x00001F0) #define CSR_HW_REV_TYPE_QNJ (0x0000360) --- a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c @@ -1572,7 +1572,7 @@ int iwl_mvm_up(struct iwl_mvm *mvm) ret = iwl_mvm_sar_init(mvm); if (ret == 0) ret = iwl_mvm_sar_geo_init(mvm); - else if (ret < 0) + if (ret < 0) goto error;
iwl_mvm_tas_init(mvm);
From: Eric Dumazet edumazet@google.com
commit 75063c9294fb239bbe64eb72141b6871fe526d29 upstream.
Calling nf_defrag_ipv4_disable() instead of nf_defrag_ipv6_disable() was probably not the intent.
I found this by code inspection, while chasing a possible issue in TPROXY.
Fixes: de8c12110a13 ("netfilter: disable defrag once its no longer needed") Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/xt_socket.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netfilter/xt_socket.c +++ b/net/netfilter/xt_socket.c @@ -221,7 +221,7 @@ static void socket_mt_destroy(const stru if (par->family == NFPROTO_IPV4) nf_defrag_ipv4_disable(par->net); else if (par->family == NFPROTO_IPV6) - nf_defrag_ipv4_disable(par->net); + nf_defrag_ipv6_disable(par->net); }
static struct xt_match socket_mt_reg[] __read_mostly = {
From: Hangbin Liu liuhangbin@gmail.com
commit 2e71ec1a725a794a16e3862791ed43fe5ba6a06b upstream.
When the nft_concat_range test failed, it exit 1 in the code specifically.
But when part of, or all of the test passed, it will failed the [ ${passed} -eq 0 ] check and thus exit with 1, which is the same exit value with failure result. Fix it by exit 0 when passed is not 0.
Fixes: 611973c1e06f ("selftests: netfilter: Introduce tests for sets with range concatenation") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Reviewed-by: Stefano Brivio sbrivio@redhat.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/netfilter/nft_concat_range.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/netfilter/nft_concat_range.sh +++ b/tools/testing/selftests/netfilter/nft_concat_range.sh @@ -1583,4 +1583,4 @@ for name in ${TESTS}; do done done
-[ ${passed} -eq 0 ] && exit ${KSELFTEST_SKIP} +[ ${passed} -eq 0 ] && exit ${KSELFTEST_SKIP} || exit 0
From: Pablo Neira Ayuso pablo@netfilter.org
commit 2b4e5fb4d3776c391e40fb33673ba946dd96012d upstream.
Disable the IPv4 hooks if the IPv6 hooks fail to be registered.
Fixes: ad49d86e07a4 ("netfilter: nf_tables: Add synproxy support") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nft_synproxy.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/netfilter/nft_synproxy.c +++ b/net/netfilter/nft_synproxy.c @@ -191,8 +191,10 @@ static int nft_synproxy_do_init(const st if (err) goto nf_ct_failure; err = nf_synproxy_ipv6_init(snet, ctx->net); - if (err) + if (err) { + nf_synproxy_ipv4_fini(snet, ctx->net); goto nf_ct_failure; + } break; }
From: Hangbin Liu liuhangbin@gmail.com
commit bbe4c0896d25009a7c86285d2ab024eed4374eea upstream.
Some distros may enable rp_filter by default. After ns1 change addr to 10.0.2.99 and set default router to 10.0.2.1, while the connected router address is still 10.0.1.1. The router will not reply the arp request from ns1. Fix it by setting the router's veth0 rp_filter to 0.
Before the fix: # ./nft_fib.sh PASS: fib expression did not cause unwanted packet drops Netns nsrouter-HQkDORO2 fib counter doesn't match expected packet count of 1 for 1.1.1.1 table inet filter { chain prerouting { type filter hook prerouting priority filter; policy accept; ip daddr 1.1.1.1 fib saddr . iif oif missing counter packets 0 bytes 0 drop ip6 daddr 1c3::c01d fib saddr . iif oif missing counter packets 0 bytes 0 drop } }
After the fix: # ./nft_fib.sh PASS: fib expression did not cause unwanted packet drops PASS: fib expression did drop packets for 1.1.1.1 PASS: fib expression did drop packets for 1c3::c01d
Fixes: 82944421243e ("selftests: netfilter: add fib test case") Signed-off-by: Yi Chen yiche@redhat.com Signed-off-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/netfilter/nft_fib.sh | 1 + 1 file changed, 1 insertion(+)
--- a/tools/testing/selftests/netfilter/nft_fib.sh +++ b/tools/testing/selftests/netfilter/nft_fib.sh @@ -174,6 +174,7 @@ test_ping() { ip netns exec ${nsrouter} sysctl net.ipv6.conf.all.forwarding=1 > /dev/null ip netns exec ${nsrouter} sysctl net.ipv4.conf.veth0.forwarding=1 > /dev/null ip netns exec ${nsrouter} sysctl net.ipv4.conf.veth1.forwarding=1 > /dev/null +ip netns exec ${nsrouter} sysctl net.ipv4.conf.veth0.rp_filter=0 > /dev/null
sleep 3
From: Eric Dumazet edumazet@google.com
commit 9fcf986cc4bc6a3a39f23fbcbbc3a9e52d3c24fd upstream.
fib_alias_hw_flags_set() can be used by concurrent threads, and is only RCU protected.
We need to annotate accesses to following fields of struct fib_alias:
offload, trap, offload_failed
Because of READ_ONCE()WRITE_ONCE() limitations, make these field u8.
BUG: KCSAN: data-race in fib_alias_hw_flags_set / fib_alias_hw_flags_set
read to 0xffff888134224a6a of 1 bytes by task 2013 on cpu 1: fib_alias_hw_flags_set+0x28a/0x470 net/ipv4/fib_trie.c:1050 nsim_fib4_rt_hw_flags_set drivers/net/netdevsim/fib.c:350 [inline] nsim_fib4_rt_add drivers/net/netdevsim/fib.c:367 [inline] nsim_fib4_rt_insert drivers/net/netdevsim/fib.c:429 [inline] nsim_fib4_event drivers/net/netdevsim/fib.c:461 [inline] nsim_fib_event drivers/net/netdevsim/fib.c:881 [inline] nsim_fib_event_work+0x1852/0x2cf0 drivers/net/netdevsim/fib.c:1477 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307 process_scheduled_works kernel/workqueue.c:2370 [inline] worker_thread+0x7df/0xa70 kernel/workqueue.c:2456 kthread+0x1bf/0x1e0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30
write to 0xffff888134224a6a of 1 bytes by task 4872 on cpu 0: fib_alias_hw_flags_set+0x2d5/0x470 net/ipv4/fib_trie.c:1054 nsim_fib4_rt_hw_flags_set drivers/net/netdevsim/fib.c:350 [inline] nsim_fib4_rt_add drivers/net/netdevsim/fib.c:367 [inline] nsim_fib4_rt_insert drivers/net/netdevsim/fib.c:429 [inline] nsim_fib4_event drivers/net/netdevsim/fib.c:461 [inline] nsim_fib_event drivers/net/netdevsim/fib.c:881 [inline] nsim_fib_event_work+0x1852/0x2cf0 drivers/net/netdevsim/fib.c:1477 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307 process_scheduled_works kernel/workqueue.c:2370 [inline] worker_thread+0x7df/0xa70 kernel/workqueue.c:2456 kthread+0x1bf/0x1e0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30
value changed: 0x00 -> 0x02
Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 4872 Comm: kworker/0:0 Not tainted 5.17.0-rc3-syzkaller-00188-g1d41d2e82623-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events nsim_fib_event_work
Fixes: 90b93f1b31f8 ("ipv4: Add "offload" and "trap" indications to routes") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Reviewed-by: Ido Schimmel idosch@nvidia.com Link: https://lore.kernel.org/r/20220216173217.3792411-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/fib_lookup.h | 7 +++---- net/ipv4/fib_semantics.c | 6 +++--- net/ipv4/fib_trie.c | 22 +++++++++++++--------- net/ipv4/route.c | 4 ++-- 4 files changed, 21 insertions(+), 18 deletions(-)
--- a/net/ipv4/fib_lookup.h +++ b/net/ipv4/fib_lookup.h @@ -16,10 +16,9 @@ struct fib_alias { u8 fa_slen; u32 tb_id; s16 fa_default; - u8 offload:1, - trap:1, - offload_failed:1, - unused:5; + u8 offload; + u8 trap; + u8 offload_failed; struct rcu_head rcu; };
--- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -524,9 +524,9 @@ void rtmsg_fib(int event, __be32 key, st fri.dst_len = dst_len; fri.tos = fa->fa_tos; fri.type = fa->fa_type; - fri.offload = fa->offload; - fri.trap = fa->trap; - fri.offload_failed = fa->offload_failed; + fri.offload = READ_ONCE(fa->offload); + fri.trap = READ_ONCE(fa->trap); + fri.offload_failed = READ_ONCE(fa->offload_failed); err = fib_dump_info(skb, info->portid, seq, event, &fri, nlm_flags); if (err < 0) { /* -EMSGSIZE implies BUG in fib_nlmsg_size() */ --- a/net/ipv4/fib_trie.c +++ b/net/ipv4/fib_trie.c @@ -1047,19 +1047,23 @@ void fib_alias_hw_flags_set(struct net * if (!fa_match) goto out;
- if (fa_match->offload == fri->offload && fa_match->trap == fri->trap && - fa_match->offload_failed == fri->offload_failed) + /* These are paired with the WRITE_ONCE() happening in this function. + * The reason is that we are only protected by RCU at this point. + */ + if (READ_ONCE(fa_match->offload) == fri->offload && + READ_ONCE(fa_match->trap) == fri->trap && + READ_ONCE(fa_match->offload_failed) == fri->offload_failed) goto out;
- fa_match->offload = fri->offload; - fa_match->trap = fri->trap; + WRITE_ONCE(fa_match->offload, fri->offload); + WRITE_ONCE(fa_match->trap, fri->trap);
/* 2 means send notifications only if offload_failed was changed. */ if (net->ipv4.sysctl_fib_notify_on_flag_change == 2 && - fa_match->offload_failed == fri->offload_failed) + READ_ONCE(fa_match->offload_failed) == fri->offload_failed) goto out;
- fa_match->offload_failed = fri->offload_failed; + WRITE_ONCE(fa_match->offload_failed, fri->offload_failed);
if (!net->ipv4.sysctl_fib_notify_on_flag_change) goto out; @@ -2297,9 +2301,9 @@ static int fn_trie_dump_leaf(struct key_ fri.dst_len = KEYLENGTH - fa->fa_slen; fri.tos = fa->fa_tos; fri.type = fa->fa_type; - fri.offload = fa->offload; - fri.trap = fa->trap; - fri.offload_failed = fa->offload_failed; + fri.offload = READ_ONCE(fa->offload); + fri.trap = READ_ONCE(fa->trap); + fri.offload_failed = READ_ONCE(fa->offload_failed); err = fib_dump_info(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -3401,8 +3401,8 @@ static int inet_rtm_getroute(struct sk_b fa->fa_tos == fri.tos && fa->fa_info == res.fi && fa->fa_type == fri.type) { - fri.offload = fa->offload; - fri.trap = fa->trap; + fri.offload = READ_ONCE(fa->offload); + fri.trap = READ_ONCE(fa->trap); break; } }
From: Eric Dumazet edumazet@google.com
commit d95d6320ba7a51d61c097ffc3bcafcf70283414e upstream.
Because fib6_info_hw_flags_set() is called without any synchronization, all accesses to gi6->offload, fi->trap and fi->offload_failed need some basic protection like READ_ONCE()/WRITE_ONCE().
BUG: KCSAN: data-race in fib6_info_hw_flags_set / fib6_purge_rt
read to 0xffff8881087d5886 of 1 bytes by task 13953 on cpu 0: fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1007 [inline] fib6_purge_rt+0x4f/0x580 net/ipv6/ip6_fib.c:1033 fib6_del_route net/ipv6/ip6_fib.c:1983 [inline] fib6_del+0x696/0x890 net/ipv6/ip6_fib.c:2028 __ip6_del_rt net/ipv6/route.c:3876 [inline] ip6_del_rt+0x83/0x140 net/ipv6/route.c:3891 __ipv6_dev_ac_dec+0x2b5/0x370 net/ipv6/anycast.c:374 ipv6_dev_ac_dec net/ipv6/anycast.c:387 [inline] __ipv6_sock_ac_close+0x141/0x200 net/ipv6/anycast.c:207 ipv6_sock_ac_close+0x79/0x90 net/ipv6/anycast.c:220 inet6_release+0x32/0x50 net/ipv6/af_inet6.c:476 __sock_release net/socket.c:650 [inline] sock_close+0x6c/0x150 net/socket.c:1318 __fput+0x295/0x520 fs/file_table.c:280 ____fput+0x11/0x20 fs/file_table.c:313 task_work_run+0x8e/0x110 kernel/task_work.c:164 tracehook_notify_resume include/linux/tracehook.h:189 [inline] exit_to_user_mode_loop kernel/entry/common.c:175 [inline] exit_to_user_mode_prepare+0x160/0x190 kernel/entry/common.c:207 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline] syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:300 do_syscall_64+0x50/0xd0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x44/0xae
write to 0xffff8881087d5886 of 1 bytes by task 1912 on cpu 1: fib6_info_hw_flags_set+0x155/0x3b0 net/ipv6/route.c:6230 nsim_fib6_rt_hw_flags_set drivers/net/netdevsim/fib.c:668 [inline] nsim_fib6_rt_add drivers/net/netdevsim/fib.c:691 [inline] nsim_fib6_rt_insert drivers/net/netdevsim/fib.c:756 [inline] nsim_fib6_event drivers/net/netdevsim/fib.c:853 [inline] nsim_fib_event drivers/net/netdevsim/fib.c:886 [inline] nsim_fib_event_work+0x284f/0x2cf0 drivers/net/netdevsim/fib.c:1477 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307 worker_thread+0x616/0xa70 kernel/workqueue.c:2454 kthread+0x2c7/0x2e0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30
value changed: 0x22 -> 0x2a
Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 1912 Comm: kworker/1:3 Not tainted 5.16.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events nsim_fib_event_work
Fixes: 0c5fcf9e249e ("IPv6: Add "offload failed" indication to routes") Fixes: bb3c4ab93e44 ("ipv6: Add "offload" and "trap" indications to routes") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Amit Cohen amcohen@nvidia.com Cc: Ido Schimmel idosch@nvidia.com Reported-by: syzbot syzkaller@googlegroups.com Link: https://lore.kernel.org/r/20220216173217.3792411-2-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/netdevsim/fib.c | 4 ++-- include/net/ip6_fib.h | 10 ++++++---- net/ipv6/route.c | 19 ++++++++++--------- 3 files changed, 18 insertions(+), 15 deletions(-)
--- a/drivers/net/netdevsim/fib.c +++ b/drivers/net/netdevsim/fib.c @@ -623,14 +623,14 @@ static int nsim_fib6_rt_append(struct ns if (err) goto err_fib6_rt_nh_del;
- fib6_event->rt_arr[i]->trap = true; + WRITE_ONCE(fib6_event->rt_arr[i]->trap, true); }
return 0;
err_fib6_rt_nh_del: for (i--; i >= 0; i--) { - fib6_event->rt_arr[i]->trap = false; + WRITE_ONCE(fib6_event->rt_arr[i]->trap, false); nsim_fib6_rt_nh_del(fib6_rt, fib6_event->rt_arr[i]); } return err; --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -189,14 +189,16 @@ struct fib6_info { u32 fib6_metric; u8 fib6_protocol; u8 fib6_type; + + u8 offload; + u8 trap; + u8 offload_failed; + u8 should_flush:1, dst_nocount:1, dst_nopolicy:1, fib6_destroying:1, - offload:1, - trap:1, - offload_failed:1, - unused:1; + unused:4;
struct rcu_head rcu; struct nexthop *nh; --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -5767,11 +5767,11 @@ static int rt6_fill_node(struct net *net }
if (!dst) { - if (rt->offload) + if (READ_ONCE(rt->offload)) rtm->rtm_flags |= RTM_F_OFFLOAD; - if (rt->trap) + if (READ_ONCE(rt->trap)) rtm->rtm_flags |= RTM_F_TRAP; - if (rt->offload_failed) + if (READ_ONCE(rt->offload_failed)) rtm->rtm_flags |= RTM_F_OFFLOAD_FAILED; }
@@ -6229,19 +6229,20 @@ void fib6_info_hw_flags_set(struct net * struct sk_buff *skb; int err;
- if (f6i->offload == offload && f6i->trap == trap && - f6i->offload_failed == offload_failed) + if (READ_ONCE(f6i->offload) == offload && + READ_ONCE(f6i->trap) == trap && + READ_ONCE(f6i->offload_failed) == offload_failed) return;
- f6i->offload = offload; - f6i->trap = trap; + WRITE_ONCE(f6i->offload, offload); + WRITE_ONCE(f6i->trap, trap);
/* 2 means send notifications only if offload_failed was changed. */ if (net->ipv6.sysctl.fib_notify_on_flag_change == 2 && - f6i->offload_failed == offload_failed) + READ_ONCE(f6i->offload_failed) == offload_failed) return;
- f6i->offload_failed = offload_failed; + WRITE_ONCE(f6i->offload_failed, offload_failed);
if (!rcu_access_pointer(f6i->fib6_node)) /* The route was removed from the tree, do not send
From: Ignat Korchagin ignat@cloudflare.com
commit 26394fc118d6115390bd5b3a0fb17096271da227 upstream.
Some time ago 8965779d2c0e ("ipv6,mcast: always hold idev->lock before mca_lock") switched ipv6_get_lladdr() to __ipv6_get_lladdr(), which is rcu-unsafe version. That was OK, because idev->lock was held for these codepaths.
In 88e2ca308094 ("mld: convert ifmcaddr6 to RCU") these external locks were removed, so we probably need to restore the original rcu-safe call.
Otherwise, we occasionally get a machine crashed/stalled with the following in dmesg:
[ 3405.966610][T230589] general protection fault, probably for non-canonical address 0xdead00000000008c: 0000 [#1] SMP NOPTI [ 3405.982083][T230589] CPU: 44 PID: 230589 Comm: kworker/44:3 Tainted: G O 5.15.19-cloudflare-2022.2.1 #1 [ 3405.998061][T230589] Hardware name: SUPA-COOL-SERV [ 3406.009552][T230589] Workqueue: mld mld_ifc_work [ 3406.017224][T230589] RIP: 0010:__ipv6_get_lladdr+0x34/0x60 [ 3406.025780][T230589] Code: 57 10 48 83 c7 08 48 89 e5 48 39 d7 74 3e 48 8d 82 38 ff ff ff eb 13 48 8b 90 d0 00 00 00 48 8d 82 38 ff ff ff 48 39 d7 74 22 <66> 83 78 32 20 77 1b 75 e4 89 ca 23 50 2c 75 dd 48 8b 50 08 48 8b [ 3406.055748][T230589] RSP: 0018:ffff94e4b3fc3d10 EFLAGS: 00010202 [ 3406.065617][T230589] RAX: dead00000000005a RBX: ffff94e4b3fc3d30 RCX: 0000000000000040 [ 3406.077477][T230589] RDX: dead000000000122 RSI: ffff94e4b3fc3d30 RDI: ffff8c3a31431008 [ 3406.089389][T230589] RBP: ffff94e4b3fc3d10 R08: 0000000000000000 R09: 0000000000000000 [ 3406.101445][T230589] R10: ffff8c3a31430000 R11: 000000000000000b R12: ffff8c2c37887100 [ 3406.113553][T230589] R13: ffff8c3a39537000 R14: 00000000000005dc R15: ffff8c3a31431000 [ 3406.125730][T230589] FS: 0000000000000000(0000) GS:ffff8c3b9fc80000(0000) knlGS:0000000000000000 [ 3406.138992][T230589] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3406.149895][T230589] CR2: 00007f0dfea1db60 CR3: 000000387b5f2000 CR4: 0000000000350ee0 [ 3406.162421][T230589] Call Trace: [ 3406.170235][T230589] <TASK> [ 3406.177736][T230589] mld_newpack+0xfe/0x1a0 [ 3406.186686][T230589] add_grhead+0x87/0xa0 [ 3406.195498][T230589] add_grec+0x485/0x4e0 [ 3406.204310][T230589] ? newidle_balance+0x126/0x3f0 [ 3406.214024][T230589] mld_ifc_work+0x15d/0x450 [ 3406.223279][T230589] process_one_work+0x1e6/0x380 [ 3406.232982][T230589] worker_thread+0x50/0x3a0 [ 3406.242371][T230589] ? rescuer_thread+0x360/0x360 [ 3406.252175][T230589] kthread+0x127/0x150 [ 3406.261197][T230589] ? set_kthread_struct+0x40/0x40 [ 3406.271287][T230589] ret_from_fork+0x22/0x30 [ 3406.280812][T230589] </TASK> [ 3406.288937][T230589] Modules linked in: ... [last unloaded: kheaders] [ 3406.476714][T230589] ---[ end trace 3525a7655f2f3b9e ]---
Fixes: 88e2ca308094 ("mld: convert ifmcaddr6 to RCU") Reported-by: David Pinilla Caparros dpini@cloudflare.com Signed-off-by: Ignat Korchagin ignat@cloudflare.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/addrconf.h | 2 -- net/ipv6/addrconf.c | 4 ++-- net/ipv6/mcast.c | 2 +- 3 files changed, 3 insertions(+), 5 deletions(-)
--- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -109,8 +109,6 @@ struct inet6_ifaddr *ipv6_get_ifaddr(str int ipv6_dev_get_saddr(struct net *net, const struct net_device *dev, const struct in6_addr *daddr, unsigned int srcprefs, struct in6_addr *saddr); -int __ipv6_get_lladdr(struct inet6_dev *idev, struct in6_addr *addr, - u32 banned_flags); int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *addr, u32 banned_flags); bool inet_rcv_saddr_equal(const struct sock *sk, const struct sock *sk2, --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1837,8 +1837,8 @@ out: } EXPORT_SYMBOL(ipv6_dev_get_saddr);
-int __ipv6_get_lladdr(struct inet6_dev *idev, struct in6_addr *addr, - u32 banned_flags) +static int __ipv6_get_lladdr(struct inet6_dev *idev, struct in6_addr *addr, + u32 banned_flags) { struct inet6_ifaddr *ifp; int err = -EADDRNOTAVAIL; --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -1759,7 +1759,7 @@ static struct sk_buff *mld_newpack(struc skb_reserve(skb, hlen); skb_tailroom_reserve(skb, mtu, tlen);
- if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) { + if (ipv6_get_lladdr(dev, &addr_buf, IFA_F_TENTATIVE)) { /* <draft-ietf-magma-mld-source-05.txt>: * use unspecified address as the source address * when a valid link-local address is not available.
From: Willem de Bruijn willemb@google.com
commit 0b0dff5b3b98c5c7ce848151df9da0b3cdf0cc8b upstream.
Ipv6 flowlabels historically require a reservation before use. Optionally in exclusive mode (e.g., user-private).
Commit 59c820b2317f ("ipv6: elide flowlabel check if no exclusive leases exist") introduced a fastpath that avoids this check when no exclusive leases exist in the system, and thus any flowlabel use will be granted.
That allows skipping the control operation to reserve a flowlabel entirely. Though with a warning if the fast path fails:
This is an optimization. Robust applications still have to revert to requesting leases if the fast path fails due to an exclusive lease.
Still, this is subtle. Better isolate network namespaces from each other. Flowlabels are per-netns. Also record per-netns whether exclusive leases are in use. Then behavior does not change based on activity in other netns.
Changes v2 - wrap in IS_ENABLED(CONFIG_IPV6) to avoid breakage if disabled
Fixes: 59c820b2317f ("ipv6: elide flowlabel check if no exclusive leases exist") Link: https://lore.kernel.org/netdev/MWHPR2201MB1072BCCCFCE779E4094837ACD0329@MWHP... Reported-by: Congyu Liu liu3101@purdue.edu Signed-off-by: Willem de Bruijn willemb@google.com Tested-by: Congyu Liu liu3101@purdue.edu Link: https://lore.kernel.org/r/20220215160037.1976072-1-willemdebruijn.kernel@gma... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/ipv6.h | 5 ++++- include/net/netns/ipv6.h | 3 ++- net/ipv6/ip6_flowlabel.c | 4 +++- 3 files changed, 9 insertions(+), 3 deletions(-)
--- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -391,17 +391,20 @@ static inline void txopt_put(struct ipv6 kfree_rcu(opt, rcu); }
+#if IS_ENABLED(CONFIG_IPV6) struct ip6_flowlabel *__fl6_sock_lookup(struct sock *sk, __be32 label);
extern struct static_key_false_deferred ipv6_flowlabel_exclusive; static inline struct ip6_flowlabel *fl6_sock_lookup(struct sock *sk, __be32 label) { - if (static_branch_unlikely(&ipv6_flowlabel_exclusive.key)) + if (static_branch_unlikely(&ipv6_flowlabel_exclusive.key) && + READ_ONCE(sock_net(sk)->ipv6.flowlabel_has_excl)) return __fl6_sock_lookup(sk, label) ? : ERR_PTR(-ENOENT);
return NULL; } +#endif
struct ipv6_txoptions *fl6_merge_options(struct ipv6_txoptions *opt_space, struct ip6_flowlabel *fl, --- a/include/net/netns/ipv6.h +++ b/include/net/netns/ipv6.h @@ -77,9 +77,10 @@ struct netns_ipv6 { spinlock_t fib6_gc_lock; unsigned int ip6_rt_gc_expire; unsigned long ip6_rt_last_gc; + unsigned char flowlabel_has_excl; #ifdef CONFIG_IPV6_MULTIPLE_TABLES - unsigned int fib6_rules_require_fldissect; bool fib6_has_custom_rules; + unsigned int fib6_rules_require_fldissect; #ifdef CONFIG_IPV6_SUBTREES unsigned int fib6_routes_require_src; #endif --- a/net/ipv6/ip6_flowlabel.c +++ b/net/ipv6/ip6_flowlabel.c @@ -450,8 +450,10 @@ fl_create(struct net *net, struct sock * err = -EINVAL; goto done; } - if (fl_shared_exclusive(fl) || fl->opt) + if (fl_shared_exclusive(fl) || fl->opt) { + WRITE_ONCE(sock_net(sk)->ipv6.flowlabel_has_excl, 1); static_branch_deferred_inc(&ipv6_flowlabel_exclusive); + } return fl;
done:
From: Jonas Gorski jonas.gorski@gmail.com
commit 6aba04ee3263669b335458c4cf4c7d97d6940229 upstream.
This reverts commit 3710e80952cf2dc48257ac9f145b117b5f74e0a5.
Since idm_base and nicpm_base are still optional resources not present on all platforms, this breaks the driver for everything except Northstar 2 (which has both).
The same change was already reverted once with 755f5738ff98 ("net: broadcom: fix a mistake about ioremap resource").
So let's do it again.
Fixes: 3710e80952cf ("net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com [florian: Added comments to explain the resources are optional] Signed-off-by: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20220216184634.2032460-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bgmac-platform.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-)
--- a/drivers/net/ethernet/broadcom/bgmac-platform.c +++ b/drivers/net/ethernet/broadcom/bgmac-platform.c @@ -172,6 +172,7 @@ static int bgmac_probe(struct platform_d { struct device_node *np = pdev->dev.of_node; struct bgmac *bgmac; + struct resource *regs; int ret;
bgmac = bgmac_alloc(&pdev->dev); @@ -208,15 +209,23 @@ static int bgmac_probe(struct platform_d if (IS_ERR(bgmac->plat.base)) return PTR_ERR(bgmac->plat.base);
- bgmac->plat.idm_base = devm_platform_ioremap_resource_byname(pdev, "idm_base"); - if (IS_ERR(bgmac->plat.idm_base)) - return PTR_ERR(bgmac->plat.idm_base); - else + /* The idm_base resource is optional for some platforms */ + regs = platform_get_resource_byname(pdev, IORESOURCE_MEM, "idm_base"); + if (regs) { + bgmac->plat.idm_base = devm_ioremap_resource(&pdev->dev, regs); + if (IS_ERR(bgmac->plat.idm_base)) + return PTR_ERR(bgmac->plat.idm_base); bgmac->feature_flags &= ~BGMAC_FEAT_IDM_MASK; + }
- bgmac->plat.nicpm_base = devm_platform_ioremap_resource_byname(pdev, "nicpm_base"); - if (IS_ERR(bgmac->plat.nicpm_base)) - return PTR_ERR(bgmac->plat.nicpm_base); + /* The nicpm_base resource is optional for some platforms */ + regs = platform_get_resource_byname(pdev, IORESOURCE_MEM, "nicpm_base"); + if (regs) { + bgmac->plat.nicpm_base = devm_ioremap_resource(&pdev->dev, + regs); + if (IS_ERR(bgmac->plat.nicpm_base)) + return PTR_ERR(bgmac->plat.nicpm_base); + }
bgmac->read = platform_bgmac_read; bgmac->write = platform_bgmac_write;
From: Jiasheng Jiang jiasheng@iscas.ac.cn
commit a72c01a94f1d285a274219d36e2a17b4846c0615 upstream.
As the possible failure of the alloc, the ifmgd->assoc_req_ies might be NULL pointer returned from kmemdup(). Therefore it might be better to free the skb and return error in order to fail the association, like ieee80211_assoc_success(). Also, the caller, ieee80211_do_assoc(), needs to deal with the return value from ieee80211_send_assoc().
Fixes: 4d9ec73d2b78 ("cfg80211: Report Association Request frame IEs in association events") Signed-off-by: Jiasheng Jiang jiasheng@iscas.ac.cn Link: https://lore.kernel.org/r/20220105081559.2387083-1-jiasheng@iscas.ac.cn [fix some paths to be errors, not success] Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mac80211/mlme.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-)
--- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -664,7 +664,7 @@ static void ieee80211_add_he_ie(struct i ieee80211_ie_build_he_6ghz_cap(sdata, skb); }
-static void ieee80211_send_assoc(struct ieee80211_sub_if_data *sdata) +static int ieee80211_send_assoc(struct ieee80211_sub_if_data *sdata) { struct ieee80211_local *local = sdata->local; struct ieee80211_if_managed *ifmgd = &sdata->u.mgd; @@ -684,6 +684,7 @@ static void ieee80211_send_assoc(struct enum nl80211_iftype iftype = ieee80211_vif_type_p2p(&sdata->vif); const struct ieee80211_sband_iftype_data *iftd; struct ieee80211_prep_tx_info info = {}; + int ret;
/* we know it's writable, cast away the const */ if (assoc_data->ie_len) @@ -697,7 +698,7 @@ static void ieee80211_send_assoc(struct chanctx_conf = rcu_dereference(sdata->vif.chanctx_conf); if (WARN_ON(!chanctx_conf)) { rcu_read_unlock(); - return; + return -EINVAL; } chan = chanctx_conf->def.chan; rcu_read_unlock(); @@ -748,7 +749,7 @@ static void ieee80211_send_assoc(struct (iftd ? iftd->vendor_elems.len : 0), GFP_KERNEL); if (!skb) - return; + return -ENOMEM;
skb_reserve(skb, local->hw.extra_tx_headroom);
@@ -1029,15 +1030,22 @@ skip_rates: skb_put_data(skb, assoc_data->ie + offset, noffset - offset); }
- if (assoc_data->fils_kek_len && - fils_encrypt_assoc_req(skb, assoc_data) < 0) { - dev_kfree_skb(skb); - return; + if (assoc_data->fils_kek_len) { + ret = fils_encrypt_assoc_req(skb, assoc_data); + if (ret < 0) { + dev_kfree_skb(skb); + return ret; + } }
pos = skb_tail_pointer(skb); kfree(ifmgd->assoc_req_ies); ifmgd->assoc_req_ies = kmemdup(ie_start, pos - ie_start, GFP_ATOMIC); + if (!ifmgd->assoc_req_ies) { + dev_kfree_skb(skb); + return -ENOMEM; + } + ifmgd->assoc_req_ies_len = pos - ie_start;
drv_mgd_prepare_tx(local, sdata, &info); @@ -1047,6 +1055,8 @@ skip_rates: IEEE80211_SKB_CB(skb)->flags |= IEEE80211_TX_CTL_REQ_TX_STATUS | IEEE80211_TX_INTFL_MLME_CONN_TX; ieee80211_tx_skb(sdata, skb); + + return 0; }
void ieee80211_send_pspoll(struct ieee80211_local *local, @@ -4451,6 +4461,7 @@ static int ieee80211_do_assoc(struct iee { struct ieee80211_mgd_assoc_data *assoc_data = sdata->u.mgd.assoc_data; struct ieee80211_local *local = sdata->local; + int ret;
sdata_assert_lock(sdata);
@@ -4471,7 +4482,9 @@ static int ieee80211_do_assoc(struct iee sdata_info(sdata, "associate with %pM (try %d/%d)\n", assoc_data->bss->bssid, assoc_data->tries, IEEE80211_ASSOC_MAX_TRIES); - ieee80211_send_assoc(sdata); + ret = ieee80211_send_assoc(sdata); + if (ret) + return ret;
if (!ieee80211_hw_check(&local->hw, REPORTS_TX_ACK_STATUS)) { assoc_data->timeout = jiffies + IEEE80211_ASSOC_TIMEOUT;
From: Phil Elwell phil@raspberrypi.com
commit 665408f4c3a5c83e712871daa062721624b2b79e upstream.
The call to brcm_alt_fw_path in brcmf_fw_get_firmwares is not protected by a check to the validity of the fwctx->req->board_type pointer. This results in a crash in strlcat when, for example, the WLAN chip is found in a USB dongle.
Prevent the crash by adding the necessary check.
See: https://github.com/raspberrypi/linux/issues/4833
Fixes: 5ff013914c62 ("brcmfmac: firmware: Allow per-board firmware binaries") Signed-off-by: Phil Elwell phil@raspberrypi.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20220118154514.3245524-1-phil@raspberrypi.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/broadcom/brcm80211/brcmfmac/firmware.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/firmware.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/firmware.c @@ -693,7 +693,7 @@ int brcmf_fw_get_firmwares(struct device { struct brcmf_fw_item *first = &req->items[0]; struct brcmf_fw *fwctx; - char *alt_path; + char *alt_path = NULL; int ret;
brcmf_dbg(TRACE, "enter: dev=%s\n", dev_name(dev)); @@ -712,7 +712,9 @@ int brcmf_fw_get_firmwares(struct device fwctx->done = fw_cb;
/* First try alternative board-specific path if any */ - alt_path = brcm_alt_fw_path(first->path, fwctx->req->board_type); + if (fwctx->req->board_type) + alt_path = brcm_alt_fw_path(first->path, + fwctx->req->board_type); if (alt_path) { ret = request_firmware_nowait(THIS_MODULE, true, alt_path, fwctx->dev, GFP_KERNEL, fwctx,
From: Johannes Berg johannes.berg@intel.com
commit f0a6fd1527067da537e9c48390237488719948ed upstream.
My previous fix here to fix the deadlock left a race where the exact same deadlock (see the original commit referenced below) can still happen if cfg80211_destroy_ifaces() already runs while nl80211_netlink_notify() is still marking some interfaces as nl_owner_dead.
The race happens because we have two loops here - first we dev_close() all the netdevs, and then we destroy them. If we also have two netdevs (first one need only be a wdev though) then we can find one during the first iteration, close it, and go to the second iteration -- but then find two, and try to destroy also the one we didn't close yet.
Fix this by only iterating once.
Reported-by: Toke Høiland-Jørgensen toke@redhat.com Fixes: ea6b2098dd02 ("cfg80211: fix locking in netlink owner interface destruction") Tested-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://lore.kernel.org/r/20220201130951.22093-1-johannes@sipsolutions.net Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/wireless/core.c | 17 ++++------------- 1 file changed, 4 insertions(+), 13 deletions(-)
--- a/net/wireless/core.c +++ b/net/wireless/core.c @@ -5,7 +5,7 @@ * Copyright 2006-2010 Johannes Berg johannes@sipsolutions.net * Copyright 2013-2014 Intel Mobile Communications GmbH * Copyright 2015-2017 Intel Deutschland GmbH - * Copyright (C) 2018-2021 Intel Corporation + * Copyright (C) 2018-2022 Intel Corporation */
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt @@ -332,29 +332,20 @@ static void cfg80211_event_work(struct w void cfg80211_destroy_ifaces(struct cfg80211_registered_device *rdev) { struct wireless_dev *wdev, *tmp; - bool found = false;
ASSERT_RTNL();
- list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) { + list_for_each_entry_safe(wdev, tmp, &rdev->wiphy.wdev_list, list) { if (wdev->nl_owner_dead) { if (wdev->netdev) dev_close(wdev->netdev); - found = true; - } - } - - if (!found) - return;
- wiphy_lock(&rdev->wiphy); - list_for_each_entry_safe(wdev, tmp, &rdev->wiphy.wdev_list, list) { - if (wdev->nl_owner_dead) { + wiphy_lock(&rdev->wiphy); cfg80211_leave(rdev, wdev); rdev_del_virtual_intf(rdev, wdev); + wiphy_unlock(&rdev->wiphy); } } - wiphy_unlock(&rdev->wiphy); }
static void cfg80211_destroy_iface_wk(struct work_struct *work)
From: Mans Rullgard mans@mansr.com
commit 6bb9681a43f34f2cab4aad6e2a02da4ce54d13c5 upstream.
The reset input to the LAN9303 chip is active low, and devicetree gpio handles reflect this. Therefore, the gpio should be requested with an initial state of high in order for the reset signal to be asserted. Other uses of the gpio already use the correct polarity.
Fixes: a1292595e006 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303") Signed-off-by: Mans Rullgard mans@mansr.com Reviewed-by: Andrew Lunn andrew@lunn.ch Reviewed-by: Florian Fianelil f.fainelli@gmail.com Link: https://lore.kernel.org/r/20220209145454.19749-1-mans@mansr.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/lan9303-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -1309,7 +1309,7 @@ static int lan9303_probe_reset_gpio(stru struct device_node *np) { chip->reset_gpio = devm_gpiod_get_optional(chip->dev, "reset", - GPIOD_OUT_LOW); + GPIOD_OUT_HIGH); if (IS_ERR(chip->reset_gpio)) return PTR_ERR(chip->reset_gpio);
From: Vladimir Oltean vladimir.oltean@nxp.com
commit a2614140dc0f467a83aa3bb4b6ee2d6480a76202 upstream.
mv88e6xxx is special among DSA drivers in that it requires the VTU to contain the VID of the FDB entry it modifies in mv88e6xxx_port_db_load_purge(), otherwise it will return -EOPNOTSUPP.
Sometimes due to races this is not always satisfied even if external code does everything right (first deletes the FDB entries, then the VLAN), because DSA commits to hardware FDB entries asynchronously since commit c9eb3e0f8701 ("net: dsa: Add support for learning FDB through notification").
Therefore, the mv88e6xxx driver must close this race condition by itself, by asking DSA to flush the switchdev workqueue of any FDB deletions in progress, prior to exiting a VLAN.
Fixes: c9eb3e0f8701 ("net: dsa: Add support for learning FDB through notification") Reported-by: Rafael Richter rafael.richter@gin.de Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/mv88e6xxx/chip.c | 7 +++++++ include/net/dsa.h | 1 + net/dsa/dsa.c | 1 + net/dsa/dsa_priv.h | 1 - 4 files changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -2291,6 +2291,13 @@ static int mv88e6xxx_port_vlan_del(struc if (!mv88e6xxx_max_vid(chip)) return -EOPNOTSUPP;
+ /* The ATU removal procedure needs the FID to be mapped in the VTU, + * but FDB deletion runs concurrently with VLAN deletion. Flush the DSA + * switchdev workqueue to ensure that all FDB entries are deleted + * before we remove the VLAN. + */ + dsa_flush_workqueue(); + mv88e6xxx_reg_lock(chip);
err = mv88e6xxx_port_get_pvid(chip, port, &pvid); --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -1056,6 +1056,7 @@ void dsa_unregister_switch(struct dsa_sw int dsa_register_switch(struct dsa_switch *ds); void dsa_switch_shutdown(struct dsa_switch *ds); struct dsa_switch *dsa_switch_find(int tree_index, int sw_index); +void dsa_flush_workqueue(void); #ifdef CONFIG_PM_SLEEP int dsa_switch_suspend(struct dsa_switch *ds); int dsa_switch_resume(struct dsa_switch *ds); --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -349,6 +349,7 @@ void dsa_flush_workqueue(void) { flush_workqueue(dsa_owq); } +EXPORT_SYMBOL_GPL(dsa_flush_workqueue);
int dsa_devlink_param_get(struct devlink *dl, u32 id, struct devlink_param_gset_ctx *ctx) --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -170,7 +170,6 @@ void dsa_tag_driver_put(const struct dsa const struct dsa_device_ops *dsa_find_tagger_by_name(const char *buf);
bool dsa_schedule_work(struct work_struct *work); -void dsa_flush_workqueue(void); const char *dsa_tag_protocol_to_str(const struct dsa_device_ops *ops);
static inline int dsa_tag_protocol_overhead(const struct dsa_device_ops *ops)
From: Alexey Khoroshilov khoroshilov@ispras.ru
commit 8c6ae46150a453f8ae9a6cd49b45f354f478587d upstream.
of_node_put(priv->ds->slave_mii_bus->dev.of_node) should be done before mdiobus_free(priv->ds->slave_mii_bus).
Signed-off-by: Alexey Khoroshilov khoroshilov@ispras.ru Fixes: 0d120dfb5d67 ("net: dsa: lantiq_gswip: don't use devres for mdiobus") Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/1644921768-26477-1-git-send-email-khoroshilov@ispr... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/lantiq_gswip.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/dsa/lantiq_gswip.c +++ b/drivers/net/dsa/lantiq_gswip.c @@ -2201,8 +2201,8 @@ static int gswip_remove(struct platform_
if (priv->ds->slave_mii_bus) { mdiobus_unregister(priv->ds->slave_mii_bus); - mdiobus_free(priv->ds->slave_mii_bus); of_node_put(priv->ds->slave_mii_bus->dev.of_node); + mdiobus_free(priv->ds->slave_mii_bus); }
for (i = 0; i < priv->num_gphy_fw; i++)
From: Mans Rullgard mans@mansr.com
commit 017b355bbdc6620fd8fe05fe297f553ce9d855ee upstream.
Check for a hwaccel VLAN tag on rx and use it if present. Otherwise, use __skb_vlan_pop() like the other tag parsers do. This fixes the case where the VLAN tag has already been consumed by the master.
Fixes: a1292595e006 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303") Signed-off-by: Mans Rullgard mans@mansr.com Reviewed-by: Vladimir Oltean olteanv@gmail.com Link: https://lore.kernel.org/r/20220216124634.23123-1-mans@mansr.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dsa/tag_lan9303.c | 21 +++++++-------------- 1 file changed, 7 insertions(+), 14 deletions(-)
--- a/net/dsa/tag_lan9303.c +++ b/net/dsa/tag_lan9303.c @@ -77,7 +77,6 @@ static struct sk_buff *lan9303_xmit(stru
static struct sk_buff *lan9303_rcv(struct sk_buff *skb, struct net_device *dev) { - __be16 *lan9303_tag; u16 lan9303_tag1; unsigned int source_port;
@@ -87,14 +86,15 @@ static struct sk_buff *lan9303_rcv(struc return NULL; }
- lan9303_tag = dsa_etype_header_pos_rx(skb); - - if (lan9303_tag[0] != htons(ETH_P_8021Q)) { - dev_warn_ratelimited(&dev->dev, "Dropping packet due to invalid VLAN marker\n"); - return NULL; + if (skb_vlan_tag_present(skb)) { + lan9303_tag1 = skb_vlan_tag_get(skb); + __vlan_hwaccel_clear_tag(skb); + } else { + skb_push_rcsum(skb, ETH_HLEN); + __skb_vlan_pop(skb, &lan9303_tag1); + skb_pull_rcsum(skb, ETH_HLEN); }
- lan9303_tag1 = ntohs(lan9303_tag[1]); source_port = lan9303_tag1 & 0x3;
skb->dev = dsa_master_find_slave(dev, 0, source_port); @@ -103,13 +103,6 @@ static struct sk_buff *lan9303_rcv(struc return NULL; }
- /* remove the special VLAN tag between the MAC addresses - * and the current ethertype field. - */ - skb_pull_rcsum(skb, 2 + 2); - - dsa_strip_etype_header(skb, LAN9303_TAG_LEN); - if (!(lan9303_tag1 & LAN9303_TAG_RX_TRAPPED_TO_CPU)) dsa_default_offload_fwd_mark(skb);
From: Mans Rullgard mans@mansr.com
commit 430065e2671905ac675f97b7af240cc255964e93 upstream.
If the master device does VLAN filtering, the IDs used by the switch must be added for any frames to be received. Do this in the port_enable() function, and remove them in port_disable().
Fixes: a1292595e006 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303") Signed-off-by: Mans Rullgard mans@mansr.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Vladimir Oltean olteanv@gmail.com Link: https://lore.kernel.org/r/20220216204818.28746-1-mans@mansr.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/Kconfig | 1 + drivers/net/dsa/lan9303-core.c | 11 +++++++++-- 2 files changed, 10 insertions(+), 2 deletions(-)
--- a/drivers/net/dsa/Kconfig +++ b/drivers/net/dsa/Kconfig @@ -81,6 +81,7 @@ config NET_DSA_REALTEK_SMI
config NET_DSA_SMSC_LAN9303 tristate + depends on VLAN_8021Q || VLAN_8021Q=n select NET_DSA_TAG_LAN9303 select REGMAP help --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -10,6 +10,7 @@ #include <linux/mii.h> #include <linux/phy.h> #include <linux/if_bridge.h> +#include <linux/if_vlan.h> #include <linux/etherdevice.h>
#include "lan9303.h" @@ -1083,21 +1084,27 @@ static void lan9303_adjust_link(struct d static int lan9303_port_enable(struct dsa_switch *ds, int port, struct phy_device *phy) { + struct dsa_port *dp = dsa_to_port(ds, port); struct lan9303 *chip = ds->priv;
- if (!dsa_is_user_port(ds, port)) + if (!dsa_port_is_user(dp)) return 0;
+ vlan_vid_add(dp->cpu_dp->master, htons(ETH_P_8021Q), port); + return lan9303_enable_processing_port(chip, port); }
static void lan9303_port_disable(struct dsa_switch *ds, int port) { + struct dsa_port *dp = dsa_to_port(ds, port); struct lan9303 *chip = ds->priv;
- if (!dsa_is_user_port(ds, port)) + if (!dsa_port_is_user(dp)) return;
+ vlan_vid_del(dp->cpu_dp->master, htons(ETH_P_8021Q), port); + lan9303_disable_processing_port(chip, port); lan9303_phy_write(ds, chip->phy_addr_base + port, MII_BMCR, BMCR_PDOWN); }
From: Miquel Raynal miquel.raynal@bootlin.com
commit bdc120a2bcd834e571ce4115aaddf71ab34495de upstream.
These periods are expressed in time units (microseconds) while 40 and 12 are the number of symbol durations these periods will last. We need to multiply them both with the symbol_duration in order to get these values in microseconds.
Fixes: ded845a781a5 ("ieee802154: Add CA8210 IEEE 802.15.4 device driver") Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/r/20220201180629.93410-2-miquel.raynal@bootlin.com Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ieee802154/ca8210.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ieee802154/ca8210.c +++ b/drivers/net/ieee802154/ca8210.c @@ -2977,8 +2977,8 @@ static void ca8210_hw_setup(struct ieee8 ca8210_hw->phy->cca.opt = NL802154_CCA_OPT_ENERGY_CARRIER_AND; ca8210_hw->phy->cca_ed_level = -9800; ca8210_hw->phy->symbol_duration = 16; - ca8210_hw->phy->lifs_period = 40; - ca8210_hw->phy->sifs_period = 12; + ca8210_hw->phy->lifs_period = 40 * ca8210_hw->phy->symbol_duration; + ca8210_hw->phy->sifs_period = 12 * ca8210_hw->phy->symbol_duration; ca8210_hw->flags = IEEE802154_HW_AFILT | IEEE802154_HW_OMIT_CKSUM |
From: Xin Long lucien.xin@gmail.com
commit 35a79e64de29e8d57a5989aac57611c0cd29e13e upstream.
When 'ping' changes to use PING socket instead of RAW socket by:
# sysctl -w net.ipv4.ping_group_range="0 100"
There is another regression caused when matching sk_bound_dev_if and dif, RAW socket is using inet_iif() while PING socket lookup is using skb->dev->ifindex, the cmd below fails due to this:
# ip link add dummy0 type dummy # ip link set dummy0 up # ip addr add 192.168.111.1/24 dev dummy0 # ping -I dummy0 192.168.111.1 -c1
The issue was also reported on:
https://github.com/iputils/iputils/issues/104
But fixed in iputils in a wrong way by not binding to device when destination IP is on device, and it will cause some of kselftests to fail, as Jianlin noticed.
This patch is to use inet(6)_iif and inet(6)_sdif to get dif and sdif for PING socket, and keep consistent with RAW socket.
Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Reported-by: Jianlin Shi jishi@redhat.com Signed-off-by: Xin Long lucien.xin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ping.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/net/ipv4/ping.c +++ b/net/ipv4/ping.c @@ -172,16 +172,23 @@ static struct sock *ping_lookup(struct n struct sock *sk = NULL; struct inet_sock *isk; struct hlist_nulls_node *hnode; - int dif = skb->dev->ifindex; + int dif, sdif;
if (skb->protocol == htons(ETH_P_IP)) { + dif = inet_iif(skb); + sdif = inet_sdif(skb); pr_debug("try to find: num = %d, daddr = %pI4, dif = %d\n", (int)ident, &ip_hdr(skb)->daddr, dif); #if IS_ENABLED(CONFIG_IPV6) } else if (skb->protocol == htons(ETH_P_IPV6)) { + dif = inet6_iif(skb); + sdif = inet6_sdif(skb); pr_debug("try to find: num = %d, daddr = %pI6c, dif = %d\n", (int)ident, &ipv6_hdr(skb)->daddr, dif); #endif + } else { + pr_err("ping: protocol(%x) is not supported\n", ntohs(skb->protocol)); + return NULL; }
read_lock_bh(&ping_table.lock); @@ -221,7 +228,7 @@ static struct sock *ping_lookup(struct n }
if (sk->sk_bound_dev_if && sk->sk_bound_dev_if != dif && - sk->sk_bound_dev_if != inet_sdif(skb)) + sk->sk_bound_dev_if != sdif) continue;
sock_hold(sk);
From: Zhang Changzhong zhangchangzhong@huawei.com
commit a6ab75cec1e461f8a35559054c146c21428430b8 upstream.
In __bond_release_one(), bond_set_carrier() is only called when bond device has no slave. Therefore, if we remove the up slave from a master with two slaves and keep the down slave, the master will remain up.
Fix this by moving bond_set_carrier() out of if (!bond_has_slaves(bond)) statement.
Reproducer: $ insmod bonding.ko mode=0 miimon=100 max_bonds=2 $ ifconfig bond0 up $ ifenslave bond0 eth0 eth1 $ ifconfig eth0 down $ ifenslave -d bond0 eth1 $ cat /proc/net/bonding/bond0
Fixes: ff59c4563a8d ("[PATCH] bonding: support carrier state for master") Signed-off-by: Zhang Changzhong zhangchangzhong@huawei.com Acked-by: Jay Vosburgh jay.vosburgh@canonical.com Link: https://lore.kernel.org/r/1645021088-38370-1-git-send-email-zhangchangzhong@... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_main.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2377,10 +2377,9 @@ static int __bond_release_one(struct net bond_select_active_slave(bond); }
- if (!bond_has_slaves(bond)) { - bond_set_carrier(bond); + bond_set_carrier(bond); + if (!bond_has_slaves(bond)) eth_hw_addr_random(bond_dev); - }
unblock_netpoll_tx(); synchronize_rcu();
From: Eric Dumazet edumazet@google.com
commit dcd54265c8bc14bd023815e36e2d5f9d66ee1fee upstream.
trace_napi_poll_hit() is reading stat->dev while another thread can write on it from dropmon_net_event()
Use READ_ONCE()/WRITE_ONCE() here, RCU rules are properly enforced already, we only have to take care of load/store tearing.
BUG: KCSAN: data-race in dropmon_net_event / trace_napi_poll_hit
write to 0xffff88816f3ab9c0 of 8 bytes by task 20260 on cpu 1: dropmon_net_event+0xb8/0x2b0 net/core/drop_monitor.c:1579 notifier_call_chain kernel/notifier.c:84 [inline] raw_notifier_call_chain+0x53/0xb0 kernel/notifier.c:392 call_netdevice_notifiers_info net/core/dev.c:1919 [inline] call_netdevice_notifiers_extack net/core/dev.c:1931 [inline] call_netdevice_notifiers net/core/dev.c:1945 [inline] unregister_netdevice_many+0x867/0xfb0 net/core/dev.c:10415 ip_tunnel_delete_nets+0x24a/0x280 net/ipv4/ip_tunnel.c:1123 vti_exit_batch_net+0x2a/0x30 net/ipv4/ip_vti.c:515 ops_exit_list net/core/net_namespace.c:173 [inline] cleanup_net+0x4dc/0x8d0 net/core/net_namespace.c:597 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307 worker_thread+0x616/0xa70 kernel/workqueue.c:2454 kthread+0x1bf/0x1e0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30
read to 0xffff88816f3ab9c0 of 8 bytes by interrupt on cpu 0: trace_napi_poll_hit+0x89/0x1c0 net/core/drop_monitor.c:292 trace_napi_poll include/trace/events/napi.h:14 [inline] __napi_poll+0x36b/0x3f0 net/core/dev.c:6366 napi_poll net/core/dev.c:6432 [inline] net_rx_action+0x29e/0x650 net/core/dev.c:6519 __do_softirq+0x158/0x2de kernel/softirq.c:558 do_softirq+0xb1/0xf0 kernel/softirq.c:459 __local_bh_enable_ip+0x68/0x70 kernel/softirq.c:383 __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline] _raw_spin_unlock_bh+0x33/0x40 kernel/locking/spinlock.c:210 spin_unlock_bh include/linux/spinlock.h:394 [inline] ptr_ring_consume_bh include/linux/ptr_ring.h:367 [inline] wg_packet_decrypt_worker+0x73c/0x780 drivers/net/wireguard/receive.c:506 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307 worker_thread+0x616/0xa70 kernel/workqueue.c:2454 kthread+0x1bf/0x1e0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30
value changed: 0xffff88815883e000 -> 0x0000000000000000
Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 26435 Comm: kworker/0:1 Not tainted 5.17.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: wg-crypt-wg2 wg_packet_decrypt_worker
Fixes: 4ea7e38696c7 ("dropmon: add ability to detect when hardware dropsrxpackets") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Neil Horman nhorman@tuxdriver.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/drop_monitor.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/net/core/drop_monitor.c +++ b/net/core/drop_monitor.c @@ -280,13 +280,17 @@ static void trace_napi_poll_hit(void *ig
rcu_read_lock(); list_for_each_entry_rcu(new_stat, &hw_stats_list, list) { + struct net_device *dev; + /* * only add a note to our monitor buffer if: * 1) this is the dev we received on * 2) its after the last_rx delta * 3) our rx_dropped count has gone up */ - if ((new_stat->dev == napi->dev) && + /* Paired with WRITE_ONCE() in dropmon_net_event() */ + dev = READ_ONCE(new_stat->dev); + if ((dev == napi->dev) && (time_after(jiffies, new_stat->last_rx + dm_hw_check_delta)) && (napi->dev->stats.rx_dropped != new_stat->last_drop_val)) { trace_drop_common(NULL, NULL); @@ -1572,7 +1576,10 @@ static int dropmon_net_event(struct noti mutex_lock(&net_dm_mutex); list_for_each_entry_safe(new_stat, tmp, &hw_stats_list, list) { if (new_stat->dev == dev) { - new_stat->dev = NULL; + + /* Paired with READ_ONCE() in trace_napi_poll_hit() */ + WRITE_ONCE(new_stat->dev, NULL); + if (trace_state == TRACE_OFF) { list_del_rcu(&new_stat->list); kfree_rcu(new_stat, rcu);
From: Eric Dumazet edumazet@google.com
commit 5891cd5ec46c2c2eb6427cb54d214b149635dd0e upstream.
syzbot found a data-race [1] which lead me to add __rcu annotations to netdev->qdisc, and proper accessors to get LOCKDEP support.
[1] BUG: KCSAN: data-race in dev_activate / qdisc_lookup_rcu
write to 0xffff888168ad6410 of 8 bytes by task 13559 on cpu 1: attach_default_qdiscs net/sched/sch_generic.c:1167 [inline] dev_activate+0x2ed/0x8f0 net/sched/sch_generic.c:1221 __dev_open+0x2e9/0x3a0 net/core/dev.c:1416 __dev_change_flags+0x167/0x3f0 net/core/dev.c:8139 rtnl_configure_link+0xc2/0x150 net/core/rtnetlink.c:3150 __rtnl_newlink net/core/rtnetlink.c:3489 [inline] rtnl_newlink+0xf4d/0x13e0 net/core/rtnetlink.c:3529 rtnetlink_rcv_msg+0x745/0x7e0 net/core/rtnetlink.c:5594 netlink_rcv_skb+0x14e/0x250 net/netlink/af_netlink.c:2494 rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:5612 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x602/0x6d0 net/netlink/af_netlink.c:1343 netlink_sendmsg+0x728/0x850 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2413 ___sys_sendmsg net/socket.c:2467 [inline] __sys_sendmsg+0x195/0x230 net/socket.c:2496 __do_sys_sendmsg net/socket.c:2505 [inline] __se_sys_sendmsg net/socket.c:2503 [inline] __x64_sys_sendmsg+0x42/0x50 net/socket.c:2503 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff888168ad6410 of 8 bytes by task 13560 on cpu 0: qdisc_lookup_rcu+0x30/0x2e0 net/sched/sch_api.c:323 __tcf_qdisc_find+0x74/0x3a0 net/sched/cls_api.c:1050 tc_del_tfilter+0x1c7/0x1350 net/sched/cls_api.c:2211 rtnetlink_rcv_msg+0x5ba/0x7e0 net/core/rtnetlink.c:5585 netlink_rcv_skb+0x14e/0x250 net/netlink/af_netlink.c:2494 rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:5612 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x602/0x6d0 net/netlink/af_netlink.c:1343 netlink_sendmsg+0x728/0x850 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2413 ___sys_sendmsg net/socket.c:2467 [inline] __sys_sendmsg+0x195/0x230 net/socket.c:2496 __do_sys_sendmsg net/socket.c:2505 [inline] __se_sys_sendmsg net/socket.c:2503 [inline] __x64_sys_sendmsg+0x42/0x50 net/socket.c:2503 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0xffffffff85dee080 -> 0xffff88815d96ec00
Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 13560 Comm: syz-executor.2 Not tainted 5.17.0-rc3-syzkaller-00116-gf1baf68e1383-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 470502de5bdb ("net: sched: unlock rules update API") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Vlad Buslov vladbu@mellanox.com Reported-by: syzbot syzkaller@googlegroups.com Cc: Jamal Hadi Salim jhs@mojatatu.com Cc: Cong Wang xiyou.wangcong@gmail.com Cc: Jiri Pirko jiri@resnulli.us Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/netdevice.h | 2 +- net/core/rtnetlink.c | 6 ++++-- net/sched/cls_api.c | 6 +++--- net/sched/sch_api.c | 22 ++++++++++++---------- net/sched/sch_generic.c | 29 ++++++++++++++++------------- 5 files changed, 36 insertions(+), 29 deletions(-)
--- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2149,7 +2149,7 @@ struct net_device { struct netdev_queue *_tx ____cacheline_aligned_in_smp; unsigned int num_tx_queues; unsigned int real_num_tx_queues; - struct Qdisc *qdisc; + struct Qdisc __rcu *qdisc; unsigned int tx_queue_len; spinlock_t tx_global_lock;
--- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1698,6 +1698,7 @@ static int rtnl_fill_ifinfo(struct sk_bu { struct ifinfomsg *ifm; struct nlmsghdr *nlh; + struct Qdisc *qdisc;
ASSERT_RTNL(); nlh = nlmsg_put(skb, pid, seq, type, sizeof(*ifm), flags); @@ -1715,6 +1716,7 @@ static int rtnl_fill_ifinfo(struct sk_bu if (tgt_netnsid >= 0 && nla_put_s32(skb, IFLA_TARGET_NETNSID, tgt_netnsid)) goto nla_put_failure;
+ qdisc = rtnl_dereference(dev->qdisc); if (nla_put_string(skb, IFLA_IFNAME, dev->name) || nla_put_u32(skb, IFLA_TXQLEN, dev->tx_queue_len) || nla_put_u8(skb, IFLA_OPERSTATE, @@ -1733,8 +1735,8 @@ static int rtnl_fill_ifinfo(struct sk_bu #endif put_master_ifindex(skb, dev) || nla_put_u8(skb, IFLA_CARRIER, netif_carrier_ok(dev)) || - (dev->qdisc && - nla_put_string(skb, IFLA_QDISC, dev->qdisc->ops->id)) || + (qdisc && + nla_put_string(skb, IFLA_QDISC, qdisc->ops->id)) || nla_put_ifalias(skb, dev) || nla_put_u32(skb, IFLA_CARRIER_CHANGES, atomic_read(&dev->carrier_up_count) + --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -1044,7 +1044,7 @@ static int __tcf_qdisc_find(struct net *
/* Find qdisc */ if (!*parent) { - *q = dev->qdisc; + *q = rcu_dereference(dev->qdisc); *parent = (*q)->handle; } else { *q = qdisc_lookup_rcu(dev, TC_H_MAJ(*parent)); @@ -2587,7 +2587,7 @@ static int tc_dump_tfilter(struct sk_buf
parent = tcm->tcm_parent; if (!parent) - q = dev->qdisc; + q = rtnl_dereference(dev->qdisc); else q = qdisc_lookup(dev, TC_H_MAJ(tcm->tcm_parent)); if (!q) @@ -2962,7 +2962,7 @@ static int tc_dump_chain(struct sk_buff return skb->len;
if (!tcm->tcm_parent) - q = dev->qdisc; + q = rtnl_dereference(dev->qdisc); else q = qdisc_lookup(dev, TC_H_MAJ(tcm->tcm_parent));
--- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -301,7 +301,7 @@ struct Qdisc *qdisc_lookup(struct net_de
if (!handle) return NULL; - q = qdisc_match_from_root(dev->qdisc, handle); + q = qdisc_match_from_root(rtnl_dereference(dev->qdisc), handle); if (q) goto out;
@@ -320,7 +320,7 @@ struct Qdisc *qdisc_lookup_rcu(struct ne
if (!handle) return NULL; - q = qdisc_match_from_root(dev->qdisc, handle); + q = qdisc_match_from_root(rcu_dereference(dev->qdisc), handle); if (q) goto out;
@@ -1082,10 +1082,10 @@ static int qdisc_graft(struct net_device skip: if (!ingress) { notify_and_destroy(net, skb, n, classid, - dev->qdisc, new); + rtnl_dereference(dev->qdisc), new); if (new && !new->ops->attach) qdisc_refcount_inc(new); - dev->qdisc = new ? : &noop_qdisc; + rcu_assign_pointer(dev->qdisc, new ? : &noop_qdisc);
if (new && new->ops->attach) new->ops->attach(new); @@ -1460,7 +1460,7 @@ static int tc_get_qdisc(struct sk_buff * q = dev_ingress_queue(dev)->qdisc_sleeping; } } else { - q = dev->qdisc; + q = rtnl_dereference(dev->qdisc); } if (!q) { NL_SET_ERR_MSG(extack, "Cannot find specified qdisc on specified device"); @@ -1549,7 +1549,7 @@ replay: q = dev_ingress_queue(dev)->qdisc_sleeping; } } else { - q = dev->qdisc; + q = rtnl_dereference(dev->qdisc); }
/* It may be default qdisc, ignore it */ @@ -1771,7 +1771,8 @@ static int tc_dump_qdisc(struct sk_buff s_q_idx = 0; q_idx = 0;
- if (tc_dump_qdisc_root(dev->qdisc, skb, cb, &q_idx, s_q_idx, + if (tc_dump_qdisc_root(rtnl_dereference(dev->qdisc), + skb, cb, &q_idx, s_q_idx, true, tca[TCA_DUMP_INVISIBLE]) < 0) goto done;
@@ -2042,7 +2043,7 @@ static int tc_ctl_tclass(struct sk_buff } else if (qid1) { qid = qid1; } else if (qid == 0) - qid = dev->qdisc->handle; + qid = rtnl_dereference(dev->qdisc)->handle;
/* Now qid is genuine qdisc handle consistent * both with parent and child. @@ -2053,7 +2054,7 @@ static int tc_ctl_tclass(struct sk_buff portid = TC_H_MAKE(qid, portid); } else { if (qid == 0) - qid = dev->qdisc->handle; + qid = rtnl_dereference(dev->qdisc)->handle; }
/* OK. Locate qdisc */ @@ -2214,7 +2215,8 @@ static int tc_dump_tclass(struct sk_buff s_t = cb->args[0]; t = 0;
- if (tc_dump_tclass_root(dev->qdisc, skb, tcm, cb, &t, s_t, true) < 0) + if (tc_dump_tclass_root(rtnl_dereference(dev->qdisc), + skb, tcm, cb, &t, s_t, true) < 0) goto done;
dev_queue = dev_ingress_queue(dev); --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -1114,30 +1114,33 @@ static void attach_default_qdiscs(struct if (!netif_is_multiqueue(dev) || dev->priv_flags & IFF_NO_QUEUE) { netdev_for_each_tx_queue(dev, attach_one_default_qdisc, NULL); - dev->qdisc = txq->qdisc_sleeping; - qdisc_refcount_inc(dev->qdisc); + qdisc = txq->qdisc_sleeping; + rcu_assign_pointer(dev->qdisc, qdisc); + qdisc_refcount_inc(qdisc); } else { qdisc = qdisc_create_dflt(txq, &mq_qdisc_ops, TC_H_ROOT, NULL); if (qdisc) { - dev->qdisc = qdisc; + rcu_assign_pointer(dev->qdisc, qdisc); qdisc->ops->attach(qdisc); } } + qdisc = rtnl_dereference(dev->qdisc);
/* Detect default qdisc setup/init failed and fallback to "noqueue" */ - if (dev->qdisc == &noop_qdisc) { + if (qdisc == &noop_qdisc) { netdev_warn(dev, "default qdisc (%s) fail, fallback to %s\n", default_qdisc_ops->id, noqueue_qdisc_ops.id); dev->priv_flags |= IFF_NO_QUEUE; netdev_for_each_tx_queue(dev, attach_one_default_qdisc, NULL); - dev->qdisc = txq->qdisc_sleeping; - qdisc_refcount_inc(dev->qdisc); + qdisc = txq->qdisc_sleeping; + rcu_assign_pointer(dev->qdisc, qdisc); + qdisc_refcount_inc(qdisc); dev->priv_flags ^= IFF_NO_QUEUE; }
#ifdef CONFIG_NET_SCHED - if (dev->qdisc != &noop_qdisc) - qdisc_hash_add(dev->qdisc, false); + if (qdisc != &noop_qdisc) + qdisc_hash_add(qdisc, false); #endif }
@@ -1167,7 +1170,7 @@ void dev_activate(struct net_device *dev * and noqueue_qdisc for virtual interfaces */
- if (dev->qdisc == &noop_qdisc) + if (rtnl_dereference(dev->qdisc) == &noop_qdisc) attach_default_qdiscs(dev);
if (!netif_carrier_ok(dev)) @@ -1333,7 +1336,7 @@ static int qdisc_change_tx_queue_len(str void dev_qdisc_change_real_num_tx(struct net_device *dev, unsigned int new_real_tx) { - struct Qdisc *qdisc = dev->qdisc; + struct Qdisc *qdisc = rtnl_dereference(dev->qdisc);
if (qdisc->ops->change_real_num_tx) qdisc->ops->change_real_num_tx(qdisc, new_real_tx); @@ -1373,7 +1376,7 @@ static void dev_init_scheduler_queue(str
void dev_init_scheduler(struct net_device *dev) { - dev->qdisc = &noop_qdisc; + rcu_assign_pointer(dev->qdisc, &noop_qdisc); netdev_for_each_tx_queue(dev, dev_init_scheduler_queue, &noop_qdisc); if (dev_ingress_queue(dev)) dev_init_scheduler_queue(dev, dev_ingress_queue(dev), &noop_qdisc); @@ -1401,8 +1404,8 @@ void dev_shutdown(struct net_device *dev netdev_for_each_tx_queue(dev, shutdown_scheduler_queue, &noop_qdisc); if (dev_ingress_queue(dev)) shutdown_scheduler_queue(dev, dev_ingress_queue(dev), &noop_qdisc); - qdisc_put(dev->qdisc); - dev->qdisc = &noop_qdisc; + qdisc_put(rtnl_dereference(dev->qdisc)); + rcu_assign_pointer(dev->qdisc, &noop_qdisc);
WARN_ON(timer_pending(&dev->watchdog_timer)); }
From: Eric Dumazet edumazet@google.com
commit 9ceaf6f76b203682bb6100e14b3d7da4c0bedde8 upstream.
syzbot reported that two threads might write over agg_select_timer at the same time. Make agg_select_timer atomic to fix the races.
BUG: KCSAN: data-race in bond_3ad_initiate_agg_selection / bond_3ad_state_machine_handler
read to 0xffff8881242aea90 of 4 bytes by task 1846 on cpu 1: bond_3ad_state_machine_handler+0x99/0x2810 drivers/net/bonding/bond_3ad.c:2317 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307 worker_thread+0x616/0xa70 kernel/workqueue.c:2454 kthread+0x1bf/0x1e0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30
write to 0xffff8881242aea90 of 4 bytes by task 25910 on cpu 0: bond_3ad_initiate_agg_selection+0x18/0x30 drivers/net/bonding/bond_3ad.c:1998 bond_open+0x658/0x6f0 drivers/net/bonding/bond_main.c:3967 __dev_open+0x274/0x3a0 net/core/dev.c:1407 dev_open+0x54/0x190 net/core/dev.c:1443 bond_enslave+0xcef/0x3000 drivers/net/bonding/bond_main.c:1937 do_set_master net/core/rtnetlink.c:2532 [inline] do_setlink+0x94f/0x2500 net/core/rtnetlink.c:2736 __rtnl_newlink net/core/rtnetlink.c:3414 [inline] rtnl_newlink+0xfeb/0x13e0 net/core/rtnetlink.c:3529 rtnetlink_rcv_msg+0x745/0x7e0 net/core/rtnetlink.c:5594 netlink_rcv_skb+0x14e/0x250 net/netlink/af_netlink.c:2494 rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:5612 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x602/0x6d0 net/netlink/af_netlink.c:1343 netlink_sendmsg+0x728/0x850 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2413 ___sys_sendmsg net/socket.c:2467 [inline] __sys_sendmsg+0x195/0x230 net/socket.c:2496 __do_sys_sendmsg net/socket.c:2505 [inline] __se_sys_sendmsg net/socket.c:2503 [inline] __x64_sys_sendmsg+0x42/0x50 net/socket.c:2503 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x00000050 -> 0x0000004f
Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 25910 Comm: syz-executor.1 Tainted: G W 5.17.0-rc4-syzkaller-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Cc: Jay Vosburgh j.vosburgh@gmail.com Cc: Veaceslav Falico vfalico@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_3ad.c | 30 +++++++++++++++++++++++++----- include/net/bond_3ad.h | 2 +- 2 files changed, 26 insertions(+), 6 deletions(-)
--- a/drivers/net/bonding/bond_3ad.c +++ b/drivers/net/bonding/bond_3ad.c @@ -225,7 +225,7 @@ static inline int __check_agg_selection_ if (bond == NULL) return 0;
- return BOND_AD_INFO(bond).agg_select_timer ? 1 : 0; + return atomic_read(&BOND_AD_INFO(bond).agg_select_timer) ? 1 : 0; }
/** @@ -1995,7 +1995,7 @@ static void ad_marker_response_received( */ void bond_3ad_initiate_agg_selection(struct bonding *bond, int timeout) { - BOND_AD_INFO(bond).agg_select_timer = timeout; + atomic_set(&BOND_AD_INFO(bond).agg_select_timer, timeout); }
/** @@ -2279,6 +2279,28 @@ void bond_3ad_update_ad_actor_settings(s }
/** + * bond_agg_timer_advance - advance agg_select_timer + * @bond: bonding structure + * + * Return true when agg_select_timer reaches 0. + */ +static bool bond_agg_timer_advance(struct bonding *bond) +{ + int val, nval; + + while (1) { + val = atomic_read(&BOND_AD_INFO(bond).agg_select_timer); + if (!val) + return false; + nval = val - 1; + if (atomic_cmpxchg(&BOND_AD_INFO(bond).agg_select_timer, + val, nval) == val) + break; + } + return nval == 0; +} + +/** * bond_3ad_state_machine_handler - handle state machines timeout * @work: work context to fetch bonding struct to work on from * @@ -2313,9 +2335,7 @@ void bond_3ad_state_machine_handler(stru if (!bond_has_slaves(bond)) goto re_arm;
- /* check if agg_select_timer timer after initialize is timed out */ - if (BOND_AD_INFO(bond).agg_select_timer && - !(--BOND_AD_INFO(bond).agg_select_timer)) { + if (bond_agg_timer_advance(bond)) { slave = bond_first_slave_rcu(bond); port = slave ? &(SLAVE_AD_INFO(slave)->port) : NULL;
--- a/include/net/bond_3ad.h +++ b/include/net/bond_3ad.h @@ -262,7 +262,7 @@ struct ad_system { struct ad_bond_info { struct ad_system system; /* 802.3ad system structure */ struct bond_3ad_stats stats; - u32 agg_select_timer; /* Timer to select aggregator after all adapter's hand shakes */ + atomic_t agg_select_timer; /* Timer to select aggregator after all adapter's hand shakes */ u16 aggregator_identifier; };
From: Kees Cook keescook@chromium.org
commit 52a9dab6d892763b2a8334a568bd4e2c1a6fde66 upstream.
GCC 12 correctly reports a potential use-after-free condition in the xrealloc helper. Fix the warning by avoiding an implicit "free(ptr)" when size == 0:
In file included from help.c:12: In function 'xrealloc', inlined from 'add_cmdname' at help.c:24:2: subcmd-util.h:56:23: error: pointer may be used after 'realloc' [-Werror=use-after-free] 56 | ret = realloc(ptr, size); | ^~~~~~~~~~~~~~~~~~ subcmd-util.h:52:21: note: call to 'realloc' here 52 | void *ret = realloc(ptr, size); | ^~~~~~~~~~~~~~~~~~ subcmd-util.h:58:31: error: pointer may be used after 'realloc' [-Werror=use-after-free] 58 | ret = realloc(ptr, 1); | ^~~~~~~~~~~~~~~ subcmd-util.h:52:21: note: call to 'realloc' here 52 | void *ret = realloc(ptr, size); | ^~~~~~~~~~~~~~~~~~
Fixes: 2f4ce5ec1d447beb ("perf tools: Finalize subcmd independence") Reported-by: Valdis Klētnieks valdis.kletnieks@vt.edu Signed-off-by: Kees Kook keescook@chromium.org Tested-by: Valdis Klētnieks valdis.kletnieks@vt.edu Tested-by: Justin M. Forbes jforbes@fedoraproject.org Acked-by: Josh Poimboeuf jpoimboe@redhat.com Cc: linux-hardening@vger.kernel.org Cc: Valdis Klētnieks valdis.kletnieks@vt.edu Link: http://lore.kernel.org/lkml/20220213182443.4037039-1-keescook@chromium.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/lib/subcmd/subcmd-util.h | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-)
--- a/tools/lib/subcmd/subcmd-util.h +++ b/tools/lib/subcmd/subcmd-util.h @@ -50,15 +50,8 @@ static NORETURN inline void die(const ch static inline void *xrealloc(void *ptr, size_t size) { void *ret = realloc(ptr, size); - if (!ret && !size) - ret = realloc(ptr, 1); - if (!ret) { - ret = realloc(ptr, size); - if (!ret && !size) - ret = realloc(ptr, 1); - if (!ret) - die("Out of memory, realloc failed"); - } + if (!ret) + die("Out of memory, realloc failed"); return ret; }
From: Wen Gu guwen@linux.alibaba.com
commit 1de9770d121ee9294794cca0e0be8fbfa0134ee8 upstream.
The callback functions of clcsock will be saved and replaced during the fallback. But if the fallback happens more than once, then the copies of these callback functions will be overwritten incorrectly, resulting in a loop call issue:
clcsk->sk_error_report |- smc_fback_error_report() <------------------------------| |- smc_fback_forward_wakeup() | (loop) |- clcsock_callback() (incorrectly overwritten) | |- smc->clcsk_error_report() ------------------|
So this patch fixes the issue by saving these function pointers only once in the fallback and avoiding overwriting.
Reported-by: syzbot+4de3c0e8a263e1e499bc@syzkaller.appspotmail.com Fixes: 341adeec9ada ("net/smc: Forward wakeup to smc socket waitqueue after fallback") Link: https://lore.kernel.org/r/0000000000006d045e05d78776f6@google.com Signed-off-by: Wen Gu guwen@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/smc/af_smc.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
--- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -649,14 +649,17 @@ static void smc_fback_error_report(struc static int smc_switch_to_fallback(struct smc_sock *smc, int reason_code) { struct sock *clcsk; + int rc = 0;
mutex_lock(&smc->clcsock_release_lock); if (!smc->clcsock) { - mutex_unlock(&smc->clcsock_release_lock); - return -EBADF; + rc = -EBADF; + goto out; } clcsk = smc->clcsock->sk;
+ if (smc->use_fallback) + goto out; smc->use_fallback = true; smc->fallback_rsn = reason_code; smc_stat_fallback(smc); @@ -683,8 +686,9 @@ static int smc_switch_to_fallback(struct smc->clcsock->sk->sk_user_data = (void *)((uintptr_t)smc | SK_USER_DATA_NOCOPY); } +out: mutex_unlock(&smc->clcsock_release_lock); - return 0; + return rc; }
/* fall back during connect */
From: DENG Qingfang dqfext@gmail.com
commit 525b108e6d95b643eccbd84fb10aa9aa101b18dd upstream.
The function mt7531_phy_mode_supported in the DSA driver set supported mode to PHY_INTERFACE_MODE_GMII instead of PHY_INTERFACE_MODE_INTERNAL for the internal PHY, so this check breaks the PHY initialization:
mt7530 mdio-bus:00 wan (uninitialized): failed to connect to PHY: -EINVAL
Remove the check to make it work again.
Reported-by: Hauke Mehrtens hauke@hauke-m.de Fixes: e40d2cca0189 ("net: phy: add MediaTek Gigabit Ethernet PHY driver") Signed-off-by: DENG Qingfang dqfext@gmail.com Acked-by: Arınç ÜNAL arinc.unal@arinc9.com Tested-by: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/mediatek-ge.c | 3 --- 1 file changed, 3 deletions(-)
--- a/drivers/net/phy/mediatek-ge.c +++ b/drivers/net/phy/mediatek-ge.c @@ -55,9 +55,6 @@ static int mt7530_phy_config_init(struct
static int mt7531_phy_config_init(struct phy_device *phydev) { - if (phydev->interface != PHY_INTERFACE_MODE_INTERNAL) - return -EINVAL; - mtk_gephy_config_init(phydev);
/* PHY link down power saving enable */
From: Gatis Peisenieks gatis@mikrotik.com
commit bf8e59fd315f304eb538546e35de6dc603e4709f upstream.
If NIC had packets in tx queue at the moment link down event happened, it could result in tx timeout when link got back up.
Since device has more than one tx queue we need to reset them accordingly.
Fixes: 057f4af2b171 ("atl1c: add 4 RX/TX queue support for Mikrotik 10/25G NIC") Signed-off-by: Gatis Peisenieks gatis@mikrotik.com Link: https://lore.kernel.org/r/20220211065123.4187615-1-gatis@mikrotik.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c @@ -900,7 +900,7 @@ static void atl1c_clean_tx_ring(struct a atl1c_clean_buffer(pdev, buffer_info); }
- netdev_reset_queue(adapter->netdev); + netdev_tx_reset_queue(netdev_get_tx_queue(adapter->netdev, queue));
/* Zero out Tx-buffers */ memset(tpd_ring->desc, 0, sizeof(struct atl1c_tpd_desc) *
From: Jon Maloy jmaloy@redhat.com
commit 032062f363b4bf02b1d547f329aa5d97b6a17410 upstream.
When a link comes up we add its presence to the name table to make it possible for users to subscribe for link up/down events. However, after a previous call signature change the binding is wrongly published with the peer node as publishing node, instead of the own node as it should be. This has the effect that the command 'tipc name table show' will list the link binding (service type 2) with node scope and a peer node as originator, something that obviously is impossible.
We correct this bug here.
Fixes: 50a3499ab853 ("tipc: simplify signature of tipc_namtbl_publish()") Signed-off-by: Jon Maloy jmaloy@redhat.com Link: https://lore.kernel.org/r/20220214013852.2803940-1-jmaloy@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/node.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -413,7 +413,7 @@ static void tipc_node_write_unlock(struc tipc_uaddr(&ua, TIPC_SERVICE_RANGE, TIPC_NODE_SCOPE, TIPC_LINK_STATE, n->addr, n->addr); sk.ref = n->link_id; - sk.node = n->addr; + sk.node = tipc_own_addr(net); bearer_id = n->link_id & 0xffff; publ_list = &n->publ_list;
From: Tom Rix trix@redhat.com
commit 2a36ed7c1cd55742503bed81d2cc0ea83bd0ad0c upstream.
Clang static analysis reports this representative problem dpaa2-switch-flower.c:616:24: warning: The right operand of '==' is a garbage value tmp->cfg.vlan_id == vlan) { ^ ~~~~ vlan is set in dpaa2_switch_flower_parse_mirror_key(). However this function can return success without setting vlan. So change the default return to -EOPNOTSUPP.
Fixes: 0f3faece5808 ("dpaa2-switch: add VLAN based mirroring") Signed-off-by: Tom Rix trix@redhat.com Reviewed-by: Ioana Ciornei ioana.ciornei@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/dpaa2/dpaa2-switch-flower.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch-flower.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch-flower.c @@ -532,6 +532,7 @@ static int dpaa2_switch_flower_parse_mir struct flow_rule *rule = flow_cls_offload_flow_rule(cls); struct flow_dissector *dissector = rule->match.dissector; struct netlink_ext_ack *extack = cls->common.extack; + int ret = -EOPNOTSUPP;
if (dissector->used_keys & ~(BIT(FLOW_DISSECTOR_KEY_BASIC) | @@ -561,9 +562,10 @@ static int dpaa2_switch_flower_parse_mir }
*vlan = (u16)match.key->vlan_id; + ret = 0; }
- return 0; + return ret; }
static int
From: Radu Bulie radu-andrei.bulie@nxp.com
commit 07dd44852be89386ab12210df90a2d78779f3bff upstream.
1588 Single Step Timestamping code path uses a mutex to enforce atomicity for two events: - update of ptp single step register - transmit ptp event packet
Before this patch the mutex was not initialized. This caused unexpected crashes in the Tx function.
Fixes: c55211892f463 ("dpaa2-eth: support PTP Sync packet one-step timestamping") Signed-off-by: Radu Bulie radu-andrei.bulie@nxp.com Reviewed-by: Ioana Ciornei ioana.ciornei@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c @@ -4329,7 +4329,7 @@ static int dpaa2_eth_probe(struct fsl_mc }
INIT_WORK(&priv->tx_onestep_tstamp, dpaa2_eth_tx_onestep_tstamp); - + mutex_init(&priv->onestep_tstamp_lock); skb_queue_head_init(&priv->tx_skbs);
priv->rx_copybreak = DPAA2_ETH_DEFAULT_COPYBREAK;
From: Oleksandr Mazur oleksandr.mazur@plvision.eu
commit c832962ac972082b3a1f89775c9d4274c8cb5670 upstream.
Whenever bridge driver hits the max capacity of MDBs, it disables the MC processing (by setting corresponding bridge option), but never notifies switchdev about such change (the notifiers are called only upon explicit setting of this option, through the registered netlink interface).
This could lead to situation when Software MDB processing gets disabled, but this event never gets offloaded to the underlying Hardware.
Fix this by adding a notify message in such case.
Fixes: 147c1e9b902c ("switchdev: bridge: Offload multicast disabled") Signed-off-by: Oleksandr Mazur oleksandr.mazur@plvision.eu Acked-by: Nikolay Aleksandrov nikolay@nvidia.com Link: https://lore.kernel.org/r/20220215165303.31908-1-oleksandr.mazur@plvision.eu Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bridge/br_multicast.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -82,6 +82,9 @@ static void br_multicast_find_del_pg(str struct net_bridge_port_group *pg); static void __br_multicast_stop(struct net_bridge_mcast *brmctx);
+static int br_mc_disabled_update(struct net_device *dev, bool value, + struct netlink_ext_ack *extack); + static struct net_bridge_port_group * br_sg_port_find(struct net_bridge *br, struct net_bridge_port_group_sg_key *sg_p) @@ -1156,6 +1159,7 @@ struct net_bridge_mdb_entry *br_multicas return mp;
if (atomic_read(&br->mdb_hash_tbl.nelems) >= br->hash_max) { + br_mc_disabled_update(br->dev, false, NULL); br_opt_toggle(br, BROPT_MULTICAST_ENABLED, false); return ERR_PTR(-E2BIG); }
From: Arnaldo Carvalho de Melo acme@redhat.com
commit 31ded1535e3182778a1d0e5c32711f55da3bc512 upstream.
This was detected by the gcc in Fedora Rawhide's gcc:
50 11.01 fedora:rawhide : FAIL gcc version 12.0.1 20220205 (Red Hat 12.0.1-0) (GCC) inlined from 'bpf__config_obj' at util/bpf-loader.c:1242:9: util/bpf-loader.c:1225:34: error: pointer 'map_opt' may be used after 'free' [-Werror=use-after-free] 1225 | *key_scan_pos += strlen(map_opt); | ^~~~~~~~~~~~~~~ util/bpf-loader.c:1223:9: note: call to 'free' here 1223 | free(map_name); | ^~~~~~~~~~~~~~ cc1: all warnings being treated as errors
So do the calculations on the pointer before freeing it.
Fixes: 04f9bf2bac72480c ("perf bpf-loader: Add missing '*' for key_scan_pos") Cc: Adrian Hunter adrian.hunter@intel.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Wang ShaoBo bobo.shaobowang@huawei.com Link: https://lore.kernel.org/lkml/Yg1VtQxKrPpS3uNA@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/util/bpf-loader.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/tools/perf/util/bpf-loader.c +++ b/tools/perf/util/bpf-loader.c @@ -1214,9 +1214,10 @@ bpf__obj_config_map(struct bpf_object *o pr_debug("ERROR: Invalid map config option '%s'\n", map_opt); err = -BPF_LOADER_ERRNO__OBJCONF_MAP_OPT; out: - free(map_name); if (!err) *key_scan_pos += strlen(map_opt); + + free(map_name); return err; }
From: Muhammad Usama Anjum usama.anjum@collabora.com
commit a7e793a867ae312cecdeb6f06cceff98263e75dd upstream.
non-regular file needs to be compiled and then copied to the output directory. Remove it from TEST_PROGS and add it to TEST_GEN_PROGS. This removes error thrown by rsync when non-regular object isn't found:
rsync: [sender] link_stat "/linux/tools/testing/selftests/exec/non-regular" failed: No such file or directory (2) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
Fixes: 0f71241a8e32 ("selftests/exec: add file type errno tests") Reported-by: "kernelci.org bot" bot@kernelci.org Signed-off-by: Muhammad Usama Anjum usama.anjum@collabora.com Reviewed-by: Shuah Khan skhan@linuxfoundation.org Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/exec/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/exec/Makefile +++ b/tools/testing/selftests/exec/Makefile @@ -3,8 +3,8 @@ CFLAGS = -Wall CFLAGS += -Wno-nonnull CFLAGS += -D_GNU_SOURCE
-TEST_PROGS := binfmt_script non-regular -TEST_GEN_PROGS := execveat load_address_4096 load_address_2097152 load_address_16777216 +TEST_PROGS := binfmt_script +TEST_GEN_PROGS := execveat load_address_4096 load_address_2097152 load_address_16777216 non-regular TEST_GEN_FILES := execveat.symlink execveat.denatured script subdir # Makefile is a run-time dependency, since it's accessed by the execveat test TEST_FILES := Makefile
From: Joakim Tjernlund joakim.tjernlund@infinera.com
commit 4f6de676d94ee8ddfc2e7e7cd935fc7cb2feff3a upstream.
In commit:
114945d84a30a5fe ("arm64: Fix labels in el2_setup macros")
We renamed a label from '1' to '.Lskip_gicv3_@', but failed to update a branch to it, which now targets a later label also called '1'.
The branch is taken rarely, when GICv3 is present but SRE is disabled at EL3, causing a boot-time crash.
Update the caller to the new label name.
Fixes: 114945d84a30 ("arm64: Fix labels in el2_setup macros") Cc: stable@vger.kernel.org # 5.12.x Signed-off-by: Joakim Tjernlund joakim.tjernlund@infinera.com Link: https://lore.kernel.org/r/20220214175643.21931-1-joakim.tjernlund@infinera.c... Reviewed-by: Mark Rutland mark.rutland@arm.com Reviewed-by: Marc Zyngier maz@kernel.org Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/el2_setup.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/include/asm/el2_setup.h +++ b/arch/arm64/include/asm/el2_setup.h @@ -106,7 +106,7 @@ msr_s SYS_ICC_SRE_EL2, x0 isb // Make sure SRE is now set mrs_s x0, SYS_ICC_SRE_EL2 // Read SRE back, - tbz x0, #0, 1f // and check that it sticks + tbz x0, #0, .Lskip_gicv3_@ // and check that it sticks msr_s SYS_ICH_HCR_EL2, xzr // Reset ICC_HCR_EL2 to defaults .Lskip_gicv3_@: .endm
From: Matteo Martelli matteomartelli3@gmail.com
commit 19d20c7a29bf2e46ff1ab8e8c4fcd2da8a4f38e2 upstream.
Commit 83b7dcbc51c930fc2079ab6c6fc9d719768321f1 introduced a generic implicit feedback parser, which fails to execute for M-Audio FastTrack Ultra sound cards. The issue is with the ENDPOINT_SYNCTYPE check in add_generic_implicit_fb() where the SYNCTYPE is ADAPTIVE instead of ASYNC. The reason is that the sync type of the FastTrack output endpoints are set to adaptive in the quirks table since commit 65f04443c96dbda11b8fff21d6390e082846aa3c.
Fixes: 83b7dcbc51c9 ("ALSA: usb-audio: Add generic implicit fb parsing") Signed-off-by: Matteo Martelli matteomartelli3@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220211224913.20683-2-matteomartelli3@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/implicit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/sound/usb/implicit.c +++ b/sound/usb/implicit.c @@ -47,13 +47,13 @@ struct snd_usb_implicit_fb_match { static const struct snd_usb_implicit_fb_match playback_implicit_fb_quirks[] = { /* Generic matching */ IMPLICIT_FB_GENERIC_DEV(0x0499, 0x1509), /* Steinberg UR22 */ - IMPLICIT_FB_GENERIC_DEV(0x0763, 0x2080), /* M-Audio FastTrack Ultra */ - IMPLICIT_FB_GENERIC_DEV(0x0763, 0x2081), /* M-Audio FastTrack Ultra */ IMPLICIT_FB_GENERIC_DEV(0x0763, 0x2030), /* M-Audio Fast Track C400 */ IMPLICIT_FB_GENERIC_DEV(0x0763, 0x2031), /* M-Audio Fast Track C600 */
/* Fixed EP */ /* FIXME: check the availability of generic matching */ + IMPLICIT_FB_FIXED_DEV(0x0763, 0x2080, 0x81, 2), /* M-Audio FastTrack Ultra */ + IMPLICIT_FB_FIXED_DEV(0x0763, 0x2081, 0x81, 2), /* M-Audio FastTrack Ultra */ IMPLICIT_FB_FIXED_DEV(0x2466, 0x8010, 0x81, 2), /* Fractal Audio Axe-Fx III */ IMPLICIT_FB_FIXED_DEV(0x31e9, 0x0001, 0x81, 2), /* Solid State Logic SSL2 */ IMPLICIT_FB_FIXED_DEV(0x31e9, 0x0002, 0x81, 2), /* Solid State Logic SSL2+ */
From: Yu Huang diwang90@gmail.com
commit c07f2c7b45413a9e50ba78630fda04ecfa17b4f2 upstream.
Legion Y9000X 2019 has the same speaker with Y9000X 2020, but with a different quirk address. Add one quirk entry to make the speaker work on Y9000X 2019 too.
Signed-off-by: Yu Huang diwang90@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220212160835.165065-1-diwang90@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9013,6 +9013,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x17aa, 0x3824, "Legion Y9000X 2020", ALC285_FIXUP_LEGION_Y9000X_SPEAKERS), SND_PCI_QUIRK(0x17aa, 0x3827, "Ideapad S740", ALC285_FIXUP_IDEAPAD_S740_COEF), SND_PCI_QUIRK(0x17aa, 0x3834, "Lenovo IdeaPad Slim 9i 14ITL5", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS), + SND_PCI_QUIRK(0x17aa, 0x383d, "Legion Y9000X 2019", ALC285_FIXUP_LEGION_Y9000X_SPEAKERS), SND_PCI_QUIRK(0x17aa, 0x3843, "Yoga 9i", ALC287_FIXUP_IDEAPAD_BASS_SPK_AMP), SND_PCI_QUIRK(0x17aa, 0x384a, "Lenovo Yoga 7 15ITL5", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS), SND_PCI_QUIRK(0x17aa, 0x3852, "Lenovo Yoga 7 14ITL5", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS),
From: Takashi Iwai tiwai@suse.de
commit 2a845837e3d0ddaed493b4c5c4643d7f0542804d upstream.
The recently introduced coef_mutex for Realtek codec seems causing a deadlock when the relevant code is invoked from the power-off state; then the HD-audio core tries to power-up internally, and this kicks off the codec runtime PM code that tries to take the same coef_mutex.
In order to avoid the deadlock, do the temporary power up/down around the coef_mutex acquisition and release. This assures that the power-up sequence runs before the mutex, hence no re-entrance will happen.
Fixes: b837a9f5ab3b ("ALSA: hda: realtek: Fix race at concurrent COEF updates") Reported-and-tested-by: Julian Wollrath jwollrath@web.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220214132838.4db10fca@schienar Link: https://lore.kernel.org/r/20220214130410.21230-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 39 ++++++++++++++++++++++++--------------- 1 file changed, 24 insertions(+), 15 deletions(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -133,6 +133,22 @@ struct alc_spec { * COEF access helper functions */
+static void coef_mutex_lock(struct hda_codec *codec) +{ + struct alc_spec *spec = codec->spec; + + snd_hda_power_up_pm(codec); + mutex_lock(&spec->coef_mutex); +} + +static void coef_mutex_unlock(struct hda_codec *codec) +{ + struct alc_spec *spec = codec->spec; + + mutex_unlock(&spec->coef_mutex); + snd_hda_power_down_pm(codec); +} + static int __alc_read_coefex_idx(struct hda_codec *codec, hda_nid_t nid, unsigned int coef_idx) { @@ -146,12 +162,11 @@ static int __alc_read_coefex_idx(struct static int alc_read_coefex_idx(struct hda_codec *codec, hda_nid_t nid, unsigned int coef_idx) { - struct alc_spec *spec = codec->spec; unsigned int val;
- mutex_lock(&spec->coef_mutex); + coef_mutex_lock(codec); val = __alc_read_coefex_idx(codec, nid, coef_idx); - mutex_unlock(&spec->coef_mutex); + coef_mutex_unlock(codec); return val; }
@@ -168,11 +183,9 @@ static void __alc_write_coefex_idx(struc static void alc_write_coefex_idx(struct hda_codec *codec, hda_nid_t nid, unsigned int coef_idx, unsigned int coef_val) { - struct alc_spec *spec = codec->spec; - - mutex_lock(&spec->coef_mutex); + coef_mutex_lock(codec); __alc_write_coefex_idx(codec, nid, coef_idx, coef_val); - mutex_unlock(&spec->coef_mutex); + coef_mutex_unlock(codec); }
#define alc_write_coef_idx(codec, coef_idx, coef_val) \ @@ -193,11 +206,9 @@ static void alc_update_coefex_idx(struct unsigned int coef_idx, unsigned int mask, unsigned int bits_set) { - struct alc_spec *spec = codec->spec; - - mutex_lock(&spec->coef_mutex); + coef_mutex_lock(codec); __alc_update_coefex_idx(codec, nid, coef_idx, mask, bits_set); - mutex_unlock(&spec->coef_mutex); + coef_mutex_unlock(codec); }
#define alc_update_coef_idx(codec, coef_idx, mask, bits_set) \ @@ -230,9 +241,7 @@ struct coef_fw { static void alc_process_coef_fw(struct hda_codec *codec, const struct coef_fw *fw) { - struct alc_spec *spec = codec->spec; - - mutex_lock(&spec->coef_mutex); + coef_mutex_lock(codec); for (; fw->nid; fw++) { if (fw->mask == (unsigned short)-1) __alc_write_coefex_idx(codec, fw->nid, fw->idx, fw->val); @@ -240,7 +249,7 @@ static void alc_process_coef_fw(struct h __alc_update_coefex_idx(codec, fw->nid, fw->idx, fw->mask, fw->val); } - mutex_unlock(&spec->coef_mutex); + coef_mutex_unlock(codec); }
/*
From: Takashi Iwai tiwai@suse.de
commit 6317f7449348a897483a2b4841f7a9190745c81b upstream.
The forced probe mask via probe_mask 0x100 bit doesn't work any longer as expected since the bus init code was moved and it's clearing the codec_mask value that was set beforehand. This patch fixes the long-time regression by moving the check_probe_mask() call.
Fixes: a41d122449be ("ALSA: hda - Embed bus into controller object") Reported-by: dmummenschanz@web.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/trinity-f018660b-95c9-442b-a2a8-c92a56eb07ed-16443... Link: https://lore.kernel.org/r/20220214100020.8870-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/hda_intel.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1794,8 +1794,6 @@ static int azx_create(struct snd_card *c
assign_position_fix(chip, check_position_fix(chip, position_fix[dev]));
- check_probe_mask(chip, dev); - if (single_cmd < 0) /* allow fallback to single_cmd at errors */ chip->fallback_to_single_cmd = 1; else /* explicitly set to single_cmd or not */ @@ -1821,6 +1819,8 @@ static int azx_create(struct snd_card *c chip->bus.core.needs_damn_long_delay = 1; }
+ check_probe_mask(chip, dev); + err = snd_device_new(card, SNDRV_DEV_LOWLEVEL, chip, &ops); if (err < 0) { dev_err(card->dev, "Error creating device [card]!\n");
From: Takashi Iwai tiwai@suse.de
commit dd8e5b161d7fb9cefa1f1d6e35a39b9e1563c8d3 upstream.
By some unknown reason, BIOS on Shenker Dock 15 doesn't set up the codec mask properly for the onboard audio. Let's set the forced codec mask to enable the codec discovery.
Reported-by: dmummenschanz@web.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/trinity-f018660b-95c9-442b-a2a8-c92a56eb07ed-16443... Link: https://lore.kernel.org/r/20220214100020.8870-2-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/hda_intel.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1611,6 +1611,7 @@ static const struct snd_pci_quirk probe_ /* forced codec slots */ SND_PCI_QUIRK(0x1043, 0x1262, "ASUS W5Fm", 0x103), SND_PCI_QUIRK(0x1046, 0x1262, "ASUS W5F", 0x103), + SND_PCI_QUIRK(0x1558, 0x0351, "Schenker Dock 15", 0x105), /* WinFast VP200 H (Teradici) user reported broken communication */ SND_PCI_QUIRK(0x3a21, 0x040d, "WinFast VP200 H", 0x101), {}
From: Mark Brown broonie@kernel.org
commit 564778d7b1ea465f9487eedeece7527a033549c5 upstream.
When writing out a stereo control we discard the change notification from the first channel, meaning that events are only generated based on changes to the second channel. Ensure that we report a change if either channel has changed.
Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220201155629.120510-2-broonie@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/soc-ops.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
--- a/sound/soc/soc-ops.c +++ b/sound/soc/soc-ops.c @@ -308,7 +308,7 @@ int snd_soc_put_volsw(struct snd_kcontro unsigned int sign_bit = mc->sign_bit; unsigned int mask = (1 << fls(max)) - 1; unsigned int invert = mc->invert; - int err; + int err, ret; bool type_2r = false; unsigned int val2 = 0; unsigned int val, val_mask; @@ -350,12 +350,18 @@ int snd_soc_put_volsw(struct snd_kcontro err = snd_soc_component_update_bits(component, reg, val_mask, val); if (err < 0) return err; + ret = err;
- if (type_2r) + if (type_2r) { err = snd_soc_component_update_bits(component, reg2, val_mask, - val2); + val2); + /* Don't discard any error code or drop change flag */ + if (ret == 0 || err < 0) { + ret = err; + } + }
- return err; + return ret; } EXPORT_SYMBOL_GPL(snd_soc_put_volsw);
From: Mark Brown broonie@kernel.org
commit 650204ded3703b5817bd4b6a77fa47d333c4f902 upstream.
When writing out a stereo control we discard the change notification from the first channel, meaning that events are only generated based on changes to the second channel. Ensure that we report a change if either channel has changed.
Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220201155629.120510-4-broonie@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/soc-ops.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
--- a/sound/soc/soc-ops.c +++ b/sound/soc/soc-ops.c @@ -512,7 +512,7 @@ int snd_soc_put_volsw_range(struct snd_k unsigned int mask = (1 << fls(max)) - 1; unsigned int invert = mc->invert; unsigned int val, val_mask; - int ret; + int err, ret;
if (invert) val = (max - ucontrol->value.integer.value[0]) & mask; @@ -521,9 +521,10 @@ int snd_soc_put_volsw_range(struct snd_k val_mask = mask << shift; val = val << shift;
- ret = snd_soc_component_update_bits(component, reg, val_mask, val); - if (ret < 0) - return ret; + err = snd_soc_component_update_bits(component, reg, val_mask, val); + if (err < 0) + return err; + ret = err;
if (snd_soc_volsw_is_stereo(mc)) { if (invert) @@ -533,8 +534,12 @@ int snd_soc_put_volsw_range(struct snd_k val_mask = mask << shift; val = val << shift;
- ret = snd_soc_component_update_bits(component, rreg, val_mask, + err = snd_soc_component_update_bits(component, rreg, val_mask, val); + /* Don't discard any error code or drop change flag */ + if (ret == 0 || err < 0) { + ret = err; + } }
return ret;
From: Mark Brown broonie@kernel.org
commit 7f3d90a3519680dfa23e750f80bfdefc0f5eda4a upstream.
When writing out a stereo control we discard the change notification from the first channel, meaning that events are only generated based on changes to the second channel. Ensure that we report a change if either channel has changed.
Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220201155629.120510-3-broonie@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/soc-ops.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/sound/soc/soc-ops.c +++ b/sound/soc/soc-ops.c @@ -427,6 +427,7 @@ int snd_soc_put_volsw_sx(struct snd_kcon int min = mc->min; unsigned int mask = (1U << (fls(min + max) - 1)) - 1; int err = 0; + int ret; unsigned int val, val_mask;
val = ucontrol->value.integer.value[0]; @@ -443,6 +444,7 @@ int snd_soc_put_volsw_sx(struct snd_kcon err = snd_soc_component_update_bits(component, reg, val_mask, val); if (err < 0) return err; + ret = err;
if (snd_soc_volsw_is_stereo(mc)) { unsigned int val2; @@ -453,6 +455,11 @@ int snd_soc_put_volsw_sx(struct snd_kcon
err = snd_soc_component_update_bits(component, reg2, val_mask, val2); + + /* Don't discard any error code or drop change flag */ + if (ret == 0 || err < 0) { + ret = err; + } } return err; }
From: Mark Brown broonie@kernel.org
commit 2b7c46369f09c358164d31d17e5695185403185e upstream.
When writing out a stereo control we discard the change notification from the first channel, meaning that events are only generated based on changes to the second channel. Ensure that we report a change if either channel has changed.
Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220201155629.120510-5-broonie@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/soc-ops.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/sound/soc/soc-ops.c +++ b/sound/soc/soc-ops.c @@ -895,6 +895,7 @@ int snd_soc_put_xr_sx(struct snd_kcontro unsigned long mask = (1UL<<mc->nbits)-1; long max = mc->max; long val = ucontrol->value.integer.value[0]; + int ret = 0; unsigned int i;
if (val < mc->min || val > mc->max) @@ -909,9 +910,11 @@ int snd_soc_put_xr_sx(struct snd_kcontro regmask, regval); if (err < 0) return err; + if (err > 0) + ret = err; }
- return 0; + return ret; } EXPORT_SYMBOL_GPL(snd_soc_put_xr_sx);
From: Amir Goldstein amir73il@gmail.com
commit dd5a927e411836eaef44eb9b00fece615e82e242 upstream.
'setcifsacl -g <SID>' silently fails to set the group SID on server.
Actually, the bug existed since commit 438471b67963 ("CIFS: Add support for setting owner info, dos attributes, and create time"), but this fix will not apply cleanly to kernel versions <= v5.10.
Fixes: 3970acf7ddb9 ("SMB3: Add support for getting and setting SACLs") Cc: stable@vger.kernel.org # 5.11+ Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/xattr.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/cifs/xattr.c +++ b/fs/cifs/xattr.c @@ -175,11 +175,13 @@ static int cifs_xattr_set(const struct x switch (handler->flags) { case XATTR_CIFS_NTSD_FULL: aclflags = (CIFS_ACL_OWNER | + CIFS_ACL_GROUP | CIFS_ACL_DACL | CIFS_ACL_SACL); break; case XATTR_CIFS_NTSD: aclflags = (CIFS_ACL_OWNER | + CIFS_ACL_GROUP | CIFS_ACL_DACL); break; case XATTR_CIFS_ACL:
From: Christophe Leroy christophe.leroy@csgroup.eu
commit 9bb162fa26ed76031ed0e7dbc77ccea0bf977758 upstream.
Allthough kernel text is always mapped with BATs, we still have inittext mapped with pages, so TLB miss handling is required when CONFIG_DEBUG_PAGEALLOC or CONFIG_KFENCE is set.
The final solution should be to set a BAT that also maps inittext but that BAT then needs to be cleared at end of init, and it will require more changes to be able to do it properly.
As DEBUG_PAGEALLOC or KFENCE are debugging, performance is not a big deal so let's fix it simply for now to enable easy stable application.
Fixes: 035b19a15a98 ("powerpc/32s: Always map kernel text and rodata with BATs") Cc: stable@vger.kernel.org # v5.11+ Reported-by: Maxime Bizon mbizon@freebox.fr Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/aea33b4813a26bdb9378b5f273f00bd5d4abe240.163885736... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/kernel/head_book3s_32.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/powerpc/kernel/head_book3s_32.S +++ b/arch/powerpc/kernel/head_book3s_32.S @@ -421,14 +421,14 @@ InstructionTLBMiss: */ /* Get PTE (linux-style) and check access */ mfspr r3,SPRN_IMISS -#ifdef CONFIG_MODULES +#if defined(CONFIG_MODULES) || defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_KFENCE) lis r1, TASK_SIZE@h /* check if kernel address */ cmplw 0,r1,r3 #endif mfspr r2, SPRN_SDR1 li r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC | _PAGE_USER rlwinm r2, r2, 28, 0xfffff000 -#ifdef CONFIG_MODULES +#if defined(CONFIG_MODULES) || defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_KFENCE) bgt- 112f lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */ li r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC
From: Anders Roxell anders.roxell@linaro.org
commit fe663df7825811358531dc2e8a52d9eaa5e3515e upstream.
Building tinyconfig with gcc (Debian 11.2.0-16) and assembler (Debian 2.37.90.20220207) the following build error shows up:
{standard input}: Assembler messages: {standard input}:2088: Error: unrecognized opcode: `ptesync' make[3]: *** [/builds/linux/scripts/Makefile.build:287: arch/powerpc/lib/sstep.o] Error 1
Add the 'ifdef CONFIG_PPC64' around the 'ptesync' in function 'emulate_update_regs()' to like it is in 'analyse_instr()'. Since it looks like it got dropped inadvertently by commit 3cdfcbfd32b9 ("powerpc: Change analyse_instr so it doesn't modify *regs").
A key detail is that analyse_instr() will never recognise lwsync or ptesync on 32-bit (because of the existing ifdef), and as a result emulate_update_regs() should never be called with an op specifying either of those on 32-bit. So removing them from emulate_update_regs() should be a nop in terms of runtime behaviour.
Fixes: 3cdfcbfd32b9 ("powerpc: Change analyse_instr so it doesn't modify *regs") Cc: stable@vger.kernel.org # v4.14+ Suggested-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Anders Roxell anders.roxell@linaro.org [mpe: Add last paragraph of change log mentioning analyse_instr() details] Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20220211005113.1361436-1-anders.roxell@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/lib/sstep.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/powerpc/lib/sstep.c +++ b/arch/powerpc/lib/sstep.c @@ -3181,12 +3181,14 @@ void emulate_update_regs(struct pt_regs case BARRIER_EIEIO: eieio(); break; +#ifdef CONFIG_PPC64 case BARRIER_LWSYNC: asm volatile("lwsync" : : : "memory"); break; case BARRIER_PTESYNC: asm volatile("ptesync" : : : "memory"); break; +#endif } break;
From: Christian Eggers ceggers@arri.de
commit 9161f365c91614e5a3f5c6dcc44c3b1b33bc59c0 upstream.
If gpmi_nfc_apply_timings() fails, the PM runtime usage counter must be dropped.
Reported-by: Pavel Machek pavel@denx.de Fixes: f53d4c109a66 ("mtd: rawnand: gpmi: Add ERR007117 protection for nfc_apply_timings") Signed-off-by: Christian Eggers ceggers@arri.de Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20220125081619.6286-1-ceggers@arri.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c @@ -2293,7 +2293,7 @@ static int gpmi_nfc_exec_op(struct nand_ this->hw.must_apply_timings = false; ret = gpmi_nfc_apply_timings(this); if (ret) - return ret; + goto out_pm; }
dev_dbg(this->dev, "%s: %d instructions\n", __func__, op->ninstrs); @@ -2422,6 +2422,7 @@ unmap:
this->bch = false;
+out_pm: pm_runtime_mark_last_busy(this->dev); pm_runtime_put_autosuspend(this->dev);
From: Steve French stfrench@microsoft.com
commit 9405b5f8b20c2bfa6523a555279a0379640dc136 upstream.
The conversion to the new API broke the snapshot mount option due to 32 vs. 64 bit type mismatch
Fixes: 24e0a1eff9e2 ("cifs: switch to new mount api") Cc: stable@vger.kernel.org # 5.11+ Reported-by: ruckajan10@gmail.com Acked-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/fs_context.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/cifs/fs_context.c +++ b/fs/cifs/fs_context.c @@ -146,7 +146,7 @@ const struct fs_parameter_spec smb3_fs_p fsparam_u32("echo_interval", Opt_echo_interval), fsparam_u32("max_credits", Opt_max_credits), fsparam_u32("handletimeout", Opt_handletimeout), - fsparam_u32("snapshot", Opt_snapshot), + fsparam_u64("snapshot", Opt_snapshot), fsparam_u32("max_channels", Opt_max_channels),
/* Mount options which take string value */ @@ -1062,7 +1062,7 @@ static int smb3_fs_context_parse_param(s ctx->echo_interval = result.uint_32; break; case Opt_snapshot: - ctx->snapshot_time = result.uint_32; + ctx->snapshot_time = result.uint_64; break; case Opt_max_credits: if (result.uint_32 < 20 || result.uint_32 > 60000) {
From: Jon Maloy jmaloy@redhat.com
commit c08e58438d4a709fb451b6d7d33432cc9907a2a8 upstream.
The previous bug fix had an unfortunate side effect that broke distribution of binding table entries between nodes. The updated tipc_sock_addr struct is also used further down in the same function, and there the old value is still the correct one.
Fixes: 032062f363b4 ("tipc: fix wrong publisher node address in link publications") Signed-off-by: Jon Maloy jmaloy@redhat.com Link: https://lore.kernel.org/r/20220216020009.3404578-1-jmaloy@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/node.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
--- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -403,7 +403,7 @@ static void tipc_node_write_unlock(struc u32 flags = n->action_flags; struct list_head *publ_list; struct tipc_uaddr ua; - u32 bearer_id; + u32 bearer_id, node;
if (likely(!flags)) { write_unlock_bh(&n->lock); @@ -414,6 +414,7 @@ static void tipc_node_write_unlock(struc TIPC_LINK_STATE, n->addr, n->addr); sk.ref = n->link_id; sk.node = tipc_own_addr(net); + node = n->addr; bearer_id = n->link_id & 0xffff; publ_list = &n->publ_list;
@@ -423,17 +424,17 @@ static void tipc_node_write_unlock(struc write_unlock_bh(&n->lock);
if (flags & TIPC_NOTIFY_NODE_DOWN) - tipc_publ_notify(net, publ_list, sk.node, n->capabilities); + tipc_publ_notify(net, publ_list, node, n->capabilities);
if (flags & TIPC_NOTIFY_NODE_UP) - tipc_named_node_up(net, sk.node, n->capabilities); + tipc_named_node_up(net, node, n->capabilities);
if (flags & TIPC_NOTIFY_LINK_UP) { - tipc_mon_peer_up(net, sk.node, bearer_id); + tipc_mon_peer_up(net, node, bearer_id); tipc_nametbl_publish(net, &ua, &sk, sk.ref); } if (flags & TIPC_NOTIFY_LINK_DOWN) { - tipc_mon_peer_down(net, sk.node, bearer_id); + tipc_mon_peer_down(net, node, bearer_id); tipc_nametbl_withdraw(net, &ua, &sk, sk.ref); } }
From: Bart Van Assche bvanassche@acm.org
commit d77ea8226b3be23b0b45aa42851243b62a27bda1 upstream.
Commit 7252a3603015 ("scsi: ufs: Avoid busy-waiting by eliminating tag conflicts") guarantees that 'tag' is not in use by any SCSI command. Remove the check that returns early if a conflict occurs.
Link: https://lore.kernel.org/r/20211203231950.193369-6-bvanassche@acm.org Tested-by: Bean Huo beanhuo@micron.com Reviewed-by: Bean Huo beanhuo@micron.com Acked-by: Avri Altman avri.altman@wdc.com Signed-off-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/ufs/ufshcd.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
--- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -6658,11 +6658,6 @@ static int ufshcd_issue_devman_upiu_cmd( tag = req->tag; WARN_ONCE(tag < 0, "Invalid tag %d\n", tag);
- if (unlikely(test_bit(tag, &hba->outstanding_reqs))) { - err = -EBUSY; - goto out; - } - lrbp = &hba->lrb[tag]; WARN_ON(lrbp->cmd); lrbp->cmd = NULL; @@ -6730,8 +6725,8 @@ static int ufshcd_issue_devman_upiu_cmd( ufshcd_add_query_upiu_trace(hba, err ? UFS_QUERY_ERR : UFS_QUERY_COMP, (struct utp_upiu_req *)lrbp->ucd_rsp_ptr);
-out: blk_put_request(req); + out_unlock: up_read(&hba->clk_scaling_lock); return err;
From: Bart Van Assche bvanassche@acm.org
commit 945c3cca05d78351bba29fa65d93834cb7934c7b upstream.
The following deadlock has been observed on a test setup:
- All tags allocated
- The SCSI error handler calls ufshcd_eh_host_reset_handler()
- ufshcd_eh_host_reset_handler() queues work that calls ufshcd_err_handler()
- ufshcd_err_handler() locks up as follows:
Workqueue: ufs_eh_wq_0 ufshcd_err_handler.cfi_jt Call trace: __switch_to+0x298/0x5d8 __schedule+0x6cc/0xa94 schedule+0x12c/0x298 blk_mq_get_tag+0x210/0x480 __blk_mq_alloc_request+0x1c8/0x284 blk_get_request+0x74/0x134 ufshcd_exec_dev_cmd+0x68/0x640 ufshcd_verify_dev_init+0x68/0x35c ufshcd_probe_hba+0x12c/0x1cb8 ufshcd_host_reset_and_restore+0x88/0x254 ufshcd_reset_and_restore+0xd0/0x354 ufshcd_err_handler+0x408/0xc58 process_one_work+0x24c/0x66c worker_thread+0x3e8/0xa4c kthread+0x150/0x1b4 ret_from_fork+0x10/0x30
Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved request.
Link: https://lore.kernel.org/r/20211203231950.193369-10-bvanassche@acm.org Tested-by: Bean Huo beanhuo@micron.com Reviewed-by: Adrian Hunter adrian.hunter@intel.com Reviewed-by: Bean Huo beanhuo@micron.com Signed-off-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/ufs/ufshcd.c | 53 ++++++++++++---------------------------------- drivers/scsi/ufs/ufshcd.h | 2 + 2 files changed, 16 insertions(+), 39 deletions(-)
--- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -125,8 +125,9 @@ EXPORT_SYMBOL_GPL(ufshcd_dump_regs); enum { UFSHCD_MAX_CHANNEL = 0, UFSHCD_MAX_ID = 1, - UFSHCD_CMD_PER_LUN = 32, - UFSHCD_CAN_QUEUE = 32, + UFSHCD_NUM_RESERVED = 1, + UFSHCD_CMD_PER_LUN = 32 - UFSHCD_NUM_RESERVED, + UFSHCD_CAN_QUEUE = 32 - UFSHCD_NUM_RESERVED, };
/* UFSHCD error handling flags */ @@ -2185,6 +2186,7 @@ static inline int ufshcd_hba_capabilitie hba->nutrs = (hba->capabilities & MASK_TRANSFER_REQUESTS_SLOTS) + 1; hba->nutmrs = ((hba->capabilities & MASK_TASK_MANAGEMENT_REQUEST_SLOTS) >> 16) + 1; + hba->reserved_slot = hba->nutrs - 1;
/* Read crypto capabilities */ err = ufshcd_hba_init_crypto_capabilities(hba); @@ -2910,30 +2912,15 @@ static int ufshcd_wait_for_dev_cmd(struc static int ufshcd_exec_dev_cmd(struct ufs_hba *hba, enum dev_cmd_type cmd_type, int timeout) { - struct request_queue *q = hba->cmd_queue; DECLARE_COMPLETION_ONSTACK(wait); - struct request *req; + const u32 tag = hba->reserved_slot; struct ufshcd_lrb *lrbp; int err; - int tag;
- down_read(&hba->clk_scaling_lock); + /* Protects use of hba->reserved_slot. */ + lockdep_assert_held(&hba->dev_cmd.lock);
- /* - * Get free slot, sleep if slots are unavailable. - * Even though we use wait_event() which sleeps indefinitely, - * the maximum wait time is bounded by SCSI request timeout. - */ - req = blk_get_request(q, REQ_OP_DRV_OUT, 0); - if (IS_ERR(req)) { - err = PTR_ERR(req); - goto out_unlock; - } - tag = req->tag; - WARN_ONCE(tag < 0, "Invalid tag %d\n", tag); - /* Set the timeout such that the SCSI error handler is not activated. */ - req->timeout = msecs_to_jiffies(2 * timeout); - blk_mq_start_request(req); + down_read(&hba->clk_scaling_lock);
lrbp = &hba->lrb[tag]; WARN_ON(lrbp->cmd); @@ -2951,8 +2938,6 @@ static int ufshcd_exec_dev_cmd(struct uf (struct utp_upiu_req *)lrbp->ucd_rsp_ptr);
out: - blk_put_request(req); -out_unlock: up_read(&hba->clk_scaling_lock); return err; } @@ -6640,23 +6625,16 @@ static int ufshcd_issue_devman_upiu_cmd( enum dev_cmd_type cmd_type, enum query_opcode desc_op) { - struct request_queue *q = hba->cmd_queue; DECLARE_COMPLETION_ONSTACK(wait); - struct request *req; + const u32 tag = hba->reserved_slot; struct ufshcd_lrb *lrbp; int err = 0; - int tag; u8 upiu_flags;
- down_read(&hba->clk_scaling_lock); + /* Protects use of hba->reserved_slot. */ + lockdep_assert_held(&hba->dev_cmd.lock);
- req = blk_get_request(q, REQ_OP_DRV_OUT, 0); - if (IS_ERR(req)) { - err = PTR_ERR(req); - goto out_unlock; - } - tag = req->tag; - WARN_ONCE(tag < 0, "Invalid tag %d\n", tag); + down_read(&hba->clk_scaling_lock);
lrbp = &hba->lrb[tag]; WARN_ON(lrbp->cmd); @@ -6725,9 +6703,6 @@ static int ufshcd_issue_devman_upiu_cmd( ufshcd_add_query_upiu_trace(hba, err ? UFS_QUERY_ERR : UFS_QUERY_COMP, (struct utp_upiu_req *)lrbp->ucd_rsp_ptr);
- blk_put_request(req); - -out_unlock: up_read(&hba->clk_scaling_lock); return err; } @@ -9418,8 +9393,8 @@ int ufshcd_init(struct ufs_hba *hba, voi /* Configure LRB */ ufshcd_host_memory_configure(hba);
- host->can_queue = hba->nutrs; - host->cmd_per_lun = hba->nutrs; + host->can_queue = hba->nutrs - UFSHCD_NUM_RESERVED; + host->cmd_per_lun = hba->nutrs - UFSHCD_NUM_RESERVED; host->max_id = UFSHCD_MAX_ID; host->max_lun = UFS_MAX_LUNS; host->max_channel = UFSHCD_MAX_CHANNEL; --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@ -725,6 +725,7 @@ struct ufs_hba_monitor { * @capabilities: UFS Controller Capabilities * @nutrs: Transfer Request Queue depth supported by controller * @nutmrs: Task Management Queue depth supported by controller + * @reserved_slot: Used to submit device commands. Protected by @dev_cmd.lock. * @ufs_version: UFS Version to which controller complies * @vops: pointer to variant specific operations * @priv: pointer to variant specific private data @@ -813,6 +814,7 @@ struct ufs_hba { u32 capabilities; int nutrs; int nutmrs; + u32 reserved_slot; u32 ufs_version; const struct ufs_hba_variant_ops *vops; struct ufs_hba_variant_params *vps;
From: Martin Povišer povik+lin@cutebit.org
commit 307f31452078792aab94a729fce33200c6e42dc4 upstream.
Per TAS2770 datasheet there must be a 1 ms delay from reset to first command. So insert delays into the driver where appropriate.
Fixes: 1a476abc723e ("tas2770: add tas2770 smart PA kernel driver") Signed-off-by: Martin Povišer povik+lin@cutebit.org Link: https://lore.kernel.org/r/20220204095301.5554-1-povik+lin@cutebit.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/tas2770.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/sound/soc/codecs/tas2770.c +++ b/sound/soc/codecs/tas2770.c @@ -38,10 +38,12 @@ static void tas2770_reset(struct tas2770 gpiod_set_value_cansleep(tas2770->reset_gpio, 0); msleep(20); gpiod_set_value_cansleep(tas2770->reset_gpio, 1); + usleep_range(1000, 2000); }
snd_soc_component_write(tas2770->component, TAS2770_SW_RST, TAS2770_RST); + usleep_range(1000, 2000); }
static int tas2770_set_bias_level(struct snd_soc_component *component, @@ -110,6 +112,7 @@ static int tas2770_codec_resume(struct s
if (tas2770->sdz_gpio) { gpiod_set_value_cansleep(tas2770->sdz_gpio, 1); + usleep_range(1000, 2000); } else { ret = snd_soc_component_update_bits(component, TAS2770_PWR_CTRL, TAS2770_PWR_CTRL_MASK, @@ -510,8 +513,10 @@ static int tas2770_codec_probe(struct sn
tas2770->component = component;
- if (tas2770->sdz_gpio) + if (tas2770->sdz_gpio) { gpiod_set_value_cansleep(tas2770->sdz_gpio, 1); + usleep_range(1000, 2000); + }
tas2770_reset(tas2770);
From: Stephen Boyd swboyd@chromium.org
commit c8d251f51ee61df06ee0e419348d8c9160bbfb86 upstream.
In commit da0363f7bfd3 ("ASoC: qcom: Fix for DMA interrupt clear reg overwriting") we changed regmap_write() to regmap_update_bits() so that we can avoid overwriting bits that we didn't intend to modify. Unfortunately this change breaks the case where a register is writable but not readable, which is exactly how the HDMI irq clear register is designed (grep around LPASS_HDMITX_APP_IRQCLEAR_REG to see how it's write only). That's because regmap_update_bits() tries to read the register from the hardware and if it isn't readable it looks in the regmap cache to see what was written there last time to compare against what we want to write there. Eventually, we're unable to modify this register at all because the bits that we're trying to set are already set in the cache.
This is doubly bad for the irq clear register because you have to write the bit to clear an interrupt. Given the irq is level triggered, we see an interrupt storm upon plugging in an HDMI cable and starting audio playback. The irq storm is so great that performance degrades significantly, leading to CPU soft lockups.
Fix it by using regmap_write_bits() so that we really do write the bits in the clear register that we want to. This brings the number of irqs handled by lpass_dma_interrupt_handler() down from ~150k/sec to ~10/sec.
Fixes: da0363f7bfd3 ("ASoC: qcom: Fix for DMA interrupt clear reg overwriting") Cc: Srinivasa Rao Mandadapu srivasam@codeaurora.org Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Signed-off-by: Stephen Boyd swboyd@chromium.org Link: https://lore.kernel.org/r/20220209232520.4017634-1-swboyd@chromium.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/qcom/lpass-platform.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/sound/soc/qcom/lpass-platform.c +++ b/sound/soc/qcom/lpass-platform.c @@ -524,7 +524,7 @@ static int lpass_platform_pcmops_trigger return -EINVAL; }
- ret = regmap_update_bits(map, reg_irqclr, val_irqclr, val_irqclr); + ret = regmap_write_bits(map, reg_irqclr, val_irqclr, val_irqclr); if (ret) { dev_err(soc_runtime->dev, "error writing to irqclear reg: %d\n", ret); return ret; @@ -665,7 +665,7 @@ static irqreturn_t lpass_dma_interrupt_h return -EINVAL; } if (interrupts & LPAIF_IRQ_PER(chan)) { - rv = regmap_update_bits(map, reg, mask, (LPAIF_IRQ_PER(chan) | val)); + rv = regmap_write_bits(map, reg, mask, (LPAIF_IRQ_PER(chan) | val)); if (rv) { dev_err(soc_runtime->dev, "error writing to irqclear reg: %d\n", rv); @@ -676,7 +676,7 @@ static irqreturn_t lpass_dma_interrupt_h }
if (interrupts & LPAIF_IRQ_XRUN(chan)) { - rv = regmap_update_bits(map, reg, mask, (LPAIF_IRQ_XRUN(chan) | val)); + rv = regmap_write_bits(map, reg, mask, (LPAIF_IRQ_XRUN(chan) | val)); if (rv) { dev_err(soc_runtime->dev, "error writing to irqclear reg: %d\n", rv); @@ -688,7 +688,7 @@ static irqreturn_t lpass_dma_interrupt_h }
if (interrupts & LPAIF_IRQ_ERR(chan)) { - rv = regmap_update_bits(map, reg, mask, (LPAIF_IRQ_ERR(chan) | val)); + rv = regmap_write_bits(map, reg, mask, (LPAIF_IRQ_ERR(chan) | val)); if (rv) { dev_err(soc_runtime->dev, "error writing to irqclear reg: %d\n", rv);
From: Laibin Qiu qiulaibin@huawei.com
commit e92bc4cd34de2ce454bdea8cd198b8067ee4e123 upstream.
Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in wbt_disable_default() when switch elevator to bfq. And when we remove scsi device, wbt will be enabled by wbt_enable_default. If it become false positive between wbt_wait() and wbt_track() when submit write request.
The following is the scenario that triggered the problem.
T1 T2 T3 elevator_switch_mq bfq_init_queue wbt_disable_default <= Set rwb->enable_state (OFF) Submit_bio blk_mq_make_request rq_qos_throttle <= rwb->enable_state (OFF) scsi_remove_device sd_remove del_gendisk blk_unregister_queue elv_unregister_queue wbt_enable_default <= Set rwb->enable_state (ON) q_qos_track <= rwb->enable_state (ON) ^^^^^^ this request will mark WBT_TRACKED without inflight add and will lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.
Fix this by move wbt_enable_default() from elv_unregister to bfq_exit_queue(). Only re-enable wbt when bfq exit.
Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")
Remove oneline stale comment, and kill one oneshot local variable.
Signed-off-by: Ming Lei ming.lei@rehdat.com Reviewed-by: Christoph Hellwig hch@lst.de Link: https://lore.kernel.org/linux-block/20211214133103.551813-1-qiulaibin@huawei... Signed-off-by: Laibin Qiu qiulaibin@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/bfq-iosched.c | 2 ++ block/elevator.c | 2 -- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -6878,6 +6878,8 @@ static void bfq_exit_queue(struct elevat spin_unlock_irq(&bfqd->lock); #endif
+ wbt_enable_default(bfqd->queue); + kfree(bfqd); }
--- a/block/elevator.c +++ b/block/elevator.c @@ -523,8 +523,6 @@ void elv_unregister_queue(struct request kobject_del(&e->kobj);
e->registered = 0; - /* Re-enable throttling in case elevator disabled it */ - wbt_enable_default(q); } }
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 9d047bf68fe8cdb4086deaf4edd119731a9481ed upstream.
In nfs4_update_changeattr_locked(), we don't need to set the NFS_INO_REVAL_PAGECACHE flag, because we already know the value of the change attribute, and we're already flagging the size. In fact, this forces us to revalidate the change attribute a second time for no good reason. This extra flag appears to have been introduced as part of the xattr feature, when update_changeattr_locked() was converted for use by the xattr code.
Fixes: 1b523ca972ed ("nfs: modify update_changeattr to deal with regular files") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfs/nfs4proc.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -1232,8 +1232,7 @@ nfs4_update_changeattr_locked(struct ino NFS_INO_INVALID_ACCESS | NFS_INO_INVALID_ACL | NFS_INO_INVALID_SIZE | NFS_INO_INVALID_OTHER | NFS_INO_INVALID_BLOCKS | NFS_INO_INVALID_NLINK | - NFS_INO_INVALID_MODE | NFS_INO_INVALID_XATTR | - NFS_INO_REVAL_PAGECACHE; + NFS_INO_INVALID_MODE | NFS_INO_INVALID_XATTR; nfsi->attrtimeo = NFS_MINATTRTIMEO(inode); } nfsi->attrtimeo_timestamp = jiffies;
From: Trond Myklebust trond.myklebust@hammerspace.com
commit e0caaf75d443e02e55e146fd75fe2efc8aed5540 upstream.
Commit ac795161c936 (NFSv4: Handle case where the lookup of a directory fails) [1], part of Linux since 5.17-rc2, introduced a regression, where a symbolic link on an NFS mount to a directory on another NFS does not resolve(?) the first time it is accessed:
Reported-by: Paul Menzel pmenzel@molgen.mpg.de Fixes: ac795161c936 ("NFSv4: Handle case where the lookup of a directory fails") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Tested-by: Donald Buczek buczek@molgen.mpg.de Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfs/dir.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1987,14 +1987,14 @@ no_open: if (!res) { inode = d_inode(dentry); if ((lookup_flags & LOOKUP_DIRECTORY) && inode && - !S_ISDIR(inode->i_mode)) + !(S_ISDIR(inode->i_mode) || S_ISLNK(inode->i_mode))) res = ERR_PTR(-ENOTDIR); else if (inode && S_ISREG(inode->i_mode)) res = ERR_PTR(-EOPENSTALE); } else if (!IS_ERR(res)) { inode = d_inode(res); if ((lookup_flags & LOOKUP_DIRECTORY) && inode && - !S_ISDIR(inode->i_mode)) { + !(S_ISDIR(inode->i_mode) || S_ISLNK(inode->i_mode))) { dput(res); res = ERR_PTR(-ENOTDIR); } else if (inode && S_ISREG(inode->i_mode)) {
From: Trond Myklebust trond.myklebust@hammerspace.com
commit d19e0183a88306acda07f4a01fedeeffe2a2a06b upstream.
The result of the writeback, whether it is an ENOSPC or an EIO, or anything else, does not inhibit the NFS client from reporting the correct file timestamps.
Fixes: 79566ef018f5 ("NFS: Getattr doesn't require data sync semantics") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfs/inode.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
--- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -840,12 +840,9 @@ int nfs_getattr(struct user_namespace *m }
/* Flush out writes to the server in order to update c/mtime. */ - if ((request_mask & (STATX_CTIME|STATX_MTIME)) && - S_ISREG(inode->i_mode)) { - err = filemap_write_and_wait(inode->i_mapping); - if (err) - goto out; - } + if ((request_mask & (STATX_CTIME | STATX_MTIME)) && + S_ISREG(inode->i_mode)) + filemap_write_and_wait(inode->i_mapping);
/* * We may force a getattr if the user cares about atime.
From: Linus Torvalds torvalds@linux-foundation.org
commit 3593030761630e09200072a4bd06468892c27be3 upstream.
Daniel Gibson reports that the n_tty code gets line termination wrong in very specific cases:
"If you feed a line with exactly 64 chars + terminating newline, and directly afterwards (without reading) another line into a pseudo terminal, the the first read() on the other side will return the 64 char line *without* terminating newline, and the next read() will return the missing terminating newline AND the complete next line (if it fits in the buffer)"
and bisected the behavior to commit 3b830a9c34d5 ("tty: convert tty_ldisc_ops 'read()' function to take a kernel pointer").
Now, digging deeper, it turns out that the behavior isn't exactly new: what changed in commit 3b830a9c34d5 was that the tty line discipline .read() function is now passed an intermediate kernel buffer rather than the final user space buffer.
And that intermediate kernel buffer is 64 bytes in size - thus that special case with exactly 64 bytes plus terminating newline.
The same problem did exist before, but historically the boundary was not the 64-byte chunk, but the user-supplied buffer size, which is obviously generally bigger (and potentially bigger than N_TTY_BUF_SIZE, which would hide the issue entirely).
The reason is that the n_tty canon_copy_from_read_buf() code would look ahead for the EOL character one byte further than it would actually copy. It would then decide that it had found the terminator, and unmark it as an EOL character - which in turn explains why the next read wouldn't then be terminated by it.
Now, the reason it did all this in the first place is related to some historical and pretty obscure EOF behavior, see commit ac8f3bf8832a ("n_tty: Fix poll() after buffer-limited eof push read") and commit 40d5e0905a03 ("n_tty: Fix EOF push handling").
And the reason for the EOL confusion is that we treat EOF as a special EOL condition, with the EOL character being NUL (aka "__DISABLED_CHAR" in the kernel sources).
So that EOF look-ahead also affects the normal EOL handling.
This patch just removes the look-ahead that causes problems, because EOL is much more critical than the historical "EOF in the middle of a line that coincides with the end of the buffer" handling ever was.
Now, it is possible that we should indeed re-introduce the "look at next character to see if it's a EOF" behavior, but if so, that should be done not at the kernel buffer chunk boundary in canon_copy_from_read_buf(), but at a higher level, when we run out of the user buffer.
In particular, the place to do that would be at the top of 'n_tty_read()', where we check if it's a continuation of a previously started read, and there is no more buffer space left, we could decide to just eat the __DISABLED_CHAR at that point.
But that would be a separate patch, because I suspect nobody actually cares, and I'd like to get a report about it before bothering.
Fixes: 3b830a9c34d5 ("tty: convert tty_ldisc_ops 'read()' function to take a kernel pointer") Fixes: ac8f3bf8832a ("n_tty: Fix poll() after buffer-limited eof push read") Fixes: 40d5e0905a03 ("n_tty: Fix EOF push handling") Link: https://bugzilla.kernel.org/show_bug.cgi?id=215611 Reported-and-tested-by: Daniel Gibson metalcaedes@gmail.com Cc: Peter Hurley peter@hurleysoftware.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Jiri Slaby jirislaby@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/n_tty.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -1963,7 +1963,7 @@ static bool canon_copy_from_read_buf(str return false;
canon_head = smp_load_acquire(&ldata->canon_head); - n = min(*nr + 1, canon_head - ldata->read_tail); + n = min(*nr, canon_head - ldata->read_tail);
tail = ldata->read_tail & (N_TTY_BUF_SIZE - 1); size = min_t(size_t, tail + n, N_TTY_BUF_SIZE); @@ -1985,10 +1985,8 @@ static bool canon_copy_from_read_buf(str n += N_TTY_BUF_SIZE; c = n + found;
- if (!found || read_buf(ldata, eol) != __DISABLED_CHAR) { - c = min(*nr, c); + if (!found || read_buf(ldata, eol) != __DISABLED_CHAR) n = c; - }
n_tty_trace("%s: eol:%zu found:%d n:%zu c:%zu tail:%zu more:%zu\n", __func__, eol, found, n, c, tail, more);
From: Christoph Hellwig hch@lst.de
commit 7a5428dcb7902700b830e912feee4e845df7c019 upstream.
Various block drivers call blk_set_queue_dying to mark a disk as dead due to surprise removal events, but since commit 8e141f9eb803 that doesn't work given that the GD_DEAD flag needs to be set to stop I/O.
Replace the driver calls to blk_set_queue_dying with a new (and properly documented) blk_mark_disk_dead API, and fold blk_set_queue_dying into the only remaining caller.
Fixes: 8e141f9eb803 ("block: drain file system I/O on del_gendisk") Reported-by: Markus Blöchl markus.bloechl@ipetronik.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Sagi Grimberg sagi@grimberg.me Link: https://lore.kernel.org/r/20220217075231.1140-1-hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/blk-core.c | 10 ++-------- block/genhd.c | 14 ++++++++++++++ drivers/block/mtip32xx/mtip32xx.c | 2 +- drivers/block/rbd.c | 2 +- drivers/block/xen-blkfront.c | 2 +- drivers/md/dm.c | 2 +- drivers/nvme/host/core.c | 2 +- drivers/nvme/host/multipath.c | 2 +- include/linux/blkdev.h | 3 ++- 9 files changed, 24 insertions(+), 15 deletions(-)
--- a/block/blk-core.c +++ b/block/blk-core.c @@ -350,13 +350,6 @@ void blk_queue_start_drain(struct reques wake_up_all(&q->mq_freeze_wq); }
-void blk_set_queue_dying(struct request_queue *q) -{ - blk_queue_flag_set(QUEUE_FLAG_DYING, q); - blk_queue_start_drain(q); -} -EXPORT_SYMBOL_GPL(blk_set_queue_dying); - /** * blk_cleanup_queue - shutdown a request queue * @q: request queue to shutdown @@ -374,7 +367,8 @@ void blk_cleanup_queue(struct request_qu WARN_ON_ONCE(blk_queue_registered(q));
/* mark @q DYING, no new request or merges will be allowed afterwards */ - blk_set_queue_dying(q); + blk_queue_flag_set(QUEUE_FLAG_DYING, q); + blk_queue_start_drain(q);
blk_queue_flag_set(QUEUE_FLAG_NOMERGES, q); blk_queue_flag_set(QUEUE_FLAG_NOXMERGES, q); --- a/block/genhd.c +++ b/block/genhd.c @@ -545,6 +545,20 @@ out_free_ext_minor: EXPORT_SYMBOL(device_add_disk);
/** + * blk_mark_disk_dead - mark a disk as dead + * @disk: disk to mark as dead + * + * Mark as disk as dead (e.g. surprise removed) and don't accept any new I/O + * to this disk. + */ +void blk_mark_disk_dead(struct gendisk *disk) +{ + set_bit(GD_DEAD, &disk->state); + blk_queue_start_drain(disk->queue); +} +EXPORT_SYMBOL_GPL(blk_mark_disk_dead); + +/** * del_gendisk - remove the gendisk * @disk: the struct gendisk to remove * --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -4112,7 +4112,7 @@ static void mtip_pci_remove(struct pci_d "Completion workers still active!\n"); }
- blk_set_queue_dying(dd->queue); + blk_mark_disk_dead(dd->disk); set_bit(MTIP_DDF_REMOVE_PENDING_BIT, &dd->dd_flag);
/* Clean up the block layer. */ --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -7182,7 +7182,7 @@ static ssize_t do_rbd_remove(struct bus_ * IO to complete/fail. */ blk_mq_freeze_queue(rbd_dev->disk->queue); - blk_set_queue_dying(rbd_dev->disk->queue); + blk_mark_disk_dead(rbd_dev->disk); }
del_gendisk(rbd_dev->disk); --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -2128,7 +2128,7 @@ static void blkfront_closing(struct blkf
/* No more blkif_request(). */ blk_mq_stop_hw_queues(info->rq); - blk_set_queue_dying(info->rq); + blk_mark_disk_dead(info->gd); set_capacity(info->gd, 0);
for_each_rinfo(info, rinfo, i) { --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2156,7 +2156,7 @@ static void __dm_destroy(struct mapped_d set_bit(DMF_FREEING, &md->flags); spin_unlock(&_minor_lock);
- blk_set_queue_dying(md->queue); + blk_mark_disk_dead(md->disk);
/* * Take suspend_lock so that presuspend and postsuspend methods --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -131,7 +131,7 @@ static void nvme_set_queue_dying(struct if (test_and_set_bit(NVME_NS_DEAD, &ns->flags)) return;
- blk_set_queue_dying(ns->queue); + blk_mark_disk_dead(ns->disk); blk_mq_unquiesce_queue(ns->queue);
set_capacity_and_notify(ns->disk, 0); --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -792,7 +792,7 @@ void nvme_mpath_remove_disk(struct nvme_ { if (!head->disk) return; - blk_set_queue_dying(head->disk->queue); + blk_mark_disk_dead(head->disk); /* make sure all pending bios are cleaned up */ kblockd_schedule_work(&head->requeue_work); flush_work(&head->requeue_work); --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1184,7 +1184,8 @@ extern void blk_dump_rq_flags(struct req
bool __must_check blk_get_queue(struct request_queue *); extern void blk_put_queue(struct request_queue *); -extern void blk_set_queue_dying(struct request_queue *); + +void blk_mark_disk_dead(struct gendisk *disk);
#ifdef CONFIG_BLOCK /*
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit 5c23b3f965bc9ee696bf2ed4bdc54d339dd9a455 upstream.
Interacting with a NAND chip on an IPQ6018 I found that the qcomsmem NAND partition parser was returning -EPROBE_DEFER waiting for the main smem driver to load.
This caused the board to reset. Playing about with the probe() function shows that the problem lies in the core clock being switched off before the nandc_unalloc() routine has completed.
If we look at how qcom_nandc_remove() tears down allocated resources we see the expected order is
qcom_nandc_unalloc(nandc);
clk_disable_unprepare(nandc->aon_clk); clk_disable_unprepare(nandc->core_clk);
dma_unmap_resource(&pdev->dev, nandc->base_dma, resource_size(res), DMA_BIDIRECTIONAL, 0);
Tweaking probe() to both bring up and tear-down in that order removes the reset if we end up deferring elsewhere.
Fixes: c76b78d8ec05 ("mtd: nand: Qualcomm NAND controller driver") Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Manivannan Sadhasivam mani@kernel.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20220103030316.58301-2-bryan.odonoghue@lin... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/nand/raw/qcom_nandc.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)
--- a/drivers/mtd/nand/raw/qcom_nandc.c +++ b/drivers/mtd/nand/raw/qcom_nandc.c @@ -2,7 +2,6 @@ /* * Copyright (c) 2016, The Linux Foundation. All rights reserved. */ - #include <linux/clk.h> #include <linux/slab.h> #include <linux/bitops.h> @@ -3063,10 +3062,6 @@ static int qcom_nandc_probe(struct platf if (dma_mapping_error(dev, nandc->base_dma)) return -ENXIO;
- ret = qcom_nandc_alloc(nandc); - if (ret) - goto err_nandc_alloc; - ret = clk_prepare_enable(nandc->core_clk); if (ret) goto err_core_clk; @@ -3075,6 +3070,10 @@ static int qcom_nandc_probe(struct platf if (ret) goto err_aon_clk;
+ ret = qcom_nandc_alloc(nandc); + if (ret) + goto err_nandc_alloc; + ret = qcom_nandc_setup(nandc); if (ret) goto err_setup; @@ -3086,15 +3085,14 @@ static int qcom_nandc_probe(struct platf return 0;
err_setup: + qcom_nandc_unalloc(nandc); +err_nandc_alloc: clk_disable_unprepare(nandc->aon_clk); err_aon_clk: clk_disable_unprepare(nandc->core_clk); err_core_clk: - qcom_nandc_unalloc(nandc); -err_nandc_alloc: dma_unmap_resource(dev, res->start, resource_size(res), DMA_BIDIRECTIONAL, 0); - return ret; }
From: Ansuel Smith ansuelsmth@gmail.com
commit 65d003cca335cabc0160d3cd7daa689eaa9dd3cd upstream.
In the event of a skipped partition (case when the entry name is empty) the kernel panics in the cleanup function as the name entry is NULL. Rework the parser logic by first checking the real partition number and then allocate the space and set the data for the valid partitions.
The logic was also fundamentally wrong as with a skipped partition, the parts number returned was incorrect by not decreasing it for the skipped partitions.
Fixes: 803eb124e1a6 ("mtd: parsers: Add Qcom SMEM parser") Signed-off-by: Ansuel Smith ansuelsmth@gmail.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20220116032211.9728-1-ansuelsmth@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/parsers/qcomsmempart.c | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-)
--- a/drivers/mtd/parsers/qcomsmempart.c +++ b/drivers/mtd/parsers/qcomsmempart.c @@ -58,11 +58,11 @@ static int parse_qcomsmem_part(struct mt const struct mtd_partition **pparts, struct mtd_part_parser_data *data) { + size_t len = SMEM_FLASH_PTABLE_HDR_LEN; + int ret, i, j, tmpparts, numparts = 0; struct smem_flash_pentry *pentry; struct smem_flash_ptable *ptable; - size_t len = SMEM_FLASH_PTABLE_HDR_LEN; struct mtd_partition *parts; - int ret, i, numparts; char *name, *c;
if (IS_ENABLED(CONFIG_MTD_SPI_NOR_USE_4K_SECTORS) @@ -87,8 +87,8 @@ static int parse_qcomsmem_part(struct mt }
/* Ensure that # of partitions is less than the max we have allocated */ - numparts = le32_to_cpu(ptable->numparts); - if (numparts > SMEM_FLASH_PTABLE_MAX_PARTS_V4) { + tmpparts = le32_to_cpu(ptable->numparts); + if (tmpparts > SMEM_FLASH_PTABLE_MAX_PARTS_V4) { pr_err("Partition numbers exceed the max limit\n"); return -EINVAL; } @@ -116,11 +116,17 @@ static int parse_qcomsmem_part(struct mt return PTR_ERR(ptable); }
+ for (i = 0; i < tmpparts; i++) { + pentry = &ptable->pentry[i]; + if (pentry->name[0] != '\0') + numparts++; + } + parts = kcalloc(numparts, sizeof(*parts), GFP_KERNEL); if (!parts) return -ENOMEM;
- for (i = 0; i < numparts; i++) { + for (i = 0, j = 0; i < tmpparts; i++) { pentry = &ptable->pentry[i]; if (pentry->name[0] == '\0') continue; @@ -135,24 +141,25 @@ static int parse_qcomsmem_part(struct mt for (c = name; *c != '\0'; c++) *c = tolower(*c);
- parts[i].name = name; - parts[i].offset = le32_to_cpu(pentry->offset) * mtd->erasesize; - parts[i].mask_flags = pentry->attr; - parts[i].size = le32_to_cpu(pentry->length) * mtd->erasesize; + parts[j].name = name; + parts[j].offset = le32_to_cpu(pentry->offset) * mtd->erasesize; + parts[j].mask_flags = pentry->attr; + parts[j].size = le32_to_cpu(pentry->length) * mtd->erasesize; pr_debug("%d: %s offs=0x%08x size=0x%08x attr:0x%08x\n", i, pentry->name, le32_to_cpu(pentry->offset), le32_to_cpu(pentry->length), pentry->attr); + j++; }
pr_debug("SMEM partition table found: ver: %d len: %d\n", - le32_to_cpu(ptable->version), numparts); + le32_to_cpu(ptable->version), tmpparts); *pparts = parts;
return numparts;
out_free_parts: - while (--i >= 0) - kfree(parts[i].name); + while (--j >= 0) + kfree(parts[j].name); kfree(parts); *pparts = NULL;
From: Ansuel Smith ansuelsmth@gmail.com
commit 3dd8ba961b9356c4113b96541c752c73d98fef70 upstream.
Mtdpart doesn't free pparts when a cleanup function is declared. Add missing free for pparts in cleanup function for smem to fix the leak.
Fixes: 10f3b4d79958 ("mtd: parsers: qcom: Fix leaking of partition name") Signed-off-by: Ansuel Smith ansuelsmth@gmail.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20220116032211.9728-2-ansuelsmth@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/parsers/qcomsmempart.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/mtd/parsers/qcomsmempart.c +++ b/drivers/mtd/parsers/qcomsmempart.c @@ -173,6 +173,8 @@ static void parse_qcomsmem_cleanup(const
for (i = 0; i < nr_parts; i++) kfree(pparts[i].name); + + kfree(pparts); }
static const struct of_device_id qcomsmem_of_match_table[] = {
From: Dan Carpenter dan.carpenter@oracle.com
commit 3e3765875b1b8864898603768fd5c93eeb552211 upstream.
The problem is that "erasesize" is a uint64_t type so it might be non-zero but the lower 32 bits are zero so when it's truncated, "(uint32_t)erasesize", then that value is zero. This leads to a divide by zero bug.
Avoid the bug by delaying the divide until after we have validated that "erasesize" is non-zero and within the uint32_t range.
Fixes: dc2b3e5cbc80 ("mtd: phram: use div_u64_rem to stop overwrite len in phram_setup") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20220121115505.GI1978@kadam Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/devices/phram.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
--- a/drivers/mtd/devices/phram.c +++ b/drivers/mtd/devices/phram.c @@ -264,15 +264,19 @@ static int phram_setup(const char *val) } }
- if (erasesize) - div_u64_rem(len, (uint32_t)erasesize, &rem); - if (len == 0 || erasesize == 0 || erasesize > len - || erasesize > UINT_MAX || rem) { + || erasesize > UINT_MAX) { parse_err("illegal erasesize or len\n"); ret = -EINVAL; goto error; } + + div_u64_rem(len, (uint32_t)erasesize, &rem); + if (rem) { + parse_err("len is not multiple of erasesize\n"); + ret = -EINVAL; + goto error; + }
ret = register_device(name, start, len, (uint32_t)erasesize); if (ret)
From: david regan dregan@mail.com
commit 36415a7964711822e63695ea67fede63979054d9 upstream.
The brcmnand driver contains a bug in which if a page (example 2k byte) is read from the parallel/ONFI NAND and within that page a subpage (512 byte) has correctable errors which is followed by a subpage with uncorrectable errors, the page read will return the wrong status of correctable (as opposed to the actual status of uncorrectable.)
The bug is in function brcmnand_read_by_pio where there is a check for uncorrectable bits which will be preempted if a previous status for correctable bits is detected.
The fix is to stop checking for bad bits only if we already have a bad bits status.
Fixes: 27c5b17cd1b1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller") Signed-off-by: david regan dregan@mail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/trinity-478e0c09-9134-40e8-8f8c-31c371225e... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/nand/raw/brcmnand/brcmnand.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c @@ -2106,7 +2106,7 @@ static int brcmnand_read_by_pio(struct m mtd->oobsize / trans, host->hwcfg.sector_size_1k);
- if (!ret) { + if (ret != -EBADMSG) { *err_addr = brcmnand_get_uncorrecc_addr(ctrl);
if (*err_addr)
From: Dongliang Mu mudongliangabcd@gmail.com
[ Upstream commit 817b8b9c5396d2b2d92311b46719aad5d3339dbe ]
When hid_parse() in elo_probe() fails, it forgets to call usb_put_dev to decrease the refcount.
Fix this by adding usb_put_dev() in the error handling code of elo_probe().
Fixes: fbf42729d0e9 ("HID: elo: update the reference count of the usb device structure") Reported-by: syzkaller syzkaller@googlegroups.com Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-elo.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/hid/hid-elo.c b/drivers/hid/hid-elo.c index 8e960d7b233b3..9b42b0cdeef06 100644 --- a/drivers/hid/hid-elo.c +++ b/drivers/hid/hid-elo.c @@ -262,6 +262,7 @@ static int elo_probe(struct hid_device *hdev, const struct hid_device_id *id)
return 0; err_free: + usb_put_dev(udev); kfree(priv); return ret; }
From: Miaoqian Lin linmq006@gmail.com
[ Upstream commit ba1b71b008e97fd747845ff3a818420b11bbe830 ]
If of_find_device_by_node() succeeds, ingenic_ecc_get() doesn't have a corresponding put_device(). Thus add put_device() to fix the exception handling.
Fixes: 15de8c6efd0e ("mtd: rawnand: ingenic: Separate top-level and SoC specific code") Signed-off-by: Miaoqian Lin linmq006@gmail.com Reviewed-by: Paul Cercueil paul@crapouillou.net Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20211230072751.21622-1-linmq006@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/nand/raw/ingenic/ingenic_ecc.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/ingenic/ingenic_ecc.c b/drivers/mtd/nand/raw/ingenic/ingenic_ecc.c index efe0ffe4f1abc..9054559e52dda 100644 --- a/drivers/mtd/nand/raw/ingenic/ingenic_ecc.c +++ b/drivers/mtd/nand/raw/ingenic/ingenic_ecc.c @@ -68,9 +68,14 @@ static struct ingenic_ecc *ingenic_ecc_get(struct device_node *np) struct ingenic_ecc *ecc;
pdev = of_find_device_by_node(np); - if (!pdev || !platform_get_drvdata(pdev)) + if (!pdev) return ERR_PTR(-EPROBE_DEFER);
+ if (!platform_get_drvdata(pdev)) { + put_device(&pdev->dev); + return ERR_PTR(-EPROBE_DEFER); + } + ecc = platform_get_drvdata(pdev); clk_prepare_enable(ecc->clk);
From: Miaoqian Lin linmq006@gmail.com
[ Upstream commit 8bc69f86328e87a0ffa79438430cc82f3aa6a194 ]
kobject_init_and_add() takes reference even when it fails. According to the doc of kobject_init_and_add():
If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object.
Fix memory leak by calling kobject_put().
Fixes: c2e5df616e1a ("vmbus: add per-channel sysfs info") Signed-off-by: Miaoqian Lin linmq006@gmail.com Reviewed-by: Juan Vazquez juvazq@linux.microsoft.com Link: https://lore.kernel.org/r/20220203173008.43480-1-linmq006@gmail.com Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hv/vmbus_drv.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 392c1ac4f8193..44bd0b6ff5059 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -2027,8 +2027,10 @@ int vmbus_add_channel_kobj(struct hv_device *dev, struct vmbus_channel *channel) kobj->kset = dev->channels_kset; ret = kobject_init_and_add(kobj, &vmbus_chan_ktype, NULL, "%u", relid); - if (ret) + if (ret) { + kobject_put(kobj); return ret; + }
ret = sysfs_create_group(kobj, &vmbus_chan_group);
@@ -2037,6 +2039,7 @@ int vmbus_add_channel_kobj(struct hv_device *dev, struct vmbus_channel *channel) * The calling functions' error handling paths will cleanup the * empty channel directory. */ + kobject_put(kobj); dev_err(device, "Unable to set up channel sysfs files\n"); return ret; }
From: Like Xu likexu@tencent.com
[ Upstream commit 7c174f305cbee6bdba5018aae02b84369e7ab995 ]
The find_arch_event() returns a "unsigned int" value, which is used by the pmc_reprogram_counter() to program a PERF_TYPE_HARDWARE type perf_event.
The returned value is actually the kernel defined generic perf_hw_id, let's rename it to pmc_perf_hw_id() with simpler incoming parameters for better self-explanation.
Signed-off-by: Like Xu likexu@tencent.com Message-Id: 20211130074221.93635-3-likexu@tencent.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/pmu.c | 8 +------- arch/x86/kvm/pmu.h | 3 +-- arch/x86/kvm/svm/pmu.c | 8 ++++---- arch/x86/kvm/vmx/pmu_intel.c | 9 +++++---- 4 files changed, 11 insertions(+), 17 deletions(-)
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 0772bad9165c5..eec614de9af30 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -174,7 +174,6 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc) void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel) { unsigned config, type = PERF_TYPE_RAW; - u8 event_select, unit_mask; struct kvm *kvm = pmc->vcpu->kvm; struct kvm_pmu_event_filter *filter; int i; @@ -206,17 +205,12 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel) if (!allow_event) return;
- event_select = eventsel & ARCH_PERFMON_EVENTSEL_EVENT; - unit_mask = (eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8; - if (!(eventsel & (ARCH_PERFMON_EVENTSEL_EDGE | ARCH_PERFMON_EVENTSEL_INV | ARCH_PERFMON_EVENTSEL_CMASK | HSW_IN_TX | HSW_IN_TX_CHECKPOINTED))) { - config = kvm_x86_ops.pmu_ops->find_arch_event(pmc_to_pmu(pmc), - event_select, - unit_mask); + config = kvm_x86_ops.pmu_ops->pmc_perf_hw_id(pmc); if (config != PERF_COUNT_HW_MAX) type = PERF_TYPE_HARDWARE; } diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 0e4f2b1fa9fbd..a06d95165ac7c 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -24,8 +24,7 @@ struct kvm_event_hw_type_mapping { };
struct kvm_pmu_ops { - unsigned (*find_arch_event)(struct kvm_pmu *pmu, u8 event_select, - u8 unit_mask); + unsigned int (*pmc_perf_hw_id)(struct kvm_pmc *pmc); unsigned (*find_fixed_event)(int idx); bool (*pmc_is_enabled)(struct kvm_pmc *pmc); struct kvm_pmc *(*pmc_idx_to_pmc)(struct kvm_pmu *pmu, int pmc_idx); diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c index e152241d1d709..06f8034f62e4f 100644 --- a/arch/x86/kvm/svm/pmu.c +++ b/arch/x86/kvm/svm/pmu.c @@ -134,10 +134,10 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr, return &pmu->gp_counters[msr_to_index(msr)]; }
-static unsigned amd_find_arch_event(struct kvm_pmu *pmu, - u8 event_select, - u8 unit_mask) +static unsigned int amd_pmc_perf_hw_id(struct kvm_pmc *pmc) { + u8 event_select = pmc->eventsel & ARCH_PERFMON_EVENTSEL_EVENT; + u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8; int i;
for (i = 0; i < ARRAY_SIZE(amd_event_mapping); i++) @@ -320,7 +320,7 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu) }
struct kvm_pmu_ops amd_pmu_ops = { - .find_arch_event = amd_find_arch_event, + .pmc_perf_hw_id = amd_pmc_perf_hw_id, .find_fixed_event = amd_find_fixed_event, .pmc_is_enabled = amd_pmc_is_enabled, .pmc_idx_to_pmc = amd_pmc_idx_to_pmc, diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 10cc4f65c4efd..6427d95de01cf 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -68,10 +68,11 @@ static void global_ctrl_changed(struct kvm_pmu *pmu, u64 data) reprogram_counter(pmu, bit); }
-static unsigned intel_find_arch_event(struct kvm_pmu *pmu, - u8 event_select, - u8 unit_mask) +static unsigned int intel_pmc_perf_hw_id(struct kvm_pmc *pmc) { + struct kvm_pmu *pmu = pmc_to_pmu(pmc); + u8 event_select = pmc->eventsel & ARCH_PERFMON_EVENTSEL_EVENT; + u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8; int i;
for (i = 0; i < ARRAY_SIZE(intel_arch_events); i++) @@ -706,7 +707,7 @@ static void intel_pmu_cleanup(struct kvm_vcpu *vcpu) }
struct kvm_pmu_ops intel_pmu_ops = { - .find_arch_event = intel_find_arch_event, + .pmc_perf_hw_id = intel_pmc_perf_hw_id, .find_fixed_event = intel_find_fixed_event, .pmc_is_enabled = intel_pmc_is_enabled, .pmc_idx_to_pmc = intel_pmc_idx_to_pmc,
From: Jim Mattson jmattson@google.com
[ Upstream commit b8bfee85f1307426e0242d654f3a14c06ef639c5 ]
AMD's event select is 3 nybbles, with the high nybble in bits 35:32 of a PerfEvtSeln MSR. Don't drop the high nybble when setting up the config field of a perf_event_attr structure for a call to perf_event_create_kernel_counter().
Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM") Reported-by: Stephane Eranian eranian@google.com Signed-off-by: Jim Mattson jmattson@google.com Message-Id: 20220203014813.2130559-1-jmattson@google.com Reviewed-by: David Dunn daviddunn@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/pmu.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index eec614de9af30..bfef0c658730e 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -95,7 +95,7 @@ static void kvm_perf_overflow_intr(struct perf_event *perf_event, }
static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, - unsigned config, bool exclude_user, + u64 config, bool exclude_user, bool exclude_kernel, bool intr, bool in_tx, bool in_tx_cp) { @@ -173,7 +173,8 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc)
void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel) { - unsigned config, type = PERF_TYPE_RAW; + u64 config; + u32 type = PERF_TYPE_RAW; struct kvm *kvm = pmc->vcpu->kvm; struct kvm_pmu_event_filter *filter; int i;
From: Jim Mattson jmattson@google.com
[ Upstream commit 710c476514313c74045c41c0571bb5178fd16e3d ]
AMD's event select is 3 nybbles, with the high nybble in bits 35:32 of a PerfEvtSeln MSR. Don't mask off the high nybble when configuring a RAW perf event.
Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM") Signed-off-by: Jim Mattson jmattson@google.com Message-Id: 20220203014813.2130559-2-jmattson@google.com Reviewed-by: David Dunn daviddunn@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/pmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index bfef0c658730e..f256f01056bdb 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -217,7 +217,7 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel) }
if (type == PERF_TYPE_RAW) - config = eventsel & X86_RAW_EVENT_MASK; + config = eventsel & AMD64_RAW_EVENT_MASK;
if (pmc->current_config == eventsel && pmc_resume_counter(pmc)) return;
From: Wan Jiabing wanjiabing@vivo.com
[ Upstream commit 80c469a0a03763f814715f3d12b6f3964c7423e8 ]
Fix following coccicheck warning: ./arch/arm/mach-omap2/omap_hwmod.c:753:1-23: WARNING: Function for_each_matching_node should have of_node_put() before break
Early exits from for_each_matching_node should decrement the node reference counter.
Signed-off-by: Wan Jiabing wanjiabing@vivo.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-omap2/omap_hwmod.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mach-omap2/omap_hwmod.c b/arch/arm/mach-omap2/omap_hwmod.c index 0c2936c7a3799..a5e9cffcac10c 100644 --- a/arch/arm/mach-omap2/omap_hwmod.c +++ b/arch/arm/mach-omap2/omap_hwmod.c @@ -752,8 +752,10 @@ static int __init _init_clkctrl_providers(void)
for_each_matching_node(np, ti_clkctrl_match_table) { ret = _setup_clkctrl_provider(np); - if (ret) + if (ret) { + of_node_put(np); break; + } }
return ret;
From: Ye Guojin ye.guojin@zte.com.cn
[ Upstream commit 34596ba380b03d181e24efd50e2f21045bde3696 ]
This was found by coccicheck: ./arch/arm/mach-omap2/display.c, 272, 1-7, ERROR missing put_device; call of_find_device_by_node on line 258, but without a corresponding object release within this function.
Move the put_device() call before the if judgment.
Reported-by: Zeal Robot zealci@zte.com.cn Signed-off-by: Ye Guojin ye.guojin@zte.com.cn Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-omap2/display.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mach-omap2/display.c b/arch/arm/mach-omap2/display.c index 6daaa645ae5d9..21413a9b7b6c6 100644 --- a/arch/arm/mach-omap2/display.c +++ b/arch/arm/mach-omap2/display.c @@ -263,9 +263,9 @@ static int __init omapdss_init_of(void) }
r = of_platform_populate(node, NULL, NULL, &pdev->dev); + put_device(&pdev->dev); if (r) { pr_err("Unable to populate DSS submodule devices\n"); - put_device(&pdev->dev); return r; }
From: Al Cooper alcooperx@gmail.com
[ Upstream commit 42fed57046fc74586d7058bd51a1c10ac9c690cb ]
The PHY client driver does a phy_exit() call on suspend or rmmod and the PHY driver needs to know the difference because some clocks need to be kept running for suspend but can be shutdown on unbind/rmmod (or if there are no PHY clients at all).
The fix is to use a PM notifier so the driver can tell if a PHY client is calling exit() because of a system suspend or a driver unbind/rmmod.
Signed-off-by: Al Cooper alcooperx@gmail.com Acked-by: Florian Fainelli f.fainelli@gmail.com Link: https://lore.kernel.org/r/20211201180653.35097-2-alcooperx@gmail.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/broadcom/phy-brcm-usb.c | 38 +++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+)
diff --git a/drivers/phy/broadcom/phy-brcm-usb.c b/drivers/phy/broadcom/phy-brcm-usb.c index 116fb23aebd99..0f1deb6e0eabf 100644 --- a/drivers/phy/broadcom/phy-brcm-usb.c +++ b/drivers/phy/broadcom/phy-brcm-usb.c @@ -18,6 +18,7 @@ #include <linux/soc/brcmstb/brcmstb.h> #include <dt-bindings/phy/phy.h> #include <linux/mfd/syscon.h> +#include <linux/suspend.h>
#include "phy-brcm-usb-init.h"
@@ -70,12 +71,35 @@ struct brcm_usb_phy_data { int init_count; int wake_irq; struct brcm_usb_phy phys[BRCM_USB_PHY_ID_MAX]; + struct notifier_block pm_notifier; + bool pm_active; };
static s8 *node_reg_names[BRCM_REGS_MAX] = { "crtl", "xhci_ec", "xhci_gbl", "usb_phy", "usb_mdio", "bdc_ec" };
+static int brcm_pm_notifier(struct notifier_block *notifier, + unsigned long pm_event, + void *unused) +{ + struct brcm_usb_phy_data *priv = + container_of(notifier, struct brcm_usb_phy_data, pm_notifier); + + switch (pm_event) { + case PM_HIBERNATION_PREPARE: + case PM_SUSPEND_PREPARE: + priv->pm_active = true; + break; + case PM_POST_RESTORE: + case PM_POST_HIBERNATION: + case PM_POST_SUSPEND: + priv->pm_active = false; + break; + } + return NOTIFY_DONE; +} + static irqreturn_t brcm_usb_phy_wake_isr(int irq, void *dev_id) { struct phy *gphy = dev_id; @@ -91,6 +115,9 @@ static int brcm_usb_phy_init(struct phy *gphy) struct brcm_usb_phy_data *priv = container_of(phy, struct brcm_usb_phy_data, phys[phy->id]);
+ if (priv->pm_active) + return 0; + /* * Use a lock to make sure a second caller waits until * the base phy is inited before using it. @@ -120,6 +147,9 @@ static int brcm_usb_phy_exit(struct phy *gphy) struct brcm_usb_phy_data *priv = container_of(phy, struct brcm_usb_phy_data, phys[phy->id]);
+ if (priv->pm_active) + return 0; + dev_dbg(&gphy->dev, "EXIT\n"); if (phy->id == BRCM_USB_PHY_2_0) brcm_usb_uninit_eohci(&priv->ini); @@ -488,6 +518,9 @@ static int brcm_usb_phy_probe(struct platform_device *pdev) if (err) return err;
+ priv->pm_notifier.notifier_call = brcm_pm_notifier; + register_pm_notifier(&priv->pm_notifier); + mutex_init(&priv->mutex);
/* make sure invert settings are correct */ @@ -528,7 +561,10 @@ static int brcm_usb_phy_probe(struct platform_device *pdev)
static int brcm_usb_phy_remove(struct platform_device *pdev) { + struct brcm_usb_phy_data *priv = dev_get_drvdata(&pdev->dev); + sysfs_remove_group(&pdev->dev.kobj, &brcm_usb_phy_group); + unregister_pm_notifier(&priv->pm_notifier);
return 0; } @@ -539,6 +575,7 @@ static int brcm_usb_phy_suspend(struct device *dev) struct brcm_usb_phy_data *priv = dev_get_drvdata(dev);
if (priv->init_count) { + dev_dbg(dev, "SUSPEND\n"); priv->ini.wake_enabled = device_may_wakeup(dev); if (priv->phys[BRCM_USB_PHY_3_0].inited) brcm_usb_uninit_xhci(&priv->ini); @@ -578,6 +615,7 @@ static int brcm_usb_phy_resume(struct device *dev) * Uninitialize anything that wasn't previously initialized. */ if (priv->init_count) { + dev_dbg(dev, "RESUME\n"); if (priv->wake_irq >= 0) disable_irq_wake(priv->wake_irq); brcm_usb_init_common(&priv->ini);
From: Padmanabha Srinivasaiah treasure4paddy@gmail.com
[ Upstream commit 0cea730cac824edf78ffd3302938ed5fe2b9d50d ]
In service_callback path RCU dereferenced pointer struct vchiq_service need to be accessed inside rcu read-critical section.
Also userdata/user_service part of vchiq_service is accessed around different synchronization mechanism, getting an extra reference to a pointer keeps sematics simpler and avoids prolonged graceperiod.
Accessing vchiq_service with rcu_read_[lock/unlock] fixes below issue.
[ 32.201659] ============================= [ 32.201664] WARNING: suspicious RCU usage [ 32.201670] 5.15.11-rt24-v8+ #3 Not tainted [ 32.201680] ----------------------------- [ 32.201685] drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.h:529 suspicious rcu_dereference_check() usage! [ 32.201695] [ 32.201695] other info that might help us debug this: [ 32.201695] [ 32.201700] [ 32.201700] rcu_scheduler_active = 2, debug_locks = 1 [ 32.201708] no locks held by vchiq-slot/0/98. [ 32.201715] [ 32.201715] stack backtrace: [ 32.201723] CPU: 1 PID: 98 Comm: vchiq-slot/0 Not tainted 5.15.11-rt24-v8+ #3 [ 32.201733] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT) [ 32.201739] Call trace: [ 32.201742] dump_backtrace+0x0/0x1b8 [ 32.201772] show_stack+0x20/0x30 [ 32.201784] dump_stack_lvl+0x8c/0xb8 [ 32.201799] dump_stack+0x18/0x34 [ 32.201808] lockdep_rcu_suspicious+0xe4/0xf8 [ 32.201817] service_callback+0x124/0x400 [ 32.201830] slot_handler_func+0xf60/0x1e20 [ 32.201839] kthread+0x19c/0x1a8 [ 32.201849] ret_from_fork+0x10/0x20
Tested-by: Stefan Wahren stefan.wahren@i2se.com Signed-off-by: Padmanabha Srinivasaiah treasure4paddy@gmail.com Link: https://lore.kernel.org/r/20211231195406.5479-1-treasure4paddy@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../interface/vchiq_arm/vchiq_arm.c | 20 +++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c index 967f10b9582a8..ea9a53bdb4174 100644 --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c @@ -1033,15 +1033,27 @@ service_callback(enum vchiq_reason reason, struct vchiq_header *header,
DEBUG_TRACE(SERVICE_CALLBACK_LINE);
+ rcu_read_lock(); service = handle_to_service(handle); - if (WARN_ON(!service)) + if (WARN_ON(!service)) { + rcu_read_unlock(); return VCHIQ_SUCCESS; + }
user_service = (struct user_service *)service->base.userdata; instance = user_service->instance;
- if (!instance || instance->closing) + if (!instance || instance->closing) { + rcu_read_unlock(); return VCHIQ_SUCCESS; + } + + /* + * As hopping around different synchronization mechanism, + * taking an extra reference results in simpler implementation. + */ + vchiq_service_get(service); + rcu_read_unlock();
vchiq_log_trace(vchiq_arm_log_level, "%s - service %lx(%d,%p), reason %d, header %lx, instance %lx, bulk_userdata %lx", @@ -1074,6 +1086,7 @@ service_callback(enum vchiq_reason reason, struct vchiq_header *header, NULL, user_service, bulk_userdata); if (status != VCHIQ_SUCCESS) { DEBUG_TRACE(SERVICE_CALLBACK_LINE); + vchiq_service_put(service); return status; } } @@ -1084,11 +1097,13 @@ service_callback(enum vchiq_reason reason, struct vchiq_header *header, vchiq_log_info(vchiq_arm_log_level, "%s interrupted", __func__); DEBUG_TRACE(SERVICE_CALLBACK_LINE); + vchiq_service_put(service); return VCHIQ_RETRY; } else if (instance->closing) { vchiq_log_info(vchiq_arm_log_level, "%s closing", __func__); DEBUG_TRACE(SERVICE_CALLBACK_LINE); + vchiq_service_put(service); return VCHIQ_ERROR; } DEBUG_TRACE(SERVICE_CALLBACK_LINE); @@ -1117,6 +1132,7 @@ service_callback(enum vchiq_reason reason, struct vchiq_header *header, header = NULL; } DEBUG_TRACE(SERVICE_CALLBACK_LINE); + vchiq_service_put(service);
if (skip_completion) return VCHIQ_SUCCESS;
From: Wan Jiabing wanjiabing@vivo.com
[ Upstream commit 46e994717807f4b935c44d81dde9dd8bcd9a4f5d ]
Fix following coccicheck warning: ./drivers/phy/mediatek/phy-mtk-tphy.c:994:6-29: duplicated argument to && or ||
The efuse_rx_imp is duplicate. Here should be efuse_tx_imp.
Signed-off-by: Wan Jiabing wanjiabing@vivo.com Acked-by: Chunfeng Yun chunfeng.yun@mediatek.com Link: https://lore.kernel.org/r/20220107025050.787720-1-wanjiabing@vivo.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/mediatek/phy-mtk-tphy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/mediatek/phy-mtk-tphy.c b/drivers/phy/mediatek/phy-mtk-tphy.c index 98a942c607a67..db39b0c4649a2 100644 --- a/drivers/phy/mediatek/phy-mtk-tphy.c +++ b/drivers/phy/mediatek/phy-mtk-tphy.c @@ -1125,7 +1125,7 @@ static int phy_efuse_get(struct mtk_tphy *tphy, struct mtk_phy_instance *instanc /* no efuse, ignore it */ if (!instance->efuse_intr && !instance->efuse_rx_imp && - !instance->efuse_rx_imp) { + !instance->efuse_tx_imp) { dev_warn(dev, "no u3 intr efuse, but dts enable it\n"); instance->efuse_sw_en = 0; break;
From: Guo Ren guoren@linux.alibaba.com
[ Upstream commit 1d4df649cbb4b26d19bea38ecff4b65b10a1bbca ]
The thead,c900-plic has been used in opensbi to distinguish PLIC [1]. Although PLICs have the same behaviors in Linux, they are different hardware with some custom initializing in firmware(opensbi).
Qute opensbi patch commit-msg by Samuel:
The T-HEAD PLIC implementation requires setting a delegation bit to allow access from S-mode. Now that the T-HEAD PLIC has its own compatible string, set this bit automatically from the PLIC driver, instead of reaching into the PLIC's MMIO space from another driver.
[1]: https://github.com/riscv-software-src/opensbi/commit/78c2b19218bd62653b9fb31...
Signed-off-by: Guo Ren guoren@linux.alibaba.com Cc: Anup Patel anup@brainfault.org Cc: Marc Zyngier maz@kernel.org Cc: Palmer Dabbelt palmer@dabbelt.com Cc: Samuel Holland samuel@sholland.org Cc: Thomas Gleixner tglx@linutronix.de Tested-by: Samuel Holland samuel@sholland.org Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20220130135634.1213301-3-guoren@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-sifive-plic.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c index 259065d271ef0..09cc98266d30f 100644 --- a/drivers/irqchip/irq-sifive-plic.c +++ b/drivers/irqchip/irq-sifive-plic.c @@ -398,3 +398,4 @@ static int __init plic_init(struct device_node *node,
IRQCHIP_DECLARE(sifive_plic, "sifive,plic-1.0.0", plic_init); IRQCHIP_DECLARE(riscv_plic0, "riscv,plic0", plic_init); /* for legacy systems */ +IRQCHIP_DECLARE(thead_c900_plic, "thead,c900-plic", plic_init); /* for firmware driver */
From: Nick Desaulniers ndesaulniers@google.com
[ Upstream commit bfb1a7c91fb7758273b4a8d735313d9cc388b502 ]
In __WARN_FLAGS(), we had two asm statements (abbreviated):
asm volatile("ud2"); asm volatile(".pushsection .discard.reachable");
These pair of statements are used to trigger an exception, but then help objtool understand that for warnings, control flow will be restored immediately afterwards.
The problem is that volatile is not a compiler barrier. GCC explicitly documents this:
Note that the compiler can move even volatile asm instructions relative to other code, including across jump instructions.
Also, no clobbers are specified to prevent instructions from subsequent statements from being scheduled by compiler before the second asm statement. This can lead to instructions from subsequent statements being emitted by the compiler before the second asm statement.
Providing a scheduling model such as via -march= options enables the compiler to better schedule instructions with known latencies to hide latencies from data hazards compared to inline asm statements in which latencies are not estimated.
If an instruction gets scheduled by the compiler between the two asm statements, then objtool will think that it is not reachable, producing a warning.
To prevent instructions from being scheduled in between the two asm statements, merge them.
Also remove an unnecessary unreachable() asm annotation from BUG() in favor of __builtin_unreachable(). objtool is able to track that the ud2 from BUG() terminates control flow within the function.
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile Link: https://github.com/ClangBuiltLinux/linux/issues/1483 Signed-off-by: Nick Desaulniers ndesaulniers@google.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Link: https://lore.kernel.org/r/20220202205557.2260694-1-ndesaulniers@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/bug.h | 20 +++++++++++--------- include/linux/compiler.h | 21 +++++---------------- 2 files changed, 16 insertions(+), 25 deletions(-)
diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h index 84b87538a15de..bab883c0b6fee 100644 --- a/arch/x86/include/asm/bug.h +++ b/arch/x86/include/asm/bug.h @@ -22,7 +22,7 @@
#ifdef CONFIG_DEBUG_BUGVERBOSE
-#define _BUG_FLAGS(ins, flags) \ +#define _BUG_FLAGS(ins, flags, extra) \ do { \ asm_inline volatile("1:\t" ins "\n" \ ".pushsection __bug_table,"aw"\n" \ @@ -31,7 +31,8 @@ do { \ "\t.word %c1" "\t# bug_entry::line\n" \ "\t.word %c2" "\t# bug_entry::flags\n" \ "\t.org 2b+%c3\n" \ - ".popsection" \ + ".popsection\n" \ + extra \ : : "i" (__FILE__), "i" (__LINE__), \ "i" (flags), \ "i" (sizeof(struct bug_entry))); \ @@ -39,14 +40,15 @@ do { \
#else /* !CONFIG_DEBUG_BUGVERBOSE */
-#define _BUG_FLAGS(ins, flags) \ +#define _BUG_FLAGS(ins, flags, extra) \ do { \ asm_inline volatile("1:\t" ins "\n" \ ".pushsection __bug_table,"aw"\n" \ "2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n" \ "\t.word %c0" "\t# bug_entry::flags\n" \ "\t.org 2b+%c1\n" \ - ".popsection" \ + ".popsection\n" \ + extra \ : : "i" (flags), \ "i" (sizeof(struct bug_entry))); \ } while (0) @@ -55,7 +57,7 @@ do { \
#else
-#define _BUG_FLAGS(ins, flags) asm volatile(ins) +#define _BUG_FLAGS(ins, flags, extra) asm volatile(ins)
#endif /* CONFIG_GENERIC_BUG */
@@ -63,8 +65,8 @@ do { \ #define BUG() \ do { \ instrumentation_begin(); \ - _BUG_FLAGS(ASM_UD2, 0); \ - unreachable(); \ + _BUG_FLAGS(ASM_UD2, 0, ""); \ + __builtin_unreachable(); \ } while (0)
/* @@ -75,9 +77,9 @@ do { \ */ #define __WARN_FLAGS(flags) \ do { \ + __auto_type f = BUGFLAG_WARNING|(flags); \ instrumentation_begin(); \ - _BUG_FLAGS(ASM_UD2, BUGFLAG_WARNING|(flags)); \ - annotate_reachable(); \ + _BUG_FLAGS(ASM_UD2, f, ASM_REACHABLE); \ instrumentation_end(); \ } while (0)
diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 429dcebe2b992..0f7fd205ab7ea 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -117,14 +117,6 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, */ #define __stringify_label(n) #n
-#define __annotate_reachable(c) ({ \ - asm volatile(__stringify_label(c) ":\n\t" \ - ".pushsection .discard.reachable\n\t" \ - ".long " __stringify_label(c) "b - .\n\t" \ - ".popsection\n\t" : : "i" (c)); \ -}) -#define annotate_reachable() __annotate_reachable(__COUNTER__) - #define __annotate_unreachable(c) ({ \ asm volatile(__stringify_label(c) ":\n\t" \ ".pushsection .discard.unreachable\n\t" \ @@ -133,24 +125,21 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, }) #define annotate_unreachable() __annotate_unreachable(__COUNTER__)
-#define ASM_UNREACHABLE \ - "999:\n\t" \ - ".pushsection .discard.unreachable\n\t" \ - ".long 999b - .\n\t" \ +#define ASM_REACHABLE \ + "998:\n\t" \ + ".pushsection .discard.reachable\n\t" \ + ".long 998b - .\n\t" \ ".popsection\n\t"
/* Annotate a C jump table to allow objtool to follow the code flow */ #define __annotate_jump_table __section(".rodata..c_jump_table")
#else -#define annotate_reachable() #define annotate_unreachable() +# define ASM_REACHABLE #define __annotate_jump_table #endif
-#ifndef ASM_UNREACHABLE -# define ASM_UNREACHABLE -#endif #ifndef unreachable # define unreachable() do { \ annotate_unreachable(); \
From: Florian Westphal fw@strlen.de
[ Upstream commit 77b337196a9d87f3d6bb9b07c0436ecafbffda1e ]
Vivek Thrivikraman reported: An SCTP server application which is accessed continuously by client application. When the session disconnects the client retries to establish a connection. After restart of SCTP server application the session is not established because of stale conntrack entry with connection state CLOSED as below.
(removing this entry manually established new connection):
sctp 9 CLOSED src=10.141.189.233 [..] [ASSURED]
Just skip timeout update of closed entries, we don't want them to stay around forever.
Reported-and-tested-by: Vivek Thrivikraman vivek.thrivikraman@est.tech Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1579 Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_conntrack_proto_sctp.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/net/netfilter/nf_conntrack_proto_sctp.c b/net/netfilter/nf_conntrack_proto_sctp.c index 2394238d01c91..5a936334b517a 100644 --- a/net/netfilter/nf_conntrack_proto_sctp.c +++ b/net/netfilter/nf_conntrack_proto_sctp.c @@ -489,6 +489,15 @@ int nf_conntrack_sctp_packet(struct nf_conn *ct, pr_debug("Setting vtag %x for dir %d\n", ih->init_tag, !dir); ct->proto.sctp.vtag[!dir] = ih->init_tag; + + /* don't renew timeout on init retransmit so + * port reuse by client or NAT middlebox cannot + * keep entry alive indefinitely (incl. nat info). + */ + if (new_state == SCTP_CONNTRACK_CLOSED && + old_state == SCTP_CONNTRACK_CLOSED && + nf_ct_is_confirmed(ct)) + ignore = true; }
ct->proto.sctp.state = new_state;
From: Namjae Jeon linkinjeon@kernel.org
[ Upstream commit 97550c7478a2da93e348d8c3075d92cddd473a78 ]
ksmbd sets the inode number to UniqueId. However, the same UniqueId for dot and dotdot entry is set to the inode number of the parent inode. This patch set them using the current inode and parent inode.
Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ksmbd/smb_common.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/ksmbd/smb_common.c b/fs/ksmbd/smb_common.c index 707490ab1f4c4..f2e7e3a654b34 100644 --- a/fs/ksmbd/smb_common.c +++ b/fs/ksmbd/smb_common.c @@ -308,14 +308,17 @@ int ksmbd_populate_dot_dotdot_entries(struct ksmbd_work *work, int info_level, for (i = 0; i < 2; i++) { struct kstat kstat; struct ksmbd_kstat ksmbd_kstat; + struct dentry *dentry;
if (!dir->dot_dotdot[i]) { /* fill dot entry info */ if (i == 0) { d_info->name = "."; d_info->name_len = 1; + dentry = dir->filp->f_path.dentry; } else { d_info->name = ".."; d_info->name_len = 2; + dentry = dir->filp->f_path.dentry->d_parent; }
if (!match_pattern(d_info->name, d_info->name_len, @@ -327,7 +330,7 @@ int ksmbd_populate_dot_dotdot_entries(struct ksmbd_work *work, int info_level, ksmbd_kstat.kstat = &kstat; ksmbd_vfs_fill_dentry_attrs(work, user_ns, - dir->filp->f_path.dentry->d_parent, + dentry, &ksmbd_kstat); rc = fn(conn, info_level, d_info, &ksmbd_kstat); if (rc)
From: Namjae Jeon linkinjeon@kernel.org
[ Upstream commit 04e260948a160d3b7d622bf4c8a96fa4577c09bd ]
When checking smb2 query directory packets from other servers, OutputBufferLength is different with ksmbd. Other servers add an unaligned next offset to OutputBufferLength for the last entry.
Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ksmbd/smb2pdu.c | 7 ++++--- fs/ksmbd/vfs.h | 1 + 2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/ksmbd/smb2pdu.c b/fs/ksmbd/smb2pdu.c index 70685cbbec8c0..192d8308afc27 100644 --- a/fs/ksmbd/smb2pdu.c +++ b/fs/ksmbd/smb2pdu.c @@ -3422,9 +3422,9 @@ static int smb2_populate_readdir_entry(struct ksmbd_conn *conn, int info_level, goto free_conv_name; }
- struct_sz = readdir_info_level_struct_sz(info_level); - next_entry_offset = ALIGN(struct_sz - 1 + conv_len, - KSMBD_DIR_INFO_ALIGNMENT); + struct_sz = readdir_info_level_struct_sz(info_level) - 1 + conv_len; + next_entry_offset = ALIGN(struct_sz, KSMBD_DIR_INFO_ALIGNMENT); + d_info->last_entry_off_align = next_entry_offset - struct_sz;
if (next_entry_offset > d_info->out_buf_len) { d_info->out_buf_len = 0; @@ -3976,6 +3976,7 @@ int smb2_query_dir(struct ksmbd_work *work) ((struct file_directory_info *) ((char *)rsp->Buffer + d_info.last_entry_offset)) ->NextEntryOffset = 0; + d_info.data_count -= d_info.last_entry_off_align;
rsp->StructureSize = cpu_to_le16(9); rsp->OutputBufferOffset = cpu_to_le16(72); diff --git a/fs/ksmbd/vfs.h b/fs/ksmbd/vfs.h index b0d5b8feb4a36..432c947731779 100644 --- a/fs/ksmbd/vfs.h +++ b/fs/ksmbd/vfs.h @@ -86,6 +86,7 @@ struct ksmbd_dir_info { int last_entry_offset; bool hide_dot_file; int flags; + int last_entry_off_align; };
struct ksmbd_readdir_data {
From: Christian Hewitt christianshewitt@gmail.com
[ Upstream commit 76577c9137456febb05b0e17d244113196a98968 ]
Add an additional reserved memory region for the BL32 trusted firmware present in many devices that boot from Amlogic vendor u-boot.
Suggested-by: Mateusz Krzak kszaquitto@gmail.com Signed-off-by: Christian Hewitt christianshewitt@gmail.com Reviewed-by: Neil Armstrong narmstrong@baylibre.com Reviewed-by: Kevin Hilman khilman@baylibre.com Signed-off-by: Neil Armstrong narmstrong@baylibre.com Link: https://lore.kernel.org/r/20220126044954.19069-2-christianshewitt@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/amlogic/meson-gx.dtsi | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm64/boot/dts/amlogic/meson-gx.dtsi b/arch/arm64/boot/dts/amlogic/meson-gx.dtsi index 6b457b2c30a4b..aa14ea017a613 100644 --- a/arch/arm64/boot/dts/amlogic/meson-gx.dtsi +++ b/arch/arm64/boot/dts/amlogic/meson-gx.dtsi @@ -49,6 +49,12 @@ no-map; };
+ /* 32 MiB reserved for ARM Trusted Firmware (BL32) */ + secmon_reserved_bl32: secmon@5300000 { + reg = <0x0 0x05300000 0x0 0x2000000>; + no-map; + }; + linux,cma { compatible = "shared-dma-pool"; reusable;
From: Christian Hewitt christianshewitt@gmail.com
[ Upstream commit 08982a1b3aa2611c9c711d24825c9002d28536f4 ]
Add an additional reserved memory region for the BL32 trusted firmware present in many devices that boot from Amlogic vendor u-boot.
Signed-off-by: Christian Hewitt christianshewitt@gmail.com Reviewed-by: Neil Armstrong narmstrong@baylibre.com Reviewed-by: Kevin Hilman khilman@baylibre.com Signed-off-by: Neil Armstrong narmstrong@baylibre.com Link: https://lore.kernel.org/r/20220126044954.19069-3-christianshewitt@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi b/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi index 428449d98c0ae..a3a1ea0f21340 100644 --- a/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi +++ b/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi @@ -107,6 +107,12 @@ no-map; };
+ /* 32 MiB reserved for ARM Trusted Firmware (BL32) */ + secmon_reserved_bl32: secmon@5300000 { + reg = <0x0 0x05300000 0x0 0x2000000>; + no-map; + }; + linux,cma { compatible = "shared-dma-pool"; reusable;
From: Christian Hewitt christianshewitt@gmail.com
[ Upstream commit f26573e2bc9dfd551a0d5c6971f18cc546543312 ]
The BL32/TEE reserved-memory region is now inherited from the common family dtsi (meson-g12-common) so we can drop it from board files.
Signed-off-by: Christian Hewitt christianshewitt@gmail.com Reviewed-by: Neil Armstrong narmstrong@baylibre.com Reviewed-by: Kevin Hilman khilman@baylibre.com Signed-off-by: Neil Armstrong narmstrong@baylibre.com Link: https://lore.kernel.org/r/20220126044954.19069-4-christianshewitt@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/amlogic/meson-g12a-sei510.dts | 8 -------- arch/arm64/boot/dts/amlogic/meson-sm1-sei610.dts | 8 -------- 2 files changed, 16 deletions(-)
diff --git a/arch/arm64/boot/dts/amlogic/meson-g12a-sei510.dts b/arch/arm64/boot/dts/amlogic/meson-g12a-sei510.dts index d8838dde0f0f4..4fb31c2ba31c4 100644 --- a/arch/arm64/boot/dts/amlogic/meson-g12a-sei510.dts +++ b/arch/arm64/boot/dts/amlogic/meson-g12a-sei510.dts @@ -157,14 +157,6 @@ regulator-always-on; };
- reserved-memory { - /* TEE Reserved Memory */ - bl32_reserved: bl32@5000000 { - reg = <0x0 0x05300000 0x0 0x2000000>; - no-map; - }; - }; - sdio_pwrseq: sdio-pwrseq { compatible = "mmc-pwrseq-simple"; reset-gpios = <&gpio GPIOX_6 GPIO_ACTIVE_LOW>; diff --git a/arch/arm64/boot/dts/amlogic/meson-sm1-sei610.dts b/arch/arm64/boot/dts/amlogic/meson-sm1-sei610.dts index 427475846fc70..a5d79f2f7c196 100644 --- a/arch/arm64/boot/dts/amlogic/meson-sm1-sei610.dts +++ b/arch/arm64/boot/dts/amlogic/meson-sm1-sei610.dts @@ -203,14 +203,6 @@ regulator-always-on; };
- reserved-memory { - /* TEE Reserved Memory */ - bl32_reserved: bl32@5000000 { - reg = <0x0 0x05300000 0x0 0x2000000>; - no-map; - }; - }; - sdio_pwrseq: sdio-pwrseq { compatible = "mmc-pwrseq-simple"; reset-gpios = <&gpio GPIOX_6 GPIO_ACTIVE_LOW>;
From: Axel Rasmussen axelrasmussen@google.com
[ Upstream commit 4cbd93c3c110447adc66cb67c08af21f939ae2d7 ]
When running the pidfd_fdinfo_test on arm64, it fails for me. After some digging, the reason is that the child exits due to SIGBUS, because it overflows the 1024 byte stack we've reserved for it.
To fix the issue, increase the stack size to 8192 bytes (this number is somewhat arbitrary, and was arrived at through experimentation -- I kept doubling until the failure no longer occurred).
Also, let's make the issue easier to debug. wait_for_pid() returns an ambiguous value: it may return -1 in all of these cases:
1. waitpid() itself returned -1 2. waitpid() returned success, but we found !WIFEXITED(status). 3. The child process exited, but it did so with a -1 exit code.
There's no way for the caller to tell the difference. So, at least log which occurred, so the test runner can debug things.
While debugging this, I found that we had !WIFEXITED(), because the child exited due to a signal. This seems like a reasonably common case, so also print out whether or not we have WIFSIGNALED(), and the associated WTERMSIG() (if any). This lets us see the SIGBUS I'm fixing clearly when it occurs.
Finally, I'm suspicious of allocating the child's stack on our stack. man clone(2) suggests that the correct way to do this is with mmap(), and in particular by setting MAP_STACK. So, switch to doing it that way instead.
Signed-off-by: Axel Rasmussen axelrasmussen@google.com Acked-by: Christian Brauner brauner@kernel.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/pidfd/pidfd.h | 13 ++++++++--- .../selftests/pidfd/pidfd_fdinfo_test.c | 22 +++++++++++++++---- 2 files changed, 28 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h index 01f8d3c0cf2cb..6922d6417e1cf 100644 --- a/tools/testing/selftests/pidfd/pidfd.h +++ b/tools/testing/selftests/pidfd/pidfd.h @@ -68,7 +68,7 @@ #define PIDFD_SKIP 3 #define PIDFD_XFAIL 4
-int wait_for_pid(pid_t pid) +static inline int wait_for_pid(pid_t pid) { int status, ret;
@@ -78,13 +78,20 @@ int wait_for_pid(pid_t pid) if (errno == EINTR) goto again;
+ ksft_print_msg("waitpid returned -1, errno=%d\n", errno); return -1; }
- if (!WIFEXITED(status)) + if (!WIFEXITED(status)) { + ksft_print_msg( + "waitpid !WIFEXITED, WIFSIGNALED=%d, WTERMSIG=%d\n", + WIFSIGNALED(status), WTERMSIG(status)); return -1; + }
- return WEXITSTATUS(status); + ret = WEXITSTATUS(status); + ksft_print_msg("waitpid WEXITSTATUS=%d\n", ret); + return ret; }
static inline int sys_pidfd_open(pid_t pid, unsigned int flags) diff --git a/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c b/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c index 22558524f71c3..3fd8e903118f5 100644 --- a/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c +++ b/tools/testing/selftests/pidfd/pidfd_fdinfo_test.c @@ -12,6 +12,7 @@ #include <string.h> #include <syscall.h> #include <sys/wait.h> +#include <sys/mman.h>
#include "pidfd.h" #include "../kselftest.h" @@ -80,7 +81,10 @@ static inline int error_check(struct error *err, const char *test_name) return err->code; }
+#define CHILD_STACK_SIZE 8192 + struct child { + char *stack; pid_t pid; int fd; }; @@ -89,17 +93,22 @@ static struct child clone_newns(int (*fn)(void *), void *args, struct error *err) { static int flags = CLONE_PIDFD | CLONE_NEWPID | CLONE_NEWNS | SIGCHLD; - size_t stack_size = 1024; - char *stack[1024] = { 0 }; struct child ret;
if (!(flags & CLONE_NEWUSER) && geteuid() != 0) flags |= CLONE_NEWUSER;
+ ret.stack = mmap(NULL, CHILD_STACK_SIZE, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0); + if (ret.stack == MAP_FAILED) { + error_set(err, -1, "mmap of stack failed (errno %d)", errno); + return ret; + } + #ifdef __ia64__ - ret.pid = __clone2(fn, stack, stack_size, flags, args, &ret.fd); + ret.pid = __clone2(fn, ret.stack, CHILD_STACK_SIZE, flags, args, &ret.fd); #else - ret.pid = clone(fn, stack + stack_size, flags, args, &ret.fd); + ret.pid = clone(fn, ret.stack + CHILD_STACK_SIZE, flags, args, &ret.fd); #endif
if (ret.pid < 0) { @@ -129,6 +138,11 @@ static inline int child_join(struct child *child, struct error *err) else if (r > 0) error_set(err, r, "child %d reported: %d", child->pid, r);
+ if (munmap(child->stack, CHILD_STACK_SIZE)) { + error_set(err, -1, "munmap of child stack failed (errno %d)", errno); + r = -1; + } + return r; }
From: Axel Rasmussen axelrasmussen@google.com
[ Upstream commit e2aa5e650b07693477dff554053605976789fd68 ]
These are some trivial fixups, which were needed to build the tests with clang and -Werror. The following issues are fixed:
- Remove various unused variables. - In child_poll_leader_exit_test, clang isn't smart enough to realize syscall(SYS_exit, 0) won't return, so it complains we never return from a non-void function. Add an extra exit(0) to appease it. - In test_pidfd_poll_leader_exit, ret may be branched on despite being uninitialized, if we have !use_waitpid. Initialize it to zero to get the right behavior in that case.
Signed-off-by: Axel Rasmussen axelrasmussen@google.com Acked-by: Christian Brauner brauner@kernel.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/clone3/clone3.c | 2 -- tools/testing/selftests/pidfd/pidfd_test.c | 6 +++--- tools/testing/selftests/pidfd/pidfd_wait.c | 5 ++--- 3 files changed, 5 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/clone3/clone3.c b/tools/testing/selftests/clone3/clone3.c index 076cf4325f783..cd4582129c7d6 100644 --- a/tools/testing/selftests/clone3/clone3.c +++ b/tools/testing/selftests/clone3/clone3.c @@ -126,8 +126,6 @@ static void test_clone3(uint64_t flags, size_t size, int expected,
int main(int argc, char *argv[]) { - pid_t pid; - uid_t uid = getuid();
ksft_print_header(); diff --git a/tools/testing/selftests/pidfd/pidfd_test.c b/tools/testing/selftests/pidfd/pidfd_test.c index 529eb700ac26a..9a2d64901d591 100644 --- a/tools/testing/selftests/pidfd/pidfd_test.c +++ b/tools/testing/selftests/pidfd/pidfd_test.c @@ -441,7 +441,6 @@ static void test_pidfd_poll_exec(int use_waitpid) { int pid, pidfd = 0; int status, ret; - pthread_t t1; time_t prog_start = time(NULL); const char *test_name = "pidfd_poll check for premature notification on child thread exec";
@@ -500,13 +499,14 @@ static int child_poll_leader_exit_test(void *args) */ *child_exit_secs = time(NULL); syscall(SYS_exit, 0); + /* Never reached, but appeases compiler thinking we should return. */ + exit(0); }
static void test_pidfd_poll_leader_exit(int use_waitpid) { int pid, pidfd = 0; - int status, ret; - time_t prog_start = time(NULL); + int status, ret = 0; const char *test_name = "pidfd_poll check for premature notification on non-empty" "group leader exit";
diff --git a/tools/testing/selftests/pidfd/pidfd_wait.c b/tools/testing/selftests/pidfd/pidfd_wait.c index be2943f072f60..17999e082aa71 100644 --- a/tools/testing/selftests/pidfd/pidfd_wait.c +++ b/tools/testing/selftests/pidfd/pidfd_wait.c @@ -39,7 +39,7 @@ static int sys_waitid(int which, pid_t pid, siginfo_t *info, int options,
TEST(wait_simple) { - int pidfd = -1, status = 0; + int pidfd = -1; pid_t parent_tid = -1; struct clone_args args = { .parent_tid = ptr_to_u64(&parent_tid), @@ -47,7 +47,6 @@ TEST(wait_simple) .flags = CLONE_PIDFD | CLONE_PARENT_SETTID, .exit_signal = SIGCHLD, }; - int ret; pid_t pid; siginfo_t info = { .si_signo = 0, @@ -88,7 +87,7 @@ TEST(wait_simple)
TEST(wait_states) { - int pidfd = -1, status = 0; + int pidfd = -1; pid_t parent_tid = -1; struct clone_args args = { .parent_tid = ptr_to_u64(&parent_tid),
From: Shakeel Butt shakeelb@google.com
[ Upstream commit 0a3f1e0beacf6cc8ae5f846b0641c1df476e83d6 ]
On an overcommitted system which is running multiple workloads of varying priorities, it is preferred to trigger an oom-killer to kill a low priority workload than to let the high priority workload receiving ENOMEMs. On our memory overcommitted systems, we are seeing a lot of ENOMEMs instead of oom-kills because io_uring_setup callchain is using __GFP_NORETRY gfp flag which avoids the oom-killer. Let's remove it and allow the oom-killer to kill a lower priority job.
Signed-off-by: Shakeel Butt shakeelb@google.com Link: https://lore.kernel.org/r/20220125051736.2981459-1-shakeelb@google.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 993913c585fbf..21fc8ce9405d3 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8820,10 +8820,9 @@ static void io_mem_free(void *ptr)
static void *io_mem_alloc(size_t size) { - gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP | - __GFP_NORETRY | __GFP_ACCOUNT; + gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP;
- return (void *) __get_free_pages(gfp_flags, get_order(size)); + return (void *) __get_free_pages(gfp, get_order(size)); }
static unsigned long rings_size(unsigned sq_entries, unsigned cq_entries,
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit d6ebb17ccc7b37872a32bc25b4a21f1e5af8c7e3 ]
Testing on various upcoming OEM systems shows commit 7b167c4cb48e ("ACPI: PM: Only mark EC GPE for wakeup on Intel systems") was short sighted and the symptoms were indicative of other problems. Some OEMs do have the dedicated GPIOs for the power button but also rely upon an interrupt to the EC SCI to let the lid work.
The original commit showed spurious activity on Lenovo systems: * On both Lenovo T14 and P14s the keyboard wakeup doesn't work, and sometimes the power button event doesn't work.
This was confirmed on my end at that time.
However further development in the kernel showed that the issue was actually the IRQ for the GPIO controller was also shared with the EC SCI. This was actually fixed by commit 2d54067fcd23 ("pinctrl: amd: Fix wakeups when IRQ is shared with SCI").
The original commit also showed problems with AC adapter: * On HP 635 G7 detaching or attaching AC during suspend will cause the system not to wakeup * On Asus vivobook to prevent detaching AC causing resume problems * On Lenovo 14ARE05 to prevent detaching AC causing resume problems * On HP ENVY x360 to prevent detaching AC causing resume problems
Detaching AC adapter causing problems appears to have been a problem because the EC SCI went off to notify the OS of the power adapter change but the SCI was ignored and there was no other way to wake up this system since GPIO controller wasn't properly enabled. The wakeups were fixed by enabling the GPIO controller in commit acd47b9f28e5 ("pinctrl: amd: Handle wake-up interrupt").
I've confirmed on a variety of OEM notebooks with the following test
1) echo 1 | sudo tee /sys/power/pm_debug_messages 2) sudo systemctl suspend 3) unplug AC adapter, make sure system is still asleep 4) wake system from lid (which is provided by ACPI SCI on some of them) 5) dmesg a) see the EC GPE dispatched, timekeeping for X seconds (matching ~time until AC adapter plug out) b) see timekeeping for Y seconds until woke (matching ~time from AC adapter until lid event) 6) Look at /sys/kernel/debug/amd_pmc/s0ix_stats "Time (in us) in S0i3" = X + Y - firmware processing time
Signed-off-by: Mario Limonciello mario.limonciello@amd.com Tested-by: Kai-Heng Feng kai.heng.feng@canonical.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/x86/s2idle.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c index 1c48358b43ba3..e0185e841b2a3 100644 --- a/drivers/acpi/x86/s2idle.c +++ b/drivers/acpi/x86/s2idle.c @@ -424,15 +424,11 @@ static int lps0_device_attach(struct acpi_device *adev, mem_sleep_current = PM_SUSPEND_TO_IDLE;
/* - * Some Intel based LPS0 systems, like ASUS Zenbook UX430UNR/i7-8550U don't - * use intel-hid or intel-vbtn but require the EC GPE to be enabled while - * suspended for certain wakeup devices to work, so mark it as wakeup-capable. - * - * Only enable on !AMD as enabling this universally causes problems for a number - * of AMD based systems. + * Some LPS0 systems, like ASUS Zenbook UX430UNR/i7-8550U, require the + * EC GPE to be enabled while suspended for certain wakeup devices to + * work, so mark it as wakeup-capable. */ - if (!acpi_s2idle_vendor_amd()) - acpi_ec_mark_gpe_for_wake(); + acpi_ec_mark_gpe_for_wake();
return 0; }
From: Brenda Streiff brenda.streiff@ni.com
[ Upstream commit 8a4c5b2a6d8ea079fa36034e8167de87ab6f8880 ]
The 'shell' built-in only returns the first 256 bytes of the command's output. In some cases, 'shell' is used to return a path; by bumping up the buffer size to 4096 this lets us capture up to PATH_MAX.
The specific case where I ran into this was due to commit 1e860048c53e ("gcc-plugins: simplify GCC plugin-dev capability test"). After this change, we now use `$(shell,$(CC) -print-file-name=plugin)` to return a path; if the gcc path is particularly long, then the path ends up truncated at the 256 byte mark, which makes the HAVE_GCC_PLUGINS depends test always fail.
Signed-off-by: Brenda Streiff brenda.streiff@ni.com Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/kconfig/preprocess.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/kconfig/preprocess.c b/scripts/kconfig/preprocess.c index 0590f86df6e40..748da578b418c 100644 --- a/scripts/kconfig/preprocess.c +++ b/scripts/kconfig/preprocess.c @@ -141,7 +141,7 @@ static char *do_lineno(int argc, char *argv[]) static char *do_shell(int argc, char *argv[]) { FILE *p; - char buf[256]; + char buf[4096]; char *cmd; size_t nread; int i;
From: Zoltán Böszörményi zboszor@gmail.com
[ Upstream commit c8ea23d5fa59f28302d4e3370c75d9c308e64410 ]
This device is a CF card, or possibly an SSD in CF form factor. It supports NCQ and high speed DMA.
While it also advertises TRIM support, I/O errors are reported when the discard mount option fstrim is used. TRIM also fails when disabling NCQ and not just as an NCQ command.
TRIM must be disabled for this device.
Signed-off-by: Zoltán Böszörményi zboszor@gmail.com Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/libata-core.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 4d848cfc406fe..24b67d78cb83d 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4014,6 +4014,7 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
/* devices that don't properly handle TRIM commands */ { "SuperSSpeed S238*", NULL, ATA_HORKAGE_NOTRIM, }, + { "M88V29*", NULL, ATA_HORKAGE_NOTRIM, },
/* * As defined, the DRAT (Deterministic Read After Trim) and RZAT
From: Jae Hyun Yoo jae.hyun.yoo@linux.intel.com
[ Upstream commit 301a5d3ad2432d7829f59432ca0a93a6defbb9a1 ]
Add a checking code when it gets -EPROBE_DEFER while getting a clock resource. In this case, it doesn't need to print out an error message because the probing will be re-visited.
Signed-off-by: Jae Hyun Yoo jae.hyun.yoo@linux.intel.com Signed-off-by: Joel Stanley joel@jms.id.au Reviewed-by: Andrew Jeffery andrew@aj.id.au Reviewed-by: Iwona Winiarska iwona.winiarska@intel.com Link: https://lore.kernel.org/r/20211104173709.222912-1-jae.hyun.yoo@intel.com Link: https://lore.kernel.org/r/20220201070118.196372-1-joel@jms.id.au' Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/aspeed/aspeed-lpc-ctrl.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/soc/aspeed/aspeed-lpc-ctrl.c b/drivers/soc/aspeed/aspeed-lpc-ctrl.c index 72771e018c42e..258894ed234b3 100644 --- a/drivers/soc/aspeed/aspeed-lpc-ctrl.c +++ b/drivers/soc/aspeed/aspeed-lpc-ctrl.c @@ -306,10 +306,9 @@ static int aspeed_lpc_ctrl_probe(struct platform_device *pdev) }
lpc_ctrl->clk = devm_clk_get(dev, NULL); - if (IS_ERR(lpc_ctrl->clk)) { - dev_err(dev, "couldn't get clock\n"); - return PTR_ERR(lpc_ctrl->clk); - } + if (IS_ERR(lpc_ctrl->clk)) + return dev_err_probe(dev, PTR_ERR(lpc_ctrl->clk), + "couldn't get clock\n"); rc = clk_prepare_enable(lpc_ctrl->clk); if (rc) { dev_err(dev, "couldn't enable clock\n");
From: Dan Aloni dan.aloni@vastdata.com
[ Upstream commit a9c10b5b3b67b3750a10c8b089b2e05f5e176e33 ]
If there are failures then we must not leave the non-NULL pointers with the error value, otherwise `rpcrdma_ep_destroy` gets confused and tries free them, resulting in an Oops.
Signed-off-by: Dan Aloni dan.aloni@vastdata.com Acked-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/xprtrdma/verbs.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index aaec3c9be8db6..1295f9ab839fd 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -438,6 +438,7 @@ static int rpcrdma_ep_create(struct rpcrdma_xprt *r_xprt) IB_POLL_WORKQUEUE); if (IS_ERR(ep->re_attr.send_cq)) { rc = PTR_ERR(ep->re_attr.send_cq); + ep->re_attr.send_cq = NULL; goto out_destroy; }
@@ -446,6 +447,7 @@ static int rpcrdma_ep_create(struct rpcrdma_xprt *r_xprt) IB_POLL_WORKQUEUE); if (IS_ERR(ep->re_attr.recv_cq)) { rc = PTR_ERR(ep->re_attr.recv_cq); + ep->re_attr.recv_cq = NULL; goto out_destroy; } ep->re_receive_count = 0; @@ -484,6 +486,7 @@ static int rpcrdma_ep_create(struct rpcrdma_xprt *r_xprt) ep->re_pd = ib_alloc_pd(device, 0); if (IS_ERR(ep->re_pd)) { rc = PTR_ERR(ep->re_pd); + ep->re_pd = NULL; goto out_destroy; }
From: Sascha Hauer s.hauer@pengutronix.de
[ Upstream commit c0cfbb122275da1b726481de5a8cffeb24e6322b ]
The driver returns an error when devm_phy_optional_get() fails leaving the previously enabled clock turned on. Change order and enable the clock only after the phy has been acquired.
Signed-off-by: Sascha Hauer s.hauer@pengutronix.de Signed-off-by: Heiko Stuebner heiko@sntech.de Link: https://patchwork.freedesktop.org/patch/msgid/20220126145549.617165-3-s.haue... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c b/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c index 830bdd5e9b7ce..8677c82716784 100644 --- a/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c @@ -529,13 +529,6 @@ static int dw_hdmi_rockchip_bind(struct device *dev, struct device *master, return ret; }
- ret = clk_prepare_enable(hdmi->vpll_clk); - if (ret) { - DRM_DEV_ERROR(hdmi->dev, "Failed to enable HDMI vpll: %d\n", - ret); - return ret; - } - hdmi->phy = devm_phy_optional_get(dev, "hdmi"); if (IS_ERR(hdmi->phy)) { ret = PTR_ERR(hdmi->phy); @@ -544,6 +537,13 @@ static int dw_hdmi_rockchip_bind(struct device *dev, struct device *master, return ret; }
+ ret = clk_prepare_enable(hdmi->vpll_clk); + if (ret) { + DRM_DEV_ERROR(hdmi->dev, "Failed to enable HDMI vpll: %d\n", + ret); + return ret; + } + drm_encoder_helper_add(encoder, &dw_hdmi_rockchip_encoder_helper_funcs); drm_simple_encoder_init(drm, encoder, DRM_MODE_ENCODER_TMDS);
From: JaeSang Yoo js.yoo.5b@gmail.com
[ Upstream commit 3203ce39ac0b2a57a84382ec184c7d4a0bede175 ]
The kernel parameter "tp_printk_stop_on_boot" starts with "tp_printk" which is the same as another kernel parameter "tp_printk". If "tp_printk" setup is called before the "tp_printk_stop_on_boot", it will override the latter and keep it from being set.
This is similar to other kernel parameter issues, such as: Commit 745a600cf1a6 ("um: console: Ignore console= option") or init/do_mounts.c:45 (setup function of "ro" kernel param)
Fix it by checking for a "_" right after the "tp_printk" and if that exists do not process the parameter.
Link: https://lkml.kernel.org/r/20220208195421.969326-1-jsyoo5b@gmail.com
Signed-off-by: JaeSang Yoo jsyoo5b@gmail.com [ Fixed up change log and added space after if condition ] Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 51a87a67e2abe..618c20ce2479d 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -252,6 +252,10 @@ __setup("trace_clock=", set_trace_boot_clock);
static int __init set_tracepoint_printk(char *str) { + /* Ignore the "tp_printk_stop_on_boot" param */ + if (*str == '_') + return 0; + if ((strcmp(str, "=0") != 0 && strcmp(str, "=off") != 0)) tracepoint_printk = 1; return 1;
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 03ad3093c7c069d6ab4403730009ebafeea9ee37 ]
A number of BIOS versions have a problem with the watermarks table not being configured properly. This manifests as a very scary looking warning during resume from s0i3. This should be harmless in most cases and is well understood, so decrease the assertion to a clearer warning about the problem.
Reviewed-by: Harry Wentland harry.wentland@amd.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c index 162ae71861247..21d2cbc3cbb20 100644 --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_smu.c @@ -120,7 +120,11 @@ int dcn31_smu_send_msg_with_param( result = dcn31_smu_wait_for_response(clk_mgr, 10, 200000);
if (result == VBIOSSMC_Result_Failed) { - ASSERT(0); + if (msg_id == VBIOSSMC_MSG_TransferTableDram2Smu && + param == TABLE_WATERMARKS) + DC_LOG_WARNING("Watermarks table not configured properly by SMU"); + else + ASSERT(0); REG_WRITE(MP1_SMN_C2PMSG_91, VBIOSSMC_Result_OK); return -1; }
From: Roman Li Roman.Li@amd.com
[ Upstream commit 328e34a5ad227399391891d454043e5d73e598d2 ]
[Why] pflip interrupt order are mapped 1 to 1 to otg id. e.g. if irq_src=26 corresponds to otg0 then 27->otg1, 28->otg2...
Linux DM registers pflip interrupts per number of crtcs. In fused pipe case crtc numbers can be less than otg id.
e.g. if one pipe out of 3(otg#0-2) is fused adev->mode_info.num_crtc=2 so DM only registers irq_src 26,27. This is a bug since if pipe#2 remains unfused DM never gets otg2 pflip interrupt (irq_src=28) That may results in gfx failure due to pflip timeout.
[How] Register pflip interrupts per max num of otg instead of num_crtc
Signed-off-by: Roman Li Roman.Li@amd.com Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- drivers/gpu/drm/amd/display/dc/core/dc.c | 2 ++ drivers/gpu/drm/amd/display/dc/dc.h | 1 + 3 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 16556ae892d4a..5ae9b8133d6da 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -3230,7 +3230,7 @@ static int dcn10_register_irq_handlers(struct amdgpu_device *adev)
/* Use GRPH_PFLIP interrupt */ for (i = DCN_1_0__SRCID__HUBP0_FLIP_INTERRUPT; - i <= DCN_1_0__SRCID__HUBP0_FLIP_INTERRUPT + adev->mode_info.num_crtc - 1; + i <= DCN_1_0__SRCID__HUBP0_FLIP_INTERRUPT + dc->caps.max_otg_num - 1; i++) { r = amdgpu_irq_add_id(adev, SOC15_IH_CLIENTID_DCE, i, &adev->pageflip_irq); if (r) { diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 1860ccc3f4f2c..4fae73478840c 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -1118,6 +1118,8 @@ struct dc *dc_create(const struct dc_init_data *init_params)
dc->caps.max_dp_protocol_version = DP_VERSION_1_4;
+ dc->caps.max_otg_num = dc->res_pool->res_cap->num_timing_generator; + if (dc->res_pool->dmcu != NULL) dc->versions.dmcu_version = dc->res_pool->dmcu->dmcu_version; } diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index 3ab52d9a82cf6..e0f58fab5e8ed 100644 --- a/drivers/gpu/drm/amd/display/dc/dc.h +++ b/drivers/gpu/drm/amd/display/dc/dc.h @@ -185,6 +185,7 @@ struct dc_caps { struct dc_color_caps color; bool vbios_lttpr_aware; bool vbios_lttpr_enable; + uint32_t max_otg_num; };
struct dc_bug_wa {
From: Dmytro Laktyushkin Dmytro.Laktyushkin@amd.com
[ Upstream commit 60fdf98a774eee244a4e00c34a9e7729b61d0f44 ]
Fix clamping to match register field size
Reviewed-by: Charlene Liu Charlene.Liu@amd.com Acked-by: Jasdeep Dhillon jdhillon@amd.com Signed-off-by: Dmytro Laktyushkin Dmytro.Laktyushkin@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../drm/amd/display/dc/dcn31/dcn31_hubbub.c | 61 ++++++++++--------- 1 file changed, 32 insertions(+), 29 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c index 90c73a1cb9861..5e3bcaf12cac4 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c @@ -138,8 +138,11 @@ static uint32_t convert_and_clamp( ret_val = wm_ns * refclk_mhz; ret_val /= 1000;
- if (ret_val > clamp_value) + if (ret_val > clamp_value) { + /* clamping WMs is abnormal, unexpected and may lead to underflow*/ + ASSERT(0); ret_val = clamp_value; + }
return ret_val; } @@ -159,7 +162,7 @@ static bool hubbub31_program_urgent_watermarks( if (safe_to_lower || watermarks->a.urgent_ns > hubbub2->watermarks.a.urgent_ns) { hubbub2->watermarks.a.urgent_ns = watermarks->a.urgent_ns; prog_wm_value = convert_and_clamp(watermarks->a.urgent_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0x3fff); REG_SET(DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_A, 0, DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_A, prog_wm_value);
@@ -193,7 +196,7 @@ static bool hubbub31_program_urgent_watermarks( if (safe_to_lower || watermarks->a.urgent_latency_ns > hubbub2->watermarks.a.urgent_latency_ns) { hubbub2->watermarks.a.urgent_latency_ns = watermarks->a.urgent_latency_ns; prog_wm_value = convert_and_clamp(watermarks->a.urgent_latency_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0x3fff); REG_SET(DCHUBBUB_ARB_REFCYC_PER_TRIP_TO_MEMORY_A, 0, DCHUBBUB_ARB_REFCYC_PER_TRIP_TO_MEMORY_A, prog_wm_value); } else if (watermarks->a.urgent_latency_ns < hubbub2->watermarks.a.urgent_latency_ns) @@ -203,7 +206,7 @@ static bool hubbub31_program_urgent_watermarks( if (safe_to_lower || watermarks->b.urgent_ns > hubbub2->watermarks.b.urgent_ns) { hubbub2->watermarks.b.urgent_ns = watermarks->b.urgent_ns; prog_wm_value = convert_and_clamp(watermarks->b.urgent_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0x3fff); REG_SET(DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_B, 0, DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_B, prog_wm_value);
@@ -237,7 +240,7 @@ static bool hubbub31_program_urgent_watermarks( if (safe_to_lower || watermarks->b.urgent_latency_ns > hubbub2->watermarks.b.urgent_latency_ns) { hubbub2->watermarks.b.urgent_latency_ns = watermarks->b.urgent_latency_ns; prog_wm_value = convert_and_clamp(watermarks->b.urgent_latency_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0x3fff); REG_SET(DCHUBBUB_ARB_REFCYC_PER_TRIP_TO_MEMORY_B, 0, DCHUBBUB_ARB_REFCYC_PER_TRIP_TO_MEMORY_B, prog_wm_value); } else if (watermarks->b.urgent_latency_ns < hubbub2->watermarks.b.urgent_latency_ns) @@ -247,7 +250,7 @@ static bool hubbub31_program_urgent_watermarks( if (safe_to_lower || watermarks->c.urgent_ns > hubbub2->watermarks.c.urgent_ns) { hubbub2->watermarks.c.urgent_ns = watermarks->c.urgent_ns; prog_wm_value = convert_and_clamp(watermarks->c.urgent_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0x3fff); REG_SET(DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_C, 0, DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_C, prog_wm_value);
@@ -281,7 +284,7 @@ static bool hubbub31_program_urgent_watermarks( if (safe_to_lower || watermarks->c.urgent_latency_ns > hubbub2->watermarks.c.urgent_latency_ns) { hubbub2->watermarks.c.urgent_latency_ns = watermarks->c.urgent_latency_ns; prog_wm_value = convert_and_clamp(watermarks->c.urgent_latency_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0x3fff); REG_SET(DCHUBBUB_ARB_REFCYC_PER_TRIP_TO_MEMORY_C, 0, DCHUBBUB_ARB_REFCYC_PER_TRIP_TO_MEMORY_C, prog_wm_value); } else if (watermarks->c.urgent_latency_ns < hubbub2->watermarks.c.urgent_latency_ns) @@ -291,7 +294,7 @@ static bool hubbub31_program_urgent_watermarks( if (safe_to_lower || watermarks->d.urgent_ns > hubbub2->watermarks.d.urgent_ns) { hubbub2->watermarks.d.urgent_ns = watermarks->d.urgent_ns; prog_wm_value = convert_and_clamp(watermarks->d.urgent_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0x3fff); REG_SET(DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_D, 0, DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_D, prog_wm_value);
@@ -325,7 +328,7 @@ static bool hubbub31_program_urgent_watermarks( if (safe_to_lower || watermarks->d.urgent_latency_ns > hubbub2->watermarks.d.urgent_latency_ns) { hubbub2->watermarks.d.urgent_latency_ns = watermarks->d.urgent_latency_ns; prog_wm_value = convert_and_clamp(watermarks->d.urgent_latency_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0x3fff); REG_SET(DCHUBBUB_ARB_REFCYC_PER_TRIP_TO_MEMORY_D, 0, DCHUBBUB_ARB_REFCYC_PER_TRIP_TO_MEMORY_D, prog_wm_value); } else if (watermarks->d.urgent_latency_ns < hubbub2->watermarks.d.urgent_latency_ns) @@ -351,7 +354,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->a.cstate_pstate.cstate_enter_plus_exit_ns; prog_wm_value = convert_and_clamp( watermarks->a.cstate_pstate.cstate_enter_plus_exit_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_A, 0, DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_A, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_ENTER_EXIT_WATERMARK_A calculated =%d\n" @@ -367,7 +370,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->a.cstate_pstate.cstate_exit_ns; prog_wm_value = convert_and_clamp( watermarks->a.cstate_pstate.cstate_exit_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_A, 0, DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_A, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_EXIT_WATERMARK_A calculated =%d\n" @@ -383,7 +386,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->a.cstate_pstate.cstate_enter_plus_exit_z8_ns; prog_wm_value = convert_and_clamp( watermarks->a.cstate_pstate.cstate_enter_plus_exit_z8_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_Z8_A, 0, DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_Z8_A, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_ENTER_WATERMARK_Z8_A calculated =%d\n" @@ -399,7 +402,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->a.cstate_pstate.cstate_exit_z8_ns; prog_wm_value = convert_and_clamp( watermarks->a.cstate_pstate.cstate_exit_z8_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_Z8_A, 0, DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_Z8_A, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_EXIT_WATERMARK_Z8_A calculated =%d\n" @@ -416,7 +419,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->b.cstate_pstate.cstate_enter_plus_exit_ns; prog_wm_value = convert_and_clamp( watermarks->b.cstate_pstate.cstate_enter_plus_exit_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_B, 0, DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_B, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_ENTER_EXIT_WATERMARK_B calculated =%d\n" @@ -432,7 +435,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->b.cstate_pstate.cstate_exit_ns; prog_wm_value = convert_and_clamp( watermarks->b.cstate_pstate.cstate_exit_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_B, 0, DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_B, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_EXIT_WATERMARK_B calculated =%d\n" @@ -448,7 +451,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->b.cstate_pstate.cstate_enter_plus_exit_z8_ns; prog_wm_value = convert_and_clamp( watermarks->b.cstate_pstate.cstate_enter_plus_exit_z8_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_Z8_B, 0, DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_Z8_B, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_ENTER_WATERMARK_Z8_B calculated =%d\n" @@ -464,7 +467,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->b.cstate_pstate.cstate_exit_z8_ns; prog_wm_value = convert_and_clamp( watermarks->b.cstate_pstate.cstate_exit_z8_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_Z8_B, 0, DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_Z8_B, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_EXIT_WATERMARK_Z8_B calculated =%d\n" @@ -481,7 +484,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->c.cstate_pstate.cstate_enter_plus_exit_ns; prog_wm_value = convert_and_clamp( watermarks->c.cstate_pstate.cstate_enter_plus_exit_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_C, 0, DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_C, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_ENTER_EXIT_WATERMARK_C calculated =%d\n" @@ -497,7 +500,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->c.cstate_pstate.cstate_exit_ns; prog_wm_value = convert_and_clamp( watermarks->c.cstate_pstate.cstate_exit_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_C, 0, DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_C, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_EXIT_WATERMARK_C calculated =%d\n" @@ -513,7 +516,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->c.cstate_pstate.cstate_enter_plus_exit_z8_ns; prog_wm_value = convert_and_clamp( watermarks->c.cstate_pstate.cstate_enter_plus_exit_z8_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_Z8_C, 0, DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_Z8_C, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_ENTER_WATERMARK_Z8_C calculated =%d\n" @@ -529,7 +532,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->c.cstate_pstate.cstate_exit_z8_ns; prog_wm_value = convert_and_clamp( watermarks->c.cstate_pstate.cstate_exit_z8_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_Z8_C, 0, DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_Z8_C, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_EXIT_WATERMARK_Z8_C calculated =%d\n" @@ -546,7 +549,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->d.cstate_pstate.cstate_enter_plus_exit_ns; prog_wm_value = convert_and_clamp( watermarks->d.cstate_pstate.cstate_enter_plus_exit_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_D, 0, DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_D, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_ENTER_EXIT_WATERMARK_D calculated =%d\n" @@ -562,7 +565,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->d.cstate_pstate.cstate_exit_ns; prog_wm_value = convert_and_clamp( watermarks->d.cstate_pstate.cstate_exit_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_D, 0, DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_D, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_EXIT_WATERMARK_D calculated =%d\n" @@ -578,7 +581,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->d.cstate_pstate.cstate_enter_plus_exit_z8_ns; prog_wm_value = convert_and_clamp( watermarks->d.cstate_pstate.cstate_enter_plus_exit_z8_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_Z8_D, 0, DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_Z8_D, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_ENTER_WATERMARK_Z8_D calculated =%d\n" @@ -594,7 +597,7 @@ static bool hubbub31_program_stutter_watermarks( watermarks->d.cstate_pstate.cstate_exit_z8_ns; prog_wm_value = convert_and_clamp( watermarks->d.cstate_pstate.cstate_exit_z8_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_Z8_D, 0, DCHUBBUB_ARB_ALLOW_SR_EXIT_WATERMARK_Z8_D, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("SR_EXIT_WATERMARK_Z8_D calculated =%d\n" @@ -625,7 +628,7 @@ static bool hubbub31_program_pstate_watermarks( watermarks->a.cstate_pstate.pstate_change_ns; prog_wm_value = convert_and_clamp( watermarks->a.cstate_pstate.pstate_change_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_DRAM_CLK_CHANGE_WATERMARK_A, 0, DCHUBBUB_ARB_ALLOW_DRAM_CLK_CHANGE_WATERMARK_A, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("DRAM_CLK_CHANGE_WATERMARK_A calculated =%d\n" @@ -642,7 +645,7 @@ static bool hubbub31_program_pstate_watermarks( watermarks->b.cstate_pstate.pstate_change_ns; prog_wm_value = convert_and_clamp( watermarks->b.cstate_pstate.pstate_change_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_DRAM_CLK_CHANGE_WATERMARK_B, 0, DCHUBBUB_ARB_ALLOW_DRAM_CLK_CHANGE_WATERMARK_B, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("DRAM_CLK_CHANGE_WATERMARK_B calculated =%d\n" @@ -659,7 +662,7 @@ static bool hubbub31_program_pstate_watermarks( watermarks->c.cstate_pstate.pstate_change_ns; prog_wm_value = convert_and_clamp( watermarks->c.cstate_pstate.pstate_change_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_DRAM_CLK_CHANGE_WATERMARK_C, 0, DCHUBBUB_ARB_ALLOW_DRAM_CLK_CHANGE_WATERMARK_C, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("DRAM_CLK_CHANGE_WATERMARK_C calculated =%d\n" @@ -676,7 +679,7 @@ static bool hubbub31_program_pstate_watermarks( watermarks->d.cstate_pstate.pstate_change_ns; prog_wm_value = convert_and_clamp( watermarks->d.cstate_pstate.pstate_change_ns, - refclk_mhz, 0x1fffff); + refclk_mhz, 0xffff); REG_SET(DCHUBBUB_ARB_ALLOW_DRAM_CLK_CHANGE_WATERMARK_D, 0, DCHUBBUB_ARB_ALLOW_DRAM_CLK_CHANGE_WATERMARK_D, prog_wm_value); DC_LOG_BANDWIDTH_CALCS("DRAM_CLK_CHANGE_WATERMARK_D calculated =%d\n"
From: Slark Xiao slark_xiao@163.com
[ Upstream commit 8ecbb179286cbc91810c16caeb3396e06305cd0c ]
Dell DW5829e same as DW5821e except the CAT level. DW5821e supports CAT16 but DW5829e supports CAT9. Also, DW5829e includes normal and eSIM type. Please see below test evidence:
T: Bus=04 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 5 Spd=5000 MxCh= 0 D: Ver= 3.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1 P: Vendor=413c ProdID=81e6 Rev=03.18 S: Manufacturer=Dell Inc. S: Product=DW5829e Snapdragon X20 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
T: Bus=04 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 7 Spd=5000 MxCh= 0 D: Ver= 3.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1 P: Vendor=413c ProdID=81e4 Rev=03.18 S: Manufacturer=Dell Inc. S: Product=DW5829e-eSIM Snapdragon X20 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
Signed-off-by: Slark Xiao slark_xiao@163.com Acked-by: Bjørn Mork bjorn@mork.no Link: https://lore.kernel.org/r/20220209024717.8564-1-slark_xiao@163.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/qmi_wwan.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 33ada2c59952e..0c7f02ca6822b 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1395,6 +1395,8 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x413c, 0x81d7, 0)}, /* Dell Wireless 5821e */ {QMI_FIXED_INTF(0x413c, 0x81d7, 1)}, /* Dell Wireless 5821e preproduction config */ {QMI_FIXED_INTF(0x413c, 0x81e0, 0)}, /* Dell Wireless 5821e with eSIM support*/ + {QMI_FIXED_INTF(0x413c, 0x81e4, 0)}, /* Dell Wireless 5829e with eSIM support*/ + {QMI_FIXED_INTF(0x413c, 0x81e6, 0)}, /* Dell Wireless 5829e */ {QMI_FIXED_INTF(0x03f0, 0x4e1d, 8)}, /* HP lt4111 LTE/EV-DO/HSPA+ Gobi 4G Module */ {QMI_FIXED_INTF(0x03f0, 0x9d1d, 1)}, /* HP lt4120 Snapdragon X5 LTE */ {QMI_FIXED_INTF(0x22de, 0x9061, 3)}, /* WeTelecom WPD-600N */
From: Marc St-Amand mstamand@ciena.com
[ Upstream commit 37f7860602b5b2d99fc7465f6407f403f5941988 ]
Single page and coherent memory blocks can use different DMA masks when the macb accesses physical memory directly. The kernel is clever enough to allocate pages that fit into the requested address width.
When using the ARM SMMU, the DMA mask must be the same for single pages and big coherent memory blocks. Otherwise the translation tables turn into one big mess.
[ 74.959909] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK [ 74.959989] arm-smmu fd800000.smmu: Unhandled context fault: fsr=0x402, iova=0x3165687460, fsynr=0x20001, cbfrsynra=0x877, cb=1 [ 75.173939] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK [ 75.173955] arm-smmu fd800000.smmu: Unhandled context fault: fsr=0x402, iova=0x3165687460, fsynr=0x20001, cbfrsynra=0x877, cb=1
Since using the same DMA mask does not hurt direct 1:1 physical memory mappings, this commit always aligns DMA and coherent masks.
Signed-off-by: Marc St-Amand mstamand@ciena.com Signed-off-by: Harini Katakam harini.katakam@xilinx.com Acked-by: Nicolas Ferre nicolas.ferre@microchip.com Tested-by: Conor Dooley conor.dooley@microchip.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cadence/macb_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index d13fb1d318215..d71c11a6282ec 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -4739,7 +4739,7 @@ static int macb_probe(struct platform_device *pdev)
#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT if (GEM_BFEXT(DAW64, gem_readl(bp, DCFG6))) { - dma_set_mask(&pdev->dev, DMA_BIT_MASK(44)); + dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(44)); bp->hw_dma_cap |= HW_DMA_CAP_64B; } #endif
From: Jing Leng jleng@ambarella.com
[ Upstream commit 1b9e740a81f91ae338b29ed70455719804957b80 ]
When the KCONFIG_AUTOCONFIG is specified (e.g. export \ KCONFIG_AUTOCONFIG=output/config/auto.conf), the directory of include/config/ will not be created, so kconfig can't create deps files in it and auto.conf can't be generated.
Signed-off-by: Jing Leng jleng@ambarella.com Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/kconfig/confdata.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/scripts/kconfig/confdata.c b/scripts/kconfig/confdata.c index cf72680cd7692..4a828bca071e8 100644 --- a/scripts/kconfig/confdata.c +++ b/scripts/kconfig/confdata.c @@ -983,14 +983,19 @@ static int conf_write_dep(const char *name)
static int conf_touch_deps(void) { - const char *name; + const char *name, *tmp; struct symbol *sym; int res, i;
- strcpy(depfile_path, "include/config/"); - depfile_prefix_len = strlen(depfile_path); - name = conf_get_autoconfig_name(); + tmp = strrchr(name, '/'); + depfile_prefix_len = tmp ? tmp - name + 1 : 0; + if (depfile_prefix_len + 1 > sizeof(depfile_path)) + return -1; + + strncpy(depfile_path, name, depfile_prefix_len); + depfile_path[depfile_prefix_len] = 0; + conf_read_simple(name, S_DEF_AUTO); sym_calc_value(modules_sym);
From: James Smart jsmart2021@gmail.com
commit 7f4c5a26f735dea4bbc0eb8eb9da99cda95a8563 upstream.
When connected point to point, the driver does not know the FC4's supported by the other end. In Fabrics, it can query the nameserver. Thus the driver must send PRLIs for the FC4s it supports and enable support based on the acc(ept) or rej(ect) of the respective FC4 PRLI. Currently the driver supports SCSI and NVMe PRLIs.
Unfortunately, although the behavior is per standard, many devices have come to expect only SCSI PRLIs. In this particular example, the NVMe PRLI is properly RJT'd but the target decided that it must LOGO after seeing the unexpected NVMe PRLI. The LOGO causes the sequence to restart and login is now in an infinite failure loop.
Fix the problem by having the driver, on a pt2pt link, remember NVMe PRLI accept or reject status across logout as long as the link stays "up". When retrying login, if the prior NVMe PRLI was rejected, it will not be sent on the next login.
Link: https://lore.kernel.org/r/20220212163120.15385-1-jsmart2021@gmail.com Cc: stable@vger.kernel.org # v5.4+ Reviewed-by: Ewan D. Milne emilne@redhat.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/lpfc/lpfc.h | 1 + drivers/scsi/lpfc/lpfc_attr.c | 3 +++ drivers/scsi/lpfc/lpfc_els.c | 20 +++++++++++++++++++- drivers/scsi/lpfc/lpfc_nportdisc.c | 5 +++-- 4 files changed, 26 insertions(+), 3 deletions(-)
--- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -593,6 +593,7 @@ struct lpfc_vport { #define FC_VPORT_LOGO_RCVD 0x200 /* LOGO received on vport */ #define FC_RSCN_DISCOVERY 0x400 /* Auth all devices after RSCN */ #define FC_LOGO_RCVD_DID_CHNG 0x800 /* FDISC on phys port detect DID chng*/ +#define FC_PT2PT_NO_NVME 0x1000 /* Don't send NVME PRLI */ #define FC_SCSI_SCAN_TMO 0x4000 /* scsi scan timer running */ #define FC_ABORT_DISCOVERY 0x8000 /* we want to abort discovery */ #define FC_NDISC_ACTIVE 0x10000 /* NPort discovery active */ --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -1315,6 +1315,9 @@ lpfc_issue_lip(struct Scsi_Host *shost) pmboxq->u.mb.mbxCommand = MBX_DOWN_LINK; pmboxq->u.mb.mbxOwner = OWN_HOST;
+ if ((vport->fc_flag & FC_PT2PT) && (vport->fc_flag & FC_PT2PT_NO_NVME)) + vport->fc_flag &= ~FC_PT2PT_NO_NVME; + mbxstatus = lpfc_sli_issue_mbox_wait(phba, pmboxq, LPFC_MBOX_TMO * 2);
if ((mbxstatus == MBX_SUCCESS) && --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -1072,7 +1072,8 @@ stop_rr_fcf_flogi:
/* FLOGI failed, so there is no fabric */ spin_lock_irq(shost->host_lock); - vport->fc_flag &= ~(FC_FABRIC | FC_PUBLIC_LOOP); + vport->fc_flag &= ~(FC_FABRIC | FC_PUBLIC_LOOP | + FC_PT2PT_NO_NVME); spin_unlock_irq(shost->host_lock);
/* If private loop, then allow max outstanding els to be @@ -4587,6 +4588,23 @@ lpfc_els_retry(struct lpfc_hba *phba, st /* Added for Vendor specifc support * Just keep retrying for these Rsn / Exp codes */ + if ((vport->fc_flag & FC_PT2PT) && + cmd == ELS_CMD_NVMEPRLI) { + switch (stat.un.b.lsRjtRsnCode) { + case LSRJT_UNABLE_TPC: + case LSRJT_INVALID_CMD: + case LSRJT_LOGICAL_ERR: + case LSRJT_CMD_UNSUPPORTED: + lpfc_printf_vlog(vport, KERN_WARNING, LOG_ELS, + "0168 NVME PRLI LS_RJT " + "reason %x port doesn't " + "support NVME, disabling NVME\n", + stat.un.b.lsRjtRsnCode); + retry = 0; + vport->fc_flag |= FC_PT2PT_NO_NVME; + goto out_retry; + } + } switch (stat.un.b.lsRjtRsnCode) { case LSRJT_UNABLE_TPC: /* The driver has a VALID PLOGI but the rport has --- a/drivers/scsi/lpfc/lpfc_nportdisc.c +++ b/drivers/scsi/lpfc/lpfc_nportdisc.c @@ -1961,8 +1961,9 @@ lpfc_cmpl_reglogin_reglogin_issue(struct * is configured try it. */ ndlp->nlp_fc4_type |= NLP_FC4_FCP; - if ((vport->cfg_enable_fc4_type == LPFC_ENABLE_BOTH) || - (vport->cfg_enable_fc4_type == LPFC_ENABLE_NVME)) { + if ((!(vport->fc_flag & FC_PT2PT_NO_NVME)) && + (vport->cfg_enable_fc4_type == LPFC_ENABLE_BOTH || + vport->cfg_enable_fc4_type == LPFC_ENABLE_NVME)) { ndlp->nlp_fc4_type |= NLP_FC4_NVME; /* We need to update the localport also */ lpfc_nvme_update_localport(vport);
From: Eliav Farber farbere@amazon.com
commit f8efca92ae509c25e0a4bd5d0a86decea4f0c41e upstream.
Do alignment logic properly and use the "ptr" local variable for calculating the remainder of the alignment.
This became an issue because struct edac_mc_layer has a size that is not zero modulo eight, and the next offset that was prepared for the private data was unaligned, causing an alignment exception.
The patch in Fixes: which broke this actually wanted to "what we actually care about is the alignment of the actual pointer that's about to be returned." But it didn't check that alignment.
Use the correct variable "ptr" for that.
[ bp: Massage commit message. ]
Fixes: 8447c4d15e35 ("edac: Do alignment logic properly in edac_align_ptr()") Signed-off-by: Eliav Farber farbere@amazon.com Signed-off-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220113100622.12783-2-farbere@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/edac/edac_mc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/edac/edac_mc.c +++ b/drivers/edac/edac_mc.c @@ -215,7 +215,7 @@ void *edac_align_ptr(void **p, unsigned else return (char *)ptr;
- r = (unsigned long)p % align; + r = (unsigned long)ptr % align;
if (r == 0) return (char *)ptr;
From: Eric W. Biederman ebiederm@xmission.com
commit 0cbae9e24fa7d6c6e9f828562f084da82217a0c5 upstream.
While examining is_ucounts_overlimit and reading the various messages I realized that is_ucounts_overlimit fails to deal with counts that may have wrapped.
Being wrapped should be a transitory state for counts and they should never be wrapped for long, but it can happen so handle it.
Cc: stable@vger.kernel.org Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") Link: https://lkml.kernel.org/r/20220216155832.680775-5-ebiederm@xmission.com Reviewed-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/ucount.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/kernel/ucount.c +++ b/kernel/ucount.c @@ -344,7 +344,8 @@ bool is_ucounts_overlimit(struct ucounts if (rlimit > LONG_MAX) max = LONG_MAX; for (iter = ucounts; iter; iter = iter->ns->ucounts) { - if (get_ucounts_value(iter, type) > max) + long val = get_ucounts_value(iter, type); + if (val < 0 || val > max) return true; max = READ_ONCE(iter->ns->ucount_max[type]); }
From: Eric W. Biederman ebiederm@xmission.com
commit 99c31f9feda41d0f10d030dc04ba106c93295aa2 upstream.
Any cred that is destined for use by commit_creds must have a non-NULL cred->ucounts field. Only curing credential construction is a NULL cred->ucounts valid. Only abort_creds, put_cred, and put_cred_rcu needs to deal with a cred with a NULL ucount. As set_cred_ucounts is non of those case don't confuse people by handling something that can not happen.
Link: https://lkml.kernel.org/r/871r4irzds.fsf_-_@disp2133 Tested-by: Yu Zhao yuzhao@google.com Reviewed-by: Alexey Gladkov legion@kernel.org Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/cred.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/kernel/cred.c +++ b/kernel/cred.c @@ -676,15 +676,14 @@ int set_cred_ucounts(struct cred *new) * This optimization is needed because alloc_ucounts() uses locks * for table lookups. */ - if (old_ucounts && old_ucounts->ns == new->user_ns && uid_eq(old_ucounts->uid, new->euid)) + if (old_ucounts->ns == new->user_ns && uid_eq(old_ucounts->uid, new->euid)) return 0;
if (!(new_ucounts = alloc_ucounts(new->user_ns, new->euid))) return -EAGAIN;
new->ucounts = new_ucounts; - if (old_ucounts) - put_ucounts(old_ucounts); + put_ucounts(old_ucounts);
return 0; }
From: Eric W. Biederman ebiederm@xmission.com
commit a55d07294f1e9b576093bdfa95422f8119941e83 upstream.
Michal Koutný mkoutny@suse.com wrote:
Tasks are associated to multiple users at once. Historically and as per setrlimit(2) RLIMIT_NPROC is enforce based on real user ID.
The commit 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") made the accounting structure "indexed" by euid and hence potentially account tasks differently.
The effective user ID may be different e.g. for setuid programs but those are exec'd into already existing task (i.e. below limit), so different accounting is moot.
Some special setresuid(2) users may notice the difference, justifying this fix.
I looked at cred->ucount and it is only used for rlimit operations that were previously stored in cred->user. Making the fact cred->ucount can refer to a different user from cred->user a bug, affecting all uses of cred->ulimit not just RLIMIT_NPROC.
Fix set_cred_ucounts to always use the real uid not the effective uid.
Further simplify set_cred_ucounts by noticing that set_cred_ucounts somehow retained a draft version of the check to see if alloc_ucounts was needed that checks the new->user and new->user_ns against the current_real_cred(). Remove that draft version of the check.
All that matters for setting the cred->ucounts are the user_ns and uid fields in the cred.
Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20220207121800.5079-4-mkoutny@suse.com Link: https://lkml.kernel.org/r/20220216155832.680775-3-ebiederm@xmission.com Reported-by: Michal Koutný mkoutny@suse.com Reviewed-by: Michal Koutný mkoutny@suse.com Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/cred.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
--- a/kernel/cred.c +++ b/kernel/cred.c @@ -665,21 +665,16 @@ EXPORT_SYMBOL(cred_fscmp);
int set_cred_ucounts(struct cred *new) { - struct task_struct *task = current; - const struct cred *old = task->real_cred; struct ucounts *new_ucounts, *old_ucounts = new->ucounts;
- if (new->user == old->user && new->user_ns == old->user_ns) - return 0; - /* * This optimization is needed because alloc_ucounts() uses locks * for table lookups. */ - if (old_ucounts->ns == new->user_ns && uid_eq(old_ucounts->uid, new->euid)) + if (old_ucounts->ns == new->user_ns && uid_eq(old_ucounts->uid, new->uid)) return 0;
- if (!(new_ucounts = alloc_ucounts(new->user_ns, new->euid))) + if (!(new_ucounts = alloc_ucounts(new->user_ns, new->uid))) return -EAGAIN;
new->ucounts = new_ucounts;
From: Eric W. Biederman ebiederm@xmission.com
commit 8f2f9c4d82f24f172ae439e5035fc1e0e4c229dd upstream.
Michal Koutný mkoutny@suse.com wrote:
It was reported that v5.14 behaves differently when enforcing RLIMIT_NPROC limit, namely, it allows one more task than previously. This is consequence of the commit 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") that missed the sharpness of equality in the forking path.
This can be fixed either by fixing the test or by moving the increment to be before the test. Fix it my moving copy_creds which contains the increment before is_ucounts_overlimit.
In the case of CLONE_NEWUSER the ucounts in the task_cred changes. The function is_ucounts_overlimit needs to use the final version of the ucounts for the new process. Which means moving the is_ucounts_overlimit test after copy_creds is necessary.
Both the test in fork and the test in set_user were semantically changed when the code moved to ucounts. The change of the test in fork was bad because it was before the increment. The test in set_user was wrong and the change to ucounts fixed it. So this fix only restores the old behavior in one lcation not two.
Link: https://lkml.kernel.org/r/20220204181144.24462-1-mkoutny@suse.com Link: https://lkml.kernel.org/r/20220216155832.680775-2-ebiederm@xmission.com Cc: stable@vger.kernel.org Reported-by: Michal Koutný mkoutny@suse.com Reviewed-by: Michal Koutný mkoutny@suse.com Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/fork.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/kernel/fork.c +++ b/kernel/fork.c @@ -2055,18 +2055,18 @@ static __latent_entropy struct task_stru #ifdef CONFIG_PROVE_LOCKING DEBUG_LOCKS_WARN_ON(!p->softirqs_enabled); #endif + retval = copy_creds(p, clone_flags); + if (retval < 0) + goto bad_fork_free; + retval = -EAGAIN; if (is_ucounts_overlimit(task_ucounts(p), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) { if (p->real_cred->user != INIT_USER && !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) - goto bad_fork_free; + goto bad_fork_cleanup_count; } current->flags &= ~PF_NPROC_EXCEEDED;
- retval = copy_creds(p, clone_flags); - if (retval < 0) - goto bad_fork_free; - /* * If multiple threads are within copy_process(), then this check * triggers too late. This doesn't hurt, the check is only there
From: Eric W. Biederman ebiederm@xmission.com
commit c16bdeb5a39ffa3f32b32f812831a2092d2a3061 upstream.
Solar Designer solar@openwall.com wrote:
I'm not aware of anyone actually running into this issue and reporting it. The systems that I personally know use suexec along with rlimits still run older/distro kernels, so would not yet be affected.
So my mention was based on my understanding of how suexec works, and code review. Specifically, Apache httpd has the setting RLimitNPROC, which makes it set RLIMIT_NPROC:
https://httpd.apache.org/docs/2.4/mod/core.html#rlimitnproc
The above documentation for it includes:
"This applies to processes forked from Apache httpd children servicing requests, not the Apache httpd children themselves. This includes CGI scripts and SSI exec commands, but not any processes forked from the Apache httpd parent, such as piped logs."
In code, there are:
./modules/generators/mod_cgid.c: ( (cgid_req.limits.limit_nproc_set) && ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, ./modules/generators/mod_cgi.c: ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, ./modules/filters/mod_ext_filter.c: rv = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, conf->limit_nproc);
For example, in mod_cgi.c this is in run_cgi_child().
I think this means an httpd child sets RLIMIT_NPROC shortly before it execs suexec, which is a SUID root program. suexec then switches to the target user and execs the CGI script.
Before 2863643fb8b9, the setuid() in suexec would set the flag, and the target user's process count would be checked against RLIMIT_NPROC on execve(). After 2863643fb8b9, the setuid() in suexec wouldn't set the flag because setuid() is (naturally) called when the process is still running as root (thus, has those limits bypass capabilities), and accordingly execve() would not check the target user's process count against RLIMIT_NPROC.
In commit 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") capable calls were added to set_user to make it more consistent with fork. Unfortunately because of call site differences those capable calls were checking the credentials of the user before set*id() instead of after set*id().
This breaks enforcement of RLIMIT_NPROC for applications that set the rlimit and then call set*id() while holding a full set of capabilities. The capabilities are only changed in the new credential in security_task_fix_setuid().
The code in apache suexec appears to follow this pattern.
Commit 909cc4ae86f3 ("[PATCH] Fix two bugs with process limits (RLIMIT_NPROC)") where this check was added describes the targes of this capability check as:
2/ When a root-owned process (e.g. cgiwrap) sets up process limits and then calls setuid, the setuid should fail if the user would then be running more than rlim_cur[RLIMIT_NPROC] processes, but it doesn't. This patch adds an appropriate test. With this patch, and per-user process limit imposed in cgiwrap really works.
So the original use case of this check also appears to match the broken pattern.
Restore the enforcement of RLIMIT_NPROC by removing the bad capable checks added in set_user. This unfortunately restores the inconsistent state the code has been in for the last 11 years, but dealing with the inconsistencies looks like a larger problem.
Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20210907213042.GA22626@openwall.com/ Link: https://lkml.kernel.org/r/20220212221412.GA29214@openwall.com Link: https://lkml.kernel.org/r/20220216155832.680775-1-ebiederm@xmission.com Fixes: 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") History-Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Reviewed-by: Solar Designer solar@openwall.com Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sys.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/kernel/sys.c +++ b/kernel/sys.c @@ -480,8 +480,7 @@ static int set_user(struct cred *new) * failure to the execve() stage. */ if (is_ucounts_overlimit(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) && - new_user != INIT_USER && - !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) + new_user != INIT_USER) current->flags |= PF_NPROC_EXCEEDED; else current->flags &= ~PF_NPROC_EXCEEDED;
From: Eric W. Biederman ebiederm@xmission.com
commit c923a8e7edb010da67424077cbf1a6f1396ebd2e upstream.
During set*id() which cred->ucounts to charge the the current process to is not known until after set_cred_ucounts. So move the RLIMIT_NPROC checking into a new helper flag_nproc_exceeded and call flag_nproc_exceeded after set_cred_ucounts.
This is very much an arbitrary subset of the places where we currently change the RLIMIT_NPROC accounting, designed to preserve the existing logic.
Fixing the existing logic will be the subject of another series of changes.
Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20220216155832.680775-4-ebiederm@xmission.com Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sys.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
--- a/kernel/sys.c +++ b/kernel/sys.c @@ -472,6 +472,16 @@ static int set_user(struct cred *new) if (!new_user) return -EAGAIN;
+ free_uid(new->user); + new->user = new_user; + return 0; +} + +static void flag_nproc_exceeded(struct cred *new) +{ + if (new->ucounts == current_ucounts()) + return; + /* * We don't fail in case of NPROC limit excess here because too many * poorly written programs don't check set*uid() return code, assuming @@ -480,14 +490,10 @@ static int set_user(struct cred *new) * failure to the execve() stage. */ if (is_ucounts_overlimit(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) && - new_user != INIT_USER) + new->user != INIT_USER) current->flags |= PF_NPROC_EXCEEDED; else current->flags &= ~PF_NPROC_EXCEEDED; - - free_uid(new->user); - new->user = new_user; - return 0; }
/* @@ -562,6 +568,7 @@ long __sys_setreuid(uid_t ruid, uid_t eu if (retval < 0) goto error;
+ flag_nproc_exceeded(new); return commit_creds(new);
error: @@ -624,6 +631,7 @@ long __sys_setuid(uid_t uid) if (retval < 0) goto error;
+ flag_nproc_exceeded(new); return commit_creds(new);
error: @@ -703,6 +711,7 @@ long __sys_setresuid(uid_t ruid, uid_t e if (retval < 0) goto error;
+ flag_nproc_exceeded(new); return commit_creds(new);
error:
From: Eric Dumazet edumazet@google.com
commit 5740d068909676d4bdb5c9c00c37a83df7728909 upstream.
We have been living dangerously, at the mercy of malicious users, abusing TC_ACT_REPEAT, as shown by this syzpot report [1].
Add an arbitrary limit (32) to the number of times an action can return TC_ACT_REPEAT.
v2: switch the limit to 32 instead of 10. Use net_warn_ratelimited() instead of pr_err_once().
[1] (C repro available on demand)
rcu: INFO: rcu_preempt self-detected stall on CPU rcu: 1-...!: (10500 ticks this GP) idle=021/1/0x4000000000000000 softirq=5592/5592 fqs=0 (t=10502 jiffies g=5305 q=190) rcu: rcu_preempt kthread timer wakeup didn't happen for 10502 jiffies! g5305 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 rcu: Possible timer handling issue on cpu=0 timer-softirq=3527 rcu: rcu_preempt kthread starved for 10505 jiffies! g5305 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0 rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. rcu: RCU grace-period kthread stack dump: task:rcu_preempt state:I stack:29344 pid: 14 ppid: 2 flags:0x00004000 Call Trace: <TASK> context_switch kernel/sched/core.c:4986 [inline] __schedule+0xab2/0x4db0 kernel/sched/core.c:6295 schedule+0xd2/0x260 kernel/sched/core.c:6368 schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1881 rcu_gp_fqs_loop+0x186/0x810 kernel/rcu/tree.c:1963 rcu_gp_kthread+0x1de/0x320 kernel/rcu/tree.c:2136 kthread+0x2e9/0x3a0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK> rcu: Stack dump where RCU GP kthread last ran: Sending NMI from CPU 1 to CPUs 0: NMI backtrace for cpu 0 CPU: 0 PID: 3646 Comm: syz-executor358 Not tainted 5.17.0-rc3-syzkaller-00149-gbf8e59fd315f #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:rep_nop arch/x86/include/asm/vdso/processor.h:13 [inline] RIP: 0010:cpu_relax arch/x86/include/asm/vdso/processor.h:18 [inline] RIP: 0010:pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:437 [inline] RIP: 0010:__pv_queued_spin_lock_slowpath+0x3b8/0xb40 kernel/locking/qspinlock.c:508 Code: 48 89 eb c6 45 01 01 41 bc 00 80 00 00 48 c1 e9 03 83 e3 07 41 be 01 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8d 2c 01 eb 0c <f3> 90 41 83 ec 01 0f 84 72 04 00 00 41 0f b6 45 00 38 d8 7f 08 84 RSP: 0018:ffffc9000283f1b0 EFLAGS: 00000206 RAX: 0000000000000003 RBX: 0000000000000000 RCX: 1ffff1100fc0071e RDX: 0000000000000001 RSI: 0000000000000201 RDI: 0000000000000000 RBP: ffff88807e0038f0 R08: 0000000000000001 R09: ffffffff8ffbf9ff R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000004c1e R13: ffffed100fc0071e R14: 0000000000000001 R15: ffff8880b9c3aa80 FS: 00005555562bf300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffdbfef12b8 CR3: 00000000723c2000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:591 [inline] queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline] queued_spin_lock include/asm-generic/qspinlock.h:85 [inline] do_raw_spin_lock+0x200/0x2b0 kernel/locking/spinlock_debug.c:115 spin_lock_bh include/linux/spinlock.h:354 [inline] sch_tree_lock include/net/sch_generic.h:610 [inline] sch_tree_lock include/net/sch_generic.h:605 [inline] prio_tune+0x3b9/0xb50 net/sched/sch_prio.c:211 prio_init+0x5c/0x80 net/sched/sch_prio.c:244 qdisc_create.constprop.0+0x44a/0x10f0 net/sched/sch_api.c:1253 tc_modify_qdisc+0x4c5/0x1980 net/sched/sch_api.c:1660 rtnetlink_rcv_msg+0x413/0xb80 net/core/rtnetlink.c:5594 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x539/0x7e0 net/netlink/af_netlink.c:1343 netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:725 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2413 ___sys_sendmsg+0xf3/0x170 net/socket.c:2467 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2496 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f7ee98aae99 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffdbfef12d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007ffdbfef1300 RCX: 00007f7ee98aae99 RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000003 RBP: 0000000000000000 R08: 000000000000000d R09: 000000000000000d R10: 000000000000000d R11: 0000000000000246 R12: 00007ffdbfef12f0 R13: 00000000000f4240 R14: 000000000004ca47 R15: 00007ffdbfef12e4 </TASK> INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 2.293 msecs NMI backtrace for cpu 1 CPU: 1 PID: 3260 Comm: kworker/1:3 Not tainted 5.17.0-rc3-syzkaller-00149-gbf8e59fd315f #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: mld mld_ifc_work Call Trace: <IRQ> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111 nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline] rcu_dump_cpu_stacks+0x25e/0x3f0 kernel/rcu/tree_stall.h:343 print_cpu_stall kernel/rcu/tree_stall.h:604 [inline] check_cpu_stall kernel/rcu/tree_stall.h:688 [inline] rcu_pending kernel/rcu/tree.c:3919 [inline] rcu_sched_clock_irq.cold+0x5c/0x759 kernel/rcu/tree.c:2617 update_process_times+0x16d/0x200 kernel/time/timer.c:1785 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226 tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1428 __run_hrtimer kernel/time/hrtimer.c:1685 [inline] __hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1749 hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline] __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638 RIP: 0010:__sanitizer_cov_trace_const_cmp4+0xc/0x70 kernel/kcov.c:286 Code: 00 00 00 48 89 7c 30 e8 48 89 4c 30 f0 4c 89 54 d8 20 48 89 10 5b c3 0f 1f 80 00 00 00 00 41 89 f8 bf 03 00 00 00 4c 8b 14 24 <89> f1 65 48 8b 34 25 00 70 02 00 e8 14 f9 ff ff 84 c0 74 4b 48 8b RSP: 0018:ffffc90002c5eea8 EFLAGS: 00000246 RAX: 0000000000000007 RBX: ffff88801c625800 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: ffff8880137d3100 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff874fcd88 R11: 0000000000000000 R12: ffff88801d692dc0 R13: ffff8880137d3104 R14: 0000000000000000 R15: ffff88801d692de8 tcf_police_act+0x358/0x11d0 net/sched/act_police.c:256 tcf_action_exec net/sched/act_api.c:1049 [inline] tcf_action_exec+0x1a6/0x530 net/sched/act_api.c:1026 tcf_exts_exec include/net/pkt_cls.h:326 [inline] route4_classify+0xef0/0x1400 net/sched/cls_route.c:179 __tcf_classify net/sched/cls_api.c:1549 [inline] tcf_classify+0x3e8/0x9d0 net/sched/cls_api.c:1615 prio_classify net/sched/sch_prio.c:42 [inline] prio_enqueue+0x3a7/0x790 net/sched/sch_prio.c:75 dev_qdisc_enqueue+0x40/0x300 net/core/dev.c:3668 __dev_xmit_skb net/core/dev.c:3756 [inline] __dev_queue_xmit+0x1f61/0x3660 net/core/dev.c:4081 neigh_hh_output include/net/neighbour.h:533 [inline] neigh_output include/net/neighbour.h:547 [inline] ip_finish_output2+0x14dc/0x2170 net/ipv4/ip_output.c:228 __ip_finish_output net/ipv4/ip_output.c:306 [inline] __ip_finish_output+0x396/0x650 net/ipv4/ip_output.c:288 ip_finish_output+0x32/0x200 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip_output+0x196/0x310 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:451 [inline] ip_local_out+0xaf/0x1a0 net/ipv4/ip_output.c:126 iptunnel_xmit+0x628/0xa50 net/ipv4/ip_tunnel_core.c:82 geneve_xmit_skb drivers/net/geneve.c:966 [inline] geneve_xmit+0x10c8/0x3530 drivers/net/geneve.c:1077 __netdev_start_xmit include/linux/netdevice.h:4683 [inline] netdev_start_xmit include/linux/netdevice.h:4697 [inline] xmit_one net/core/dev.c:3473 [inline] dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3489 __dev_queue_xmit+0x2985/0x3660 net/core/dev.c:4116 neigh_hh_output include/net/neighbour.h:533 [inline] neigh_output include/net/neighbour.h:547 [inline] ip6_finish_output2+0xf7a/0x14f0 net/ipv6/ip6_output.c:126 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline] __ip6_finish_output+0x61e/0xe90 net/ipv6/ip6_output.c:170 ip6_finish_output+0x32/0x200 net/ipv6/ip6_output.c:201 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:224 dst_output include/net/dst.h:451 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] mld_sendpack+0x9a3/0xe40 net/ipv6/mcast.c:1826 mld_send_cr net/ipv6/mcast.c:2127 [inline] mld_ifc_work+0x71c/0xdc0 net/ipv6/mcast.c:2659 process_one_work+0x9ac/0x1650 kernel/workqueue.c:2307 worker_thread+0x657/0x1110 kernel/workqueue.c:2454 kthread+0x2e9/0x3a0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK> ---------------- Code disassembly (best guess): 0: 48 89 eb mov %rbp,%rbx 3: c6 45 01 01 movb $0x1,0x1(%rbp) 7: 41 bc 00 80 00 00 mov $0x8000,%r12d d: 48 c1 e9 03 shr $0x3,%rcx 11: 83 e3 07 and $0x7,%ebx 14: 41 be 01 00 00 00 mov $0x1,%r14d 1a: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 21: fc ff df 24: 4c 8d 2c 01 lea (%rcx,%rax,1),%r13 28: eb 0c jmp 0x36 * 2a: f3 90 pause <-- trapping instruction 2c: 41 83 ec 01 sub $0x1,%r12d 30: 0f 84 72 04 00 00 je 0x4a8 36: 41 0f b6 45 00 movzbl 0x0(%r13),%eax 3b: 38 d8 cmp %bl,%al 3d: 7f 08 jg 0x47 3f: 84 .byte 0x84
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Jamal Hadi Salim jhs@mojatatu.com Cc: Cong Wang xiyou.wangcong@gmail.com Cc: Jiri Pirko jiri@resnulli.us Reported-by: syzbot syzkaller@googlegroups.com Link: https://lore.kernel.org/r/20220215235305.3272331-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/act_api.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
--- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -728,15 +728,24 @@ int tcf_action_exec(struct sk_buff *skb, restart_act_graph: for (i = 0; i < nr_actions; i++) { const struct tc_action *a = actions[i]; + int repeat_ttl;
if (jmp_prgcnt > 0) { jmp_prgcnt -= 1; continue; } + + repeat_ttl = 32; repeat: ret = a->ops->act(skb, a, res); - if (ret == TC_ACT_REPEAT) - goto repeat; /* we need a ttl - JHS */ + + if (unlikely(ret == TC_ACT_REPEAT)) { + if (--repeat_ttl != 0) + goto repeat; + /* suspicious opcode, stop pipeline */ + net_warn_ratelimited("TC_ACT_REPEAT abuse ?\n"); + return TC_ACT_OK; + }
if (TC_ACT_EXT_CMP(ret, TC_ACT_JUMP)) { jmp_prgcnt = ret & TCA_ACT_MAX_PRIO_MASK;
From: Jiasheng Jiang jiasheng@iscas.ac.cn
commit 2d21543efe332cd8c8f212fb7d365bc8b0690bfa upstream.
Because of the possible failure of the dma_supported(), the dma_set_mask_and_coherent() may return error num. Therefore, it should be better to check it and return the error if fails.
Fixes: dc312349e875 ("dmaengine: rcar-dmac: Widen DMA mask to 40 bits") Signed-off-by: Jiasheng Jiang jiasheng@iscas.ac.cn Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://lore.kernel.org/r/20220106030939.2644320-1-jiasheng@iscas.ac.cn Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/sh/rcar-dmac.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/dma/sh/rcar-dmac.c +++ b/drivers/dma/sh/rcar-dmac.c @@ -1869,7 +1869,9 @@ static int rcar_dmac_probe(struct platfo dmac->dev = &pdev->dev; platform_set_drvdata(pdev, dmac); dma_set_max_seg_size(dmac->dev, RCAR_DMATCR_MASK); - dma_set_mask_and_coherent(dmac->dev, DMA_BIT_MASK(40)); + ret = dma_set_mask_and_coherent(dmac->dev, DMA_BIT_MASK(40)); + if (ret) + return ret;
ret = rcar_dmac_parse_of(&pdev->dev, dmac); if (ret < 0)
From: Miaoqian Lin linmq006@gmail.com
commit e831c7aba950f3ae94002b10321279654525e5ec upstream.
The pm_runtime_enable will increase power disable depth. If the probe fails, we should use pm_runtime_disable() to balance pm_runtime_enable().
Fixes: 4f3ceca254e0 ("dmaengine: stm32-dmamux: Add PM Runtime support") Signed-off-by: Miaoqian Lin linmq006@gmail.com Reviewed-by: Amelie Delaunay amelie.delaunay@foss.st.com Link: https://lore.kernel.org/r/20220108085336.11992-1-linmq006@gmail.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/stm32-dmamux.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/dma/stm32-dmamux.c +++ b/drivers/dma/stm32-dmamux.c @@ -292,10 +292,12 @@ static int stm32_dmamux_probe(struct pla ret = of_dma_router_register(node, stm32_dmamux_route_allocate, &stm32_dmamux->dmarouter); if (ret) - goto err_clk; + goto pm_disable;
return 0;
+pm_disable: + pm_runtime_disable(&pdev->dev); err_clk: clk_disable_unprepare(stm32_dmamux->clk);
From: Jiasheng Jiang jiasheng@iscas.ac.cn
commit da2ad87fba0891576aadda9161b8505fde81a84d upstream.
As the possible failure of the dma_set_max_seg_size(), it should be better to check the return value of the dma_set_max_seg_size().
Fixes: 97d49c59e219 ("dmaengine: rcar-dmac: set scatter/gather max segment size") Reported-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Jiasheng Jiang jiasheng@iscas.ac.cn Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://lore.kernel.org/r/20220111011239.452837-1-jiasheng@iscas.ac.cn Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/sh/rcar-dmac.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/dma/sh/rcar-dmac.c +++ b/drivers/dma/sh/rcar-dmac.c @@ -1868,7 +1868,10 @@ static int rcar_dmac_probe(struct platfo
dmac->dev = &pdev->dev; platform_set_drvdata(pdev, dmac); - dma_set_max_seg_size(dmac->dev, RCAR_DMATCR_MASK); + ret = dma_set_max_seg_size(dmac->dev, RCAR_DMATCR_MASK); + if (ret) + return ret; + ret = dma_set_mask_and_coherent(dmac->dev, DMA_BIT_MASK(40)); if (ret) return ret;
From: Christian Brauner brauner@kernel.org
commit d1c56bfdaca465bd1d0e913053a9c5cafe8b6a6c upstream.
The test treated zero as a successful run when it really should treat non-zero as a successful run. A mount's idmapping can't change once it has been attached to the filesystem.
Link: https://lore.kernel.org/r/20220203131411.3093040-2-brauner@kernel.org Fixes: 01eadc8dd96d ("tests: add mount_setattr() selftests") Cc: Seth Forshee seth.forshee@digitalocean.com Cc: Christoph Hellwig hch@lst.de Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/mount_setattr/mount_setattr_test.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/mount_setattr/mount_setattr_test.c +++ b/tools/testing/selftests/mount_setattr/mount_setattr_test.c @@ -1236,7 +1236,7 @@ static int get_userns_fd(unsigned long n }
/** - * Validate that an attached mount in our mount namespace can be idmapped. + * Validate that an attached mount in our mount namespace cannot be idmapped. * (The kernel enforces that the mount's mount namespace and the caller's mount * namespace match.) */ @@ -1259,7 +1259,7 @@ TEST_F(mount_setattr_idmapped, attached_
attr.userns_fd = get_userns_fd(0, 10000, 10000); ASSERT_GE(attr.userns_fd, 0); - ASSERT_EQ(sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr)), 0); + ASSERT_NE(sys_mount_setattr(open_tree_fd, "", AT_EMPTY_PATH, &attr, sizeof(attr)), 0); ASSERT_EQ(close(attr.userns_fd), 0); ASSERT_EQ(close(open_tree_fd), 0); }
From: Vladimir Zapolskiy vladimir.zapolskiy@linaro.org
commit a0d48505a1d68e27220369e2dd1e3573a2f362d2 upstream.
If i2c_add_adapter() fails to add an I2C adapter found on QCOM CCI controller, on error path i2c_del_adapter() is still called.
Fortunately there is a sanity check in the I2C core, so the only visible implication is a printed debug level message:
i2c-core: attempting to delete unregistered adapter [Qualcomm-CCI]
Nevertheless it would be reasonable to correct the probe error path.
Fixes: e517526195de ("i2c: Add Qualcomm CCI I2C driver") Signed-off-by: Vladimir Zapolskiy vladimir.zapolskiy@linaro.org Reviewed-by: Robert Foss robert.foss@linaro.org Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-qcom-cci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/i2c/busses/i2c-qcom-cci.c +++ b/drivers/i2c/busses/i2c-qcom-cci.c @@ -655,7 +655,7 @@ static int cci_probe(struct platform_dev return 0;
error_i2c: - for (; i >= 0; i--) { + for (--i ; i >= 0; i--) { if (cci->master[i].cci) i2c_del_adapter(&cci->master[i].adap); }
From: Vladimir Zapolskiy vladimir.zapolskiy@linaro.org
commit 02a4a69667a2ad32f3b52ca906f19628fbdd8a01 upstream.
There is a minor chance for a race, if a pointer to an i2c-bus subnode is stored and then reused after releasing its reference, and it would be sufficient to get one more reference under a loop over children subnodes.
Fixes: e517526195de ("i2c: Add Qualcomm CCI I2C driver") Signed-off-by: Vladimir Zapolskiy vladimir.zapolskiy@linaro.org Reviewed-by: Robert Foss robert.foss@linaro.org Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-qcom-cci.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
--- a/drivers/i2c/busses/i2c-qcom-cci.c +++ b/drivers/i2c/busses/i2c-qcom-cci.c @@ -558,7 +558,7 @@ static int cci_probe(struct platform_dev cci->master[idx].adap.quirks = &cci->data->quirks; cci->master[idx].adap.algo = &cci_algo; cci->master[idx].adap.dev.parent = dev; - cci->master[idx].adap.dev.of_node = child; + cci->master[idx].adap.dev.of_node = of_node_get(child); cci->master[idx].master = idx; cci->master[idx].cci = cci;
@@ -643,8 +643,10 @@ static int cci_probe(struct platform_dev continue;
ret = i2c_add_adapter(&cci->master[i].adap); - if (ret < 0) + if (ret < 0) { + of_node_put(cci->master[i].adap.dev.of_node); goto error_i2c; + } }
pm_runtime_set_autosuspend_delay(dev, MSEC_PER_SEC); @@ -656,8 +658,10 @@ static int cci_probe(struct platform_dev
error_i2c: for (--i ; i >= 0; i--) { - if (cci->master[i].cci) + if (cci->master[i].cci) { i2c_del_adapter(&cci->master[i].adap); + of_node_put(cci->master[i].adap.dev.of_node); + } } error: disable_irq(cci->irq); @@ -673,8 +677,10 @@ static int cci_remove(struct platform_de int i;
for (i = 0; i < cci->data->num_masters; i++) { - if (cci->master[i].cci) + if (cci->master[i].cci) { i2c_del_adapter(&cci->master[i].adap); + of_node_put(cci->master[i].adap.dev.of_node); + } cci_halt(cci, i); }
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
commit 3c62fd3406e0b2277c76a6984d3979c7f3f1d129 upstream.
In order to free resources correctly in the error handling path of pt_core_init(), 2 goto's have to be switched. Otherwise, some resources will leak and we will try to release things that have not been allocated yet.
Also move a dev_err() to a place where it is more meaningful.
Fixes: fa5d823b16a9 ("dmaengine: ptdma: Initial driver for the AMD PTDMA") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Acked-by: Sanjay R Mehta sanju.mehta@amd.com Reviewed-by: Dan Carpenter dan.carpenter@oracle.com Link: https://lore.kernel.org/r/41a963a35173f89c874f5c44df5530dc09fea8da.164404424... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/ptdma/ptdma-dev.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-)
--- a/drivers/dma/ptdma/ptdma-dev.c +++ b/drivers/dma/ptdma/ptdma-dev.c @@ -207,7 +207,7 @@ int pt_core_init(struct pt_device *pt) if (!cmd_q->qbase) { dev_err(dev, "unable to allocate command queue\n"); ret = -ENOMEM; - goto e_dma_alloc; + goto e_destroy_pool; }
cmd_q->qidx = 0; @@ -229,8 +229,10 @@ int pt_core_init(struct pt_device *pt)
/* Request an irq */ ret = request_irq(pt->pt_irq, pt_core_irq_handler, 0, dev_name(pt->dev), pt); - if (ret) - goto e_pool; + if (ret) { + dev_err(dev, "unable to allocate an IRQ\n"); + goto e_free_dma; + }
/* Update the device registers with queue information. */ cmd_q->qcontrol &= ~CMD_Q_SIZE; @@ -250,21 +252,20 @@ int pt_core_init(struct pt_device *pt) /* Register the DMA engine support */ ret = pt_dmaengine_register(pt); if (ret) - goto e_dmaengine; + goto e_free_irq;
/* Set up debugfs entries */ ptdma_debugfs_setup(pt);
return 0;
-e_dmaengine: +e_free_irq: free_irq(pt->pt_irq, pt);
-e_dma_alloc: +e_free_dma: dma_free_coherent(dev, cmd_q->qsize, cmd_q->qbase, cmd_q->qbase_dma);
-e_pool: - dev_err(dev, "unable to allocate an IRQ\n"); +e_destroy_pool: dma_pool_destroy(pt->cmd_q.dma_pool);
return ret;
From: Waiman Long longman@redhat.com
commit ddc204b517e60ae64db34f9832dc41dafa77c751 upstream.
I was made aware of the following lockdep splat:
[ 2516.308763] ===================================================== [ 2516.309085] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected [ 2516.309433] 5.14.0-51.el9.aarch64+debug #1 Not tainted [ 2516.309703] ----------------------------------------------------- [ 2516.310149] stress-ng/153663 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: [ 2516.310512] ffff0000e422b198 (&newf->file_lock){+.+.}-{2:2}, at: fd_install+0x368/0x4f0 [ 2516.310944] and this task is already holding: [ 2516.311248] ffff0000c08140d8 (&sighand->siglock){-.-.}-{2:2}, at: copy_process+0x1e2c/0x3e80 [ 2516.311804] which would create a new lock dependency: [ 2516.312066] (&sighand->siglock){-.-.}-{2:2} -> (&newf->file_lock){+.+.}-{2:2} [ 2516.312446] but this new dependency connects a HARDIRQ-irq-safe lock: [ 2516.312983] (&sighand->siglock){-.-.}-{2:2} : [ 2516.330700] Possible interrupt unsafe locking scenario:
[ 2516.331075] CPU0 CPU1 [ 2516.331328] ---- ---- [ 2516.331580] lock(&newf->file_lock); [ 2516.331790] local_irq_disable(); [ 2516.332231] lock(&sighand->siglock); [ 2516.332579] lock(&newf->file_lock); [ 2516.332922] <Interrupt> [ 2516.333069] lock(&sighand->siglock); [ 2516.333291] *** DEADLOCK *** [ 2516.389845] stack backtrace: [ 2516.390101] CPU: 3 PID: 153663 Comm: stress-ng Kdump: loaded Not tainted 5.14.0-51.el9.aarch64+debug #1 [ 2516.390756] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 2516.391155] Call trace: [ 2516.391302] dump_backtrace+0x0/0x3e0 [ 2516.391518] show_stack+0x24/0x30 [ 2516.391717] dump_stack_lvl+0x9c/0xd8 [ 2516.391938] dump_stack+0x1c/0x38 [ 2516.392247] print_bad_irq_dependency+0x620/0x710 [ 2516.392525] check_irq_usage+0x4fc/0x86c [ 2516.392756] check_prev_add+0x180/0x1d90 [ 2516.392988] validate_chain+0x8e0/0xee0 [ 2516.393215] __lock_acquire+0x97c/0x1e40 [ 2516.393449] lock_acquire.part.0+0x240/0x570 [ 2516.393814] lock_acquire+0x90/0xb4 [ 2516.394021] _raw_spin_lock+0xe8/0x154 [ 2516.394244] fd_install+0x368/0x4f0 [ 2516.394451] copy_process+0x1f5c/0x3e80 [ 2516.394678] kernel_clone+0x134/0x660 [ 2516.394895] __do_sys_clone3+0x130/0x1f4 [ 2516.395128] __arm64_sys_clone3+0x5c/0x7c [ 2516.395478] invoke_syscall.constprop.0+0x78/0x1f0 [ 2516.395762] el0_svc_common.constprop.0+0x22c/0x2c4 [ 2516.396050] do_el0_svc+0xb0/0x10c [ 2516.396252] el0_svc+0x24/0x34 [ 2516.396436] el0t_64_sync_handler+0xa4/0x12c [ 2516.396688] el0t_64_sync+0x198/0x19c [ 2517.491197] NET: Registered PF_ATMPVC protocol family [ 2517.491524] NET: Registered PF_ATMSVC protocol family [ 2591.991877] sched: RT throttling activated
One way to solve this problem is to move the fd_install() call out of the sighand->siglock critical section.
Before commit 6fd2fe494b17 ("copy_process(): don't use ksys_close() on cleanups"), the pidfd installation was done without holding both the task_list lock and the sighand->siglock. Obviously, holding these two locks are not really needed to protect the fd_install() call. So move the fd_install() call down to after the releases of both locks.
Link: https://lore.kernel.org/r/20220208163912.1084752-1-longman@redhat.com Fixes: 6fd2fe494b17 ("copy_process(): don't use ksys_close() on cleanups") Reviewed-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Waiman Long longman@redhat.com Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/fork.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/kernel/fork.c +++ b/kernel/fork.c @@ -2353,10 +2353,6 @@ static __latent_entropy struct task_stru goto bad_fork_cancel_cgroup; }
- /* past the last point of failure */ - if (pidfile) - fd_install(pidfd, pidfile); - init_task_pid_links(p); if (likely(p->pid)) { ptrace_init_task(p, (clone_flags & CLONE_PTRACE) || trace); @@ -2405,6 +2401,9 @@ static __latent_entropy struct task_stru syscall_tracepoint_update(p); write_unlock_irq(&tasklist_lock);
+ if (pidfile) + fd_install(pidfd, pidfile); + proc_fork_connector(p); sched_post_fork(p, args); cgroup_post_fork(p, args);
From: Mike Christie michael.christie@oracle.com
commit f10f582d28220f50099d3f561116256267821429 upstream.
This fixes a deadlock added with commit b40f3894e39e ("scsi: qedi: Complete TMF works before disconnect")
Bug description from Jia-Ju Bai:
qedi_process_tmf_resp() spin_lock(&session->back_lock); --> Line 201 (Lock A) spin_lock(&qedi_conn->tmf_work_lock); --> Line 230 (Lock B)
qedi_process_cmd_cleanup_resp() spin_lock_bh(&qedi_conn->tmf_work_lock); --> Line 752 (Lock B) spin_lock_bh(&conn->session->back_lock); --> Line 784 (Lock A)
When qedi_process_tmf_resp() and qedi_process_cmd_cleanup_resp() are concurrently executed, the deadlock can occur.
This patch fixes the deadlock by not holding the tmf_work_lock in qedi_process_cmd_cleanup_resp while holding the back_lock. The tmf_work_lock is only needed while we remove the tmf_work from the work_list.
Link: https://lore.kernel.org/r/20220208185448.6206-1-michael.christie@oracle.com Fixes: b40f3894e39e ("scsi: qedi: Complete TMF works before disconnect") Cc: Manish Rangankar mrangankar@marvell.com Cc: Nilesh Javali njavali@marvell.com Reported-by: TOTE Robot oslab@tsinghua.edu.cn Reported-by: Jia-Ju Bai baijiaju1990@gmail.com Signed-off-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qedi/qedi_fw.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/scsi/qedi/qedi_fw.c +++ b/drivers/scsi/qedi/qedi_fw.c @@ -772,11 +772,10 @@ static void qedi_process_cmd_cleanup_res qedi_cmd->list_tmf_work = NULL; } } + spin_unlock_bh(&qedi_conn->tmf_work_lock);
- if (!found) { - spin_unlock_bh(&qedi_conn->tmf_work_lock); + if (!found) goto check_cleanup_reqs; - }
QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_SCSI_TM, "TMF work, cqe->tid=0x%x, tmf flags=0x%x, cid=0x%x\n", @@ -807,7 +806,6 @@ static void qedi_process_cmd_cleanup_res qedi_cmd->state = CLEANUP_RECV; unlock: spin_unlock_bh(&conn->session->back_lock); - spin_unlock_bh(&qedi_conn->tmf_work_lock); wake_up_interruptible(&qedi_conn->wait_queue); return;
From: Jesse Brandeburg jesse.brandeburg@intel.com
commit 86006f996346e8a5a1ea80637ec949ceeea4ecbc upstream.
The COMMS package can enable the hardware parser to recognize IPSEC frames with ESP header and SPI identifier. If this package is available and configured for loading in /lib/firmware, then the driver will succeed in enabling this protocol type for RSS.
This in turn allows the hardware to hash over the SPI and use it to pick a consistent receive queue for the same secure flow. Without this all traffic is steered to the same queue for multiple traffic threads from the same IP address. For that reason this is marked as a fix, as the driver supports the model, but it wasn't enabled.
If the package is not available, adding this type will fail, but the failure is ignored on purpose as it has no negative affect.
Fixes: c90ed40cefe1 ("ice: Enable writing hardware filtering tables") Signed-off-by: Jesse Brandeburg jesse.brandeburg@intel.com Tested-by: Gurucharan G gurucharanx.g@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice_lib.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -1521,6 +1521,12 @@ static void ice_vsi_set_rss_flow_fld(str if (status) dev_dbg(dev, "ice_add_rss_cfg failed for sctp6 flow, vsi = %d, error = %s\n", vsi_num, ice_stat_str(status)); + + status = ice_add_rss_cfg(hw, vsi_handle, ICE_FLOW_HASH_ESP_SPI, + ICE_FLOW_SEG_HDR_ESP); + if (status) + dev_dbg(dev, "ice_add_rss_cfg failed for esp/spi flow, vsi = %d, error = %d\n", + vsi_num, status); }
/**
From: Rafał Miłecki rafal@milecki.pl
commit 834cea3a252ed4847db076a769ad9efe06afe2d5 upstream.
DSL and CM (Cable Modem) support 8 B max transfer size and have a custom DT binding for that reason. This driver was checking for a wrong "compatible" however which resulted in an incorrect setup.
Fixes: e2e5a2c61837 ("i2c: brcmstb: Adding support for CM and DSL SoCs") Signed-off-by: Rafał Miłecki rafal@milecki.pl Acked-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-brcmstb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/i2c/busses/i2c-brcmstb.c +++ b/drivers/i2c/busses/i2c-brcmstb.c @@ -673,7 +673,7 @@ static int brcmstb_i2c_probe(struct plat
/* set the data in/out register size for compatible SoCs */ if (of_device_is_compatible(dev->device->of_node, - "brcmstb,brcmper-i2c")) + "brcm,brcmper-i2c")) dev->data_regsz = sizeof(u8); else dev->data_regsz = sizeof(u32);
From: Cheng Jui Wang cheng-jui.wang@mediatek.com
commit 28df029d53a2fd80c1b8674d47895648ad26dcfb upstream.
A kernel exception was hit when trying to dump /proc/lockdep_chains after lockdep report "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!":
Unable to handle kernel paging request at virtual address 00054005450e05c3 ... 00054005450e05c3] address between user and kernel address ranges ... pc : [0xffffffece769b3a8] string+0x50/0x10c lr : [0xffffffece769ac88] vsnprintf+0x468/0x69c ... Call trace: string+0x50/0x10c vsnprintf+0x468/0x69c seq_printf+0x8c/0xd8 print_name+0x64/0xf4 lc_show+0xb8/0x128 seq_read_iter+0x3cc/0x5fc proc_reg_read_iter+0xdc/0x1d4
The cause of the problem is the function lock_chain_get_class() will shift lock_classes index by 1, but the index don't need to be shifted anymore since commit 01bb6f0af992 ("locking/lockdep: Change the range of class_idx in held_lock struct") already change the index to start from 0.
The lock_classes[-1] located at chain_hlocks array. When printing lock_classes[-1] after the chain_hlocks entries are modified, the exception happened.
The output of lockdep_chains are incorrect due to this problem too.
Fixes: f611e8cf98ec ("lockdep: Take read/write status in consideration when generate chainkey") Signed-off-by: Cheng Jui Wang cheng-jui.wang@mediatek.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Boqun Feng boqun.feng@gmail.com Link: https://lore.kernel.org/r/20220210105011.21712-1-cheng-jui.wang@mediatek.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/locking/lockdep.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -3450,7 +3450,7 @@ struct lock_class *lock_chain_get_class( u16 chain_hlock = chain_hlocks[chain->base + i]; unsigned int class_idx = chain_hlock_class_idx(chain_hlock);
- return lock_classes + class_idx - 1; + return lock_classes + class_idx; }
/* @@ -3518,7 +3518,7 @@ static void print_chain_keys_chain(struc hlock_id = chain_hlocks[chain->base + i]; chain_key = print_chain_key_iteration(hlock_id, chain_key);
- print_lock_name(lock_classes + chain_hlock_class_idx(hlock_id) - 1); + print_lock_name(lock_classes + chain_hlock_class_idx(hlock_id)); printk("\n"); } }
On 2/21/22 00:47, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
Building arm:allmodconfig ... failed Building arm64:allmodconfig ... failed -------------- Error log: drivers/tee/optee/core.c: In function 'optee_probe': drivers/tee/optee/core.c:726:20: error: operation on 'rc' may be undefined
Building mips:nlm_xlp_defconfig ... failed -------------- Error log: net/netfilter/xt_socket.c: In function 'socket_mt_destroy': net/netfilter/xt_socket.c:224:17: error: implicit declaration of function 'nf_defrag_ipv6_disable'; did you mean 'nf_defrag_ipv4_disable'?
On Mon, Feb 21, 2022 at 09:15:52AM -0800, Guenter Roeck wrote:
On 2/21/22 00:47, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
Building arm:allmodconfig ... failed Building arm64:allmodconfig ... failed
Error log: drivers/tee/optee/core.c: In function 'optee_probe': drivers/tee/optee/core.c:726:20: error: operation on 'rc' may be undefined
Jens, did you build the 5.15 patch you sent?
This line looks a bit odd:
+ rc = rc = PTR_ERR(ctx);
And given that rc is an int, that is not guaranteed to hold a pointer here :(
thanks,
greg k-h
On Mon, Feb 21, 2022 at 7:00 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Mon, Feb 21, 2022 at 09:15:52AM -0800, Guenter Roeck wrote:
On 2/21/22 00:47, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
Building arm:allmodconfig ... failed Building arm64:allmodconfig ... failed
Error log: drivers/tee/optee/core.c: In function 'optee_probe': drivers/tee/optee/core.c:726:20: error: operation on 'rc' may be undefined
Jens, did you build the 5.15 patch you sent?
I did, but it seems I wasn't paying close enough attention to warnings.
This line looks a bit odd:
rc = rc = PTR_ERR(ctx);
And given that rc is an int, that is not guaranteed to hold a pointer here :(
I'm sorry it looks like I've slipped on the keyboard. There is one "rc = " too many. I believe the assignment "rc = PTR_ERR(ctx)" is correct when IS_ERR(ctx) is true. If I'm wrong I have a bigger problem.
I'll send a v2 to fix this. I see that I've copied the error to backports for 5.10, 5.4 too so I'll send v2s for those too.
Thanks, Jens
thanks,
greg k-h
On Mon, Feb 21, 2022 at 09:15:52AM -0800, Guenter Roeck wrote:
On 2/21/22 00:47, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
Building arm:allmodconfig ... failed Building arm64:allmodconfig ... failed
Error log: drivers/tee/optee/core.c: In function 'optee_probe': drivers/tee/optee/core.c:726:20: error: operation on 'rc' may be undefined
I'm going to drop the tee patches in 5.10 and 5.15 for now.
thanks,
Building mips:nlm_xlp_defconfig ... failed
Error log: net/netfilter/xt_socket.c: In function 'socket_mt_destroy': net/netfilter/xt_socket.c:224:17: error: implicit declaration of function 'nf_defrag_ipv6_disable'; did you mean 'nf_defrag_ipv4_disable'?
I'll go drop this one from everywhere as well, until it gets straightened out upstream.
thanks,
greg k-h
On Mon, 21 Feb 2022 at 14:37, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.25-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. Regressions on arm64, arm and mips for following build errors /warnings.
arm and arm64 build log: drivers/tee/optee/core.c: In function 'optee_probe': drivers/tee/optee/core.c:726:20: warning: operation on 'rc' may be undefined [-Wsequence-point] 726 | rc = rc = PTR_ERR(ctx); | ~~~^~~~~~~~~~~~~~~~~~~
mips build log: - gcc-10-malta_defconfig - gcc-8-malta_defconfig net/netfilter/xt_socket.c: In function 'socket_mt_destroy': net/netfilter/xt_socket.c:224:3: error: implicit declaration of function 'nf_defrag_ipv6_disable'; did you mean 'nf_defrag_ipv4_disable'? [-Werror=implicit-function-declaration] nf_defrag_ipv6_disable(par->net); ^~~~~~~~~~~~~~~~~~~~~~ nf_defrag_ipv4_disable cc1: some warnings being treated as errors
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
- regressions - arm64 - build - gcc-8-defconfig-warnings - gcc-9-defconfig-warnings - gcc-10-defconfig-warnings - gcc-11-defconfig-warnings - clang-nightly-defconfig-warnings - clang-13-defconfig-warnings - clang-12-defconfig-warnings - clang-11-defconfig-warnings - gcc-11-defconfig-fac1da4b-warnings - gcc-11-defconfig-5e73d44a-warnings - gcc-11-defconfig-389abf09-warnings - gcc-11-defconfig-eec653ad-warnings - clang-13-defconfig-5b09568e-warnings - gcc-11-defconfig-59041e85-warnings - gcc-11-defconfig-904271f2-warnings - gcc-11-defconfig-5b09568e-warnings - gcc-11-defconfig-1ee88247-warnings - gcc-11-defconfig-0eba0eda-warnings - clang-nightly-defconfig-5b09568e-warnings - gcc-11-defconfig-bcbd88e2-warnings - clang-12-defconfig-eb1b7d61-warnings - clang-14-defconfig-warnings - clang-14-defconfig-5b09568e-warnings -
- arm - build - gcc-8-imx_v6_v7_defconfig-warnings - gcc-9-imx_v6_v7_defconfig-warnings - gcc-10-imx_v6_v7_defconfig-warnings - gcc-11-imx_v6_v7_defconfig-warnings - clang-nightly-imx_v6_v7_defconfig-warnings - clang-13-imx_v6_v7_defconfig-warnings - clang-13-mini2440_defconfig-warnings - clang-13-omap1_defconfig-warnings - clang-12-footbridge_defconfig-warnings - clang-12-imx_v6_v7_defconfig-warnings - clang-11-footbridge_defconfig-warnings - clang-11-imx_v6_v7_defconfig-warnings - clang-14-imx_v6_v7_defconfig-warnings -
- mips - gcc-10-malta_defconfig - gcc-8-malta_defconfig
-- Linaro LKFT https://lkft.linaro.org/
On Mon, Feb 21, 2022 at 09:47:12AM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
Build results: total: 156 pass: 152 fail: 4 Failed builds: arm:allmodconfig arm64:allmodconfig mips:nlm_xlp_defconfig mips:malta_defconfig Qemu test results: total: 488 pass: 428 fail: 60 Failed tests: <pretty much all MIPS boot tests, due to the build error>
Building arm:allmodconfig ... failed Building arm64:allmodconfig ... failed -------------- Error log: drivers/tee/optee/core.c: In function 'optee_probe': drivers/tee/optee/core.c:726:20: error: operation on 'rc' may be undefined
Building mips:nlm_xlp_defconfig ... failed Building mips:malta_defconfig ... failed -------------- Error log: net/netfilter/xt_socket.c: In function 'socket_mt_destroy': net/netfilter/xt_socket.c:224:17: error: implicit declaration of function 'nf_defrag_ipv6_disable'
Guenter
On 2/21/22 1:47 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.25-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 2/21/2022 12:47 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.25-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels:
Tested-by: Florian Fainelli f.fainelli@gmail.com
On 21/02/22 15.47, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully cross-compiled for arm64 (bcm2711_defconfig, gcc 10.2.0) and powerpc (ps3_defconfig, gcc 11.2.0).
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
On 2/21/22 00:47, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.25-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Mon, 21 Feb 2022 09:47:12 +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.25-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.15: 10 builds: 10 pass, 0 fail 28 boots: 28 pass, 0 fail 114 tests: 114 pass, 0 fail
Linux version: 5.15.25-rc1-gcfc9ab013281 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
Hi Greg,
On Mon, Feb 21, 2022 at 09:47:12AM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
Build test: mips (gcc version 11.2.1 20220213): 62 configs -> no new failure arm (gcc version 11.2.1 20220213): 100 configs -> 1 new failure arm64 (gcc version 11.2.1 20220213): 3 configs -> 1 new failure x86_64 (gcc version 11.2.1 20220213): 4 configs -> no failure
Both arm and arm64 failed with the error: drivers/tee/optee/core.c: In function 'optee_probe': drivers/tee/optee/core.c:726:20: error: operation on 'rc' may be undefined [-Werror=sequence-point] 726 | rc = rc = PTR_ERR(ctx);
Caused by: d7151f4ed49b ("optee: use driver internal tee_context for some rpc")
Boot test: x86_64: Booted on my test laptop. No regression. x86_64: Booted on qemu. No regression. [1] arm64: Booted on rpi4b (4GB model). No regression. [2] mips: Booted on ci20 board. No regression. [3]
[1]. https://openqa.qa.codethink.co.uk/tests/794 [2]. https://openqa.qa.codethink.co.uk/tests/800 [3]. https://openqa.qa.codethink.co.uk/tests/802
Tested-by: Sudip Mukherjee sudip.mukherjee@codethink.co.uk
-- Regards Sudip
On Mon, Feb 21, 2022, at 3:47 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.25 release. There are 196 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 23 Feb 2022 08:48:58 +0000. Anything received after that time might be too late.
5.15.25-rc1 compiled and booted with no errors or regressions on my x86_64 test system.
Tested-by: Slade Watkins slade@sladewatkins.com
Cheers, Slade
linux-stable-mirror@lists.linaro.org