This is the start of the stable review cycle for the 5.10.215 release. There are 294 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 13 Apr 2024 09:53:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.215-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.10.215-rc1
Michael Roth michael.roth@amd.com x86/head/64: Re-enable stack protection
Borislav Petkov (AMD) bp@alien8.de x86/retpoline: Add NOENDBR annotation to the SRSO dummy return thunk
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com scsi: sd: Fix wrong zone_write_granularity value during revalidate
Michal Kubecek mkubecek@suse.cz kbuild: dummy-tools: adjust to stricter stackprotector check
Vasiliy Kovalev kovalev@altlinux.org VMCI: Fix possible memcpy() run-time warning in vmci_datagram_invoke_guest_handler()
Luiz Augusto von Dentz luiz.von.dentz@intel.com Bluetooth: btintel: Fixe build regression
Chris Wilson chris@chris-wilson.co.uk drm/i915/gt: Reset queue_priority_hint on parking
David Hildenbrand david@redhat.com x86/mm/pat: fix VM_PAT handling in COW mappings
David Hildenbrand david@redhat.com virtio: reenable config if freezing device failed
Thadeu Lima de Souza Cascardo cascardo@canonical.com tty: n_gsm: require CAP_NET_ADMIN to attach N_GSM0710 ldisc
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: discard table flag update with pending basechain deletion
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: release batch on table validation from abort path
Roman Smirnov r.smirnov@omp.ru fbmon: prevent division by zero in fb_videomode_from_videomode()
Jiawei Fu (iBug) i@ibugone.com drivers/nvme: Add quirks for device 126f:2262
Aleksandr Burakov a.burakov@rosalinux.ru fbdev: viafb: fix typo in hw_bitblt_1 and hw_bitblt_2
Colin Ian King colin.i.king@gmail.com usb: sl811-hcd: only defined function checkdone if QUIRK2 is defined
Marco Felsch m.felsch@pengutronix.de usb: typec: tcpci: add generic tcpci fallback compatible
Petre Rodan petre.rodan@subdimension.ro tools: iio: replace seekdir() in iio_generic_buffer
linke li lilinke99@qq.com ring-buffer: use READ_ONCE() to read cpu_buffer->commit_page in concurrent environment
Ricardo B. Marliere ricardo@marliere.net ktest: force $buildonly = 1 for 'make_warnings_file' test type
Alban Boyé alban.boye@protonmail.com platform/x86: touchscreen_dmi: Add an extra entry for a variant of the Chuwi Vi8 tablet
Gergo Koteles soyer@irl.hu Input: allocate keycode for Display refresh rate toggle
Manjunath Patil manjunath.b.patil@oracle.com RDMA/cm: add timeout to cm_destroy_id wait
Roman Smirnov r.smirnov@omp.ru block: prevent division by zero in blk_rq_stat_sum()
Ian Rogers irogers@google.com libperf evlist: Avoid out-of-bounds access
Daniel Drake drake@endlessos.org Revert "ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default"
Dai Ngo dai.ngo@oracle.com SUNRPC: increase size of rpc_wait_queue.qlen from unsigned short to unsigned int
Aric Cyr aric.cyr@amd.com drm/amd/display: Fix nanosec stat overflow
Ye Bin yebin10@huawei.com ext4: forbid commit inconsistent quota data when errors=remount-ro
Zhang Yi yi.zhang@huawei.com ext4: add a hint for block bitmap corrupt state in mb_groups
Arnd Bergmann arnd@arndb.de media: sta2x11: fix irq handler cast
Alex Henrie alexhenrie24@gmail.com isofs: handle CDs with bad root inode but good Joliet root directory
Justin Tee justin.tee@broadcom.com scsi: lpfc: Fix possible memory leak in lpfc_rcv_padisc()
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp sysv: don't call sb_bread() with pointers_lock held
Geert Uytterhoeven geert+renesas@glider.be pinctrl: renesas: checker: Limit cfg reg enum checks to provided IDs
Kunwu Chan chentao@kylinos.cn Input: synaptics-rmi4 - fail probing if memory allocation for "phys" fails
Edward Adam Davis eadavis@qq.com Bluetooth: btintel: Fix null ptr deref in btintel_read_version
Eric Dumazet edumazet@google.com net/smc: reduce rtnl pressure in smc_pnet_create_pnetids_list()
David Sterba dsterba@suse.com btrfs: send: handle path ref underflow in header iterate_inode_ref()
David Sterba dsterba@suse.com btrfs: export: handle invalid inode or root reference in btrfs_get_parent()
David Sterba dsterba@suse.com btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks()
Samasth Norway Ananda samasth.norway.ananda@oracle.com tools/power x86_energy_perf_policy: Fix file leak in get_pkg_num()
Kunwu Chan chentao@kylinos.cn pstore/zone: Add a null pointer check to the psz_kmsg_read
Shannon Nelson shannon.nelson@amd.com ionic: set adminq irq affinity
Johan Jonker jbx6244@gmail.com arm64: dts: rockchip: fix rk3399 hdmi ports node
Johan Jonker jbx6244@gmail.com arm64: dts: rockchip: fix rk3328 hdmi ports node
John Ogness john.ogness@linutronix.de panic: Flush kernel log buffer at the end
Harshit Mogalapalli harshit.m.mogalapalli@oracle.com VMCI: Fix memcpy() run-time warning in dg_dispatch_as_host()
Markus Elfring elfring@users.sourceforge.net batman-adv: Improve exception handling in batadv_throw_uevent()
Markus Elfring elfring@users.sourceforge.net batman-adv: Return directly after a failed batadv_dat_select_candidates() in batadv_dat_forward_data()
Dmitry Antipov dmantipov@yandex.ru wifi: ath9k: fix LNA selection in ath_ant_try_scan()
Josh Poimboeuf jpoimboe@redhat.com objtool: Add asm version of STACK_FRAME_NON_STANDARD
Sean Christopherson seanjc@google.com x86/cpufeatures: Add CPUID_LNX_5 to track recently added Linux-defined word
Davide Caratti dcaratti@redhat.com mptcp: don't account accept() of non-MPC client as fallback to TCP
Borislav Petkov (AMD) bp@alien8.de x86/retpoline: Do the necessary fixup to the Zen3/4 srso return thunk for !SRSO
Borislav Petkov (AMD) bp@alien8.de x86/bugs: Fix the SRSO mitigation on Zen3/4
Samuel Holland samuel.holland@sifive.com riscv: Fix spurious errors from __get/put_kernel_nofault
Sumanth Korikkar sumanthk@linux.ibm.com s390/entry: align system call table on 8 bytes
Borislav Petkov (AMD) bp@alien8.de x86/mce: Make sure to grab mce_sysfs_mutex in set_bank()
Herve Codina herve.codina@bootlin.com of: dynamic: Synchronize of_changeset_destroy() with the devlink removals
Herve Codina herve.codina@bootlin.com driver core: Introduce device_link_wait_removal()
I Gede Agastya Darma Laksana gedeagas22@gmail.com ALSA: hda/realtek: Update Panasonic CF-SZ6 quirk to support headset with microphone
Arnd Bergmann arnd@arndb.de ata: sata_mv: Fix PCI device ID table declaration compilation warning
Arnd Bergmann arnd@arndb.de scsi: mylex: Fix sysfs buffer lengths
Arnd Bergmann arnd@arndb.de ata: sata_sx4: fix pdc20621_get_from_dimm() on 64-bit
Stephen Lee slee08177@gmail.com ASoC: ops: Fix wraparound for mask in snd_soc_get_volsw
Johan Hovold johan+linaro@kernel.org arm64: dts: qcom: sc7180-trogdor: mark bluetooth address as broken
Venkata Lakshmi Narayana Gubba gubbaven@codeaurora.org arm64: dts: qcom: sc7180: Remove clock for bluetooth on Trogdor
Paul Barker paul.barker.ct@bp.renesas.com net: ravb: Always process TX descriptor ring
Antoine Tenart atenart@kernel.org udp: do not accept non-tunnel GSO skbs landing in a tunnel
Alexander Stein alexander.stein@ew.tq-group.com Revert "usb: phy: generic: Get the vbus supply"
Bikash Hazarika bhazarika@marvell.com scsi: qla2xxx: Update manufacturer detail
Bikash Hazarika bhazarika@marvell.com scsi: qla2xxx: Update manufacturer details
Aleksandr Loktionov aleksandr.loktionov@intel.com i40e: fix vf may be used uninitialized in this function warning
Aleksandr Loktionov aleksandr.loktionov@intel.com i40e: fix i40e_count_filters() to count only active/new filters
Su Hui suhui@nfschina.com octeontx2-pf: check negative error code in otx2_open()
Antoine Tenart atenart@kernel.org udp: do not transition UDP GRO fraglist partial checksums to unnecessary
Kuniyuki Iwashima kuniyu@amazon.com ipv6: Fix infinite recursion in fib6_dump_done().
Jakub Kicinski kuba@kernel.org selftests: reuseaddr_conflict: add missing new line at the end of the output
Eric Dumazet edumazet@google.com erspan: make sure erspan_base_hdr is present in skb->head
Piotr Wejman piotrwejman90@gmail.com net: stmmac: fix rx queue priority assignment
Eric Dumazet edumazet@google.com net/sched: act_skbmod: prevent kernel-infoleak
Jakub Sitnicki jakub@cloudflare.com bpf, sockmap: Prevent lock inversion deadlock in map delete elem
Christophe JAILLET christophe.jaillet@wanadoo.fr vboxsf: Avoid an spurious warning if load_nls_xxx() fails
Eric Dumazet edumazet@google.com netfilter: validate user input for expected length
Ziyang Xuan william.xuanziyang@huawei.com netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get()
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: flush pending destroy work before exit_net release
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: reject new basechain after table flag update
Min Li min15.li@samsung.com block: add check that partition length needs to be aligned with block size
Pu Wen puwen@hygon.cn x86/srso: Add SRSO mitigation for Hygon processors
Vlastimil Babka vbabka@suse.cz mm, vmscan: prevent infinite loop for costly GFP_NOIO | __GFP_RETRY_MAYFAIL allocations
Ingo Molnar mingo@kernel.org Revert "x86/mm/ident_map: Use gbpages only where full GB page should be mapped."
Jens Axboe axboe@kernel.dk io_uring: ensure '0' is returned on file registration success
Alex Williamson alex.williamson@redhat.com vfio/fsl-mc: Block calling interrupt handler without trigger
Alex Williamson alex.williamson@redhat.com vfio/platform: Create persistent IRQ handlers
Alex Williamson alex.williamson@redhat.com vfio/pci: Create persistent INTx handler
Alex Williamson alex.williamson@redhat.com vfio: Introduce interface to flush virqfd inject workqueue
Alex Williamson alex.williamson@redhat.com vfio/pci: Lock external INTx masking ops
Alex Williamson alex.williamson@redhat.com vfio/pci: Disable auto-enable of exclusive INTx IRQ
Mahmoud Adam mngyadam@amazon.com net/rds: fix possible cp null dereference
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: disallow timeout for anonymous sets
Bastien Nocera hadess@hadess.net Bluetooth: Fix TOCTOU in HCI debugfs implementation
Hui Wang hui.wang@canonical.com Bluetooth: hci_event: set the conn encrypted before conn establishes
Sandipan Das sandipan.das@amd.com x86/cpufeatures: Add new word for scattered features
Heiner Kallweit hkallweit1@gmail.com r8169: fix issue caused by buggy BIOS on certain boards with RTL8168d
Arnd Bergmann arnd@arndb.de dm integrity: fix out-of-range warning
Hariprasad Kelam hkelam@marvell.com Octeontx2-af: fix pause frame configuration in GMP mode
Andrei Matei andreimatei1@gmail.com bpf: Protect against int overflow for stack access size
Nikita Kiryushin kiryushin@ancud.ru ACPICA: debugger: check status of acpi_evaluate_object() in acpi_db_walk_for_fields()
Eric Dumazet edumazet@google.com tcp: properly terminate timers for kernel sockets
Przemek Kitszel przemyslaw.kitszel@intel.com ixgbe: avoid sleeping allocation in ixgbe_ipsec_vf_add_sa()
Ryosuke Yasuoka ryasuoka@redhat.com nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet
Alan Stern stern@rowland.harvard.edu USB: core: Fix deadlock in usb_deauthorize_interface()
Muhammad Usama Anjum usama.anjum@collabora.com scsi: lpfc: Correct size for wqe for memset()
Mika Westerberg mika.westerberg@linux.intel.com PCI/DPC: Quirk PIO log size for Intel Ice Lake Root Ports
Kim Phillips kim.phillips@amd.com x86/cpu: Enable STIBP on AMD if Automatic IBRS is enabled
Quinn Tran qutran@marvell.com scsi: qla2xxx: Delay I/O Abort on PCI error
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix command flush on cable pull
Quinn Tran qutran@marvell.com scsi: qla2xxx: Split FCE|EFT trace control
Christian A. Ehrhardt lk@c--e.de usb: typec: ucsi: Clear UCSI_CCI_RESET_COMPLETE before reset
Christian A. Ehrhardt lk@c--e.de usb: typec: ucsi: Ack unsupported commands
yuan linyu yuanlinyu@hihonor.com usb: udc: remove warning when queue disabled ep
Minas Harutyunyan Minas.Harutyunyan@synopsys.com usb: dwc2: gadget: LPM flow fix
Minas Harutyunyan Minas.Harutyunyan@synopsys.com usb: dwc2: host: Fix ISOC flow in DDMA mode
Minas Harutyunyan Minas.Harutyunyan@synopsys.com usb: dwc2: host: Fix hibernation flow
Minas Harutyunyan Minas.Harutyunyan@synopsys.com usb: dwc2: host: Fix remote wakeup from hibernation
Alan Stern stern@rowland.harvard.edu USB: core: Add hub_get() and hub_put() routines
Dan Carpenter dan.carpenter@linaro.org staging: vc04_services: fix information leak in create_component()
Arnd Bergmann arnd@arndb.de staging: vc04_services: changen strncpy() to strscpy_pad()
Guilherme G. Piccoli gpiccoli@igalia.com scsi: core: Fix unremoved procfs host directory regression
Duoming Zhou duoming@zju.edu.cn ALSA: sh: aica: reorder cleanup operations to avoid UAF bugs
Oliver Neukum oneukum@suse.com usb: cdc-wdm: close race between read and workqueue
Claus Hansen Ries chr@terma.com net: ll_temac: platform_get_resource replaced by wrong function
Mikko Rapeli mikko.rapeli@linaro.org mmc: core: Avoid negative index with array access
Mikko Rapeli mikko.rapeli@linaro.org mmc: core: Initialize mmc_blk_ioc_data
Nathan Chancellor nathan@kernel.org hexagon: vmlinux.lds.S: handle attributes section
Max Filippov jcmvbkbc@gmail.com exec: Fix NOMMU linux_binprm::exec in transfer_args_to_stack()
Felix Fietkau nbd@nbd.name wifi: mac80211: check/clear fast rx for non-4addr sta VLAN changes
John Sperbeck jsperbeck@google.com init: open /initrd.image with O_LARGEFILE
Zi Yan ziy@nvidia.com mm/migrate: set swap entry values of THP tail pages properly.
Liu Shixin liushixin2@huawei.com mm/memory-failure: fix an incorrect use of tail pages
Hugo Villeneuve hvilleneuve@dimonoff.com serial: sc16is7xx: convert from _raw_ to _noinc_ regmap functions for FIFO
Nathan Chancellor nathan@kernel.org powerpc: xor_vmx: Add '-mhard-float' to CFLAGS
Tim Schumacher timschumi@gmx.de efivarfs: Request at most 512 bytes for variable names
Yang Jihong yangjihong1@huawei.com perf/core: Fix reentry problem in perf_output_read_group()
Pawan Gupta pawan.kumar.gupta@linux.intel.com KVM/x86: Export RFDS_NO and RFDS_CLEAR to guests
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/rfds: Mitigate Register File Data Sampling (RFDS)
Pawan Gupta pawan.kumar.gupta@linux.intel.com Documentation/hw-vuln: Add documentation for RFDS
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/mmio: Disable KVM mitigation when X86_FEATURE_CLEAR_CPU_BUF is set
Pawan Gupta pawan.kumar.gupta@linux.intel.com KVM/VMX: Move VERW closer to VMentry for MDS mitigation
Pawan Gupta pawan.kumar.gupta@linux.intel.com KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/entry_32: Add VERW just before userspace transition
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/entry_64: Add VERW just before userspace transition
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/bugs: Add asm helpers for executing VERW
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/asm: Add _ASM_RIP() macro for x86-64 (%rip) suffix
Goldwyn Rodrigues rgoldwyn@suse.com btrfs: allocate btrfs_ioctl_defrag_range_args on stack
John Ogness john.ogness@linutronix.de printk: Update @console_may_schedule in console_trylock_spinning()
Maximilian Heyne mheyne@amazon.de xen/events: close evtchn after mapping cleanup
Sumit Garg sumit.garg@linaro.org tee: optee: Fix kernel panic caused by incorrect error handling
Bart Van Assche bvanassche@acm.org fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion
Nicolas Pitre nico@fluxnic.net vt: fix unicode buffer corruption when deleting characters
Alexander Usyskin alexander.usyskin@intel.com mei: me: add arrow lake point H DID
Alexander Usyskin alexander.usyskin@intel.com mei: me: add arrow lake point S DID
Sherry Sun sherry.sun@nxp.com tty: serial: fsl_lpuart: avoid idle preamble pending if CTS is enabled
Mathias Nyman mathias.nyman@linux.intel.com usb: port: Don't try to peer unused USB ports based on location
Krishna Kurapati quic_kriskura@quicinc.com usb: gadget: ncm: Fix handling of zero block length packets
Alan Stern stern@rowland.harvard.edu USB: usb-storage: Prevent divide-by-0 error in isd200_ata_command
Kailang Yang kailang@realtek.com ALSA: hda/realtek - Fix headset Mic no show at resume back for Lenovo ALC897 platform
Sean Christopherson seanjc@google.com KVM: SVM: Flush pages under kvm->lock to fix UAF in svm_register_enc_region()
Nathan Chancellor nathan@kernel.org xfrm: Avoid clang fortify warning in copy_to_user_tmpl()
Michael Kelley mhklinux@outlook.com Drivers: hv: vmbus: Calculate ring buffer size for more efficient use of memory
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: reject constant set with timeout
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: disallow anonymous set with timeout flag
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: mark set as dead when unbinding anonymous set with timeout
Greg Kroah-Hartman gregkh@linuxfoundation.org cpufreq: brcmstb-avs-cpufreq: fix up "add check for cpufreq_cpu_get's return value"
Ian Abbott abbotti@mev.co.uk comedi: comedi_test: Prevent timers rescheduling during deletion
Salvatore Bonaccorso carnil@debian.org scripts: kernel-doc: Fix syntax error due to undeclared args variable
Anton Altaparmakov anton@tuxera.com x86/pm: Work around false positive kmemleak report in msr_build_context()
Andy Lutomirski luto@kernel.org x86/stackprotector/32: Make the canary into a regular percpu variable
Xu Wang vulab@iscas.ac.cn vxge: remove unnecessary cast in kfree()
Mikulas Patocka mpatocka@redhat.com dm snapshot: fix lockup in dm_exception_table_exit
Leo Ma hanghong.ma@amd.com drm/amd/display: Fix noise issue on HDMI AV mute
Rodrigo Siqueira Rodrigo.Siqueira@amd.com drm/amd/display: Return the correct HDCP error code
Conrad Kostecki conikost@gentoo.org ahci: asm1064: asm1166: don't limit reported ports
Andrey Jr. Melnikov temnota.am@gmail.com ahci: asm1064: correct count of reported ports
Jason A. Donenfeld Jason@zx2c4.com wireguard: netlink: access device through ctx instead of peer
Jason A. Donenfeld Jason@zx2c4.com wireguard: netlink: check for dangling peer via is_dead instead of empty list
Steven Rostedt (Google) rostedt@goodmis.org net: hns3: tracing: fix hclgevf trace event strings
Borislav Petkov (AMD) bp@alien8.de x86/CPU/AMD: Update the Zenbleed microcode revisions
Marek Szyprowski m.szyprowski@samsung.com cpufreq: dt: always allocate zeroed cpumask
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: prevent kernel bug at submit_bh_wbc()
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: fix failure to detect DAT corruption in btree and direct mappings
Qiang Zhang qiang4.zhang@intel.com memtest: use {READ,WRITE}_ONCE in memory scanning
Jani Nikula jani.nikula@intel.com drm/vc4: hdmi: do not return negative values from .get_modes()
Jani Nikula jani.nikula@intel.com drm/imx/ipuv3: do not return negative values from .get_modes()
Jani Nikula jani.nikula@intel.com drm/exynos: do not return negative values from .get_modes()
Jani Nikula jani.nikula@intel.com drm/panel: do not return negative error codes from drm_panel_get_modes()
Harald Freudenberger freude@linux.ibm.com s390/zcrypt: fix reference counting on zcrypt card objects
Sean Anderson sean.anderson@linux.dev soc: fsl: qbman: Use raw spinlock for cgr_lock
Sean Anderson sean.anderson@seco.com soc: fsl: qbman: Add CGR update function
Sean Anderson sean.anderson@seco.com soc: fsl: qbman: Add helper for sanity checking cgr ops
Sean Anderson sean.anderson@linux.dev soc: fsl: qbman: Always disable interrupts when taking cgr_lock
Steven Rostedt (Google) rostedt@goodmis.org ring-buffer: Fix full_waiters_pending in poll
Steven Rostedt (Google) rostedt@goodmis.org ring-buffer: Fix resetting of shortest_full
Steven Rostedt (Google) rostedt@goodmis.org ring-buffer: Do not set shortest_full when full target is hit
Steven Rostedt (Google) rostedt@goodmis.org ring-buffer: Fix waking up ring buffer readers
Alex Williamson alex.williamson@redhat.com vfio/platform: Disable virqfds on cleanup
Niklas Cassel cassel@kernel.org PCI: dwc: endpoint: Fix advertised resizable BAR size
Nathan Chancellor nathan@kernel.org kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1
Josef Bacik josef@toxicpanda.com nfs: fix UAF in direct writes
Stanislaw Gruszka stanislaw.gruszka@linux.intel.com PCI/AER: Block runtime suspend when handling errors
Sean V Kelley sean.v.kelley@intel.com PCI/ERR: Clear AER status only when we control AER
Samuel Thibault samuel.thibault@ens-lyon.org speakup: Fix 8bit characters from direct synth
Wayne Chang waynec@nvidia.com usb: gadget: tegra-xudc: Fix USB3 PHY retrieval logic
Jon Hunter jonathanh@nvidia.com usb: gadget: tegra-xudc: Use dev_err_probe()
Wayne Chang waynec@nvidia.com phy: tegra: xusb: Add API to retrieve the port number of phy
Christophe JAILLET christophe.jaillet@wanadoo.fr slimbus: core: Remove usage of the deprecated ida_simple_xx() API
Jerome Brunet jbrunet@baylibre.com nvmem: meson-efuse: fix function pointer type mismatch
Maximilian Heyne mheyne@amazon.de ext4: fix corruption during on-line resize
Josua Mayer josua@solid-run.com hwmon: (amc6821) add of_match table
Christian Gmeiner cgmeiner@igalia.com drm/etnaviv: Restore some id values
Dominique Martinet dominique.martinet@atmark-techno.com mmc: core: Fix switch on gp3 partition
Ryan Roberts ryan.roberts@arm.com mm: swap: fix race between free_swap_and_cache() and swapoff()
Fedor Pchelkin pchelkin@ispras.ru mac802154: fix llsec key resources release in mac802154_llsec_key_del
Yu Kuai yukuai3@huawei.com dm-raid: fix lockdep waring in "pers->hot_add_disk"
Song Liu song@kernel.org Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""
Paul Menzel pmenzel@molgen.mpg.de PCI/DPC: Quirk PIO log size for Intel Raptor Lake Root Ports
Mika Westerberg mika.westerberg@linux.intel.com PCI/DPC: Quirk PIO log size for certain Intel Root Ports
Mika Westerberg mika.westerberg@linux.intel.com PCI/ASPM: Make Intel DG2 L1 acceptable latency unlimited
Bjorn Helgaas bhelgaas@google.com PCI: Work around Intel I210 ROM BAR overlap defect
Amey Narkhede ameynarkhede03@gmail.com PCI: Cache PCIe Device Capabilities register
Sean V Kelley sean.v.kelley@intel.com PCI/ERR: Cache RCEC EA Capability offset in pci_init_capabilities()
Rafael J. Wysocki rafael.j.wysocki@intel.com PCI/PM: Drain runtime-idle callbacks before driver removal
Uwe Kleine-König u.kleine-koenig@pengutronix.de PCI: Drop pci_device_remove() test of pci_dev->driver
Filipe Manana fdmanana@suse.com btrfs: fix off-by-one chunk length calculation at contains_pending_extent()
Peter Collingbourne pcc@google.com serial: Lock console when calling into driver before registration
Petr Mladek pmladek@suse.com printk/console: Split out code that enables default console
Jameson Thies jthies@google.com usb: typec: ucsi: Clean up UCSI_CABLE_PROP macros
Miklos Szeredi mszeredi@redhat.com fuse: don't unhash root
Miklos Szeredi mszeredi@redhat.com fuse: fix root lookup with nonzero generation
Wolfram Sang wsa+renesas@sang-engineering.com mmc: tmio: avoid concurrent runs of mmc_request_done()
Qingliang Li qingliang.li@mediatek.com PM: sleep: wakeirq: fix wake irq warning in system suspend
Toru Katagiri Toru.Katagiri@tdk.com USB: serial: cp210x: add pid/vid for TDK NC0110013M and MM0110113M
Aurélien Jacobs aurel@gnuage.org USB: serial: option: add MeiG Smart SLM320 product
Christian Häggström christian.haggstrom@orexplore.com USB: serial: cp210x: add ID for MGP Instruments PDS100
Cameron Williams cang1@live.co.uk USB: serial: add device ID for VeriFone adapter
Daniel Vogelbacher daniel@chaospixel.com USB: serial: ftdi_sio: add support for GMC Z216C Adapter IR-USB
Michael Ellerman mpe@ellerman.id.au powerpc/fsl: Fix mfpmr build errors with newer binutils
Gabor Juhos j4g8y7@gmail.com clk: qcom: mmcc-msm8974: fix terminating of frequency table arrays
Gabor Juhos j4g8y7@gmail.com clk: qcom: mmcc-apq8084: fix terminating of frequency table arrays
Gabor Juhos j4g8y7@gmail.com clk: qcom: gcc-ipq8074: fix terminating of frequency table arrays
Gabor Juhos j4g8y7@gmail.com clk: qcom: gcc-ipq6018: fix terminating of frequency table arrays
Maulik Shah quic_mkshah@quicinc.com PM: suspend: Set mem_sleep_current during kernel command line setup
Guenter Roeck linux@roeck-us.net parisc: Strip upper 32 bit of sum in csum_ipv6_magic for 64-bit builds
Guenter Roeck linux@roeck-us.net parisc: Fix csum_ipv6_magic on 64-bit systems
Guenter Roeck linux@roeck-us.net parisc: Fix csum_ipv6_magic on 32-bit systems
Guenter Roeck linux@roeck-us.net parisc: Fix ip_fast_csum
John David Anglin dave.anglin@bell.net parisc: Avoid clobbering the C/B bits in the PSW with tophys and tovirt macros
Arseniy Krasnov avkrasnov@salutedevices.com mtd: rawnand: meson: fix scrambling mode value in command macro
Zhang Yi yi.zhang@huawei.com ubi: correct the calculation of fastmap size
Richard Weinberger richard@nod.at ubi: Check for too small LEB size in VTBL code
Matthew Wilcox (Oracle) willy@infradead.org ubifs: Set page uptodate in the correct place
Jan Kara jack@suse.cz fat: fix uninitialized field in nostale filehandles
Matthew Wilcox (Oracle) willy@infradead.org bounds: support non-power-of-two CONFIG_NR_CPUS
Damien Le Moal dlemoal@kernel.org block: Clear zone limits for a non-zoned stacked queue
Damien Le Moal damien.lemoal@wdc.com block: introduce zone_write_granularity limit
Baokun Li libaokun1@huawei.com ext4: correct best extent lstart adjustment logic
SeongJae Park sj@kernel.org selftests/mqueue: Set timeout to 180 seconds
Damian Muszynski damian.muszynski@intel.com crypto: qat - resolve race condition during AER recovery
Svyatoslav Pankratov svyatoslav.pankratov@intel.com crypto: qat - fix double free during reset
Randy Dunlap rdunlap@infradead.org sparc: vDSO: fix return value of __setup handler
Randy Dunlap rdunlap@infradead.org sparc64: NMI watchdog: fix return value of __setup handler
Sean Christopherson seanjc@google.com KVM: Always flush async #PF workqueue when vCPU is being destroyed
Gui-Dong Han 2045gemini@gmail.com media: xc4000: Fix atomicity violation in xc4000_get_frequency
Hugo Villeneuve hvilleneuve@dimonoff.com serial: max310x: fix NULL pointer dereference in I2C instantiation
Zack Rusin zack.rusin@broadcom.com drm/vmwgfx: Fix possible null pointer derefence with invalid contexts
Zack Rusin zackr@vmware.com drm/vmwgfx: Fix some static checker warnings
Lee Jones lee.jones@linaro.org drm/vmwgfx/vmwgfx_cmdbuf_res: Remove unused variable 'ret'
Christian König christian.koenig@amd.com drm/vmwgfx: switch over to the new pin interface v2
Christian König christian.koenig@amd.com drm/vmwgfx: stop using ttm_bo_create v2
Duje Mihanović duje.mihanovic@skole.hr arm: dts: marvell: Fix maxium->maxim typo in brownstone dts
Roberto Sassu roberto.sassu@huawei.com smack: Handle SMACK64TRANSMUTE in smack_inode_setsecurity()
Roberto Sassu roberto.sassu@huawei.com smack: Set SMACK64TRANSMUTE only for dirs in smack_inode_setxattr()
Amit Pundir amit.pundir@linaro.org clk: qcom: gcc-sdm845: Add soft dependency on rpmhpd
Hidenori Kobayashi hidenorik@chromium.org media: staging: ipu3-imgu: Set fields before media_entity_pads_init()
Zheng Wang zyytlz.wz@163.com wifi: brcmfmac: Fix use-after-free bug in brcmf_cfg80211_detach
Thomas Gleixner tglx@linutronix.de timers: Rename del_timer_sync() to timer_delete_sync()
Thomas Gleixner tglx@linutronix.de timers: Use del_timer_sync() even on UP
Thomas Gleixner tglx@linutronix.de timers: Update kernel-doc for various functions
Borislav Petkov bp@suse.de x86/bugs: Use sysfs_emit()
Kim Phillips kim.phillips@amd.com x86/cpu: Support AMD Automatic IBRS
Lin Yujun linyujun809@huawei.com Documentation/hw-vuln: Update spectre doc
-------------
Diffstat:
Documentation/ABI/testing/sysfs-devices-system-cpu | 1 + Documentation/admin-guide/hw-vuln/index.rst | 1 + .../admin-guide/hw-vuln/reg-file-data-sampling.rst | 104 +++++++++ Documentation/admin-guide/hw-vuln/spectre.rst | 18 +- Documentation/admin-guide/kernel-parameters.txt | 27 ++- Documentation/block/queue-sysfs.rst | 7 + Documentation/x86/mds.rst | 34 ++- Makefile | 4 +- arch/arm/boot/dts/mmp2-brownstone.dts | 2 +- arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi | 3 +- arch/arm64/boot/dts/rockchip/rk3328.dtsi | 11 +- arch/arm64/boot/dts/rockchip/rk3399.dtsi | 12 +- arch/hexagon/kernel/vmlinux.lds.S | 1 + arch/parisc/include/asm/assembly.h | 18 +- arch/parisc/include/asm/checksum.h | 10 +- arch/powerpc/include/asm/reg_fsl_emb.h | 11 +- arch/powerpc/lib/Makefile | 2 +- arch/riscv/include/asm/uaccess.h | 4 +- arch/s390/kernel/entry.S | 1 + arch/sparc/kernel/nmi.c | 2 +- arch/sparc/vdso/vma.c | 7 +- arch/x86/Kconfig | 18 +- arch/x86/Makefile | 8 + arch/x86/entry/entry.S | 23 ++ arch/x86/entry/entry_32.S | 59 +---- arch/x86/entry/entry_64.S | 10 + arch/x86/entry/entry_64_compat.S | 1 + arch/x86/include/asm/asm-prototypes.h | 1 + arch/x86/include/asm/asm.h | 5 + arch/x86/include/asm/cpufeature.h | 8 +- arch/x86/include/asm/cpufeatures.h | 5 +- arch/x86/include/asm/disabled-features.h | 3 +- arch/x86/include/asm/entry-common.h | 1 - arch/x86/include/asm/irqflags.h | 1 + arch/x86/include/asm/msr-index.h | 10 + arch/x86/include/asm/nospec-branch.h | 47 ++-- arch/x86/include/asm/processor.h | 15 +- arch/x86/include/asm/ptrace.h | 5 +- arch/x86/include/asm/required-features.h | 3 +- arch/x86/include/asm/segment.h | 30 +-- arch/x86/include/asm/setup.h | 1 - arch/x86/include/asm/stackprotector.h | 79 ++----- arch/x86/include/asm/suspend_32.h | 12 +- arch/x86/kernel/Makefile | 1 - arch/x86/kernel/asm-offsets_32.c | 5 - arch/x86/kernel/cpu/amd.c | 10 +- arch/x86/kernel/cpu/bugs.c | 245 ++++++++++++++------- arch/x86/kernel/cpu/common.c | 64 ++++-- arch/x86/kernel/cpu/mce/core.c | 4 +- arch/x86/kernel/doublefault_32.c | 4 +- arch/x86/kernel/head64.c | 9 - arch/x86/kernel/head_32.S | 18 +- arch/x86/kernel/head_64.S | 24 +- arch/x86/kernel/nmi.c | 3 - arch/x86/kernel/setup_percpu.c | 1 - arch/x86/kernel/tls.c | 8 +- arch/x86/kvm/cpuid.h | 2 + arch/x86/kvm/svm/sev.c | 18 +- arch/x86/kvm/vmx/run_flags.h | 7 +- arch/x86/kvm/vmx/vmenter.S | 9 +- arch/x86/kvm/vmx/vmx.c | 12 +- arch/x86/kvm/x86.c | 5 +- arch/x86/lib/insn-eval.c | 4 - arch/x86/lib/retpoline.S | 6 +- arch/x86/mm/ident_map.c | 23 +- arch/x86/mm/pat/memtype.c | 50 +++-- arch/x86/platform/pvh/head.S | 14 -- arch/x86/power/cpu.c | 6 +- arch/x86/xen/enlighten_pv.c | 1 - block/blk-settings.c | 41 +++- block/blk-stat.c | 2 +- block/blk-sysfs.c | 8 + block/ioctl.c | 11 +- drivers/accessibility/speakup/synth.c | 4 +- drivers/acpi/acpica/dbnames.c | 8 +- drivers/acpi/sleep.c | 12 - drivers/ata/ahci.c | 5 - drivers/ata/sata_mv.c | 63 +++--- drivers/ata/sata_sx4.c | 6 +- drivers/base/core.c | 26 ++- drivers/base/cpu.c | 8 + drivers/base/power/wakeirq.c | 4 +- drivers/bluetooth/btintel.c | 2 +- drivers/clk/qcom/gcc-ipq6018.c | 2 + drivers/clk/qcom/gcc-ipq8074.c | 2 + drivers/clk/qcom/gcc-sdm845.c | 1 + drivers/clk/qcom/mmcc-apq8084.c | 2 + drivers/clk/qcom/mmcc-msm8974.c | 2 + drivers/cpufreq/brcmstb-avs-cpufreq.c | 5 +- drivers/cpufreq/cpufreq-dt.c | 2 +- drivers/crypto/qat/qat_common/adf_aer.c | 23 +- drivers/firmware/efi/vars.c | 17 +- drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 12 +- .../gpu/drm/amd/display/modules/hdcp/hdcp_psp.c | 3 + .../gpu/drm/amd/display/modules/inc/mod_stats.h | 4 +- drivers/gpu/drm/drm_panel.c | 17 +- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_hwdb.c | 9 + drivers/gpu/drm/exynos/exynos_drm_vidi.c | 4 +- drivers/gpu/drm/exynos/exynos_hdmi.c | 4 +- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 3 - drivers/gpu/drm/i915/gt/intel_lrc.c | 3 + drivers/gpu/drm/imx/parallel-display.c | 4 +- drivers/gpu/drm/ttm/ttm_memory.c | 2 + drivers/gpu/drm/vc4/vc4_hdmi.c | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_binding.c | 20 +- drivers/gpu/drm/vmwgfx/vmwgfx_blit.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 92 ++++++-- drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c | 8 +- drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_cotable.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 11 +- drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 12 +- drivers/gpu/drm/vmwgfx/vmwgfx_fb.c | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_mob.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 6 +- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 12 +- drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_shader.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_so.c | 3 +- drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 50 +---- drivers/gpu/drm/vmwgfx/vmwgfx_validation.c | 6 +- drivers/hwmon/amc6821.c | 11 + drivers/infiniband/core/cm.c | 20 +- drivers/input/rmi4/rmi_driver.c | 6 +- drivers/md/dm-integrity.c | 2 +- drivers/md/dm-raid.c | 2 + drivers/md/dm-snap.c | 4 +- drivers/md/raid5.c | 12 + drivers/media/pci/sta2x11/sta2x11_vip.c | 9 +- drivers/media/tuners/xc4000.c | 4 +- drivers/misc/mei/hw-me-regs.h | 2 + drivers/misc/mei/pci-me.c | 2 + drivers/misc/vmw_vmci/vmci_datagram.c | 6 +- drivers/mmc/core/block.c | 14 +- drivers/mmc/host/tmio_mmc_core.c | 2 + drivers/mtd/nand/raw/meson_nand.c | 2 +- drivers/mtd/ubi/fastmap.c | 7 +- drivers/mtd/ubi/vtbl.c | 6 + .../ethernet/hisilicon/hns3/hns3pf/hclge_trace.h | 8 +- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_trace.h | 8 +- drivers/net/ethernet/intel/i40e/i40e_main.c | 7 +- drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 34 ++- drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 16 +- drivers/net/ethernet/marvell/octeontx2/af/cgx.c | 5 + .../net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 2 +- drivers/net/ethernet/neterion/vxge/vxge-config.c | 2 +- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 5 +- drivers/net/ethernet/realtek/r8169_main.c | 9 + drivers/net/ethernet/renesas/ravb_main.c | 7 +- drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 40 +++- .../net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 38 +++- drivers/net/ethernet/xilinx/ll_temac_main.c | 2 +- drivers/net/wireguard/netlink.c | 10 +- drivers/net/wireless/ath/ath9k/antenna.c | 2 +- .../broadcom/brcm80211/brcmfmac/cfg80211.c | 4 +- drivers/nvme/host/pci.c | 3 + drivers/nvmem/meson-efuse.c | 25 +-- drivers/of/dynamic.c | 12 + drivers/pci/controller/dwc/pcie-designware-ep.c | 7 +- drivers/pci/pci-driver.c | 23 +- drivers/pci/pci.c | 6 +- drivers/pci/pci.h | 17 ++ drivers/pci/pcie/Makefile | 2 +- drivers/pci/pcie/dpc.c | 15 +- drivers/pci/pcie/err.c | 33 ++- drivers/pci/pcie/rcec.c | 59 +++++ drivers/pci/probe.c | 7 +- drivers/pci/quirks.c | 100 +++++++++ drivers/pci/setup-res.c | 8 +- drivers/phy/tegra/xusb.c | 13 ++ drivers/pinctrl/renesas/core.c | 4 +- drivers/platform/x86/touchscreen_dmi.c | 9 + drivers/s390/crypto/zcrypt_api.c | 2 + drivers/scsi/hosts.c | 7 +- drivers/scsi/lpfc/lpfc_nportdisc.c | 6 +- drivers/scsi/lpfc/lpfc_nvmet.c | 2 +- drivers/scsi/myrb.c | 20 +- drivers/scsi/myrs.c | 24 +- drivers/scsi/qla2xxx/qla_attr.c | 14 +- drivers/scsi/qla2xxx/qla_def.h | 2 +- drivers/scsi/qla2xxx/qla_gs.c | 2 +- drivers/scsi/qla2xxx/qla_init.c | 102 ++++----- drivers/scsi/qla2xxx/qla_target.c | 10 + drivers/scsi/sd.c | 7 +- drivers/slimbus/core.c | 4 +- drivers/soc/fsl/qbman/qman.c | 98 +++++++-- drivers/staging/comedi/drivers/comedi_test.c | 30 ++- drivers/staging/media/ipu3/ipu3-v4l2.c | 16 +- .../staging/vc04_services/vchiq-mmal/mmal-vchiq.c | 5 +- drivers/tee/optee/device.c | 3 +- drivers/tty/n_gsm.c | 3 + drivers/tty/serial/8250/8250_port.c | 6 - drivers/tty/serial/fsl_lpuart.c | 7 +- drivers/tty/serial/max310x.c | 7 +- drivers/tty/serial/sc16is7xx.c | 15 +- drivers/tty/serial/serial_core.c | 12 + drivers/tty/vt/vt.c | 2 +- drivers/usb/class/cdc-wdm.c | 6 +- drivers/usb/core/hub.c | 23 +- drivers/usb/core/hub.h | 2 + drivers/usb/core/port.c | 5 +- drivers/usb/core/sysfs.c | 16 +- drivers/usb/dwc2/core.h | 14 ++ drivers/usb/dwc2/core_intr.c | 63 ++++-- drivers/usb/dwc2/gadget.c | 4 + drivers/usb/dwc2/hcd.c | 47 +++- drivers/usb/dwc2/hcd_ddma.c | 17 +- drivers/usb/dwc2/hw.h | 2 +- drivers/usb/gadget/function/f_ncm.c | 2 +- drivers/usb/gadget/udc/core.c | 4 +- drivers/usb/gadget/udc/tegra-xudc.c | 53 ++--- drivers/usb/host/sl811-hcd.c | 2 + drivers/usb/phy/phy-generic.c | 7 - drivers/usb/serial/cp210x.c | 4 + drivers/usb/serial/ftdi_sio.c | 2 + drivers/usb/serial/ftdi_sio_ids.h | 6 + drivers/usb/serial/option.c | 6 + drivers/usb/storage/isd200.c | 23 +- drivers/usb/typec/tcpm/tcpci.c | 1 + drivers/usb/typec/ucsi/ucsi.c | 42 +++- drivers/usb/typec/ucsi/ucsi.h | 4 +- drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c | 7 +- drivers/vfio/pci/vfio_pci_intrs.c | 188 +++++++++------- drivers/vfio/platform/vfio_platform_irq.c | 106 ++++++--- drivers/vfio/virqfd.c | 21 ++ drivers/video/fbdev/core/fbmon.c | 7 +- drivers/video/fbdev/via/accel.c | 4 +- drivers/virtio/virtio.c | 10 +- drivers/xen/events/events_base.c | 5 +- fs/aio.c | 8 +- fs/btrfs/export.c | 9 +- fs/btrfs/ioctl.c | 25 +-- fs/btrfs/send.c | 10 +- fs/btrfs/volumes.c | 14 +- fs/exec.c | 1 + fs/ext4/mballoc.c | 22 +- fs/ext4/resize.c | 3 +- fs/ext4/super.c | 12 + fs/fat/nfs.c | 6 + fs/fuse/dir.c | 4 + fs/fuse/fuse_i.h | 1 - fs/fuse/inode.c | 7 +- fs/isofs/inode.c | 18 +- fs/nfs/direct.c | 11 +- fs/nfs/write.c | 2 +- fs/nilfs2/btree.c | 9 +- fs/nilfs2/direct.c | 9 +- fs/nilfs2/inode.c | 2 +- fs/pstore/zone.c | 2 + fs/sysv/itree.c | 10 +- fs/ubifs/file.c | 13 +- fs/vboxsf/super.c | 3 +- include/linux/blkdev.h | 15 ++ include/linux/cpu.h | 2 + include/linux/device.h | 1 + include/linux/gfp.h | 9 + include/linux/hyperv.h | 22 +- include/linux/nfs_fs.h | 1 + include/linux/objtool.h | 8 + include/linux/pci.h | 6 + include/linux/phy/tegra/xusb.h | 2 + include/linux/sunrpc/sched.h | 2 +- include/linux/timer.h | 18 +- include/linux/udp.h | 28 +++ include/linux/vfio.h | 2 + include/net/cfg802154.h | 1 + include/net/inet_connection_sock.h | 1 + include/net/sock.h | 7 + include/soc/fsl/qman.h | 9 + include/uapi/linux/input-event-codes.h | 1 + init/initramfs.c | 2 +- io_uring/io_uring.c | 2 +- kernel/bounds.c | 2 +- kernel/bpf/verifier.c | 5 + kernel/events/core.c | 9 + kernel/panic.c | 8 + kernel/power/suspend.c | 1 + kernel/printk/printk.c | 63 ++++-- kernel/time/timer.c | 160 ++++++++------ kernel/trace/ring_buffer.c | 193 +++++++++------- mm/compaction.c | 7 +- mm/memory-failure.c | 2 +- mm/memory.c | 4 + mm/memtest.c | 4 +- mm/migrate.c | 6 +- mm/page_alloc.c | 10 +- mm/swapfile.c | 13 +- mm/vmscan.c | 5 +- net/batman-adv/distributed-arp-table.c | 3 +- net/batman-adv/main.c | 14 +- net/bluetooth/hci_debugfs.c | 64 ++++-- net/bluetooth/hci_event.c | 25 +++ net/bridge/netfilter/ebtables.c | 6 + net/core/sock_map.c | 6 + net/ipv4/inet_connection_sock.c | 14 ++ net/ipv4/ip_gre.c | 5 + net/ipv4/netfilter/arp_tables.c | 4 + net/ipv4/netfilter/ip_tables.c | 4 + net/ipv4/tcp.c | 2 + net/ipv4/udp.c | 7 + net/ipv4/udp_offload.c | 13 +- net/ipv6/ip6_fib.c | 14 +- net/ipv6/ip6_gre.c | 3 + net/ipv6/netfilter/ip6_tables.c | 4 + net/ipv6/udp.c | 2 +- net/ipv6/udp_offload.c | 8 +- net/mac80211/cfg.c | 5 +- net/mac802154/llsec.c | 18 +- net/mptcp/protocol.c | 3 - net/mptcp/subflow.c | 3 + net/netfilter/nf_tables_api.c | 74 ++++++- net/nfc/nci/core.c | 5 + net/rds/rdma.c | 2 +- net/sched/act_skbmod.c | 10 +- net/smc/smc_pnet.c | 10 + net/xfrm/xfrm_user.c | 3 + scripts/Makefile.extrawarn | 2 + scripts/dummy-tools/gcc | 6 +- scripts/gcc-x86_32-has-stack-protector.sh | 6 +- scripts/kernel-doc | 2 +- security/smack/smack_lsm.c | 12 +- sound/pci/hda/patch_realtek.c | 9 +- sound/sh/aica.c | 17 +- sound/soc/soc-ops.c | 2 +- tools/iio/iio_utils.c | 2 +- tools/include/linux/objtool.h | 8 + tools/lib/perf/evlist.c | 18 +- tools/lib/perf/include/internal/evlist.h | 4 +- .../x86_energy_perf_policy.c | 1 + tools/testing/ktest/ktest.pl | 1 + tools/testing/selftests/mqueue/setting | 1 + tools/testing/selftests/net/reuseaddr_conflict.c | 2 +- virt/kvm/async_pf.c | 31 ++- 335 files changed, 3261 insertions(+), 1539 deletions(-)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lin Yujun linyujun809@huawei.com
commit 06cb31cc761823ef444ba4e1df11347342a6e745 upstream.
commit 7c693f54c873691 ("x86/speculation: Add spectre_v2=ibrs option to support Kernel IBRS")
adds the "ibrs " option in Documentation/admin-guide/kernel-parameters.txt but omits it to Documentation/admin-guide/hw-vuln/spectre.rst, add it.
Signed-off-by: Lin Yujun linyujun809@huawei.com Link: https://lore.kernel.org/r/20220830123614.23007-1-linyujun809@huawei.com Signed-off-by: Jonathan Corbet corbet@lwn.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/hw-vuln/spectre.rst | 1 + 1 file changed, 1 insertion(+)
--- a/Documentation/admin-guide/hw-vuln/spectre.rst +++ b/Documentation/admin-guide/hw-vuln/spectre.rst @@ -625,6 +625,7 @@ kernel command line. eibrs enhanced IBRS eibrs,retpoline enhanced IBRS + Retpolines eibrs,lfence enhanced IBRS + LFENCE + ibrs use IBRS to protect kernel
Not specifying this option is equivalent to spectre_v2=auto.
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kim Phillips kim.phillips@amd.com
commit e7862eda309ecfccc36bb5558d937ed3ace07f3f upstream.
The AMD Zen4 core supports a new feature called Automatic IBRS.
It is a "set-and-forget" feature that means that, like Intel's Enhanced IBRS, h/w manages its IBRS mitigation resources automatically across CPL transitions.
The feature is advertised by CPUID_Fn80000021_EAX bit 8 and is enabled by setting MSR C000_0080 (EFER) bit 21.
Enable Automatic IBRS by default if the CPU feature is present. It typically provides greater performance over the incumbent generic retpolines mitigation.
Reuse the SPECTRE_V2_EIBRS spectre_v2_mitigation enum. AMD Automatic IBRS and Intel Enhanced IBRS have similar enablement. Add NO_EIBRS_PBRSB to cpu_vuln_whitelist, since AMD Automatic IBRS isn't affected by PBRSB-eIBRS.
The kernel command line option spectre_v2=eibrs is used to select AMD Automatic IBRS, if available.
Signed-off-by: Kim Phillips kim.phillips@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Acked-by: Sean Christopherson seanjc@google.com Acked-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lore.kernel.org/r/20230124163319.2277355-8-kim.phillips@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/hw-vuln/spectre.rst | 6 +++--- Documentation/admin-guide/kernel-parameters.txt | 6 +++--- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 2 ++ arch/x86/kernel/cpu/bugs.c | 20 ++++++++++++-------- arch/x86/kernel/cpu/common.c | 19 +++++++++++-------- 6 files changed, 32 insertions(+), 22 deletions(-)
--- a/Documentation/admin-guide/hw-vuln/spectre.rst +++ b/Documentation/admin-guide/hw-vuln/spectre.rst @@ -622,9 +622,9 @@ kernel command line. retpoline,generic Retpolines retpoline,lfence LFENCE; indirect branch retpoline,amd alias for retpoline,lfence - eibrs enhanced IBRS - eibrs,retpoline enhanced IBRS + Retpolines - eibrs,lfence enhanced IBRS + LFENCE + eibrs Enhanced/Auto IBRS + eibrs,retpoline Enhanced/Auto IBRS + Retpolines + eibrs,lfence Enhanced/Auto IBRS + LFENCE ibrs use IBRS to protect kernel
Not specifying this option is equivalent to --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5091,9 +5091,9 @@ retpoline,generic - Retpolines retpoline,lfence - LFENCE; indirect branch retpoline,amd - alias for retpoline,lfence - eibrs - enhanced IBRS - eibrs,retpoline - enhanced IBRS + Retpolines - eibrs,lfence - enhanced IBRS + LFENCE + eibrs - Enhanced/Auto IBRS + eibrs,retpoline - Enhanced/Auto IBRS + Retpolines + eibrs,lfence - Enhanced/Auto IBRS + LFENCE ibrs - use IBRS to protect kernel
Not specifying this option is equivalent to --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -403,6 +403,7 @@ #define X86_FEATURE_SEV_ES (19*32+ 3) /* AMD Secure Encrypted Virtualization - Encrypted State */ #define X86_FEATURE_SME_COHERENT (19*32+10) /* "" AMD hardware-enforced cache coherency */
+#define X86_FEATURE_AUTOIBRS (20*32+ 8) /* "" Automatic IBRS */ #define X86_FEATURE_SBPB (20*32+27) /* "" Selective Branch Prediction Barrier */ #define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* "" MSR_PRED_CMD[IBPB] flushes all branch type predictions */ #define X86_FEATURE_SRSO_NO (20*32+29) /* "" CPU is not affected by SRSO */ --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -30,6 +30,7 @@ #define _EFER_SVME 12 /* Enable virtualization */ #define _EFER_LMSLE 13 /* Long Mode Segment Limit Enable */ #define _EFER_FFXSR 14 /* Enable Fast FXSAVE/FXRSTOR */ +#define _EFER_AUTOIBRS 21 /* Enable Automatic IBRS */
#define EFER_SCE (1<<_EFER_SCE) #define EFER_LME (1<<_EFER_LME) @@ -38,6 +39,7 @@ #define EFER_SVME (1<<_EFER_SVME) #define EFER_LMSLE (1<<_EFER_LMSLE) #define EFER_FFXSR (1<<_EFER_FFXSR) +#define EFER_AUTOIBRS (1<<_EFER_AUTOIBRS)
/* Intel MSRs. Some also available on other CPUs */
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1293,9 +1293,9 @@ static const char * const spectre_v2_str [SPECTRE_V2_NONE] = "Vulnerable", [SPECTRE_V2_RETPOLINE] = "Mitigation: Retpolines", [SPECTRE_V2_LFENCE] = "Mitigation: LFENCE", - [SPECTRE_V2_EIBRS] = "Mitigation: Enhanced IBRS", - [SPECTRE_V2_EIBRS_LFENCE] = "Mitigation: Enhanced IBRS + LFENCE", - [SPECTRE_V2_EIBRS_RETPOLINE] = "Mitigation: Enhanced IBRS + Retpolines", + [SPECTRE_V2_EIBRS] = "Mitigation: Enhanced / Automatic IBRS", + [SPECTRE_V2_EIBRS_LFENCE] = "Mitigation: Enhanced / Automatic IBRS + LFENCE", + [SPECTRE_V2_EIBRS_RETPOLINE] = "Mitigation: Enhanced / Automatic IBRS + Retpolines", [SPECTRE_V2_IBRS] = "Mitigation: IBRS", };
@@ -1364,7 +1364,7 @@ static enum spectre_v2_mitigation_cmd __ cmd == SPECTRE_V2_CMD_EIBRS_LFENCE || cmd == SPECTRE_V2_CMD_EIBRS_RETPOLINE) && !boot_cpu_has(X86_FEATURE_IBRS_ENHANCED)) { - pr_err("%s selected but CPU doesn't have eIBRS. Switching to AUTO select\n", + pr_err("%s selected but CPU doesn't have Enhanced or Automatic IBRS. Switching to AUTO select\n", mitigation_options[i].option); return SPECTRE_V2_CMD_AUTO; } @@ -1549,8 +1549,12 @@ static void __init spectre_v2_select_mit pr_err(SPECTRE_V2_EIBRS_EBPF_MSG);
if (spectre_v2_in_ibrs_mode(mode)) { - x86_spec_ctrl_base |= SPEC_CTRL_IBRS; - update_spec_ctrl(x86_spec_ctrl_base); + if (boot_cpu_has(X86_FEATURE_AUTOIBRS)) { + msr_set_bit(MSR_EFER, _EFER_AUTOIBRS); + } else { + x86_spec_ctrl_base |= SPEC_CTRL_IBRS; + update_spec_ctrl(x86_spec_ctrl_base); + } }
switch (mode) { @@ -1634,8 +1638,8 @@ static void __init spectre_v2_select_mit /* * Retpoline protects the kernel, but doesn't protect firmware. IBRS * and Enhanced IBRS protect firmware too, so enable IBRS around - * firmware calls only when IBRS / Enhanced IBRS aren't otherwise - * enabled. + * firmware calls only when IBRS / Enhanced / Automatic IBRS aren't + * otherwise enabled. * * Use "mode" to check Enhanced IBRS instead of boot_cpu_has(), because * the user might select retpoline on the kernel command line and if --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1098,8 +1098,8 @@ static const __initconst struct x86_cpu_ VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO),
/* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */ - VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO), - VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO), + VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB), + VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_MMIO | NO_EIBRS_PBRSB),
/* Zhaoxin Family 7 */ VULNWL(CENTAUR, 7, X86_MODEL_ANY, NO_SPECTRE_V2 | NO_SWAPGS | NO_MMIO), @@ -1219,8 +1219,16 @@ static void __init cpu_set_bug_bits(stru !cpu_has(c, X86_FEATURE_AMD_SSB_NO)) setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
- if (ia32_cap & ARCH_CAP_IBRS_ALL) + /* + * AMD's AutoIBRS is equivalent to Intel's eIBRS - use the Intel feature + * flag and protect from vendor-specific bugs via the whitelist. + */ + if ((ia32_cap & ARCH_CAP_IBRS_ALL) || cpu_has(c, X86_FEATURE_AUTOIBRS)) { setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED); + if (!cpu_matches(cpu_vuln_whitelist, NO_EIBRS_PBRSB) && + !(ia32_cap & ARCH_CAP_PBRSB_NO)) + setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB); + }
if (!cpu_matches(cpu_vuln_whitelist, NO_MDS) && !(ia32_cap & ARCH_CAP_MDS_NO)) { @@ -1282,11 +1290,6 @@ static void __init cpu_set_bug_bits(stru setup_force_cpu_bug(X86_BUG_RETBLEED); }
- if (cpu_has(c, X86_FEATURE_IBRS_ENHANCED) && - !cpu_matches(cpu_vuln_whitelist, NO_EIBRS_PBRSB) && - !(ia32_cap & ARCH_CAP_PBRSB_NO)) - setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB); - /* * Check if CPU is vulnerable to GDS. If running in a virtual machine on * an affected processor, the VMM may have disabled the use of GATHER by
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov bp@suse.de
commit 1d30800c0c0ae1d086ffad2bdf0ba4403370f132 upstream.
Those mitigations are very talkative; use the printing helper which pays attention to the buffer size.
Signed-off-by: Borislav Petkov bp@suse.de Link: https://lore.kernel.org/r/20220809153419.10182-1-bp@alien8.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/bugs.c | 105 ++++++++++++++++++++++----------------------- 1 file changed, 52 insertions(+), 53 deletions(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -2436,74 +2436,74 @@ static const char * const l1tf_vmx_state static ssize_t l1tf_show_state(char *buf) { if (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_AUTO) - return sprintf(buf, "%s\n", L1TF_DEFAULT_MSG); + return sysfs_emit(buf, "%s\n", L1TF_DEFAULT_MSG);
if (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_EPT_DISABLED || (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_NEVER && sched_smt_active())) { - return sprintf(buf, "%s; VMX: %s\n", L1TF_DEFAULT_MSG, - l1tf_vmx_states[l1tf_vmx_mitigation]); + return sysfs_emit(buf, "%s; VMX: %s\n", L1TF_DEFAULT_MSG, + l1tf_vmx_states[l1tf_vmx_mitigation]); }
- return sprintf(buf, "%s; VMX: %s, SMT %s\n", L1TF_DEFAULT_MSG, - l1tf_vmx_states[l1tf_vmx_mitigation], - sched_smt_active() ? "vulnerable" : "disabled"); + return sysfs_emit(buf, "%s; VMX: %s, SMT %s\n", L1TF_DEFAULT_MSG, + l1tf_vmx_states[l1tf_vmx_mitigation], + sched_smt_active() ? "vulnerable" : "disabled"); }
static ssize_t itlb_multihit_show_state(char *buf) { if (!boot_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) || !boot_cpu_has(X86_FEATURE_VMX)) - return sprintf(buf, "KVM: Mitigation: VMX unsupported\n"); + return sysfs_emit(buf, "KVM: Mitigation: VMX unsupported\n"); else if (!(cr4_read_shadow() & X86_CR4_VMXE)) - return sprintf(buf, "KVM: Mitigation: VMX disabled\n"); + return sysfs_emit(buf, "KVM: Mitigation: VMX disabled\n"); else if (itlb_multihit_kvm_mitigation) - return sprintf(buf, "KVM: Mitigation: Split huge pages\n"); + return sysfs_emit(buf, "KVM: Mitigation: Split huge pages\n"); else - return sprintf(buf, "KVM: Vulnerable\n"); + return sysfs_emit(buf, "KVM: Vulnerable\n"); } #else static ssize_t l1tf_show_state(char *buf) { - return sprintf(buf, "%s\n", L1TF_DEFAULT_MSG); + return sysfs_emit(buf, "%s\n", L1TF_DEFAULT_MSG); }
static ssize_t itlb_multihit_show_state(char *buf) { - return sprintf(buf, "Processor vulnerable\n"); + return sysfs_emit(buf, "Processor vulnerable\n"); } #endif
static ssize_t mds_show_state(char *buf) { if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { - return sprintf(buf, "%s; SMT Host state unknown\n", - mds_strings[mds_mitigation]); + return sysfs_emit(buf, "%s; SMT Host state unknown\n", + mds_strings[mds_mitigation]); }
if (boot_cpu_has(X86_BUG_MSBDS_ONLY)) { - return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation], - (mds_mitigation == MDS_MITIGATION_OFF ? "vulnerable" : - sched_smt_active() ? "mitigated" : "disabled")); + return sysfs_emit(buf, "%s; SMT %s\n", mds_strings[mds_mitigation], + (mds_mitigation == MDS_MITIGATION_OFF ? "vulnerable" : + sched_smt_active() ? "mitigated" : "disabled")); }
- return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation], - sched_smt_active() ? "vulnerable" : "disabled"); + return sysfs_emit(buf, "%s; SMT %s\n", mds_strings[mds_mitigation], + sched_smt_active() ? "vulnerable" : "disabled"); }
static ssize_t tsx_async_abort_show_state(char *buf) { if ((taa_mitigation == TAA_MITIGATION_TSX_DISABLED) || (taa_mitigation == TAA_MITIGATION_OFF)) - return sprintf(buf, "%s\n", taa_strings[taa_mitigation]); + return sysfs_emit(buf, "%s\n", taa_strings[taa_mitigation]);
if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { - return sprintf(buf, "%s; SMT Host state unknown\n", - taa_strings[taa_mitigation]); + return sysfs_emit(buf, "%s; SMT Host state unknown\n", + taa_strings[taa_mitigation]); }
- return sprintf(buf, "%s; SMT %s\n", taa_strings[taa_mitigation], - sched_smt_active() ? "vulnerable" : "disabled"); + return sysfs_emit(buf, "%s; SMT %s\n", taa_strings[taa_mitigation], + sched_smt_active() ? "vulnerable" : "disabled"); }
static ssize_t mmio_stale_data_show_state(char *buf) @@ -2571,47 +2571,46 @@ static char *pbrsb_eibrs_state(void) static ssize_t spectre_v2_show_state(char *buf) { if (spectre_v2_enabled == SPECTRE_V2_LFENCE) - return sprintf(buf, "Vulnerable: LFENCE\n"); + return sysfs_emit(buf, "Vulnerable: LFENCE\n");
if (spectre_v2_enabled == SPECTRE_V2_EIBRS && unprivileged_ebpf_enabled()) - return sprintf(buf, "Vulnerable: eIBRS with unprivileged eBPF\n"); + return sysfs_emit(buf, "Vulnerable: eIBRS with unprivileged eBPF\n");
if (sched_smt_active() && unprivileged_ebpf_enabled() && spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE) - return sprintf(buf, "Vulnerable: eIBRS+LFENCE with unprivileged eBPF and SMT\n"); + return sysfs_emit(buf, "Vulnerable: eIBRS+LFENCE with unprivileged eBPF and SMT\n");
- return sprintf(buf, "%s%s%s%s%s%s%s\n", - spectre_v2_strings[spectre_v2_enabled], - ibpb_state(), - boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "", - stibp_state(), - boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "", - pbrsb_eibrs_state(), - spectre_v2_module_string()); + return sysfs_emit(buf, "%s%s%s%s%s%s%s\n", + spectre_v2_strings[spectre_v2_enabled], + ibpb_state(), + boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "", + stibp_state(), + boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "", + pbrsb_eibrs_state(), + spectre_v2_module_string()); }
static ssize_t srbds_show_state(char *buf) { - return sprintf(buf, "%s\n", srbds_strings[srbds_mitigation]); + return sysfs_emit(buf, "%s\n", srbds_strings[srbds_mitigation]); }
static ssize_t retbleed_show_state(char *buf) { if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET || retbleed_mitigation == RETBLEED_MITIGATION_IBPB) { - if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD && - boot_cpu_data.x86_vendor != X86_VENDOR_HYGON) - return sprintf(buf, "Vulnerable: untrained return thunk / IBPB on non-AMD based uarch\n"); - - return sprintf(buf, "%s; SMT %s\n", - retbleed_strings[retbleed_mitigation], - !sched_smt_active() ? "disabled" : - spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT || - spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED ? - "enabled with STIBP protection" : "vulnerable"); + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD && + boot_cpu_data.x86_vendor != X86_VENDOR_HYGON) + return sysfs_emit(buf, "Vulnerable: untrained return thunk / IBPB on non-AMD based uarch\n"); + + return sysfs_emit(buf, "%s; SMT %s\n", retbleed_strings[retbleed_mitigation], + !sched_smt_active() ? "disabled" : + spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT || + spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED ? + "enabled with STIBP protection" : "vulnerable"); }
- return sprintf(buf, "%s\n", retbleed_strings[retbleed_mitigation]); + return sysfs_emit(buf, "%s\n", retbleed_strings[retbleed_mitigation]); }
static ssize_t gds_show_state(char *buf) @@ -2633,26 +2632,26 @@ static ssize_t cpu_show_common(struct de char *buf, unsigned int bug) { if (!boot_cpu_has_bug(bug)) - return sprintf(buf, "Not affected\n"); + return sysfs_emit(buf, "Not affected\n");
switch (bug) { case X86_BUG_CPU_MELTDOWN: if (boot_cpu_has(X86_FEATURE_PTI)) - return sprintf(buf, "Mitigation: PTI\n"); + return sysfs_emit(buf, "Mitigation: PTI\n");
if (hypervisor_is_type(X86_HYPER_XEN_PV)) - return sprintf(buf, "Unknown (XEN PV detected, hypervisor mitigation required)\n"); + return sysfs_emit(buf, "Unknown (XEN PV detected, hypervisor mitigation required)\n");
break;
case X86_BUG_SPECTRE_V1: - return sprintf(buf, "%s\n", spectre_v1_strings[spectre_v1_mitigation]); + return sysfs_emit(buf, "%s\n", spectre_v1_strings[spectre_v1_mitigation]);
case X86_BUG_SPECTRE_V2: return spectre_v2_show_state(buf);
case X86_BUG_SPEC_STORE_BYPASS: - return sprintf(buf, "%s\n", ssb_strings[ssb_mode]); + return sysfs_emit(buf, "%s\n", ssb_strings[ssb_mode]);
case X86_BUG_L1TF: if (boot_cpu_has(X86_FEATURE_L1TF_PTEINV)) @@ -2688,7 +2687,7 @@ static ssize_t cpu_show_common(struct de break; }
- return sprintf(buf, "Vulnerable\n"); + return sysfs_emit(buf, "Vulnerable\n"); }
ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit 14f043f1340bf30bc60af127bff39f55889fef26 ]
The kernel-doc of timer related functions is partially uncomprehensible word salad. Rewrite it to make it useful.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lore.kernel.org/r/20221123201624.828703870@linutronix.de Stable-dep-of: 0f7352557a35 ("wifi: brcmfmac: Fix use-after-free bug in brcmf_cfg80211_detach") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/time/timer.c | 148 +++++++++++++++++++++++++++----------------- 1 file changed, 90 insertions(+), 58 deletions(-)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c index e87e638c31bdf..3157a6434a615 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1068,14 +1068,16 @@ __mod_timer(struct timer_list *timer, unsigned long expires, unsigned int option }
/** - * mod_timer_pending - modify a pending timer's timeout - * @timer: the pending timer to be modified - * @expires: new timeout in jiffies + * mod_timer_pending - Modify a pending timer's timeout + * @timer: The pending timer to be modified + * @expires: New absolute timeout in jiffies * - * mod_timer_pending() is the same for pending timers as mod_timer(), - * but will not re-activate and modify already deleted timers. + * mod_timer_pending() is the same for pending timers as mod_timer(), but + * will not activate inactive timers. * - * It is useful for unserialized use of timers. + * Return: + * * %0 - The timer was inactive and not modified + * * %1 - The timer was active and requeued to expire at @expires */ int mod_timer_pending(struct timer_list *timer, unsigned long expires) { @@ -1084,24 +1086,27 @@ int mod_timer_pending(struct timer_list *timer, unsigned long expires) EXPORT_SYMBOL(mod_timer_pending);
/** - * mod_timer - modify a timer's timeout - * @timer: the timer to be modified - * @expires: new timeout in jiffies - * - * mod_timer() is a more efficient way to update the expire field of an - * active timer (if the timer is inactive it will be activated) + * mod_timer - Modify a timer's timeout + * @timer: The timer to be modified + * @expires: New absolute timeout in jiffies * * mod_timer(timer, expires) is equivalent to: * * del_timer(timer); timer->expires = expires; add_timer(timer); * + * mod_timer() is more efficient than the above open coded sequence. In + * case that the timer is inactive, the del_timer() part is a NOP. The + * timer is in any case activated with the new expiry time @expires. + * * Note that if there are multiple unserialized concurrent users of the * same timer, then mod_timer() is the only safe way to modify the timeout, * since add_timer() cannot modify an already running timer. * - * The function returns whether it has modified a pending timer or not. - * (ie. mod_timer() of an inactive timer returns 0, mod_timer() of an - * active timer returns 1.) + * Return: + * * %0 - The timer was inactive and started + * * %1 - The timer was active and requeued to expire at @expires or + * the timer was active and not modified because @expires did + * not change the effective expiry time */ int mod_timer(struct timer_list *timer, unsigned long expires) { @@ -1112,11 +1117,18 @@ EXPORT_SYMBOL(mod_timer); /** * timer_reduce - Modify a timer's timeout if it would reduce the timeout * @timer: The timer to be modified - * @expires: New timeout in jiffies + * @expires: New absolute timeout in jiffies * * timer_reduce() is very similar to mod_timer(), except that it will only - * modify a running timer if that would reduce the expiration time (it will - * start a timer that isn't running). + * modify an enqueued timer if that would reduce the expiration time. If + * @timer is not enqueued it starts the timer. + * + * Return: + * * %0 - The timer was inactive and started + * * %1 - The timer was active and requeued to expire at @expires or + * the timer was active and not modified because @expires + * did not change the effective expiry time such that the + * timer would expire earlier than already scheduled */ int timer_reduce(struct timer_list *timer, unsigned long expires) { @@ -1125,18 +1137,21 @@ int timer_reduce(struct timer_list *timer, unsigned long expires) EXPORT_SYMBOL(timer_reduce);
/** - * add_timer - start a timer - * @timer: the timer to be added + * add_timer - Start a timer + * @timer: The timer to be started * - * The kernel will do a ->function(@timer) callback from the - * timer interrupt at the ->expires point in the future. The - * current time is 'jiffies'. + * Start @timer to expire at @timer->expires in the future. @timer->expires + * is the absolute expiry time measured in 'jiffies'. When the timer expires + * timer->function(timer) will be invoked from soft interrupt context. * - * The timer's ->expires, ->function fields must be set prior calling this - * function. + * The @timer->expires and @timer->function fields must be set prior + * to calling this function. + * + * If @timer->expires is already in the past @timer will be queued to + * expire at the next timer tick. * - * Timers with an ->expires field in the past will be executed in the next - * timer tick. + * This can only operate on an inactive timer. Attempts to invoke this on + * an active timer are rejected with a warning. */ void add_timer(struct timer_list *timer) { @@ -1146,11 +1161,13 @@ void add_timer(struct timer_list *timer) EXPORT_SYMBOL(add_timer);
/** - * add_timer_on - start a timer on a particular CPU - * @timer: the timer to be added - * @cpu: the CPU to start it on + * add_timer_on - Start a timer on a particular CPU + * @timer: The timer to be started + * @cpu: The CPU to start it on + * + * Same as add_timer() except that it starts the timer on the given CPU. * - * This is not very scalable on SMP. Double adds are not possible. + * See add_timer() for further details. */ void add_timer_on(struct timer_list *timer, int cpu) { @@ -1185,15 +1202,18 @@ void add_timer_on(struct timer_list *timer, int cpu) EXPORT_SYMBOL_GPL(add_timer_on);
/** - * del_timer - deactivate a timer. - * @timer: the timer to be deactivated - * - * del_timer() deactivates a timer - this works on both active and inactive - * timers. - * - * The function returns whether it has deactivated a pending timer or not. - * (ie. del_timer() of an inactive timer returns 0, del_timer() of an - * active timer returns 1.) + * del_timer - Deactivate a timer. + * @timer: The timer to be deactivated + * + * The function only deactivates a pending timer, but contrary to + * del_timer_sync() it does not take into account whether the timer's + * callback function is concurrently executed on a different CPU or not. + * It neither prevents rearming of the timer. If @timer can be rearmed + * concurrently then the return value of this function is meaningless. + * + * Return: + * * %0 - The timer was not pending + * * %1 - The timer was pending and deactivated */ int del_timer(struct timer_list *timer) { @@ -1215,10 +1235,19 @@ EXPORT_SYMBOL(del_timer);
/** * try_to_del_timer_sync - Try to deactivate a timer - * @timer: timer to delete + * @timer: Timer to deactivate + * + * This function tries to deactivate a timer. On success the timer is not + * queued and the timer callback function is not running on any CPU. * - * This function tries to deactivate a timer. Upon successful (ret >= 0) - * exit the timer is not queued and the handler is not running on any CPU. + * This function does not guarantee that the timer cannot be rearmed right + * after dropping the base lock. That needs to be prevented by the calling + * code if necessary. + * + * Return: + * * %0 - The timer was not pending + * * %1 - The timer was pending and deactivated + * * %-1 - The timer callback function is running on a different CPU */ int try_to_del_timer_sync(struct timer_list *timer) { @@ -1314,23 +1343,19 @@ static inline void del_timer_wait_running(struct timer_list *timer) { }
#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT) /** - * del_timer_sync - deactivate a timer and wait for the handler to finish. - * @timer: the timer to be deactivated - * - * This function only differs from del_timer() on SMP: besides deactivating - * the timer it also makes sure the handler has finished executing on other - * CPUs. + * del_timer_sync - Deactivate a timer and wait for the handler to finish. + * @timer: The timer to be deactivated * * Synchronization rules: Callers must prevent restarting of the timer, * otherwise this function is meaningless. It must not be called from * interrupt contexts unless the timer is an irqsafe one. The caller must - * not hold locks which would prevent completion of the timer's - * handler. The timer's handler must not call add_timer_on(). Upon exit the - * timer is not queued and the handler is not running on any CPU. + * not hold locks which would prevent completion of the timer's callback + * function. The timer's handler must not call add_timer_on(). Upon exit + * the timer is not queued and the handler is not running on any CPU. * - * Note: For !irqsafe timers, you must not hold locks that are held in - * interrupt context while calling this function. Even if the lock has - * nothing to do with the timer in question. Here's why:: + * For !irqsafe timers, the caller must not hold locks that are held in + * interrupt context. Even if the lock has nothing to do with the timer in + * question. Here's why:: * * CPU0 CPU1 * ---- ---- @@ -1344,10 +1369,17 @@ static inline void del_timer_wait_running(struct timer_list *timer) { } * while (base->running_timer == mytimer); * * Now del_timer_sync() will never return and never release somelock. - * The interrupt on the other CPU is waiting to grab somelock but - * it has interrupted the softirq that CPU0 is waiting to finish. + * The interrupt on the other CPU is waiting to grab somelock but it has + * interrupted the softirq that CPU0 is waiting to finish. + * + * This function cannot guarantee that the timer is not rearmed again by + * some concurrent or preempting code, right after it dropped the base + * lock. If there is the possibility of a concurrent rearm then the return + * value of the function is meaningless. * - * The function returns whether it has deactivated a pending timer or not. + * Return: + * * %0 - The timer was not pending + * * %1 - The timer was pending and deactivated */ int del_timer_sync(struct timer_list *timer) {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit 168f6b6ffbeec0b9333f3582e4cf637300858db5 ]
del_timer_sync() is assumed to be pointless on uniprocessor systems and can be mapped to del_timer() because in theory del_timer() can never be invoked while the timer callback function is executed.
This is not entirely true because del_timer() can be invoked from interrupt context and therefore hit in the middle of a running timer callback.
Contrary to that del_timer_sync() is not allowed to be invoked from interrupt context unless the affected timer is marked with TIMER_IRQSAFE. del_timer_sync() has proper checks in place to detect such a situation.
Give up on the UP optimization and make del_timer_sync() unconditionally available.
Co-developed-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lore.kernel.org/all/20220407161745.7d6754b3@gandalf.local.home Link: https://lore.kernel.org/all/20221110064101.429013735@goodmis.org Link: https://lore.kernel.org/r/20221123201624.888306160@linutronix.de Stable-dep-of: 0f7352557a35 ("wifi: brcmfmac: Fix use-after-free bug in brcmf_cfg80211_detach") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/timer.h | 7 +------ kernel/time/timer.c | 2 -- 2 files changed, 1 insertion(+), 8 deletions(-)
diff --git a/include/linux/timer.h b/include/linux/timer.h index d10bc7e73b41e..a3125373139e1 100644 --- a/include/linux/timer.h +++ b/include/linux/timer.h @@ -183,12 +183,7 @@ extern int timer_reduce(struct timer_list *timer, unsigned long expires); extern void add_timer(struct timer_list *timer);
extern int try_to_del_timer_sync(struct timer_list *timer); - -#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT) - extern int del_timer_sync(struct timer_list *timer); -#else -# define del_timer_sync(t) del_timer(t) -#endif +extern int del_timer_sync(struct timer_list *timer);
#define del_singleshot_timer_sync(t) del_timer_sync(t)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 3157a6434a615..9edce790b3aa5 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1341,7 +1341,6 @@ static inline void timer_sync_wait_running(struct timer_base *base) { } static inline void del_timer_wait_running(struct timer_list *timer) { } #endif
-#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT) /** * del_timer_sync - Deactivate a timer and wait for the handler to finish. * @timer: The timer to be deactivated @@ -1415,7 +1414,6 @@ int del_timer_sync(struct timer_list *timer) return ret; } EXPORT_SYMBOL(del_timer_sync); -#endif
static void call_timer_fn(struct timer_list *timer, void (*fn)(struct timer_list *),
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit 9b13df3fb64ee95e2397585404e442afee2c7d4f ]
The timer related functions do not have a strict timer_ prefixed namespace which is really annoying.
Rename del_timer_sync() to timer_delete_sync() and provide del_timer_sync() as a wrapper. Document that del_timer_sync() is not for new code.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lore.kernel.org/r/20221123201624.954785441@linutronix.de Stable-dep-of: 0f7352557a35 ("wifi: brcmfmac: Fix use-after-free bug in brcmf_cfg80211_detach") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/timer.h | 15 ++++++++++++++- kernel/time/timer.c | 18 +++++++++--------- 2 files changed, 23 insertions(+), 10 deletions(-)
diff --git a/include/linux/timer.h b/include/linux/timer.h index a3125373139e1..a3d04c4f1263a 100644 --- a/include/linux/timer.h +++ b/include/linux/timer.h @@ -183,7 +183,20 @@ extern int timer_reduce(struct timer_list *timer, unsigned long expires); extern void add_timer(struct timer_list *timer);
extern int try_to_del_timer_sync(struct timer_list *timer); -extern int del_timer_sync(struct timer_list *timer); +extern int timer_delete_sync(struct timer_list *timer); + +/** + * del_timer_sync - Delete a pending timer and wait for a running callback + * @timer: The timer to be deleted + * + * See timer_delete_sync() for detailed explanation. + * + * Do not use in new code. Use timer_delete_sync() instead. + */ +static inline int del_timer_sync(struct timer_list *timer) +{ + return timer_delete_sync(timer); +}
#define del_singleshot_timer_sync(t) del_timer_sync(t)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 9edce790b3aa5..c135cefa44ac0 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1030,7 +1030,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, unsigned int option /* * We are trying to schedule the timer on the new base. * However we can't change timer's base while it is running, - * otherwise del_timer_sync() can't detect that the timer's + * otherwise timer_delete_sync() can't detect that the timer's * handler yet has not finished. This also guarantees that the * timer is serialized wrt itself. */ @@ -1206,7 +1206,7 @@ EXPORT_SYMBOL_GPL(add_timer_on); * @timer: The timer to be deactivated * * The function only deactivates a pending timer, but contrary to - * del_timer_sync() it does not take into account whether the timer's + * timer_delete_sync() it does not take into account whether the timer's * callback function is concurrently executed on a different CPU or not. * It neither prevents rearming of the timer. If @timer can be rearmed * concurrently then the return value of this function is meaningless. @@ -1342,7 +1342,7 @@ static inline void del_timer_wait_running(struct timer_list *timer) { } #endif
/** - * del_timer_sync - Deactivate a timer and wait for the handler to finish. + * timer_delete_sync - Deactivate a timer and wait for the handler to finish. * @timer: The timer to be deactivated * * Synchronization rules: Callers must prevent restarting of the timer, @@ -1364,10 +1364,10 @@ static inline void del_timer_wait_running(struct timer_list *timer) { } * spin_lock_irq(somelock); * <IRQ> * spin_lock(somelock); - * del_timer_sync(mytimer); + * timer_delete_sync(mytimer); * while (base->running_timer == mytimer); * - * Now del_timer_sync() will never return and never release somelock. + * Now timer_delete_sync() will never return and never release somelock. * The interrupt on the other CPU is waiting to grab somelock but it has * interrupted the softirq that CPU0 is waiting to finish. * @@ -1380,7 +1380,7 @@ static inline void del_timer_wait_running(struct timer_list *timer) { } * * %0 - The timer was not pending * * %1 - The timer was pending and deactivated */ -int del_timer_sync(struct timer_list *timer) +int timer_delete_sync(struct timer_list *timer) { int ret;
@@ -1413,7 +1413,7 @@ int del_timer_sync(struct timer_list *timer)
return ret; } -EXPORT_SYMBOL(del_timer_sync); +EXPORT_SYMBOL(timer_delete_sync);
static void call_timer_fn(struct timer_list *timer, void (*fn)(struct timer_list *), @@ -1435,8 +1435,8 @@ static void call_timer_fn(struct timer_list *timer, #endif /* * Couple the lock chain with the lock chain at - * del_timer_sync() by acquiring the lock_map around the fn() - * call here and in del_timer_sync(). + * timer_delete_sync() by acquiring the lock_map around the fn() + * call here and in timer_delete_sync(). */ lock_map_acquire(&lockdep_map);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zheng Wang zyytlz.wz@163.com
[ Upstream commit 0f7352557a35ab7888bc7831411ec8a3cbe20d78 ]
This is the candidate patch of CVE-2023-47233 : https://nvd.nist.gov/vuln/detail/CVE-2023-47233
In brcm80211 driver,it starts with the following invoking chain to start init a timeout worker:
->brcmf_usb_probe ->brcmf_usb_probe_cb ->brcmf_attach ->brcmf_bus_started ->brcmf_cfg80211_attach ->wl_init_priv ->brcmf_init_escan ->INIT_WORK(&cfg->escan_timeout_work, brcmf_cfg80211_escan_timeout_worker);
If we disconnect the USB by hotplug, it will call brcmf_usb_disconnect to make cleanup. The invoking chain is :
brcmf_usb_disconnect ->brcmf_usb_disconnect_cb ->brcmf_detach ->brcmf_cfg80211_detach ->kfree(cfg);
While the timeout woker may still be running. This will cause a use-after-free bug on cfg in brcmf_cfg80211_escan_timeout_worker.
Fix it by deleting the timer and canceling the worker in brcmf_cfg80211_detach.
Fixes: e756af5b30b0 ("brcmfmac: add e-scan support.") Signed-off-by: Zheng Wang zyytlz.wz@163.com Cc: stable@vger.kernel.org [arend.vanspriel@broadcom.com: keep timer delete as is and cancel work just before free] Signed-off-by: Arend van Spriel arend.vanspriel@broadcom.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://msgid.link/20240107072504.392713-1-arend.vanspriel@broadcom.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c index baf5f0afe802e..fbb5e29530e3d 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c @@ -790,8 +790,7 @@ s32 brcmf_notify_escan_complete(struct brcmf_cfg80211_info *cfg, scan_request = cfg->scan_request; cfg->scan_request = NULL;
- if (timer_pending(&cfg->escan_timeout)) - del_timer_sync(&cfg->escan_timeout); + timer_delete_sync(&cfg->escan_timeout);
if (fw_abort) { /* Do a scan abort to stop the driver's scan engine */ @@ -7674,6 +7673,7 @@ void brcmf_cfg80211_detach(struct brcmf_cfg80211_info *cfg) brcmf_btcoex_detach(cfg); wiphy_unregister(cfg->wiphy); wl_deinit_priv(cfg); + cancel_work_sync(&cfg->escan_timeout_work); brcmf_free_wiphy(cfg->wiphy); kfree(cfg); }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hidenori Kobayashi hidenorik@chromium.org
[ Upstream commit 87318b7092670d4086bfec115a0280a60c51c2dd ]
The imgu driver fails to probe with the following message because it does not set the pad's flags before calling media_entity_pads_init().
[ 14.596315] ipu3-imgu 0000:00:05.0: failed initialize subdev media entity (-22) [ 14.596322] ipu3-imgu 0000:00:05.0: failed to register subdev0 ret (-22) [ 14.596327] ipu3-imgu 0000:00:05.0: failed to register pipes (-22) [ 14.596331] ipu3-imgu 0000:00:05.0: failed to create V4L2 devices (-22)
Fix the initialization order so that the driver probe succeeds. The ops initialization is also moved together for readability.
Fixes: a0ca1627b450 ("media: staging/intel-ipu3: Add v4l2 driver based on media framework") Cc: stable@vger.kernel.org # 6.7 Cc: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Hidenori Kobayashi hidenorik@chromium.org Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/media/ipu3/ipu3-v4l2.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/staging/media/ipu3/ipu3-v4l2.c b/drivers/staging/media/ipu3/ipu3-v4l2.c index 103f84466f6fc..371117b511e21 100644 --- a/drivers/staging/media/ipu3/ipu3-v4l2.c +++ b/drivers/staging/media/ipu3/ipu3-v4l2.c @@ -1063,6 +1063,11 @@ static int imgu_v4l2_subdev_register(struct imgu_device *imgu, struct imgu_media_pipe *imgu_pipe = &imgu->imgu_pipe[pipe];
/* Initialize subdev media entity */ + imgu_sd->subdev.entity.ops = &imgu_media_ops; + for (i = 0; i < IMGU_NODE_NUM; i++) { + imgu_sd->subdev_pads[i].flags = imgu_pipe->nodes[i].output ? + MEDIA_PAD_FL_SINK : MEDIA_PAD_FL_SOURCE; + } r = media_entity_pads_init(&imgu_sd->subdev.entity, IMGU_NODE_NUM, imgu_sd->subdev_pads); if (r) { @@ -1070,11 +1075,6 @@ static int imgu_v4l2_subdev_register(struct imgu_device *imgu, "failed initialize subdev media entity (%d)\n", r); return r; } - imgu_sd->subdev.entity.ops = &imgu_media_ops; - for (i = 0; i < IMGU_NODE_NUM; i++) { - imgu_sd->subdev_pads[i].flags = imgu_pipe->nodes[i].output ? - MEDIA_PAD_FL_SINK : MEDIA_PAD_FL_SOURCE; - }
/* Initialize subdev */ v4l2_subdev_init(&imgu_sd->subdev, &imgu_subdev_ops); @@ -1169,15 +1169,15 @@ static int imgu_v4l2_node_setup(struct imgu_device *imgu, unsigned int pipe, }
/* Initialize media entities */ + node->vdev_pad.flags = node->output ? + MEDIA_PAD_FL_SOURCE : MEDIA_PAD_FL_SINK; + vdev->entity.ops = NULL; r = media_entity_pads_init(&vdev->entity, 1, &node->vdev_pad); if (r) { dev_err(dev, "failed initialize media entity (%d)\n", r); mutex_destroy(&node->lock); return r; } - node->vdev_pad.flags = node->output ? - MEDIA_PAD_FL_SOURCE : MEDIA_PAD_FL_SINK; - vdev->entity.ops = NULL;
/* Initialize vbq */ vbq->type = node->vdev_fmt.type;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amit Pundir amit.pundir@linaro.org
[ Upstream commit 1d9054e3a4fd36e2949e616f7360bdb81bcc1921 ]
With the addition of RPMh power domain to the GCC node in device tree, we noticed a significant delay in getting the UFS driver probed on AOSP which futher led to mount failures because Android do not support rootwait. So adding a soft dependency on RPMh power domain which informs modprobe to load rpmhpd module before gcc-sdm845.
Cc: stable@vger.kernel.org # v5.4+ Fixes: 4b6ea15c0a11 ("arm64: dts: qcom: sdm845: Add missing RPMh power domain to GCC") Suggested-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Signed-off-by: Amit Pundir amit.pundir@linaro.org Reviewed-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Link: https://lore.kernel.org/r/20240123062814.2555649-1-amit.pundir@linaro.org Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/gcc-sdm845.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/clk/qcom/gcc-sdm845.c b/drivers/clk/qcom/gcc-sdm845.c index 90f7febaf5288..a6b07aa8eb650 100644 --- a/drivers/clk/qcom/gcc-sdm845.c +++ b/drivers/clk/qcom/gcc-sdm845.c @@ -3646,3 +3646,4 @@ module_exit(gcc_sdm845_exit); MODULE_DESCRIPTION("QTI GCC SDM845 Driver"); MODULE_LICENSE("GPL v2"); MODULE_ALIAS("platform:gcc-sdm845"); +MODULE_SOFTDEP("pre: rpmhpd");
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roberto Sassu roberto.sassu@huawei.com
[ Upstream commit 9c82169208dde516510aaba6bbd8b13976690c5d ]
Since the SMACK64TRANSMUTE xattr makes sense only for directories, enforce this restriction in smack_inode_setxattr().
Cc: stable@vger.kernel.org Fixes: 5c6d1125f8db ("Smack: Transmute labels on specified directories") # v2.6.38.x Signed-off-by: Roberto Sassu roberto.sassu@huawei.com Signed-off-by: Casey Schaufler casey@schaufler-ca.com Signed-off-by: Sasha Levin sashal@kernel.org --- security/smack/smack_lsm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c index 5388f143eecd8..0be25e05c1a03 100644 --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -1280,7 +1280,8 @@ static int smack_inode_setxattr(struct dentry *dentry, const char *name, check_star = 1; } else if (strcmp(name, XATTR_NAME_SMACKTRANSMUTE) == 0) { check_priv = 1; - if (size != TRANS_TRUE_SIZE || + if (!S_ISDIR(d_backing_inode(dentry)->i_mode) || + size != TRANS_TRUE_SIZE || strncmp(value, TRANS_TRUE, TRANS_TRUE_SIZE) != 0) rc = -EINVAL; } else
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roberto Sassu roberto.sassu@huawei.com
[ Upstream commit ac02f007d64eb2769d0bde742aac4d7a5fc6e8a5 ]
If the SMACK64TRANSMUTE xattr is provided, and the inode is a directory, update the in-memory inode flags by setting SMK_INODE_TRANSMUTE.
Cc: stable@vger.kernel.org Fixes: 5c6d1125f8db ("Smack: Transmute labels on specified directories") # v2.6.38.x Signed-off-by: Roberto Sassu roberto.sassu@huawei.com Signed-off-by: Casey Schaufler casey@schaufler-ca.com Signed-off-by: Sasha Levin sashal@kernel.org --- security/smack/smack_lsm.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c index 0be25e05c1a03..750f6007bbbb0 100644 --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -2722,6 +2722,15 @@ static int smack_inode_setsecurity(struct inode *inode, const char *name, if (value == NULL || size > SMK_LONGLABEL || size == 0) return -EINVAL;
+ if (strcmp(name, XATTR_SMACK_TRANSMUTE) == 0) { + if (!S_ISDIR(inode->i_mode) || size != TRANS_TRUE_SIZE || + strncmp(value, TRANS_TRUE, TRANS_TRUE_SIZE) != 0) + return -EINVAL; + + nsp->smk_flags |= SMK_INODE_TRANSMUTE; + return 0; + } + skp = smk_import_entry(value, size); if (IS_ERR(skp)) return PTR_ERR(skp);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Duje Mihanović duje.mihanovic@skole.hr
[ Upstream commit 831e0cd4f9ee15a4f02ae10b67e7fdc10eb2b4fc ]
Fix an obvious spelling error in the PMIC compatible in the MMP2 Brownstone DTS file.
Fixes: 58f1193e6210 ("mfd: max8925: Add dts") Cc: stable@vger.kernel.org Signed-off-by: Duje Mihanović duje.mihanovic@skole.hr Reported-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Closes: https://lore.kernel.org/linux-devicetree/1410884282-18041-1-git-send-email-k... Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20240125-brownstone-typo-fix-v2-1-45bc48a0c81c@sko... [krzysztof: Just 10 years to take a patch, not bad! Rephrased commit msg] Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/mmp2-brownstone.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/mmp2-brownstone.dts b/arch/arm/boot/dts/mmp2-brownstone.dts index 04f1ae1382e7a..bc64348b82185 100644 --- a/arch/arm/boot/dts/mmp2-brownstone.dts +++ b/arch/arm/boot/dts/mmp2-brownstone.dts @@ -28,7 +28,7 @@ &uart3 { &twsi1 { status = "okay"; pmic: max8925@3c { - compatible = "maxium,max8925"; + compatible = "maxim,max8925"; reg = <0x3c>; interrupts = <1>; interrupt-parent = <&intcmux4>;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian König christian.koenig@amd.com
[ Upstream commit b254557cb244e2c18e59ee1cc2293128c52d2473 ]
Implement in the driver instead since it is the only user of that function.
v2: fix usage of ttm_bo_init_reserved
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Dave Airlie airlied@redhat.com Reviewed-by: Huang Rui ray.huang@amd.com Link: https://patchwork.freedesktop.org/patch/391614/?series=81973&rev=1 Stable-dep-of: 517621b70600 ("drm/vmwgfx: Fix possible null pointer derefence with invalid contexts") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 43 ++++++++++++++++++++++ drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c | 6 +-- drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 4 ++ drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 8 ++-- 4 files changed, 53 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c index 813f1b1480941..c8ca09f0e6274 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c @@ -487,6 +487,49 @@ static void vmw_user_bo_destroy(struct ttm_buffer_object *bo) ttm_prime_object_kfree(vmw_user_bo, prime); }
+/** + * vmw_bo_create_kernel - Create a pinned BO for internal kernel use. + * + * @dev_priv: Pointer to the device private struct + * @size: size of the BO we need + * @placement: where to put it + * @p_bo: resulting BO + * + * Creates and pin a simple BO for in kernel use. + */ +int vmw_bo_create_kernel(struct vmw_private *dev_priv, unsigned long size, + struct ttm_placement *placement, + struct ttm_buffer_object **p_bo) +{ + unsigned npages = PAGE_ALIGN(size) >> PAGE_SHIFT; + struct ttm_operation_ctx ctx = { false, false }; + struct ttm_buffer_object *bo; + size_t acc_size; + int ret; + + bo = kzalloc(sizeof(*bo), GFP_KERNEL); + if (unlikely(!bo)) + return -ENOMEM; + + acc_size = ttm_round_pot(sizeof(*bo)); + acc_size += ttm_round_pot(npages * sizeof(void *)); + acc_size += ttm_round_pot(sizeof(struct ttm_tt)); + ret = ttm_bo_init_reserved(&dev_priv->bdev, bo, size, + ttm_bo_type_device, placement, 0, + &ctx, acc_size, NULL, NULL, NULL); + if (unlikely(ret)) + goto error_free; + + ttm_bo_pin(bo); + ttm_bo_unreserve(bo); + *p_bo = bo; + + return 0; + +error_free: + kfree(bo); + return ret; +}
/** * vmw_bo_init - Initialize a vmw buffer object diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c index 3b41cf63110ad..9a9fe10d829b8 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c @@ -1245,9 +1245,9 @@ int vmw_cmdbuf_set_pool_size(struct vmw_cmdbuf_man *man, !dev_priv->has_mob) return -ENOMEM;
- ret = ttm_bo_create(&dev_priv->bdev, size, ttm_bo_type_device, - &vmw_mob_ne_placement, 0, false, - &man->cmd_space); + ret = vmw_bo_create_kernel(dev_priv, size, + &vmw_mob_placement, + &man->cmd_space); if (ret) return ret;
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h index 0a79c57c7db64..e6af950c40370 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h @@ -845,6 +845,10 @@ extern void vmw_bo_get_guest_ptr(const struct ttm_buffer_object *buf, SVGAGuestPtr *ptr); extern void vmw_bo_pin_reserved(struct vmw_buffer_object *bo, bool pin); extern void vmw_bo_bo_free(struct ttm_buffer_object *bo); +extern int vmw_bo_create_kernel(struct vmw_private *dev_priv, + unsigned long size, + struct ttm_placement *placement, + struct ttm_buffer_object **p_bo); extern int vmw_bo_init(struct vmw_private *dev_priv, struct vmw_buffer_object *vmw_bo, size_t size, struct ttm_placement *placement, diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c index 73116ec70ba59..8abeef691ad29 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c @@ -817,11 +817,9 @@ int vmw_bo_create_and_populate(struct vmw_private *dev_priv, struct ttm_buffer_object *bo; int ret;
- ret = ttm_bo_create(&dev_priv->bdev, bo_size, - ttm_bo_type_device, - &vmw_sys_ne_placement, - 0, false, &bo); - + ret = vmw_bo_create_kernel(dev_priv, bo_size, + &vmw_sys_placement, + &bo); if (unlikely(ret != 0)) return ret;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian König christian.koenig@amd.com
[ Upstream commit fbe86ca567919b22bbba1220ce55020b1868879f ]
Stop using TTM_PL_FLAG_NO_EVICT.
v2: fix unconditional pinning
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Dave Airlie airlied@redhat.com Reviewed-by: Huang Rui ray.huang@amd.com Link: https://patchwork.freedesktop.org/patch/391601/?series=81973&rev=1 Stable-dep-of: 517621b70600 ("drm/vmwgfx: Fix possible null pointer derefence with invalid contexts") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_blit.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 49 +++++++++++----------- drivers/gpu/drm/vmwgfx/vmwgfx_cotable.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 7 +--- drivers/gpu/drm/vmwgfx/vmwgfx_fb.c | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_shader.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 42 ------------------- drivers/gpu/drm/vmwgfx/vmwgfx_validation.c | 2 +- 11 files changed, 39 insertions(+), 85 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c index e8d66182cd7b5..ea2f2f937eb30 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c @@ -459,9 +459,9 @@ int vmw_bo_cpu_blit(struct ttm_buffer_object *dst, int ret = 0;
/* Buffer objects need to be either pinned or reserved: */ - if (!(dst->mem.placement & TTM_PL_FLAG_NO_EVICT)) + if (!(dst->pin_count)) dma_resv_assert_held(dst->base.resv); - if (!(src->mem.placement & TTM_PL_FLAG_NO_EVICT)) + if (!(src->pin_count)) dma_resv_assert_held(src->base.resv);
if (!ttm_tt_is_populated(dst->ttm)) { diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c index c8ca09f0e6274..9a66ba2543263 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c @@ -106,7 +106,7 @@ int vmw_bo_pin_in_placement(struct vmw_private *dev_priv, if (unlikely(ret != 0)) goto err;
- if (buf->pin_count > 0) + if (buf->base.pin_count > 0) ret = ttm_bo_mem_compat(placement, &bo->mem, &new_flags) == true ? 0 : -EINVAL; else @@ -155,7 +155,7 @@ int vmw_bo_pin_in_vram_or_gmr(struct vmw_private *dev_priv, if (unlikely(ret != 0)) goto err;
- if (buf->pin_count > 0) { + if (buf->base.pin_count > 0) { ret = ttm_bo_mem_compat(&vmw_vram_gmr_placement, &bo->mem, &new_flags) == true ? 0 : -EINVAL; goto out_unreserve; @@ -246,12 +246,12 @@ int vmw_bo_pin_in_start_of_vram(struct vmw_private *dev_priv, if (bo->mem.mem_type == TTM_PL_VRAM && bo->mem.start < bo->num_pages && bo->mem.start > 0 && - buf->pin_count == 0) { + buf->base.pin_count == 0) { ctx.interruptible = false; (void) ttm_bo_validate(bo, &vmw_sys_placement, &ctx); }
- if (buf->pin_count > 0) + if (buf->base.pin_count > 0) ret = ttm_bo_mem_compat(&placement, &bo->mem, &new_flags) == true ? 0 : -EINVAL; else @@ -343,23 +343,13 @@ void vmw_bo_pin_reserved(struct vmw_buffer_object *vbo, bool pin)
dma_resv_assert_held(bo->base.resv);
- if (pin) { - if (vbo->pin_count++ > 0) - return; - } else { - WARN_ON(vbo->pin_count <= 0); - if (--vbo->pin_count > 0) - return; - } + if (pin == !!bo->pin_count) + return;
pl.fpfn = 0; pl.lpfn = 0; pl.mem_type = bo->mem.mem_type; pl.flags = bo->mem.placement; - if (pin) - pl.flags |= TTM_PL_FLAG_NO_EVICT; - else - pl.flags &= ~TTM_PL_FLAG_NO_EVICT;
memset(&placement, 0, sizeof(placement)); placement.num_placement = 1; @@ -368,8 +358,12 @@ void vmw_bo_pin_reserved(struct vmw_buffer_object *vbo, bool pin) ret = ttm_bo_validate(bo, &placement, &ctx);
BUG_ON(ret != 0 || bo->mem.mem_type != old_mem_type); -}
+ if (pin) + ttm_bo_pin(bo); + else + ttm_bo_unpin(bo); +}
/** * vmw_bo_map_and_cache - Map a buffer object and cache the map @@ -539,6 +533,7 @@ int vmw_bo_create_kernel(struct vmw_private *dev_priv, unsigned long size, * @size: Buffer object size in bytes. * @placement: Initial placement. * @interruptible: Whether waits should be performed interruptible. + * @pin: If the BO should be created pinned at a fixed location. * @bo_free: The buffer object destructor. * Returns: Zero on success, negative error code on error. * @@ -547,9 +542,10 @@ int vmw_bo_create_kernel(struct vmw_private *dev_priv, unsigned long size, int vmw_bo_init(struct vmw_private *dev_priv, struct vmw_buffer_object *vmw_bo, size_t size, struct ttm_placement *placement, - bool interruptible, + bool interruptible, bool pin, void (*bo_free)(struct ttm_buffer_object *bo)) { + struct ttm_operation_ctx ctx = { interruptible, false }; struct ttm_bo_device *bdev = &dev_priv->bdev; size_t acc_size; int ret; @@ -563,11 +559,16 @@ int vmw_bo_init(struct vmw_private *dev_priv, vmw_bo->base.priority = 3; vmw_bo->res_tree = RB_ROOT;
- ret = ttm_bo_init(bdev, &vmw_bo->base, size, - ttm_bo_type_device, placement, - 0, interruptible, acc_size, - NULL, NULL, bo_free); - return ret; + ret = ttm_bo_init_reserved(bdev, &vmw_bo->base, size, + ttm_bo_type_device, placement, + 0, &ctx, acc_size, NULL, NULL, bo_free); + if (unlikely(ret)) + return ret; + + if (pin) + ttm_bo_pin(&vmw_bo->base); + ttm_bo_unreserve(&vmw_bo->base); + return 0; }
@@ -656,7 +657,7 @@ int vmw_user_bo_alloc(struct vmw_private *dev_priv, ret = vmw_bo_init(dev_priv, &user_bo->vbo, size, (dev_priv->has_mob) ? &vmw_sys_placement : - &vmw_vram_sys_placement, true, + &vmw_vram_sys_placement, true, false, &vmw_user_bo_destroy); if (unlikely(ret != 0)) return ret; diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cotable.c b/drivers/gpu/drm/vmwgfx/vmwgfx_cotable.c index 65e8e7a977246..984d8884357d9 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_cotable.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cotable.c @@ -410,8 +410,8 @@ static int vmw_cotable_resize(struct vmw_resource *res, size_t new_size) if (!buf) return -ENOMEM;
- ret = vmw_bo_init(dev_priv, buf, new_size, &vmw_mob_ne_placement, - true, vmw_bo_bo_free); + ret = vmw_bo_init(dev_priv, buf, new_size, &vmw_mob_placement, + true, true, vmw_bo_bo_free); if (ret) { DRM_ERROR("Failed initializing new cotable MOB.\n"); return ret; diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c index 31e3e5c9f3622..bdb7a5e965601 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c @@ -372,7 +372,7 @@ static int vmw_dummy_query_bo_create(struct vmw_private *dev_priv) return -ENOMEM;
ret = vmw_bo_init(dev_priv, vbo, PAGE_SIZE, - &vmw_sys_ne_placement, false, + &vmw_sys_placement, false, true, &vmw_bo_bo_free); if (unlikely(ret != 0)) return ret; diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h index e6af950c40370..fa285c20d6dac 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h @@ -99,7 +99,6 @@ struct vmw_fpriv { * struct vmw_buffer_object - TTM buffer object with vmwgfx additions * @base: The TTM buffer object * @res_tree: RB tree of resources using this buffer object as a backing MOB - * @pin_count: pin depth * @cpu_writers: Number of synccpu write grabs. Protected by reservation when * increased. May be decreased without reservation. * @dx_query_ctx: DX context if this buffer object is used as a DX query MOB @@ -110,7 +109,6 @@ struct vmw_fpriv { struct vmw_buffer_object { struct ttm_buffer_object base; struct rb_root res_tree; - s32 pin_count; atomic_t cpu_writers; /* Not ref-counted. Protected by binding_mutex */ struct vmw_resource *dx_query_ctx; @@ -852,7 +850,7 @@ extern int vmw_bo_create_kernel(struct vmw_private *dev_priv, extern int vmw_bo_init(struct vmw_private *dev_priv, struct vmw_buffer_object *vmw_bo, size_t size, struct ttm_placement *placement, - bool interruptible, + bool interruptible, bool pin, void (*bo_free)(struct ttm_buffer_object *bo)); extern int vmw_user_bo_verify_access(struct ttm_buffer_object *bo, struct ttm_object_file *tfile); @@ -1009,16 +1007,13 @@ extern void vmw_validation_mem_init_ttm(struct vmw_private *dev_priv,
extern const size_t vmw_tt_size; extern struct ttm_placement vmw_vram_placement; -extern struct ttm_placement vmw_vram_ne_placement; extern struct ttm_placement vmw_vram_sys_placement; extern struct ttm_placement vmw_vram_gmr_placement; extern struct ttm_placement vmw_vram_gmr_ne_placement; extern struct ttm_placement vmw_sys_placement; -extern struct ttm_placement vmw_sys_ne_placement; extern struct ttm_placement vmw_evictable_placement; extern struct ttm_placement vmw_srf_placement; extern struct ttm_placement vmw_mob_placement; -extern struct ttm_placement vmw_mob_ne_placement; extern struct ttm_placement vmw_nonfixed_placement; extern struct ttm_bo_driver vmw_bo_driver; extern const struct vmw_sg_table * diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c b/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c index 97d9d2557447b..3923acc3ab1e5 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c @@ -406,7 +406,7 @@ static int vmw_fb_create_bo(struct vmw_private *vmw_priv,
ret = vmw_bo_init(vmw_priv, vmw_bo, size, &vmw_sys_placement, - false, + false, false, &vmw_bo_bo_free); if (unlikely(ret != 0)) goto err_unlock; /* init frees the buffer on failure */ diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c index c0f156078ddae..5e922d9d5f2cc 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c @@ -370,7 +370,7 @@ static int vmw_resource_buf_alloc(struct vmw_resource *res,
ret = vmw_bo_init(res->dev_priv, backup, res->backup_size, res->func->backup_placement, - interruptible, + interruptible, false, &vmw_bo_bo_free); if (unlikely(ret != 0)) goto out_no_bo; @@ -1002,7 +1002,7 @@ int vmw_resource_pin(struct vmw_resource *res, bool interruptible) vbo = res->backup;
ttm_bo_reserve(&vbo->base, interruptible, false, NULL); - if (!vbo->pin_count) { + if (!vbo->base.pin_count) { ret = ttm_bo_validate (&vbo->base, res->func->backup_placement, diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c index 2b6590344468d..9c8109efefbd6 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c @@ -451,8 +451,8 @@ vmw_sou_primary_plane_prepare_fb(struct drm_plane *plane, */ vmw_overlay_pause_all(dev_priv); ret = vmw_bo_init(dev_priv, vps->bo, size, - &vmw_vram_ne_placement, - false, &vmw_bo_bo_free); + &vmw_vram_placement, + false, true, &vmw_bo_bo_free); vmw_overlay_resume_all(dev_priv); if (ret) { vps->bo = NULL; /* vmw_bo_init frees on error */ diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c b/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c index e139fdfd16356..f328aa5839a22 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c @@ -978,8 +978,8 @@ int vmw_compat_shader_add(struct vmw_private *dev_priv, if (unlikely(!buf)) return -ENOMEM;
- ret = vmw_bo_init(dev_priv, buf, size, &vmw_sys_ne_placement, - true, vmw_bo_bo_free); + ret = vmw_bo_init(dev_priv, buf, size, &vmw_sys_placement, + true, true, vmw_bo_bo_free); if (unlikely(ret != 0)) goto out;
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c index 8abeef691ad29..89b3356ec27f0 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c @@ -37,13 +37,6 @@ static const struct ttm_place vram_placement_flags = { .flags = TTM_PL_FLAG_CACHED };
-static const struct ttm_place vram_ne_placement_flags = { - .fpfn = 0, - .lpfn = 0, - .mem_type = TTM_PL_VRAM, - .flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_NO_EVICT -}; - static const struct ttm_place sys_placement_flags = { .fpfn = 0, .lpfn = 0, @@ -51,13 +44,6 @@ static const struct ttm_place sys_placement_flags = { .flags = TTM_PL_FLAG_CACHED };
-static const struct ttm_place sys_ne_placement_flags = { - .fpfn = 0, - .lpfn = 0, - .mem_type = TTM_PL_SYSTEM, - .flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_NO_EVICT -}; - static const struct ttm_place gmr_placement_flags = { .fpfn = 0, .lpfn = 0, @@ -79,13 +65,6 @@ static const struct ttm_place mob_placement_flags = { .flags = TTM_PL_FLAG_CACHED };
-static const struct ttm_place mob_ne_placement_flags = { - .fpfn = 0, - .lpfn = 0, - .mem_type = VMW_PL_MOB, - .flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_NO_EVICT -}; - struct ttm_placement vmw_vram_placement = { .num_placement = 1, .placement = &vram_placement_flags, @@ -158,13 +137,6 @@ struct ttm_placement vmw_vram_sys_placement = { .busy_placement = &sys_placement_flags };
-struct ttm_placement vmw_vram_ne_placement = { - .num_placement = 1, - .placement = &vram_ne_placement_flags, - .num_busy_placement = 1, - .busy_placement = &vram_ne_placement_flags -}; - struct ttm_placement vmw_sys_placement = { .num_placement = 1, .placement = &sys_placement_flags, @@ -172,13 +144,6 @@ struct ttm_placement vmw_sys_placement = { .busy_placement = &sys_placement_flags };
-struct ttm_placement vmw_sys_ne_placement = { - .num_placement = 1, - .placement = &sys_ne_placement_flags, - .num_busy_placement = 1, - .busy_placement = &sys_ne_placement_flags -}; - static const struct ttm_place evictable_placement_flags[] = { { .fpfn = 0, @@ -243,13 +208,6 @@ struct ttm_placement vmw_mob_placement = { .busy_placement = &mob_placement_flags };
-struct ttm_placement vmw_mob_ne_placement = { - .num_placement = 1, - .num_busy_placement = 1, - .placement = &mob_ne_placement_flags, - .busy_placement = &mob_ne_placement_flags -}; - struct ttm_placement vmw_nonfixed_placement = { .num_placement = 3, .placement = nonfixed_placement_flags, diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c index e69bc373ae2e5..f2e2bf6d1421f 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c @@ -540,7 +540,7 @@ int vmw_validation_bo_validate_single(struct ttm_buffer_object *bo, if (atomic_read(&vbo->cpu_writers)) return -EBUSY;
- if (vbo->pin_count > 0) + if (vbo->base.pin_count > 0) return 0;
if (validate_as_mob)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lee Jones lee.jones@linaro.org
[ Upstream commit 43ebfe61c3928573a5ef8d80c2f5300aa5c904c0 ]
Fixes the following W=1 kernel build warning(s):
drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c: In function ‘vmw_cmdbuf_res_revert’: drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c:162:6: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable]
Cc: VMware Graphics linux-graphics-maintainer@vmware.com Cc: Roland Scheidegger sroland@vmware.com Cc: Zack Rusin zackr@vmware.com Cc: David Airlie airlied@linux.ie Cc: Daniel Vetter daniel@ffwll.ch Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones lee.jones@linaro.org Signed-off-by: Zack Rusin zackr@vmware.com Link: https://patchwork.freedesktop.org/patch/msgid/20210115181313.3431493-40-lee.... Stable-dep-of: 517621b70600 ("drm/vmwgfx: Fix possible null pointer derefence with invalid contexts") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c index 44d858ce4ce7f..47b92d0c898a7 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c @@ -160,7 +160,6 @@ void vmw_cmdbuf_res_commit(struct list_head *list) void vmw_cmdbuf_res_revert(struct list_head *list) { struct vmw_cmdbuf_res *entry, *next; - int ret;
list_for_each_entry_safe(entry, next, list, head) { switch (entry->state) { @@ -168,8 +167,7 @@ void vmw_cmdbuf_res_revert(struct list_head *list) vmw_cmdbuf_res_free(entry->man, entry); break; case VMW_CMDBUF_RES_DEL: - ret = drm_ht_insert_item(&entry->man->resources, - &entry->hash); + drm_ht_insert_item(&entry->man->resources, &entry->hash); list_del(&entry->head); list_add_tail(&entry->head, &entry->man->list); entry->state = VMW_CMDBUF_RES_COMMITTED;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zack Rusin zackr@vmware.com
[ Upstream commit 74231041d14030f1ae6582b9233bfe782ac23e33 ]
Fix some minor issues that Coverity spotted in the code. None of that are serious but they're all valid concerns so fixing them makes sense.
Signed-off-by: Zack Rusin zackr@vmware.com Reviewed-by: Roland Scheidegger sroland@vmware.com Reviewed-by: Martin Krastev krastevm@vmware.com Link: https://patchwork.freedesktop.org/patch/msgid/20210609172307.131929-5-zackr@... Stable-dep-of: 517621b70600 ("drm/vmwgfx: Fix possible null pointer derefence with invalid contexts") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/ttm/ttm_memory.c | 2 ++ drivers/gpu/drm/vmwgfx/vmwgfx_binding.c | 20 ++++++++------------ drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c | 4 +++- drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 2 ++ drivers/gpu/drm/vmwgfx/vmwgfx_mob.c | 4 +++- drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 6 ++++-- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 8 ++++++-- drivers/gpu/drm/vmwgfx/vmwgfx_so.c | 3 ++- drivers/gpu/drm/vmwgfx/vmwgfx_validation.c | 4 ++-- 10 files changed, 33 insertions(+), 22 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_memory.c b/drivers/gpu/drm/ttm/ttm_memory.c index 89d50f38c0f2c..5af52012fc5c6 100644 --- a/drivers/gpu/drm/ttm/ttm_memory.c +++ b/drivers/gpu/drm/ttm/ttm_memory.c @@ -431,8 +431,10 @@ int ttm_mem_global_init(struct ttm_mem_global *glob)
si_meminfo(&si);
+ spin_lock(&glob->lock); /* set it as 0 by default to keep original behavior of OOM */ glob->lower_mem_limit = 0; + spin_unlock(&glob->lock);
ret = ttm_mem_init_kernel_zone(glob, &si); if (unlikely(ret != 0)) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_binding.c b/drivers/gpu/drm/vmwgfx/vmwgfx_binding.c index f41550797970b..4da4bf3b7f0b3 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_binding.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_binding.c @@ -713,7 +713,7 @@ static int vmw_binding_scrub_cb(struct vmw_ctx_bindinfo *bi, bool rebind) * without checking which bindings actually need to be emitted * * @cbs: Pointer to the context's struct vmw_ctx_binding_state - * @bi: Pointer to where the binding info array is stored in @cbs + * @biv: Pointer to where the binding info array is stored in @cbs * @max_num: Maximum number of entries in the @bi array. * * Scans the @bi array for bindings and builds a buffer of view id data. @@ -723,11 +723,9 @@ static int vmw_binding_scrub_cb(struct vmw_ctx_bindinfo *bi, bool rebind) * contains the command data. */ static void vmw_collect_view_ids(struct vmw_ctx_binding_state *cbs, - const struct vmw_ctx_bindinfo *bi, + const struct vmw_ctx_bindinfo_view *biv, u32 max_num) { - const struct vmw_ctx_bindinfo_view *biv = - container_of(bi, struct vmw_ctx_bindinfo_view, bi); unsigned long i;
cbs->bind_cmd_count = 0; @@ -835,7 +833,7 @@ static int vmw_emit_set_sr(struct vmw_ctx_binding_state *cbs, */ static int vmw_emit_set_rt(struct vmw_ctx_binding_state *cbs) { - const struct vmw_ctx_bindinfo *loc = &cbs->render_targets[0].bi; + const struct vmw_ctx_bindinfo_view *loc = &cbs->render_targets[0]; struct { SVGA3dCmdHeader header; SVGA3dCmdDXSetRenderTargets body; @@ -871,7 +869,7 @@ static int vmw_emit_set_rt(struct vmw_ctx_binding_state *cbs) * without checking which bindings actually need to be emitted * * @cbs: Pointer to the context's struct vmw_ctx_binding_state - * @bi: Pointer to where the binding info array is stored in @cbs + * @biso: Pointer to where the binding info array is stored in @cbs * @max_num: Maximum number of entries in the @bi array. * * Scans the @bi array for bindings and builds a buffer of SVGA3dSoTarget data. @@ -881,11 +879,9 @@ static int vmw_emit_set_rt(struct vmw_ctx_binding_state *cbs) * contains the command data. */ static void vmw_collect_so_targets(struct vmw_ctx_binding_state *cbs, - const struct vmw_ctx_bindinfo *bi, + const struct vmw_ctx_bindinfo_so_target *biso, u32 max_num) { - const struct vmw_ctx_bindinfo_so_target *biso = - container_of(bi, struct vmw_ctx_bindinfo_so_target, bi); unsigned long i; SVGA3dSoTarget *so_buffer = (SVGA3dSoTarget *) cbs->bind_cmd_buffer;
@@ -916,7 +912,7 @@ static void vmw_collect_so_targets(struct vmw_ctx_binding_state *cbs, */ static int vmw_emit_set_so_target(struct vmw_ctx_binding_state *cbs) { - const struct vmw_ctx_bindinfo *loc = &cbs->so_targets[0].bi; + const struct vmw_ctx_bindinfo_so_target *loc = &cbs->so_targets[0]; struct { SVGA3dCmdHeader header; SVGA3dCmdDXSetSOTargets body; @@ -1063,7 +1059,7 @@ static int vmw_emit_set_vb(struct vmw_ctx_binding_state *cbs)
static int vmw_emit_set_uav(struct vmw_ctx_binding_state *cbs) { - const struct vmw_ctx_bindinfo *loc = &cbs->ua_views[0].views[0].bi; + const struct vmw_ctx_bindinfo_view *loc = &cbs->ua_views[0].views[0]; struct { SVGA3dCmdHeader header; SVGA3dCmdDXSetUAViews body; @@ -1093,7 +1089,7 @@ static int vmw_emit_set_uav(struct vmw_ctx_binding_state *cbs)
static int vmw_emit_set_cs_uav(struct vmw_ctx_binding_state *cbs) { - const struct vmw_ctx_bindinfo *loc = &cbs->ua_views[1].views[0].bi; + const struct vmw_ctx_bindinfo_view *loc = &cbs->ua_views[1].views[0]; struct { SVGA3dCmdHeader header; SVGA3dCmdDXSetCSUAViews body; diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c index 9a9fe10d829b8..87a39721e5bc0 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c @@ -514,7 +514,7 @@ static void vmw_cmdbuf_work_func(struct work_struct *work) struct vmw_cmdbuf_man *man = container_of(work, struct vmw_cmdbuf_man, work); struct vmw_cmdbuf_header *entry, *next; - uint32_t dummy; + uint32_t dummy = 0; bool send_fence = false; struct list_head restart_head[SVGA_CB_CONTEXT_MAX]; int i; diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c index 47b92d0c898a7..f212368c03129 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c @@ -160,6 +160,7 @@ void vmw_cmdbuf_res_commit(struct list_head *list) void vmw_cmdbuf_res_revert(struct list_head *list) { struct vmw_cmdbuf_res *entry, *next; + int ret;
list_for_each_entry_safe(entry, next, list, head) { switch (entry->state) { @@ -167,7 +168,8 @@ void vmw_cmdbuf_res_revert(struct list_head *list) vmw_cmdbuf_res_free(entry->man, entry); break; case VMW_CMDBUF_RES_DEL: - drm_ht_insert_item(&entry->man->resources, &entry->hash); + ret = drm_ht_insert_item(&entry->man->resources, &entry->hash); + BUG_ON(ret); list_del(&entry->head); list_add_tail(&entry->head, &entry->man->list); entry->state = VMW_CMDBUF_RES_COMMITTED; diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c index 00082c679170a..831291b5d1a03 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c @@ -2535,6 +2535,8 @@ static int vmw_cmd_dx_so_define(struct vmw_private *dev_priv,
so_type = vmw_so_cmd_to_type(header->id); res = vmw_context_cotable(ctx_node->ctx, vmw_so_cotables[so_type]); + if (IS_ERR(res)) + return PTR_ERR(res); cmd = container_of(header, typeof(*cmd), header); ret = vmw_cotable_notify(res, cmd->defined_id);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c b/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c index 7f95ed6aa2241..fb0797b380dd2 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c @@ -494,11 +494,13 @@ static void vmw_mob_pt_setup(struct vmw_mob *mob, { unsigned long num_pt_pages = 0; struct ttm_buffer_object *bo = mob->pt_bo; - struct vmw_piter save_pt_iter; + struct vmw_piter save_pt_iter = {0}; struct vmw_piter pt_iter; const struct vmw_sg_table *vsgt; int ret;
+ BUG_ON(num_data_pages == 0); + ret = ttm_bo_reserve(bo, false, true, NULL); BUG_ON(ret != 0);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c b/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c index 15b5bde693242..751582f5ab0b1 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c @@ -154,6 +154,7 @@ static unsigned long vmw_port_hb_out(struct rpc_channel *channel, /* HB port can't access encrypted memory. */ if (hb && !mem_encrypt_active()) { unsigned long bp = channel->cookie_high; + u32 channel_id = (channel->channel_id << 16);
si = (uintptr_t) msg; di = channel->cookie_low; @@ -161,7 +162,7 @@ static unsigned long vmw_port_hb_out(struct rpc_channel *channel, VMW_PORT_HB_OUT( (MESSAGE_STATUS_SUCCESS << 16) | VMW_PORT_CMD_HB_MSG, msg_len, si, di, - VMWARE_HYPERVISOR_HB | (channel->channel_id << 16) | + VMWARE_HYPERVISOR_HB | channel_id | VMWARE_HYPERVISOR_OUT, VMW_HYPERVISOR_MAGIC, bp, eax, ebx, ecx, edx, si, di); @@ -209,6 +210,7 @@ static unsigned long vmw_port_hb_in(struct rpc_channel *channel, char *reply, /* HB port can't access encrypted memory */ if (hb && !mem_encrypt_active()) { unsigned long bp = channel->cookie_low; + u32 channel_id = (channel->channel_id << 16);
si = channel->cookie_high; di = (uintptr_t) reply; @@ -216,7 +218,7 @@ static unsigned long vmw_port_hb_in(struct rpc_channel *channel, char *reply, VMW_PORT_HB_IN( (MESSAGE_STATUS_SUCCESS << 16) | VMW_PORT_CMD_HB_MSG, reply_len, si, di, - VMWARE_HYPERVISOR_HB | (channel->channel_id << 16), + VMWARE_HYPERVISOR_HB | channel_id, VMW_HYPERVISOR_MAGIC, bp, eax, ebx, ecx, edx, si, di);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c index 5e922d9d5f2cc..26f88d64879f1 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c @@ -114,6 +114,7 @@ static void vmw_resource_release(struct kref *kref) container_of(kref, struct vmw_resource, kref); struct vmw_private *dev_priv = res->dev_priv; int id; + int ret; struct idr *idr = &dev_priv->res_idr[res->func->res_type];
spin_lock(&dev_priv->resource_lock); @@ -122,7 +123,8 @@ static void vmw_resource_release(struct kref *kref) if (res->backup) { struct ttm_buffer_object *bo = &res->backup->base;
- ttm_bo_reserve(bo, false, false, NULL); + ret = ttm_bo_reserve(bo, false, false, NULL); + BUG_ON(ret); if (vmw_resource_mob_attached(res) && res->func->unbind != NULL) { struct ttm_validate_buffer val_buf; @@ -1001,7 +1003,9 @@ int vmw_resource_pin(struct vmw_resource *res, bool interruptible) if (res->backup) { vbo = res->backup;
- ttm_bo_reserve(&vbo->base, interruptible, false, NULL); + ret = ttm_bo_reserve(&vbo->base, interruptible, false, NULL); + if (ret) + goto out_no_validate; if (!vbo->base.pin_count) { ret = ttm_bo_validate (&vbo->base, diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_so.c b/drivers/gpu/drm/vmwgfx/vmwgfx_so.c index 3f97b61dd5d83..9330f1a0f1743 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_so.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_so.c @@ -538,7 +538,8 @@ const SVGACOTableType vmw_so_cotables[] = { [vmw_so_ds] = SVGA_COTABLE_DEPTHSTENCIL, [vmw_so_rs] = SVGA_COTABLE_RASTERIZERSTATE, [vmw_so_ss] = SVGA_COTABLE_SAMPLER, - [vmw_so_so] = SVGA_COTABLE_STREAMOUTPUT + [vmw_so_so] = SVGA_COTABLE_STREAMOUTPUT, + [vmw_so_max]= SVGA_COTABLE_MAX };
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c index f2e2bf6d1421f..cc1cfc827bb9a 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c @@ -585,13 +585,13 @@ int vmw_validation_bo_validate(struct vmw_validation_context *ctx, bool intr) container_of(entry->base.bo, typeof(*vbo), base);
if (entry->cpu_blit) { - struct ttm_operation_ctx ctx = { + struct ttm_operation_ctx ttm_ctx = { .interruptible = intr, .no_wait_gpu = false };
ret = ttm_bo_validate(entry->base.bo, - &vmw_nonfixed_placement, &ctx); + &vmw_nonfixed_placement, &ttm_ctx); } else { ret = vmw_validation_bo_validate_single (entry->base.bo, intr, entry->as_mob);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zack Rusin zack.rusin@broadcom.com
[ Upstream commit 517621b7060096e48e42f545fa6646fc00252eac ]
vmw_context_cotable can return either an error or a null pointer and its usage sometimes went unchecked. Subsequent code would then try to access either a null pointer or an error value.
The invalid dereferences were only possible with malformed userspace apps which never properly initialized the rendering contexts.
Check the results of vmw_context_cotable to fix the invalid derefs.
Thanks: ziming zhang(@ezrak1e) from Ant Group Light-Year Security Lab who was the first person to discover it. Niels De Graef who reported it and helped to track down the poc.
Fixes: 9c079b8ce8bf ("drm/vmwgfx: Adapt execbuf to the new validation api") Cc: stable@vger.kernel.org # v4.20+ Reported-by: Niels De Graef ndegraef@redhat.com Signed-off-by: Zack Rusin zack.rusin@broadcom.com Cc: Martin Krastev martin.krastev@broadcom.com Cc: Maaz Mombasawala maaz.mombasawala@broadcom.com Cc: Ian Forbes ian.forbes@broadcom.com Cc: Broadcom internal kernel review list bcm-kernel-feedback-list@broadcom.com Cc: dri-devel@lists.freedesktop.org Reviewed-by: Maaz Mombasawala maaz.mombasawala@broadcom.com Reviewed-by: Martin Krastev martin.krastev@broadcom.com Link: https://patchwork.freedesktop.org/patch/msgid/20240110200305.94086-1-zack.ru... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c index 831291b5d1a03..616f6cb622783 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c @@ -467,7 +467,7 @@ static int vmw_resource_context_res_add(struct vmw_private *dev_priv, vmw_res_type(ctx) == vmw_res_dx_context) { for (i = 0; i < cotable_max; ++i) { res = vmw_context_cotable(ctx, i); - if (IS_ERR(res)) + if (IS_ERR_OR_NULL(res)) continue;
ret = vmw_execbuf_res_noctx_val_add(sw_context, res, @@ -1272,6 +1272,8 @@ static int vmw_cmd_dx_define_query(struct vmw_private *dev_priv, return -EINVAL;
cotable_res = vmw_context_cotable(ctx_node->ctx, SVGA_COTABLE_DXQUERY); + if (IS_ERR_OR_NULL(cotable_res)) + return cotable_res ? PTR_ERR(cotable_res) : -EINVAL; ret = vmw_cotable_notify(cotable_res, cmd->body.queryId);
return ret; @@ -2450,6 +2452,8 @@ static int vmw_cmd_dx_view_define(struct vmw_private *dev_priv, return ret;
res = vmw_context_cotable(ctx_node->ctx, vmw_view_cotables[view_type]); + if (IS_ERR_OR_NULL(res)) + return res ? PTR_ERR(res) : -EINVAL; ret = vmw_cotable_notify(res, cmd->defined_id); if (unlikely(ret != 0)) return ret; @@ -2535,8 +2539,8 @@ static int vmw_cmd_dx_so_define(struct vmw_private *dev_priv,
so_type = vmw_so_cmd_to_type(header->id); res = vmw_context_cotable(ctx_node->ctx, vmw_so_cotables[so_type]); - if (IS_ERR(res)) - return PTR_ERR(res); + if (IS_ERR_OR_NULL(res)) + return res ? PTR_ERR(res) : -EINVAL; cmd = container_of(header, typeof(*cmd), header); ret = vmw_cotable_notify(res, cmd->defined_id);
@@ -2655,6 +2659,8 @@ static int vmw_cmd_dx_define_shader(struct vmw_private *dev_priv, return -EINVAL;
res = vmw_context_cotable(ctx_node->ctx, SVGA_COTABLE_DXSHADER); + if (IS_ERR_OR_NULL(res)) + return res ? PTR_ERR(res) : -EINVAL; ret = vmw_cotable_notify(res, cmd->body.shaderId); if (ret) return ret; @@ -2976,6 +2982,8 @@ static int vmw_cmd_dx_define_streamoutput(struct vmw_private *dev_priv, }
res = vmw_context_cotable(ctx_node->ctx, SVGA_COTABLE_STREAMOUTPUT); + if (IS_ERR_OR_NULL(res)) + return res ? PTR_ERR(res) : -EINVAL; ret = vmw_cotable_notify(res, cmd->body.soid); if (ret) return ret;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hugo Villeneuve hvilleneuve@dimonoff.com
[ Upstream commit 0d27056c24efd3d63a03f3edfbcfc4827086b110 ]
When trying to instantiate a max14830 device from userspace:
echo max14830 0x60 > /sys/bus/i2c/devices/i2c-2/new_device
we get the following error:
Unable to handle kernel NULL pointer dereference at virtual address... ... Call trace: max310x_i2c_probe+0x48/0x170 [max310x] i2c_device_probe+0x150/0x2a0 ...
Add check for validity of devtype to prevent the error, and abort probe with a meaningful error message.
Fixes: 2e1f2d9a9bdb ("serial: max310x: implement I2C support") Cc: stable@vger.kernel.org Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Signed-off-by: Hugo Villeneuve hvilleneuve@dimonoff.com Link: https://lore.kernel.org/r/20240118152213.2644269-2-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/max310x.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/max310x.c b/drivers/tty/serial/max310x.c index 5570fd3b84e15..363b68555fe62 100644 --- a/drivers/tty/serial/max310x.c +++ b/drivers/tty/serial/max310x.c @@ -1636,13 +1636,16 @@ static unsigned short max310x_i2c_slave_addr(unsigned short addr,
static int max310x_i2c_probe(struct i2c_client *client) { - const struct max310x_devtype *devtype = - device_get_match_data(&client->dev); + const struct max310x_devtype *devtype; struct i2c_client *port_client; struct regmap *regmaps[4]; unsigned int i; u8 port_addr;
+ devtype = device_get_match_data(&client->dev); + if (!devtype) + return dev_err_probe(&client->dev, -ENODEV, "Failed to match device\n"); + if (client->addr < devtype->slave_addr.min || client->addr > devtype->slave_addr.max) return dev_err_probe(&client->dev, -EINVAL,
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gui-Dong Han 2045gemini@gmail.com
[ Upstream commit 36d503ad547d1c75758a6fcdbec2806f1b6aeb41 ]
In xc4000_get_frequency(): *freq = priv->freq_hz + priv->freq_offset; The code accesses priv->freq_hz and priv->freq_offset without holding any lock.
In xc4000_set_params(): // Code that updates priv->freq_hz and priv->freq_offset ...
xc4000_get_frequency() and xc4000_set_params() may execute concurrently, risking inconsistent reads of priv->freq_hz and priv->freq_offset. Since these related data may update during reading, it can result in incorrect frequency calculation, leading to atomicity violations.
This possible bug is found by an experimental static analysis tool developed by our team, BassCheck[1]. This tool analyzes the locking APIs to extract function pairs that can be concurrently executed, and then analyzes the instructions in the paired functions to identify possible concurrency bugs including data races and atomicity violations. The above possible bug is reported when our tool analyzes the source code of Linux 6.2.
To address this issue, it is proposed to add a mutex lock pair in xc4000_get_frequency() to ensure atomicity. With this patch applied, our tool no longer reports the possible bug, with the kernel configuration allyesconfig for x86_64. Due to the lack of associated hardware, we cannot test the patch in runtime testing, and just verify it according to the code logic.
[1] https://sites.google.com/view/basscheck/
Fixes: 4c07e32884ab ("[media] xc4000: Fix get_frequency()") Cc: stable@vger.kernel.org Reported-by: BassCheck bass@buaa.edu.cn Signed-off-by: Gui-Dong Han 2045gemini@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/tuners/xc4000.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/media/tuners/xc4000.c b/drivers/media/tuners/xc4000.c index ef9af052007cb..849df4d1c573c 100644 --- a/drivers/media/tuners/xc4000.c +++ b/drivers/media/tuners/xc4000.c @@ -1517,10 +1517,10 @@ static int xc4000_get_frequency(struct dvb_frontend *fe, u32 *freq) { struct xc4000_priv *priv = fe->tuner_priv;
+ mutex_lock(&priv->lock); *freq = priv->freq_hz + priv->freq_offset;
if (debug) { - mutex_lock(&priv->lock); if ((priv->cur_fw.type & (BASE | FM | DTV6 | DTV7 | DTV78 | DTV8)) == BASE) { u16 snr = 0; @@ -1531,8 +1531,8 @@ static int xc4000_get_frequency(struct dvb_frontend *fe, u32 *freq) return 0; } } - mutex_unlock(&priv->lock); } + mutex_unlock(&priv->lock);
dprintk(1, "%s()\n", __func__);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
[ Upstream commit 3d75b8aa5c29058a512db29da7cbee8052724157 ]
Always flush the per-vCPU async #PF workqueue when a vCPU is clearing its completion queue, e.g. when a VM and all its vCPUs is being destroyed. KVM must ensure that none of its workqueue callbacks is running when the last reference to the KVM _module_ is put. Gifting a reference to the associated VM prevents the workqueue callback from dereferencing freed vCPU/VM memory, but does not prevent the KVM module from being unloaded before the callback completes.
Drop the misguided VM refcount gifting, as calling kvm_put_kvm() from async_pf_execute() if kvm_put_kvm() flushes the async #PF workqueue will result in deadlock. async_pf_execute() can't return until kvm_put_kvm() finishes, and kvm_put_kvm() can't return until async_pf_execute() finishes:
WARNING: CPU: 8 PID: 251 at virt/kvm/kvm_main.c:1435 kvm_put_kvm+0x2d/0x320 [kvm] Modules linked in: vhost_net vhost vhost_iotlb tap kvm_intel kvm irqbypass CPU: 8 PID: 251 Comm: kworker/8:1 Tainted: G W 6.6.0-rc1-e7af8d17224a-x86/gmem-vm #119 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Workqueue: events async_pf_execute [kvm] RIP: 0010:kvm_put_kvm+0x2d/0x320 [kvm] Call Trace: <TASK> async_pf_execute+0x198/0x260 [kvm] process_one_work+0x145/0x2d0 worker_thread+0x27e/0x3a0 kthread+0xba/0xe0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x11/0x20 </TASK> ---[ end trace 0000000000000000 ]--- INFO: task kworker/8:1:251 blocked for more than 120 seconds. Tainted: G W 6.6.0-rc1-e7af8d17224a-x86/gmem-vm #119 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/8:1 state:D stack:0 pid:251 ppid:2 flags:0x00004000 Workqueue: events async_pf_execute [kvm] Call Trace: <TASK> __schedule+0x33f/0xa40 schedule+0x53/0xc0 schedule_timeout+0x12a/0x140 __wait_for_common+0x8d/0x1d0 __flush_work.isra.0+0x19f/0x2c0 kvm_clear_async_pf_completion_queue+0x129/0x190 [kvm] kvm_arch_destroy_vm+0x78/0x1b0 [kvm] kvm_put_kvm+0x1c1/0x320 [kvm] async_pf_execute+0x198/0x260 [kvm] process_one_work+0x145/0x2d0 worker_thread+0x27e/0x3a0 kthread+0xba/0xe0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x11/0x20 </TASK>
If kvm_clear_async_pf_completion_queue() actually flushes the workqueue, then there's no need to gift async_pf_execute() a reference because all invocations of async_pf_execute() will be forced to complete before the vCPU and its VM are destroyed/freed. And that in turn fixes the module unloading bug as __fput() won't do module_put() on the last vCPU reference until the vCPU has been freed, e.g. if closing the vCPU file also puts the last reference to the KVM module.
Note that kvm_check_async_pf_completion() may also take the work item off the completion queue and so also needs to flush the work queue, as the work will not be seen by kvm_clear_async_pf_completion_queue(). Waiting on the workqueue could theoretically delay a vCPU due to waiting for the work to complete, but that's a very, very small chance, and likely a very small delay. kvm_arch_async_page_present_queued() unconditionally makes a new request, i.e. will effectively delay entering the guest, so the remaining work is really just:
trace_kvm_async_pf_completed(addr, cr2_or_gpa);
__kvm_vcpu_wake_up(vcpu);
mmput(mm);
and mmput() can't drop the last reference to the page tables if the vCPU is still alive, i.e. the vCPU won't get stuck tearing down page tables.
Add a helper to do the flushing, specifically to deal with "wakeup all" work items, as they aren't actually work items, i.e. are never placed in a workqueue. Trying to flush a bogus workqueue entry rightly makes __flush_work() complain (kudos to whoever added that sanity check).
Note, commit 5f6de5cbebee ("KVM: Prevent module exit until all VMs are freed") *tried* to fix the module refcounting issue by having VMs grab a reference to the module, but that only made the bug slightly harder to hit as it gave async_pf_execute() a bit more time to complete before the KVM module could be unloaded.
Fixes: af585b921e5d ("KVM: Halt vcpu if page it tries to access is swapped out") Cc: stable@vger.kernel.org Cc: David Matlack dmatlack@google.com Reviewed-by: Xu Yilun yilun.xu@intel.com Reviewed-by: Vitaly Kuznetsov vkuznets@redhat.com Link: https://lore.kernel.org/r/20240110011533.503302-2-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- virt/kvm/async_pf.c | 31 ++++++++++++++++++++++++++----- 1 file changed, 26 insertions(+), 5 deletions(-)
diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c index dd777688d14a9..952afb1bc83b4 100644 --- a/virt/kvm/async_pf.c +++ b/virt/kvm/async_pf.c @@ -88,7 +88,27 @@ static void async_pf_execute(struct work_struct *work) rcuwait_wake_up(&vcpu->wait);
mmput(mm); - kvm_put_kvm(vcpu->kvm); +} + +static void kvm_flush_and_free_async_pf_work(struct kvm_async_pf *work) +{ + /* + * The async #PF is "done", but KVM must wait for the work item itself, + * i.e. async_pf_execute(), to run to completion. If KVM is a module, + * KVM must ensure *no* code owned by the KVM (the module) can be run + * after the last call to module_put(). Note, flushing the work item + * is always required when the item is taken off the completion queue. + * E.g. even if the vCPU handles the item in the "normal" path, the VM + * could be terminated before async_pf_execute() completes. + * + * Wake all events skip the queue and go straight done, i.e. don't + * need to be flushed (but sanity check that the work wasn't queued). + */ + if (work->wakeup_all) + WARN_ON_ONCE(work->work.func); + else + flush_work(&work->work); + kmem_cache_free(async_pf_cache, work); }
void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu) @@ -115,7 +135,6 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu) #else if (cancel_work_sync(&work->work)) { mmput(work->mm); - kvm_put_kvm(vcpu->kvm); /* == work->vcpu->kvm */ kmem_cache_free(async_pf_cache, work); } #endif @@ -127,7 +146,10 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu) list_first_entry(&vcpu->async_pf.done, typeof(*work), link); list_del(&work->link); - kmem_cache_free(async_pf_cache, work); + + spin_unlock(&vcpu->async_pf.lock); + kvm_flush_and_free_async_pf_work(work); + spin_lock(&vcpu->async_pf.lock); } spin_unlock(&vcpu->async_pf.lock);
@@ -152,7 +174,7 @@ void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu)
list_del(&work->queue); vcpu->async_pf.queued--; - kmem_cache_free(async_pf_cache, work); + kvm_flush_and_free_async_pf_work(work); } }
@@ -187,7 +209,6 @@ bool kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, work->arch = *arch; work->mm = current->mm; mmget(work->mm); - kvm_get_kvm(work->vcpu->kvm);
INIT_WORK(&work->work, async_pf_execute);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 3ed7c61e49d65dacb96db798c0ab6fcd55a1f20f ]
__setup() handlers should return 1 to obsolete_checksetup() in init/main.c to indicate that the boot option has been handled. A return of 0 causes the boot option/value to be listed as an Unknown kernel parameter and added to init's (limited) argument or environment strings. Also, error return codes don't mean anything to obsolete_checksetup() -- only non-zero (usually 1) or zero. So return 1 from setup_nmi_watchdog().
Fixes: e5553a6d0442 ("sparc64: Implement NMI watchdog on capable cpus.") Signed-off-by: Randy Dunlap rdunlap@infradead.org Reported-by: Igor Zhbanov izh1979@gmail.com Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru Cc: "David S. Miller" davem@davemloft.net Cc: sparclinux@vger.kernel.org Cc: Sam Ravnborg sam@ravnborg.org Cc: Andrew Morton akpm@linux-foundation.org Cc: stable@vger.kernel.org Cc: Arnd Bergmann arnd@arndb.de Cc: Andreas Larsson andreas@gaisler.com Signed-off-by: Andreas Larsson andreas@gaisler.com Link: https://lore.kernel.org/r/20240211052802.22612-1-rdunlap@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/sparc/kernel/nmi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/sparc/kernel/nmi.c b/arch/sparc/kernel/nmi.c index 060fff95a305c..fbf25e926f67c 100644 --- a/arch/sparc/kernel/nmi.c +++ b/arch/sparc/kernel/nmi.c @@ -274,7 +274,7 @@ static int __init setup_nmi_watchdog(char *str) if (!strncmp(str, "panic", 5)) panic_on_timeout = 1;
- return 0; + return 1; } __setup("nmi_watchdog=", setup_nmi_watchdog);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 5378f00c935bebb846b1fdb0e79cb76c137c56b5 ]
__setup() handlers should return 1 to obsolete_checksetup() in init/main.c to indicate that the boot option has been handled. A return of 0 causes the boot option/value to be listed as an Unknown kernel parameter and added to init's (limited) argument or environment strings. Also, error return codes don't mean anything to obsolete_checksetup() -- only non-zero (usually 1) or zero. So return 1 from vdso_setup().
Fixes: 9a08862a5d2e ("vDSO for sparc") Signed-off-by: Randy Dunlap rdunlap@infradead.org Reported-by: Igor Zhbanov izh1979@gmail.com Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru Cc: "David S. Miller" davem@davemloft.net Cc: sparclinux@vger.kernel.org Cc: Dan Carpenter dan.carpenter@oracle.com Cc: Nick Alcock nick.alcock@oracle.com Cc: Sam Ravnborg sam@ravnborg.org Cc: Andrew Morton akpm@linux-foundation.org Cc: stable@vger.kernel.org Cc: Arnd Bergmann arnd@arndb.de Cc: Andreas Larsson andreas@gaisler.com Signed-off-by: Andreas Larsson andreas@gaisler.com Link: https://lore.kernel.org/r/20240211052808.22635-1-rdunlap@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/sparc/vdso/vma.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/arch/sparc/vdso/vma.c b/arch/sparc/vdso/vma.c index cc19e09b0fa1e..b073153c711ad 100644 --- a/arch/sparc/vdso/vma.c +++ b/arch/sparc/vdso/vma.c @@ -449,9 +449,8 @@ static __init int vdso_setup(char *s) unsigned long val;
err = kstrtoul(s, 10, &val); - if (err) - return err; - vdso_enabled = val; - return 0; + if (!err) + vdso_enabled = val; + return 1; } __setup("vdso=", vdso_setup);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Svyatoslav Pankratov svyatoslav.pankratov@intel.com
[ Upstream commit 01aed663e6c421aeafc9c330bda630976b50a764 ]
There is no need to free the reset_data structure if the recovery is unsuccessful and the reset is synchronous. The function adf_dev_aer_schedule_reset() handles the cleanup properly. Only asynchronous resets require such structure to be freed inside the reset worker.
Fixes: d8cba25d2c68 ("crypto: qat - Intel(R) QAT driver framework") Signed-off-by: Svyatoslav Pankratov svyatoslav.pankratov@intel.com Signed-off-by: Giovanni Cabiddu giovanni.cabiddu@intel.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Stable-dep-of: 7d42e097607c ("crypto: qat - resolve race condition during AER recovery") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/qat/qat_common/adf_aer.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/crypto/qat/qat_common/adf_aer.c b/drivers/crypto/qat/qat_common/adf_aer.c index d2ae293d0df6a..d1a7809c51c0a 100644 --- a/drivers/crypto/qat/qat_common/adf_aer.c +++ b/drivers/crypto/qat/qat_common/adf_aer.c @@ -95,7 +95,8 @@ static void adf_device_reset_worker(struct work_struct *work) if (adf_dev_init(accel_dev) || adf_dev_start(accel_dev)) { /* The device hanged and we can't restart it so stop here */ dev_err(&GET_DEV(accel_dev), "Restart device failed\n"); - kfree(reset_data); + if (reset_data->mode == ADF_DEV_RESET_ASYNC) + kfree(reset_data); WARN(1, "QAT: device restart failed. Device is unusable\n"); return; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damian Muszynski damian.muszynski@intel.com
[ Upstream commit 7d42e097607c4d246d99225bf2b195b6167a210c ]
During the PCI AER system's error recovery process, the kernel driver may encounter a race condition with freeing the reset_data structure's memory. If the device restart will take more than 10 seconds the function scheduling that restart will exit due to a timeout, and the reset_data structure will be freed. However, this data structure is used for completion notification after the restart is completed, which leads to a UAF bug.
This results in a KFENCE bug notice.
BUG: KFENCE: use-after-free read in adf_device_reset_worker+0x38/0xa0 [intel_qat] Use-after-free read at 0x00000000bc56fddf (in kfence-#142): adf_device_reset_worker+0x38/0xa0 [intel_qat] process_one_work+0x173/0x340
To resolve this race condition, the memory associated to the container of the work_struct is freed on the worker if the timeout expired, otherwise on the function that schedules the worker. The timeout detection can be done by checking if the caller is still waiting for completion or not by using completion_done() function.
Fixes: d8cba25d2c68 ("crypto: qat - Intel(R) QAT driver framework") Cc: stable@vger.kernel.org Signed-off-by: Damian Muszynski damian.muszynski@intel.com Reviewed-by: Giovanni Cabiddu giovanni.cabiddu@intel.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/qat/qat_common/adf_aer.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/drivers/crypto/qat/qat_common/adf_aer.c b/drivers/crypto/qat/qat_common/adf_aer.c index d1a7809c51c0a..dc0feedf52702 100644 --- a/drivers/crypto/qat/qat_common/adf_aer.c +++ b/drivers/crypto/qat/qat_common/adf_aer.c @@ -95,7 +95,8 @@ static void adf_device_reset_worker(struct work_struct *work) if (adf_dev_init(accel_dev) || adf_dev_start(accel_dev)) { /* The device hanged and we can't restart it so stop here */ dev_err(&GET_DEV(accel_dev), "Restart device failed\n"); - if (reset_data->mode == ADF_DEV_RESET_ASYNC) + if (reset_data->mode == ADF_DEV_RESET_ASYNC || + completion_done(&reset_data->compl)) kfree(reset_data); WARN(1, "QAT: device restart failed. Device is unusable\n"); return; @@ -103,11 +104,19 @@ static void adf_device_reset_worker(struct work_struct *work) adf_dev_restarted_notify(accel_dev); clear_bit(ADF_STATUS_RESTARTING, &accel_dev->status);
- /* The dev is back alive. Notify the caller if in sync mode */ - if (reset_data->mode == ADF_DEV_RESET_SYNC) - complete(&reset_data->compl); - else + /* + * The dev is back alive. Notify the caller if in sync mode + * + * If device restart will take a more time than expected, + * the schedule_reset() function can timeout and exit. This can be + * detected by calling the completion_done() function. In this case + * the reset_data structure needs to be freed here. + */ + if (reset_data->mode == ADF_DEV_RESET_ASYNC || + completion_done(&reset_data->compl)) kfree(reset_data); + else + complete(&reset_data->compl); }
static int adf_dev_aer_schedule_reset(struct adf_accel_dev *accel_dev, @@ -140,8 +149,9 @@ static int adf_dev_aer_schedule_reset(struct adf_accel_dev *accel_dev, dev_err(&GET_DEV(accel_dev), "Reset device timeout expired\n"); ret = -EFAULT; + } else { + kfree(reset_data); } - kfree(reset_data); return ret; } return 0;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: SeongJae Park sj@kernel.org
[ Upstream commit 85506aca2eb4ea41223c91c5fe25125953c19b13 ]
While mq_perf_tests runs with the default kselftest timeout limit, which is 45 seconds, the test takes about 60 seconds to complete on i3.metal AWS instances. Hence, the test always times out. Increase the timeout to 180 seconds.
Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test") Cc: stable@vger.kernel.org # 5.4.x Signed-off-by: SeongJae Park sj@kernel.org Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/mqueue/setting | 1 + 1 file changed, 1 insertion(+) create mode 100644 tools/testing/selftests/mqueue/setting
diff --git a/tools/testing/selftests/mqueue/setting b/tools/testing/selftests/mqueue/setting new file mode 100644 index 0000000000000..a953c96aa16e1 --- /dev/null +++ b/tools/testing/selftests/mqueue/setting @@ -0,0 +1 @@ +timeout=180
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
[ Upstream commit 4fbf8bc733d14bceb16dda46a3f5e19c6a9621c5 ]
When yangerkun review commit 93cdf49f6eca ("ext4: Fix best extent lstart adjustment logic in ext4_mb_new_inode_pa()"), it was found that the best extent did not completely cover the original request after adjusting the best extent lstart in ext4_mb_new_inode_pa() as follows:
original request: 2/10(8) normalized request: 0/64(64) best extent: 0/9(9)
When we check if best ex can be kept at start of goal, ac_o_ex.fe_logical is 2 less than the adjusted best extent logical end 9, so we think the adjustment is done. But obviously 0/9(9) doesn't cover 2/10(8), so we should determine here if the original request logical end is less than or equal to the adjusted best extent logical end.
In addition, add a comment stating when adjusted best_ex will not cover the original request, and remove the duplicate assertion because adjusting lstart makes no change to b_ex.fe_len.
Link: https://lore.kernel.org/r/3630fa7f-b432-7afd-5f79-781bc3b2c5ea@huawei.com Fixes: 93cdf49f6eca ("ext4: Fix best extent lstart adjustment logic in ext4_mb_new_inode_pa()") Cc: stable@kernel.org Signed-off-by: yangerkun yangerkun@huawei.com Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Ojaswin Mujoo ojaswin@linux.ibm.com Link: https://lore.kernel.org/r/20240201141845.1879253-1-libaokun1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/mballoc.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 61988b7b5be77..6ce59e2ac2d47 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4162,10 +4162,16 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) .fe_len = ac->ac_g_ex.fe_len, }; loff_t orig_goal_end = extent_logical_end(sbi, &ex); + loff_t o_ex_end = extent_logical_end(sbi, &ac->ac_o_ex);
- /* we can't allocate as much as normalizer wants. - * so, found space must get proper lstart - * to cover original request */ + /* + * We can't allocate as much as normalizer wants, so we try + * to get proper lstart to cover the original request, except + * when the goal doesn't cover the original request as below: + * + * orig_ex:2045/2055(10), isize:8417280 -> normalized:0/2048 + * best_ex:0/200(200) -> adjusted: 1848/2048(200) + */ BUG_ON(ac->ac_g_ex.fe_logical > ac->ac_o_ex.fe_logical); BUG_ON(ac->ac_g_ex.fe_len < ac->ac_o_ex.fe_len);
@@ -4177,7 +4183,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) * 1. Check if best ex can be kept at end of goal and still * cover original start * 2. Else, check if best ex can be kept at start of goal and - * still cover original start + * still cover original end * 3. Else, keep the best ex at start of original request. */ ex.fe_len = ac->ac_b_ex.fe_len; @@ -4187,7 +4193,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) goto adjust_bex;
ex.fe_logical = ac->ac_g_ex.fe_logical; - if (ac->ac_o_ex.fe_logical < extent_logical_end(sbi, &ex)) + if (o_ex_end <= extent_logical_end(sbi, &ex)) goto adjust_bex;
ex.fe_logical = ac->ac_o_ex.fe_logical; @@ -4195,7 +4201,6 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) ac->ac_b_ex.fe_logical = ex.fe_logical;
BUG_ON(ac->ac_o_ex.fe_logical < ac->ac_b_ex.fe_logical); - BUG_ON(ac->ac_o_ex.fe_len > ac->ac_b_ex.fe_len); BUG_ON(extent_logical_end(sbi, &ex) > orig_goal_end); }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal damien.lemoal@wdc.com
[ Upstream commit a805a4fa4fa376bbc145762bb8b09caa2fa8af48 ]
Per ZBC and ZAC specifications, host-managed SMR hard-disks mandate that all writes into sequential write required zones be aligned to the device physical block size. However, NVMe ZNS does not have this constraint and allows write operations into sequential zones to be aligned to the device logical block size. This inconsistency does not help with software portability across device types.
To solve this, introduce the zone_write_granularity queue limit to indicate the alignment constraint, in bytes, of write operations into zones of a zoned block device. This new limit is exported as a read-only sysfs queue attribute and the helper blk_queue_zone_write_granularity() introduced for drivers to set this limit.
The function blk_queue_set_zoned() is modified to set this new limit to the device logical block size by default. NVMe ZNS devices as well as zoned nullb devices use this default value as is. The scsi disk driver is modified to execute the blk_queue_zone_write_granularity() helper to set the zone write granularity of host-managed SMR disks to the disk physical block size.
The accessor functions queue_zone_write_granularity() and bdev_zone_write_granularity() are also introduced.
Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Martin K. Petersen martin.petersen@oracle.com Reviewed-by: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@edc.com Signed-off-by: Jens Axboe axboe@kernel.dk Stable-dep-of: c8f6f88d2592 ("block: Clear zone limits for a non-zoned stacked queue") Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/block/queue-sysfs.rst | 7 ++++++ block/blk-settings.c | 37 ++++++++++++++++++++++++++++- block/blk-sysfs.c | 8 +++++++ drivers/scsi/sd_zbc.c | 8 +++++++ include/linux/blkdev.h | 15 ++++++++++++ 5 files changed, 74 insertions(+), 1 deletion(-)
diff --git a/Documentation/block/queue-sysfs.rst b/Documentation/block/queue-sysfs.rst index 2638d3446b794..c8bf8bc3c03af 100644 --- a/Documentation/block/queue-sysfs.rst +++ b/Documentation/block/queue-sysfs.rst @@ -273,4 +273,11 @@ devices are described in the ZBC (Zoned Block Commands) and ZAC do not support zone commands, they will be treated as regular block devices and zoned will report "none".
+zone_write_granularity (RO) +--------------------------- +This indicates the alignment constraint, in bytes, for write operations in +sequential zones of zoned block devices (devices with a zoned attributed +that reports "host-managed" or "host-aware"). This value is always 0 for +regular block devices. + Jens Axboe jens.axboe@oracle.com, February 2009 diff --git a/block/blk-settings.c b/block/blk-settings.c index c3aa7f8ee3883..ab39169aa2b28 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -60,6 +60,7 @@ void blk_set_default_limits(struct queue_limits *lim) lim->io_opt = 0; lim->misaligned = 0; lim->zoned = BLK_ZONED_NONE; + lim->zone_write_granularity = 0; } EXPORT_SYMBOL(blk_set_default_limits);
@@ -353,6 +354,28 @@ void blk_queue_physical_block_size(struct request_queue *q, unsigned int size) } EXPORT_SYMBOL(blk_queue_physical_block_size);
+/** + * blk_queue_zone_write_granularity - set zone write granularity for the queue + * @q: the request queue for the zoned device + * @size: the zone write granularity size, in bytes + * + * Description: + * This should be set to the lowest possible size allowing to write in + * sequential zones of a zoned block device. + */ +void blk_queue_zone_write_granularity(struct request_queue *q, + unsigned int size) +{ + if (WARN_ON_ONCE(!blk_queue_is_zoned(q))) + return; + + q->limits.zone_write_granularity = size; + + if (q->limits.zone_write_granularity < q->limits.logical_block_size) + q->limits.zone_write_granularity = q->limits.logical_block_size; +} +EXPORT_SYMBOL_GPL(blk_queue_zone_write_granularity); + /** * blk_queue_alignment_offset - set physical block alignment offset * @q: the request queue for the device @@ -630,6 +653,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->discard_granularity; }
+ t->zone_write_granularity = max(t->zone_write_granularity, + b->zone_write_granularity); t->zoned = max(t->zoned, b->zoned); return ret; } @@ -846,6 +871,8 @@ EXPORT_SYMBOL_GPL(blk_queue_can_use_dma_map_merging); */ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model) { + struct request_queue *q = disk->queue; + switch (model) { case BLK_ZONED_HM: /* @@ -874,7 +901,15 @@ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model) break; }
- disk->queue->limits.zoned = model; + q->limits.zoned = model; + if (model != BLK_ZONED_NONE) { + /* + * Set the zone write granularity to the device logical block + * size by default. The driver can change this value if needed. + */ + blk_queue_zone_write_granularity(q, + queue_logical_block_size(q)); + } } EXPORT_SYMBOL_GPL(blk_queue_set_zoned);
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 9174137a913c4..ddf23bf3e0f1d 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -219,6 +219,12 @@ static ssize_t queue_write_zeroes_max_show(struct request_queue *q, char *page) (unsigned long long)q->limits.max_write_zeroes_sectors << 9); }
+static ssize_t queue_zone_write_granularity_show(struct request_queue *q, + char *page) +{ + return queue_var_show(queue_zone_write_granularity(q), page); +} + static ssize_t queue_zone_append_max_show(struct request_queue *q, char *page) { unsigned long long max_sectors = q->limits.max_zone_append_sectors; @@ -585,6 +591,7 @@ QUEUE_RO_ENTRY(queue_discard_zeroes_data, "discard_zeroes_data"); QUEUE_RO_ENTRY(queue_write_same_max, "write_same_max_bytes"); QUEUE_RO_ENTRY(queue_write_zeroes_max, "write_zeroes_max_bytes"); QUEUE_RO_ENTRY(queue_zone_append_max, "zone_append_max_bytes"); +QUEUE_RO_ENTRY(queue_zone_write_granularity, "zone_write_granularity");
QUEUE_RO_ENTRY(queue_zoned, "zoned"); QUEUE_RO_ENTRY(queue_nr_zones, "nr_zones"); @@ -639,6 +646,7 @@ static struct attribute *queue_attrs[] = { &queue_write_same_max_entry.attr, &queue_write_zeroes_max_entry.attr, &queue_zone_append_max_entry.attr, + &queue_zone_write_granularity_entry.attr, &queue_nonrot_entry.attr, &queue_zoned_entry.attr, &queue_nr_zones_entry.attr, diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index 01088f333dbc4..f9cd41703d99b 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -793,6 +793,14 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf) blk_queue_max_active_zones(q, 0); nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks);
+ /* + * Per ZBC and ZAC specifications, writes in sequential write required + * zones of host-managed devices must be aligned to the device physical + * block size. + */ + if (blk_queue_zoned_model(q) == BLK_ZONED_HM) + blk_queue_zone_write_granularity(q, sdkp->physical_block_size); + /* READ16/WRITE16 is mandatory for ZBC disks */ sdkp->device->use_16_for_rw = 1; sdkp->device->use_10_for_rw = 0; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 583824f111079..a47e1aebaff24 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -345,6 +345,7 @@ struct queue_limits { unsigned int max_zone_append_sectors; unsigned int discard_granularity; unsigned int discard_alignment; + unsigned int zone_write_granularity;
unsigned short max_segments; unsigned short max_integrity_segments; @@ -1169,6 +1170,8 @@ extern void blk_queue_logical_block_size(struct request_queue *, unsigned int); extern void blk_queue_max_zone_append_sectors(struct request_queue *q, unsigned int max_zone_append_sectors); extern void blk_queue_physical_block_size(struct request_queue *, unsigned int); +void blk_queue_zone_write_granularity(struct request_queue *q, + unsigned int size); extern void blk_queue_alignment_offset(struct request_queue *q, unsigned int alignment); void blk_queue_update_readahead(struct request_queue *q); @@ -1480,6 +1483,18 @@ static inline int bdev_io_opt(struct block_device *bdev) return queue_io_opt(bdev_get_queue(bdev)); }
+static inline unsigned int +queue_zone_write_granularity(const struct request_queue *q) +{ + return q->limits.zone_write_granularity; +} + +static inline unsigned int +bdev_zone_write_granularity(struct block_device *bdev) +{ + return queue_zone_write_granularity(bdev_get_queue(bdev)); +} + static inline int queue_alignment_offset(const struct request_queue *q) { if (q->limits.misaligned)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal dlemoal@kernel.org
[ Upstream commit c8f6f88d25929ad2f290b428efcae3b526f3eab0 ]
Device mapper may create a non-zoned mapped device out of a zoned device (e.g., the dm-zoned target). In such case, some queue limit such as the max_zone_append_sectors and zone_write_granularity endup being non zero values for a block device that is not zoned. Avoid this by clearing these limits in blk_stack_limits() when the stacked zoned limit is false.
Fixes: 3093a479727b ("block: inherit the zoned characteristics in blk_stack_limits") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal dlemoal@kernel.org Link: https://lore.kernel.org/r/20240222131724.1803520-1-dlemoal@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-settings.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/block/blk-settings.c b/block/blk-settings.c index ab39169aa2b28..ebd373469c807 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -656,6 +656,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->zone_write_granularity = max(t->zone_write_granularity, b->zone_write_granularity); t->zoned = max(t->zoned, b->zoned); + if (!t->zoned) { + t->zone_write_granularity = 0; + t->max_zone_append_sectors = 0; + } return ret; } EXPORT_SYMBOL(blk_stack_limits);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Wilcox (Oracle) willy@infradead.org
[ Upstream commit f2d5dcb48f7ba9e3ff249d58fc1fa963d374e66a ]
ilog2() rounds down, so for example when PowerPC 85xx sets CONFIG_NR_CPUS to 24, we will only allocate 4 bits to store the number of CPUs instead of 5. Use bits_per() instead, which rounds up. Found by code inspection. The effect of this would probably be a misaccounting when doing NUMA balancing, so to a user, it would only be a performance penalty. The effects may be more wide-spread; it's hard to tell.
Link: https://lkml.kernel.org/r/20231010145549.1244748-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Fixes: 90572890d202 ("mm: numa: Change page last {nid,pid} into {cpu,pid}") Reviewed-by: Rik van Riel riel@surriel.com Acked-by: Mel Gorman mgorman@techsingularity.net Cc: Peter Zijlstra peterz@infradead.org Cc: Ingo Molnar mingo@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bounds.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bounds.c b/kernel/bounds.c index 9795d75b09b23..a94e3769347ee 100644 --- a/kernel/bounds.c +++ b/kernel/bounds.c @@ -19,7 +19,7 @@ int main(void) DEFINE(NR_PAGEFLAGS, __NR_PAGEFLAGS); DEFINE(MAX_NR_ZONES, __MAX_NR_ZONES); #ifdef CONFIG_SMP - DEFINE(NR_CPUS_BITS, ilog2(CONFIG_NR_CPUS)); + DEFINE(NR_CPUS_BITS, bits_per(CONFIG_NR_CPUS)); #endif DEFINE(SPINLOCK_SIZE, sizeof(spinlock_t)); /* End of constants */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
[ Upstream commit fde2497d2bc3a063d8af88b258dbadc86bd7b57c ]
When fat_encode_fh_nostale() encodes file handle without a parent it stores only first 10 bytes of the file handle. However the length of the file handle must be a multiple of 4 so the file handle is actually 12 bytes long and the last two bytes remain uninitialized. This is not great at we potentially leak uninitialized information with the handle to userspace. Properly initialize the full handle length.
Link: https://lkml.kernel.org/r/20240205122626.13701-1-jack@suse.cz Reported-by: syzbot+3ce5dea5b1539ff36769@syzkaller.appspotmail.com Fixes: ea3983ace6b7 ("fat: restructure export_operations") Signed-off-by: Jan Kara jack@suse.cz Acked-by: OGAWA Hirofumi hirofumi@mail.parknet.co.jp Cc: Amir Goldstein amir73il@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fat/nfs.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/fs/fat/nfs.c b/fs/fat/nfs.c index af191371c3529..bab63eeaf9cbc 100644 --- a/fs/fat/nfs.c +++ b/fs/fat/nfs.c @@ -130,6 +130,12 @@ fat_encode_fh_nostale(struct inode *inode, __u32 *fh, int *lenp, fid->parent_i_gen = parent->i_generation; type = FILEID_FAT_WITH_PARENT; *lenp = FAT_FID_SIZE_WITH_PARENT; + } else { + /* + * We need to initialize this field because the fh is actually + * 12 bytes long + */ + fid->parent_i_pos_hi = 0; }
return type;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Wilcox (Oracle) willy@infradead.org
[ Upstream commit 723012cab779eee8228376754e22c6594229bf8f ]
Page cache reads are lockless, so setting the freshly allocated page uptodate before we've overwritten it with the data it's supposed to have in it will allow a simultaneous reader to see old data. Move the call to SetPageUptodate into ubifs_write_end(), which is after we copied the new data into the page.
Fixes: 1e51764a3c2a ("UBIFS: add new flash file system") Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Reviewed-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Richard Weinberger richard@nod.at Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ubifs/file.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c index 19fdcda045890..18df7a82517fa 100644 --- a/fs/ubifs/file.c +++ b/fs/ubifs/file.c @@ -262,9 +262,6 @@ static int write_begin_slow(struct address_space *mapping, return err; } } - - SetPageUptodate(page); - ClearPageError(page); }
if (PagePrivate(page)) @@ -463,9 +460,6 @@ static int ubifs_write_begin(struct file *file, struct address_space *mapping, return err; } } - - SetPageUptodate(page); - ClearPageError(page); }
err = allocate_budget(c, page, ui, appending); @@ -475,10 +469,8 @@ static int ubifs_write_begin(struct file *file, struct address_space *mapping, * If we skipped reading the page because we were going to * write all of it, then it is not up to date. */ - if (skipped_read) { + if (skipped_read) ClearPageChecked(page); - ClearPageUptodate(page); - } /* * Budgeting failed which means it would have to force * write-back but didn't, because we set the @fast flag in the @@ -569,6 +561,9 @@ static int ubifs_write_end(struct file *file, struct address_space *mapping, goto out; }
+ if (len == PAGE_SIZE) + SetPageUptodate(page); + if (!PagePrivate(page)) { attach_page_private(page, (void *)1); atomic_long_inc(&c->dirty_pg_cnt);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Weinberger richard@nod.at
[ Upstream commit 68a24aba7c593eafa8fd00f2f76407b9b32b47a9 ]
If the LEB size is smaller than a volume table record we cannot have volumes. In this case abort attaching.
Cc: Chenyuan Yang cy54@illinois.edu Cc: stable@vger.kernel.org Fixes: 801c135ce73d ("UBI: Unsorted Block Images") Reported-by: Chenyuan Yang cy54@illinois.edu Closes: https://lore.kernel.org/linux-mtd/1433EB7A-FC89-47D6-8F47-23BE41B263B3@illin... Signed-off-by: Richard Weinberger richard@nod.at Reviewed-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/ubi/vtbl.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/mtd/ubi/vtbl.c b/drivers/mtd/ubi/vtbl.c index f700f0e4f2ec4..6e5489e233dd2 100644 --- a/drivers/mtd/ubi/vtbl.c +++ b/drivers/mtd/ubi/vtbl.c @@ -791,6 +791,12 @@ int ubi_read_volume_table(struct ubi_device *ubi, struct ubi_attach_info *ai) * The number of supported volumes is limited by the eraseblock size * and by the UBI_MAX_VOLUMES constant. */ + + if (ubi->leb_size < UBI_VTBL_RECORD_SIZE) { + ubi_err(ubi, "LEB size too small for a volume record"); + return -EINVAL; + } + ubi->vtbl_slots = ubi->leb_size / UBI_VTBL_RECORD_SIZE; if (ubi->vtbl_slots > UBI_MAX_VOLUMES) ubi->vtbl_slots = UBI_MAX_VOLUMES;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit 7f174ae4f39e8475adcc09d26c5a43394689ad6c ]
Now that the calculation of fastmap size in ubi_calc_fm_size() is incorrect since it miss each user volume's ubi_fm_eba structure and the Internal UBI volume info. Let's correct the calculation.
Cc: stable@vger.kernel.org Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Richard Weinberger richard@nod.at Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/ubi/fastmap.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c index 6e95c4b1473e6..8081fc760d34f 100644 --- a/drivers/mtd/ubi/fastmap.c +++ b/drivers/mtd/ubi/fastmap.c @@ -86,9 +86,10 @@ size_t ubi_calc_fm_size(struct ubi_device *ubi) sizeof(struct ubi_fm_scan_pool) + sizeof(struct ubi_fm_scan_pool) + (ubi->peb_count * sizeof(struct ubi_fm_ec)) + - (sizeof(struct ubi_fm_eba) + - (ubi->peb_count * sizeof(__be32))) + - sizeof(struct ubi_fm_volhdr) * UBI_MAX_VOLUMES; + ((sizeof(struct ubi_fm_eba) + + sizeof(struct ubi_fm_volhdr)) * + (UBI_MAX_VOLUMES + UBI_INT_VOL_COUNT)) + + (ubi->peb_count * sizeof(__be32)); return roundup(size, ubi->leb_size); }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arseniy Krasnov avkrasnov@salutedevices.com
[ Upstream commit ef6f463599e16924cdd02ce5056ab52879dc008c ]
Scrambling mode is enabled by value (1 << 19). NFC_CMD_SCRAMBLER_ENABLE is already (1 << 19), so there is no need to shift it again in CMDRWGEN macro.
Signed-off-by: Arseniy Krasnov avkrasnov@salutedevices.com Cc: Stable@vger.kernel.org Fixes: 8fae856c5350 ("mtd: rawnand: meson: add support for Amlogic NAND flash controller") Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20240210214551.441610-1-avkrasnov@salutede... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/nand/raw/meson_nand.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c index 6bb0fca4a91d0..9a64190f30e9b 100644 --- a/drivers/mtd/nand/raw/meson_nand.c +++ b/drivers/mtd/nand/raw/meson_nand.c @@ -59,7 +59,7 @@ #define CMDRWGEN(cmd_dir, ran, bch, short_mode, page_size, pages) \ ( \ (cmd_dir) | \ - ((ran) << 19) | \ + (ran) | \ ((bch) << 14) | \ ((short_mode) << 13) | \ (((page_size) & 0x7f) << 6) | \
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: John David Anglin dave.anglin@bell.net
[ Upstream commit 4603fbaa76b5e703b38ac8cc718102834eb6e330 ]
Use add,l to avoid clobbering the C/B bits in the PSW.
Signed-off-by: John David Anglin dave.anglin@bell.net Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Sasha Levin sashal@kernel.org --- arch/parisc/include/asm/assembly.h | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/arch/parisc/include/asm/assembly.h b/arch/parisc/include/asm/assembly.h index a39250cb7dfcf..d3f23ed570c66 100644 --- a/arch/parisc/include/asm/assembly.h +++ b/arch/parisc/include/asm/assembly.h @@ -83,26 +83,28 @@ * version takes two arguments: a src and destination register. * However, the source and destination registers can not be * the same register. + * + * We use add,l to avoid clobbering the C/B bits in the PSW. */
.macro tophys grvirt, grphys - ldil L%(__PAGE_OFFSET), \grphys - sub \grvirt, \grphys, \grphys + ldil L%(-__PAGE_OFFSET), \grphys + addl \grvirt, \grphys, \grphys .endm - + .macro tovirt grphys, grvirt ldil L%(__PAGE_OFFSET), \grvirt - add \grphys, \grvirt, \grvirt + addl \grphys, \grvirt, \grvirt .endm
.macro tophys_r1 gr - ldil L%(__PAGE_OFFSET), %r1 - sub \gr, %r1, \gr + ldil L%(-__PAGE_OFFSET), %r1 + addl \gr, %r1, \gr .endm - + .macro tovirt_r1 gr ldil L%(__PAGE_OFFSET), %r1 - add \gr, %r1, \gr + addl \gr, %r1, \gr .endm
.macro delay value
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit a2abae8f0b638c31bb9799d9dd847306e0d005bd ]
IP checksum unit tests report the following error when run on hppa/hppa64.
# test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:463 Expected ( u64)csum_result == ( u64)expected, but ( u64)csum_result == 33754 (0x83da) ( u64)expected == 10946 (0x2ac2) not ok 4 test_ip_fast_csum
0x83da is the expected result if the IP header length is 20 bytes. 0x2ac2 is the expected result if the IP header length is 24 bytes. The test fails with an IP header length of 24 bytes. It appears that ip_fast_csum() always returns the checksum for a 20-byte header, no matter how long the header actually is.
Code analysis shows a suspicious assembler sequence in ip_fast_csum().
" addc %0, %3, %0\n" "1: ldws,ma 4(%1), %3\n" " addib,< 0, %2, 1b\n" <---
While my understanding of HPPA assembler is limited, it does not seem to make much sense to subtract 0 from a register and to expect the result to ever be negative. Subtracting 1 from the length parameter makes more sense. On top of that, the operation should be repeated if and only if the result is still > 0, so change the suspicious instruction to " addib,> -1, %2, 1b\n"
The IP checksum unit test passes after this change.
Cc: Palmer Dabbelt palmer@rivosinc.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck linux@roeck-us.net Tested-by: Charlie Jenkins charlie@rivosinc.com Reviewed-by: Charlie Jenkins charlie@rivosinc.com Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/parisc/include/asm/checksum.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/parisc/include/asm/checksum.h b/arch/parisc/include/asm/checksum.h index 3c43baca7b397..f705e5dd10742 100644 --- a/arch/parisc/include/asm/checksum.h +++ b/arch/parisc/include/asm/checksum.h @@ -40,7 +40,7 @@ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) " addc %0, %5, %0\n" " addc %0, %3, %0\n" "1: ldws,ma 4(%1), %3\n" -" addib,< 0, %2, 1b\n" +" addib,> -1, %2, 1b\n" " addc %0, %3, %0\n" "\n" " extru %0, 31, 16, %4\n"
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit 4408ba75e4ba80c91fde7e10bccccf388f5c09be ]
Calculating the IPv6 checksum on 32-bit systems missed overflows when adding the proto+len fields into the checksum. This results in the following unit test failure.
# test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:506 Expected ( u64)csum_result == ( u64)expected, but ( u64)csum_result == 46722 (0xb682) ( u64)expected == 46721 (0xb681) not ok 5 test_csum_ipv6_magic
This is probably rarely seen in the real world because proto+len are usually small values which will rarely result in overflows when calculating the checksum. However, the unit test code uses large values for the length field, causing the test to fail.
Fix the problem by adding the missing carry into the final checksum.
Cc: Palmer Dabbelt palmer@rivosinc.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck linux@roeck-us.net Tested-by: Charlie Jenkins charlie@rivosinc.com Reviewed-by: Charlie Jenkins charlie@rivosinc.com Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/parisc/include/asm/checksum.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/parisc/include/asm/checksum.h b/arch/parisc/include/asm/checksum.h index f705e5dd10742..e619e67440db9 100644 --- a/arch/parisc/include/asm/checksum.h +++ b/arch/parisc/include/asm/checksum.h @@ -163,7 +163,8 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr, " ldw,ma 4(%2), %7\n" /* 4th daddr */ " addc %6, %0, %0\n" " addc %7, %0, %0\n" -" addc %3, %0, %0\n" /* fold in proto+len, catch carry */ +" addc %3, %0, %0\n" /* fold in proto+len */ +" addc 0, %0, %0\n" /* add carry */
#endif : "=r" (sum), "=r" (saddr), "=r" (daddr), "=r" (len),
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit 4b75b12d70506e31fc02356bbca60f8d5ca012d0 ]
hppa 64-bit systems calculates the IPv6 checksum using 64-bit add operations. The last add folds protocol and length fields into the 64-bit result. While unlikely, this operation can overflow. The overflow can be triggered with a code sequence such as the following.
/* try to trigger massive overflows */ memset(tmp_buf, 0xff, sizeof(struct in6_addr)); csum_result = csum_ipv6_magic((struct in6_addr *)tmp_buf, (struct in6_addr *)tmp_buf, 0xffff, 0xff, 0xffffffff);
Fix the problem by adding any overflows from the final add operation into the calculated checksum. Fortunately, we can do this without additional cost by replacing the add operation used to fold the checksum into 32 bit with "add,dc" to add in the missing carry.
Cc: Palmer Dabbelt palmer@rivosinc.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Charlie Jenkins charlie@rivosinc.com Tested-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/parisc/include/asm/checksum.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/parisc/include/asm/checksum.h b/arch/parisc/include/asm/checksum.h index e619e67440db9..c949aa20fa162 100644 --- a/arch/parisc/include/asm/checksum.h +++ b/arch/parisc/include/asm/checksum.h @@ -137,8 +137,8 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr, " add,dc %3, %0, %0\n" /* fold in proto+len | carry bit */ " extrd,u %0, 31, 32, %4\n"/* copy upper half down */ " depdi 0, 31, 32, %0\n"/* clear upper half */ -" add %4, %0, %0\n" /* fold into 32-bits */ -" addc 0, %0, %0\n" /* add carry */ +" add,dc %4, %0, %0\n" /* fold into 32-bits, plus carry */ +" addc 0, %0, %0\n" /* add final carry */
#else
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit 0568b6f0d863643db2edcc7be31165740c89fa82 ]
IPv6 checksum tests with unaligned addresses on 64-bit builds result in unexpected failures.
Expected expected == csum_result, but expected == 46591 (0xb5ff) csum_result == 46381 (0xb52d) with alignment offset 1
Oddly enough, the problem disappeared after adding test code into the beginning of csum_ipv6_magic().
As it turns out, the 'sum' parameter of csum_ipv6_magic() is declared as __wsum, which is a 32-bit variable. However, it is treated as 64-bit variable in the 64-bit assembler code. Tests showed that the upper 32 bit of the register used to pass the variable are _not_ cleared when entering the function. This can result in checksum calculation errors.
Clearing the upper 32 bit of 'sum' as first operation in the assembler code fixes the problem.
Acked-by: Helge Deller deller@gmx.de Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/parisc/include/asm/checksum.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/parisc/include/asm/checksum.h b/arch/parisc/include/asm/checksum.h index c949aa20fa162..2aceebcd695c8 100644 --- a/arch/parisc/include/asm/checksum.h +++ b/arch/parisc/include/asm/checksum.h @@ -126,6 +126,7 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr, ** Try to keep 4 registers with "live" values ahead of the ALU. */
+" depdi 0, 31, 32, %0\n"/* clear upper half of incoming checksum */ " ldd,ma 8(%1), %4\n" /* get 1st saddr word */ " ldd,ma 8(%2), %5\n" /* get 1st daddr word */ " add %4, %0, %0\n"
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maulik Shah quic_mkshah@quicinc.com
[ Upstream commit 9bc4ffd32ef8943f5c5a42c9637cfd04771d021b ]
psci_init_system_suspend() invokes suspend_set_ops() very early during bootup even before kernel command line for mem_sleep_default is setup. This leads to kernel command line mem_sleep_default=s2idle not working as mem_sleep_current gets changed to deep via suspend_set_ops() and never changes back to s2idle.
Set mem_sleep_current along with mem_sleep_default during kernel command line setup as default suspend mode.
Fixes: faf7ec4a92c0 ("drivers: firmware: psci: add system suspend support") CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Maulik Shah quic_mkshah@quicinc.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/power/suspend.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index 4aa4d5d3947f1..14e981c0588ad 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -187,6 +187,7 @@ static int __init mem_sleep_default_setup(char *str) if (mem_sleep_labels[state] && !strcmp(str, mem_sleep_labels[state])) { mem_sleep_default = state; + mem_sleep_current = state; break; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gabor Juhos j4g8y7@gmail.com
[ Upstream commit cdbc6e2d8108bc47895e5a901cfcaf799b00ca8d ]
The frequency table arrays are supposed to be terminated with an empty element. Add such entry to the end of the arrays where it is missing in order to avoid possible out-of-bound access when the table is traversed by functions like qcom_find_freq() or qcom_find_freq_floor().
Only compile tested.
Fixes: d9db07f088af ("clk: qcom: Add ipq6018 Global Clock Controller support") Signed-off-by: Gabor Juhos j4g8y7@gmail.com Reviewed-by: Stephen Boyd sboyd@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240229-freq-table-terminator-v1-2-074334f0905c@g... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/gcc-ipq6018.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/clk/qcom/gcc-ipq6018.c b/drivers/clk/qcom/gcc-ipq6018.c index 4c5c7a8f41d08..b9844e41cf99d 100644 --- a/drivers/clk/qcom/gcc-ipq6018.c +++ b/drivers/clk/qcom/gcc-ipq6018.c @@ -1557,6 +1557,7 @@ static struct clk_regmap_div nss_ubi0_div_clk_src = {
static const struct freq_tbl ftbl_pcie_aux_clk_src[] = { F(24000000, P_XO, 1, 0, 0), + { } };
static const struct clk_parent_data gcc_xo_gpll0_core_pi_sleep_clk[] = { @@ -1737,6 +1738,7 @@ static const struct freq_tbl ftbl_sdcc_ice_core_clk_src[] = { F(160000000, P_GPLL0, 5, 0, 0), F(216000000, P_GPLL6, 5, 0, 0), F(308570000, P_GPLL6, 3.5, 0, 0), + { } };
static const struct clk_parent_data gcc_xo_gpll0_gpll6_gpll0_div2[] = {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gabor Juhos j4g8y7@gmail.com
[ Upstream commit 1040ef5ed95d6fd2628bad387d78a61633e09429 ]
The frequency table arrays are supposed to be terminated with an empty element. Add such entry to the end of the arrays where it is missing in order to avoid possible out-of-bound access when the table is traversed by functions like qcom_find_freq() or qcom_find_freq_floor().
Only compile tested.
Fixes: 9607f6224b39 ("clk: qcom: ipq8074: add PCIE, USB and SDCC clocks") Signed-off-by: Gabor Juhos j4g8y7@gmail.com Reviewed-by: Stephen Boyd sboyd@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240229-freq-table-terminator-v1-3-074334f0905c@g... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/gcc-ipq8074.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/clk/qcom/gcc-ipq8074.c b/drivers/clk/qcom/gcc-ipq8074.c index 0393154fea2f9..649e75a41f7af 100644 --- a/drivers/clk/qcom/gcc-ipq8074.c +++ b/drivers/clk/qcom/gcc-ipq8074.c @@ -972,6 +972,7 @@ static struct clk_rcg2 pcie0_axi_clk_src = {
static const struct freq_tbl ftbl_pcie_aux_clk_src[] = { F(19200000, P_XO, 1, 0, 0), + { } };
static struct clk_rcg2 pcie0_aux_clk_src = { @@ -1077,6 +1078,7 @@ static const struct freq_tbl ftbl_sdcc_ice_core_clk_src[] = { F(19200000, P_XO, 1, 0, 0), F(160000000, P_GPLL0, 5, 0, 0), F(308570000, P_GPLL6, 3.5, 0, 0), + { } };
static struct clk_rcg2 sdcc1_ice_core_clk_src = {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gabor Juhos j4g8y7@gmail.com
[ Upstream commit a903cfd38d8dee7e754fb89fd1bebed99e28003d ]
The frequency table arrays are supposed to be terminated with an empty element. Add such entry to the end of the arrays where it is missing in order to avoid possible out-of-bound access when the table is traversed by functions like qcom_find_freq() or qcom_find_freq_floor().
Only compile tested.
Fixes: 2b46cd23a5a2 ("clk: qcom: Add APQ8084 Multimedia Clock Controller (MMCC) support") Signed-off-by: Gabor Juhos j4g8y7@gmail.com Reviewed-by: Stephen Boyd sboyd@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240229-freq-table-terminator-v1-6-074334f0905c@g... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/mmcc-apq8084.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/clk/qcom/mmcc-apq8084.c b/drivers/clk/qcom/mmcc-apq8084.c index fbfcf00067394..c2fd0e8f4bc09 100644 --- a/drivers/clk/qcom/mmcc-apq8084.c +++ b/drivers/clk/qcom/mmcc-apq8084.c @@ -333,6 +333,7 @@ static struct freq_tbl ftbl_mmss_axi_clk[] = { F(333430000, P_MMPLL1, 3.5, 0, 0), F(400000000, P_MMPLL0, 2, 0, 0), F(466800000, P_MMPLL1, 2.5, 0, 0), + { } };
static struct clk_rcg2 mmss_axi_clk_src = { @@ -357,6 +358,7 @@ static struct freq_tbl ftbl_ocmemnoc_clk[] = { F(150000000, P_GPLL0, 4, 0, 0), F(228570000, P_MMPLL0, 3.5, 0, 0), F(320000000, P_MMPLL0, 2.5, 0, 0), + { } };
static struct clk_rcg2 ocmemnoc_clk_src = {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gabor Juhos j4g8y7@gmail.com
[ Upstream commit e2c02a85bf53ae86d79b5fccf0a75ac0b78e0c96 ]
The frequency table arrays are supposed to be terminated with an empty element. Add such entry to the end of the arrays where it is missing in order to avoid possible out-of-bound access when the table is traversed by functions like qcom_find_freq() or qcom_find_freq_floor().
Only compile tested.
Fixes: d8b212014e69 ("clk: qcom: Add support for MSM8974's multimedia clock controller (MMCC)") Signed-off-by: Gabor Juhos j4g8y7@gmail.com Reviewed-by: Stephen Boyd sboyd@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240229-freq-table-terminator-v1-7-074334f0905c@g... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/mmcc-msm8974.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/clk/qcom/mmcc-msm8974.c b/drivers/clk/qcom/mmcc-msm8974.c index 015426262d080..dfc377463a7af 100644 --- a/drivers/clk/qcom/mmcc-msm8974.c +++ b/drivers/clk/qcom/mmcc-msm8974.c @@ -283,6 +283,7 @@ static struct freq_tbl ftbl_mmss_axi_clk[] = { F(291750000, P_MMPLL1, 4, 0, 0), F(400000000, P_MMPLL0, 2, 0, 0), F(466800000, P_MMPLL1, 2.5, 0, 0), + { } };
static struct clk_rcg2 mmss_axi_clk_src = { @@ -307,6 +308,7 @@ static struct freq_tbl ftbl_ocmemnoc_clk[] = { F(150000000, P_GPLL0, 4, 0, 0), F(291750000, P_MMPLL1, 4, 0, 0), F(400000000, P_MMPLL0, 2, 0, 0), + { } };
static struct clk_rcg2 ocmemnoc_clk_src = {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman mpe@ellerman.id.au
[ Upstream commit 5f491356b7149564ab22323ccce79c8d595bfd0c ]
Binutils 2.38 complains about the use of mfpmr when building ppc6xx_defconfig:
CC arch/powerpc/kernel/pmc.o {standard input}: Assembler messages: {standard input}:45: Error: unrecognized opcode: `mfpmr' {standard input}:56: Error: unrecognized opcode: `mtpmr'
This is because by default the kernel is built with -mcpu=powerpc, and the mt/mfpmr instructions are not defined.
It can be avoided by enabling CONFIG_E300C3_CPU, but just adding that to the defconfig will leave open the possibility of randconfig failures.
So add machine directives around the mt/mfpmr instructions to tell binutils how to assemble them.
Cc: stable@vger.kernel.org Reported-by: Jan-Benedict Glaw jbglaw@lug-owl.de Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20240229122521.762431-3-mpe@ellerman.id.au Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/reg_fsl_emb.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/reg_fsl_emb.h b/arch/powerpc/include/asm/reg_fsl_emb.h index a21f529c43d96..8359c06d92d9f 100644 --- a/arch/powerpc/include/asm/reg_fsl_emb.h +++ b/arch/powerpc/include/asm/reg_fsl_emb.h @@ -12,9 +12,16 @@ #ifndef __ASSEMBLY__ /* Performance Monitor Registers */ #define mfpmr(rn) ({unsigned int rval; \ - asm volatile("mfpmr %0," __stringify(rn) \ + asm volatile(".machine push; " \ + ".machine e300; " \ + "mfpmr %0," __stringify(rn) ";" \ + ".machine pop; " \ : "=r" (rval)); rval;}) -#define mtpmr(rn, v) asm volatile("mtpmr " __stringify(rn) ",%0" : : "r" (v)) +#define mtpmr(rn, v) asm volatile(".machine push; " \ + ".machine e300; " \ + "mtpmr " __stringify(rn) ",%0; " \ + ".machine pop; " \ + : : "r" (v)) #endif /* __ASSEMBLY__ */
/* Freescale Book E Performance Monitor APU Registers */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Vogelbacher daniel@chaospixel.com
[ Upstream commit 3fb7bc4f3a98c48981318b87cf553c5f115fd5ca ]
The GMC IR-USB adapter cable utilizes a FTDI FT232R chip.
Add VID/PID for this adapter so it can be used as serial device via ftdi_sio.
Signed-off-by: Daniel Vogelbacher daniel@chaospixel.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/serial/ftdi_sio.c | 2 ++ drivers/usb/serial/ftdi_sio_ids.h | 6 ++++++ 2 files changed, 8 insertions(+)
diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c index 4d7f4a4ab69fb..66aa999efa6d5 100644 --- a/drivers/usb/serial/ftdi_sio.c +++ b/drivers/usb/serial/ftdi_sio.c @@ -1055,6 +1055,8 @@ static const struct usb_device_id id_table_combined[] = { .driver_info = (kernel_ulong_t)&ftdi_jtag_quirk }, { USB_DEVICE(FTDI_VID, FTDI_FALCONIA_JTAG_UNBUF_PID), .driver_info = (kernel_ulong_t)&ftdi_jtag_quirk }, + /* GMC devices */ + { USB_DEVICE(GMC_VID, GMC_Z216C_PID) }, { } /* Terminating entry */ };
diff --git a/drivers/usb/serial/ftdi_sio_ids.h b/drivers/usb/serial/ftdi_sio_ids.h index 9a0f9fc991246..b2aec1106678a 100644 --- a/drivers/usb/serial/ftdi_sio_ids.h +++ b/drivers/usb/serial/ftdi_sio_ids.h @@ -1599,3 +1599,9 @@ #define UBLOX_VID 0x1546 #define UBLOX_C099F9P_ZED_PID 0x0502 #define UBLOX_C099F9P_ODIN_PID 0x0503 + +/* + * GMC devices + */ +#define GMC_VID 0x1cd7 +#define GMC_Z216C_PID 0x0217 /* GMC Z216C Adapter IR-USB */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cameron Williams cang1@live.co.uk
[ Upstream commit cda704809797a8a86284f9df3eef5e62ec8a3175 ]
Add device ID for a (probably fake) CP2102 UART device.
lsusb -v output:
Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 1.10 bDeviceClass 0 [unknown] bDeviceSubClass 0 [unknown] bDeviceProtocol 0 bMaxPacketSize0 64 idVendor 0x11ca VeriFone Inc idProduct 0x0212 Verifone USB to Printer bcdDevice 1.00 iManufacturer 1 Silicon Labs iProduct 2 Verifone USB to Printer iSerial 3 0001 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x0020 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 100mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 0 [unknown] bInterfaceProtocol 0 iInterface 2 Verifone USB to Printer Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Device Status: 0x0000 (Bus Powered)
Signed-off-by: Cameron Williams cang1@live.co.uk Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/serial/cp210x.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index d161b64416a48..f4c982a323df2 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -181,6 +181,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x10C4, 0xF004) }, /* Elan Digital Systems USBcount50 */ { USB_DEVICE(0x10C5, 0xEA61) }, /* Silicon Labs MobiData GPRS USB Modem */ { USB_DEVICE(0x10CE, 0xEA6A) }, /* Silicon Labs MobiData GPRS USB Modem 100EU */ + { USB_DEVICE(0x11CA, 0x0212) }, /* Verifone USB to Printer (UART, CP2102) */ { USB_DEVICE(0x12B8, 0xEC60) }, /* Link G4 ECU */ { USB_DEVICE(0x12B8, 0xEC62) }, /* Link G4+ ECU */ { USB_DEVICE(0x13AD, 0x9999) }, /* Baltech card reader */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian Häggström christian.haggstrom@orexplore.com
[ Upstream commit a0d9d868491a362d421521499d98308c8e3a0398 ]
The radiation meter has the text MGP Instruments PDS-100G or PDS-100GN produced by Mirion Technologies. Tested by forcing the driver association with
echo 10c4 863c > /sys/bus/usb-serial/drivers/cp210x/new_id
and then setting the serial port in 115200 8N1 mode. The device announces ID_USB_VENDOR_ENC=Silicon\x20Labs and ID_USB_MODEL_ENC=PDS100
Signed-off-by: Christian Häggström christian.haggstrom@orexplore.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/serial/cp210x.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index f4c982a323df2..12d0be4d15101 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -148,6 +148,7 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x10C4, 0x85EA) }, /* AC-Services IBUS-IF */ { USB_DEVICE(0x10C4, 0x85EB) }, /* AC-Services CIS-IBUS */ { USB_DEVICE(0x10C4, 0x85F8) }, /* Virtenio Preon32 */ + { USB_DEVICE(0x10C4, 0x863C) }, /* MGP Instruments PDS100 */ { USB_DEVICE(0x10C4, 0x8664) }, /* AC-Services CAN-IF */ { USB_DEVICE(0x10C4, 0x8665) }, /* AC-Services OBD-IF */ { USB_DEVICE(0x10C4, 0x87ED) }, /* IMST USB-Stick for Smart Meter */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aurélien Jacobs aurel@gnuage.org
[ Upstream commit 46809c51565b83881aede6cdf3b0d25254966a41 ]
Update the USB serial option driver to support MeiG Smart SLM320.
ID 2dee:4d41 UNISOC UNISOC-8910
T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 9 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2dee ProdID=4d41 Rev=00.00 S: Manufacturer=UNISOC S: Product=UNISOC-8910 C: #Ifs= 8 Cfg#= 1 Atr=e0 MxPwr=400mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 7 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=08(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Tested successfully a PPP LTE connection using If#= 0. Not sure of the purpose of every other serial interfaces.
Signed-off-by: Aurélien Jacobs aurel@gnuage.org Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/serial/option.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index 43e8cb17b4c7a..fb1eba835e508 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -613,6 +613,11 @@ static void option_instat_callback(struct urb *urb); /* Luat Air72*U series based on UNISOC UIS8910 uses UNISOC's vendor ID */ #define LUAT_PRODUCT_AIR720U 0x4e00
+/* MeiG Smart Technology products */ +#define MEIGSMART_VENDOR_ID 0x2dee +/* MeiG Smart SLM320 based on UNISOC UIS8910 */ +#define MEIGSMART_PRODUCT_SLM320 0x4d41 + /* Device flags */
/* Highest interface number which can be used with NCTRL() and RSVD() */ @@ -2282,6 +2287,7 @@ static const struct usb_device_id option_ids[] = { { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, TOZED_PRODUCT_LT70C, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, LUAT_PRODUCT_AIR720U, 0xff, 0, 0) }, + { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SLM320, 0xff, 0, 0) }, { } /* Terminating entry */ }; MODULE_DEVICE_TABLE(usb, option_ids);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Toru Katagiri Toru.Katagiri@tdk.com
[ Upstream commit b1a8da9ff1395c4879b4bd41e55733d944f3d613 ]
TDK NC0110013M and MM0110113M have custom USB IDs for CP210x, so we need to add them to the driver.
Signed-off-by: Toru Katagiri Toru.Katagiri@tdk.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/serial/cp210x.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index 12d0be4d15101..294f7f01656aa 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -60,6 +60,8 @@ static const struct usb_device_id id_table[] = { { USB_DEVICE(0x0471, 0x066A) }, /* AKTAKOM ACE-1001 cable */ { USB_DEVICE(0x0489, 0xE000) }, /* Pirelli Broadband S.p.A, DP-L10 SIP/GSM Mobile */ { USB_DEVICE(0x0489, 0xE003) }, /* Pirelli Broadband S.p.A, DP-L10 SIP/GSM Mobile */ + { USB_DEVICE(0x04BF, 0x1301) }, /* TDK Corporation NC0110013M - Network Controller */ + { USB_DEVICE(0x04BF, 0x1303) }, /* TDK Corporation MM0110113M - i3 Micro Module */ { USB_DEVICE(0x0745, 0x1000) }, /* CipherLab USB CCD Barcode Scanner 1000 */ { USB_DEVICE(0x0846, 0x1100) }, /* NetGear Managed Switch M4100 series, M5300 series, M7100 series */ { USB_DEVICE(0x08e6, 0x5501) }, /* Gemalto Prox-PU/CU contactless smartcard reader */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qingliang Li qingliang.li@mediatek.com
[ Upstream commit e7a7681c859643f3f2476b2a28a494877fd89442 ]
When driver uses pm_runtime_force_suspend() as the system suspend callback function and registers the wake irq with reverse enable ordering, the wake irq will be re-enabled when entering system suspend, triggering an 'Unbalanced enable for IRQ xxx' warning. In this scenario, the call sequence during system suspend is as follows: suspend_devices_and_enter() -> dpm_suspend_start() -> dpm_run_callback() -> pm_runtime_force_suspend() -> dev_pm_enable_wake_irq_check() -> dev_pm_enable_wake_irq_complete()
-> suspend_enter() -> dpm_suspend_noirq() -> device_wakeup_arm_wake_irqs() -> dev_pm_arm_wake_irq()
To fix this issue, complete the setting of WAKE_IRQ_DEDICATED_ENABLED flag in dev_pm_enable_wake_irq_complete() to avoid redundant irq enablement.
Fixes: 8527beb12087 ("PM: sleep: wakeirq: fix wake irq arming") Reviewed-by: Dhruva Gole d-gole@ti.com Signed-off-by: Qingliang Li qingliang.li@mediatek.com Reviewed-by: Johan Hovold johan+linaro@kernel.org Cc: 5.16+ stable@vger.kernel.org # 5.16+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/power/wakeirq.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/base/power/wakeirq.c b/drivers/base/power/wakeirq.c index aea690c64e394..4f4310724fee5 100644 --- a/drivers/base/power/wakeirq.c +++ b/drivers/base/power/wakeirq.c @@ -365,8 +365,10 @@ void dev_pm_enable_wake_irq_complete(struct device *dev) return;
if (wirq->status & WAKE_IRQ_DEDICATED_MANAGED && - wirq->status & WAKE_IRQ_DEDICATED_REVERSE) + wirq->status & WAKE_IRQ_DEDICATED_REVERSE) { enable_irq(wirq->irq); + wirq->status |= WAKE_IRQ_DEDICATED_ENABLED; + } }
/**
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wolfram Sang wsa+renesas@sang-engineering.com
[ Upstream commit e8d1b41e69d72c62865bebe8f441163ec00b3d44 ]
With the to-be-fixed commit, the reset_work handler cleared 'host->mrq' outside of the spinlock protected critical section. That leaves a small race window during execution of 'tmio_mmc_reset()' where the done_work handler could grab a pointer to the now invalid 'host->mrq'. Both would use it to call mmc_request_done() causing problems (see link below).
However, 'host->mrq' cannot simply be cleared earlier inside the critical section. That would allow new mrqs to come in asynchronously while the actual reset of the controller still needs to be done. So, like 'tmio_mmc_set_ios()', an ERR_PTR is used to prevent new mrqs from coming in but still avoiding concurrency between work handlers.
Reported-by: Dirk Behme dirk.behme@de.bosch.com Closes: https://lore.kernel.org/all/20240220061356.3001761-1-dirk.behme@de.bosch.com... Fixes: df3ef2d3c92c ("mmc: protect the tmio_mmc driver against a theoretical race") Signed-off-by: Wolfram Sang wsa+renesas@sang-engineering.com Tested-by: Dirk Behme dirk.behme@de.bosch.com Reviewed-by: Dirk Behme dirk.behme@de.bosch.com Cc: stable@vger.kernel.org # 3.0+ Link: https://lore.kernel.org/r/20240305104423.3177-2-wsa+renesas@sang-engineering... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/tmio_mmc_core.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c index abf36acb2641f..0482920a79681 100644 --- a/drivers/mmc/host/tmio_mmc_core.c +++ b/drivers/mmc/host/tmio_mmc_core.c @@ -216,6 +216,8 @@ static void tmio_mmc_reset_work(struct work_struct *work) else mrq->cmd->error = -ETIMEDOUT;
+ /* No new calls yet, but disallow concurrent tmio_mmc_done_work() */ + host->mrq = ERR_PTR(-EBUSY); host->cmd = NULL; host->data = NULL;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
[ Upstream commit 68ca1b49e430f6534d0774a94147a823e3b8b26e ]
The root inode has a fixed nodeid and generation (1, 0).
Prior to the commit 15db16837a35 ("fuse: fix illegal access to inode with reused nodeid") generation number on lookup was ignored. After this commit lookup with the wrong generation number resulted in the inode being unhashed. This is correct for non-root inodes, but replacing the root inode is wrong and results in weird behavior.
Fix by reverting to the old behavior if ignoring the generation for the root inode, but issuing a warning in dmesg.
Reported-by: Antonio SJ Musumeci trapexit@spawn.link Closes: https://lore.kernel.org/all/CAOQ4uxhek5ytdN8Yz2tNEOg5ea4NkBb4nk0FGPjPk_9nz-V... Fixes: 15db16837a35 ("fuse: fix illegal access to inode with reused nodeid") Cc: stable@vger.kernel.org # v5.14 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fuse/dir.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index b0c701c007c68..d131f34cd3e13 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -451,6 +451,10 @@ int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name goto out_put_forget; if (fuse_invalid_attr(&outarg->attr)) goto out_put_forget; + if (outarg->nodeid == FUSE_ROOT_ID && outarg->generation != 0) { + pr_warn_once("root generation should be zero\n"); + outarg->generation = 0; + }
*inode = fuse_iget(sb, outarg->nodeid, outarg->generation, &outarg->attr, entry_attr_timeout(outarg),
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
[ Upstream commit b1fe686a765e6c0d71811d825b5a1585a202b777 ]
The root inode is assumed to be always hashed. Do not unhash the root inode even if it is marked BAD.
Fixes: 5d069dbe8aaf ("fuse: fix bad inode") Cc: stable@vger.kernel.org # v5.11 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fuse/fuse_i.h | 1 - fs/fuse/inode.c | 7 +++++-- 2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index ceaa6868386e6..33eb5fefc06b4 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -873,7 +873,6 @@ static inline bool fuse_stale_inode(const struct inode *inode, int generation,
static inline void fuse_make_bad(struct inode *inode) { - remove_inode_hash(inode); set_bit(FUSE_I_BAD, &get_fuse_inode(inode)->state); }
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 9ea175ff9c8e6..4a7ebccd359ee 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -352,8 +352,11 @@ struct inode *fuse_iget(struct super_block *sb, u64 nodeid, } else if (fuse_stale_inode(inode, generation, attr)) { /* nodeid was reused, any I/O on the old inode should fail */ fuse_make_bad(inode); - iput(inode); - goto retry; + if (inode != d_inode(sb->s_root)) { + remove_inode_hash(inode); + iput(inode); + goto retry; + } } done: fi = get_fuse_inode(inode);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jameson Thies jthies@google.com
[ Upstream commit 4d0a5a9915793377c0fe1a8d78de6bcd92cea963 ]
Clean up UCSI_CABLE_PROP macros by fixing a bitmask shifting error for plug type and updating the modal support macro for consistent naming.
Fixes: 3cf657f07918 ("usb: typec: ucsi: Remove all bit-fields") Cc: stable@vger.kernel.org Reviewed-by: Benson Leung bleung@chromium.org Reviewed-by: Prashant Malani pmalani@chromium.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Jameson Thies jthies@google.com Link: https://lore.kernel.org/r/20240305025804.1290919-2-jthies@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/typec/ucsi/ucsi.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/typec/ucsi/ucsi.h b/drivers/usb/typec/ucsi/ucsi.h index fce23ad16c6d0..41e1a64da82e8 100644 --- a/drivers/usb/typec/ucsi/ucsi.h +++ b/drivers/usb/typec/ucsi/ucsi.h @@ -219,12 +219,12 @@ struct ucsi_cable_property { #define UCSI_CABLE_PROP_FLAG_VBUS_IN_CABLE BIT(0) #define UCSI_CABLE_PROP_FLAG_ACTIVE_CABLE BIT(1) #define UCSI_CABLE_PROP_FLAG_DIRECTIONALITY BIT(2) -#define UCSI_CABLE_PROP_FLAG_PLUG_TYPE(_f_) ((_f_) & GENMASK(3, 0)) +#define UCSI_CABLE_PROP_FLAG_PLUG_TYPE(_f_) (((_f_) & GENMASK(4, 3)) >> 3) #define UCSI_CABLE_PROPERTY_PLUG_TYPE_A 0 #define UCSI_CABLE_PROPERTY_PLUG_TYPE_B 1 #define UCSI_CABLE_PROPERTY_PLUG_TYPE_C 2 #define UCSI_CABLE_PROPERTY_PLUG_OTHER 3 -#define UCSI_CABLE_PROP_MODE_SUPPORT BIT(5) +#define UCSI_CABLE_PROP_FLAG_MODE_SUPPORT BIT(5) u8 latency; } __packed;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Petr Mladek pmladek@suse.com
[ Upstream commit ed758b30d541e9bf713cd58612a4414e57dc6d73 ]
Put the code enabling a console by default into a separate function called try_enable_default_console().
Rename try_enable_new_console() to try_enable_preferred_console() to make the purpose of the different variants more clear.
It is a code refactoring without any functional change.
Signed-off-by: Petr Mladek pmladek@suse.com Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Link: https://lore.kernel.org/r/20211122132649.12737-2-pmladek@suse.com Stable-dep-of: 801410b26a0e ("serial: Lock console when calling into driver before registration") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/printk/printk.c | 38 +++++++++++++++++++++++--------------- 1 file changed, 23 insertions(+), 15 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 17a310dcb6d96..632d6d5dcfa70 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2693,7 +2693,8 @@ early_param("keep_bootcon", keep_bootcon_setup); * Care need to be taken with consoles that are statically * enabled such as netconsole */ -static int try_enable_new_console(struct console *newcon, bool user_specified) +static int try_enable_preferred_console(struct console *newcon, + bool user_specified) { struct console_cmdline *c; int i, err; @@ -2741,6 +2742,23 @@ static int try_enable_new_console(struct console *newcon, bool user_specified) return -ENOENT; }
+/* Try to enable the console unconditionally */ +static void try_enable_default_console(struct console *newcon) +{ + if (newcon->index < 0) + newcon->index = 0; + + if (newcon->setup && newcon->setup(newcon, NULL) != 0) + return; + + newcon->flags |= CON_ENABLED; + + if (newcon->device) { + newcon->flags |= CON_CONSDEV; + has_preferred_console = true; + } +} + /* * The console driver calls this routine during kernel initialization * to register the console printing procedure with printk() and to @@ -2797,25 +2815,15 @@ void register_console(struct console *newcon) * didn't select a console we take the first one * that registers here. */ - if (!has_preferred_console) { - if (newcon->index < 0) - newcon->index = 0; - if (newcon->setup == NULL || - newcon->setup(newcon, NULL) == 0) { - newcon->flags |= CON_ENABLED; - if (newcon->device) { - newcon->flags |= CON_CONSDEV; - has_preferred_console = true; - } - } - } + if (!has_preferred_console) + try_enable_default_console(newcon);
/* See if this console matches one we selected on the command line */ - err = try_enable_new_console(newcon, true); + err = try_enable_preferred_console(newcon, true);
/* If not, try to match against the platform default(s) */ if (err == -ENOENT) - err = try_enable_new_console(newcon, false); + err = try_enable_preferred_console(newcon, false);
/* printk() messages are not printed to the Braille console. */ if (err || newcon->flags & CON_BRL)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Collingbourne pcc@google.com
[ Upstream commit 801410b26a0e8b8a16f7915b2b55c9528b69ca87 ]
During the handoff from earlycon to the real console driver, we have two separate drivers operating on the same device concurrently. In the case of the 8250 driver these concurrent accesses cause problems due to the driver's use of banked registers, controlled by LCR.DLAB. It is possible for the setup(), config_port(), pm() and set_mctrl() callbacks to set DLAB, which can cause the earlycon code that intends to access TX to instead access DLL, leading to missed output and corruption on the serial line due to unintended modifications to the baud rate.
In particular, for setup() we have:
univ8250_console_setup() -> serial8250_console_setup() -> uart_set_options() -> serial8250_set_termios() -> serial8250_do_set_termios() -> serial8250_do_set_divisor()
For config_port() we have:
serial8250_config_port() -> autoconfig()
For pm() we have:
serial8250_pm() -> serial8250_do_pm() -> serial8250_set_sleep()
For set_mctrl() we have (for some devices):
serial8250_set_mctrl() -> omap8250_set_mctrl() -> __omap8250_set_mctrl()
To avoid such problems, let's make it so that the console is locked during pre-registration calls to these callbacks, which will prevent the earlycon driver from running concurrently.
Remove the partial solution to this problem in the 8250 driver that locked the console only during autoconfig_irq(), as this would result in a deadlock with the new approach. The console continues to be locked during autoconfig_irq() because it can only be called through uart_configure_port().
Although this patch introduces more locking than strictly necessary (and in particular it also locks during the call to rs485_config() which is not affected by this issue as far as I can tell), it follows the principle that it is the responsibility of the generic console code to manage the earlycon handoff by ensuring that earlycon and real console driver code cannot run concurrently, and not the individual drivers.
Signed-off-by: Peter Collingbourne pcc@google.com Reviewed-by: John Ogness john.ogness@linutronix.de Link: https://linux-review.googlesource.com/id/I7cf8124dcebf8618e6b2ee543fa5b25532... Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240304214350.501253-1-pcc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/8250/8250_port.c | 6 ------ drivers/tty/serial/serial_core.c | 12 ++++++++++++ kernel/printk/printk.c | 21 ++++++++++++++++++--- 3 files changed, 30 insertions(+), 9 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c index 8b49ac4856d2c..6098e87a34046 100644 --- a/drivers/tty/serial/8250/8250_port.c +++ b/drivers/tty/serial/8250/8250_port.c @@ -1358,9 +1358,6 @@ static void autoconfig_irq(struct uart_8250_port *up) inb_p(ICP); }
- if (uart_console(port)) - console_lock(); - /* forget possible initially masked and pending IRQ */ probe_irq_off(probe_irq_on()); save_mcr = serial8250_in_MCR(up); @@ -1391,9 +1388,6 @@ static void autoconfig_irq(struct uart_8250_port *up) if (port->flags & UPF_FOURPORT) outb_p(save_ICP, ICP);
- if (uart_console(port)) - console_unlock(); - port->irq = (irq > 0) ? irq : 0; }
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index 40fff38588d4f..10b8785b99827 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -2431,7 +2431,12 @@ uart_configure_port(struct uart_driver *drv, struct uart_state *state, port->type = PORT_UNKNOWN; flags |= UART_CONFIG_TYPE; } + /* Synchronize with possible boot console. */ + if (uart_console(port)) + console_lock(); port->ops->config_port(port, flags); + if (uart_console(port)) + console_unlock(); }
if (port->type != PORT_UNKNOWN) { @@ -2439,6 +2444,10 @@ uart_configure_port(struct uart_driver *drv, struct uart_state *state,
uart_report_port(drv, port);
+ /* Synchronize with possible boot console. */ + if (uart_console(port)) + console_lock(); + /* Power up port for set_mctrl() */ uart_change_pm(state, UART_PM_STATE_ON);
@@ -2455,6 +2464,9 @@ uart_configure_port(struct uart_driver *drv, struct uart_state *state, port->rs485_config(port, &port->rs485); spin_unlock_irqrestore(&port->lock, flags);
+ if (uart_console(port)) + console_unlock(); + /* * If this driver supports console, and it hasn't been * successfully registered yet, try to re-register it. diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 632d6d5dcfa70..255654ea12447 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2684,6 +2684,21 @@ static int __init keep_bootcon_setup(char *str)
early_param("keep_bootcon", keep_bootcon_setup);
+static int console_call_setup(struct console *newcon, char *options) +{ + int err; + + if (!newcon->setup) + return 0; + + /* Synchronize with possible boot console. */ + console_lock(); + err = newcon->setup(newcon, options); + console_unlock(); + + return err; +} + /* * This is called by register_console() to try to match * the newly registered console with any of the ones selected @@ -2719,8 +2734,8 @@ static int try_enable_preferred_console(struct console *newcon, if (_braille_register_console(newcon, c)) return 0;
- if (newcon->setup && - (err = newcon->setup(newcon, c->options)) != 0) + err = console_call_setup(newcon, c->options); + if (err) return err; } newcon->flags |= CON_ENABLED; @@ -2748,7 +2763,7 @@ static void try_enable_default_console(struct console *newcon) if (newcon->index < 0) newcon->index = 0;
- if (newcon->setup && newcon->setup(newcon, NULL) != 0) + if (console_call_setup(newcon, NULL) != 0) return;
newcon->flags |= CON_ENABLED;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
[ Upstream commit ae6bd7f9b46a29af52ebfac25d395757e2031d0d ]
At contains_pending_extent() the value of the end offset of a chunk we found in the device's allocation state io tree is inclusive, so when we calculate the length we pass to the in_range() macro, we must sum 1 to the expression "physical_end - physical_offset".
In practice the wrong calculation should be harmless as chunks sizes are never 1 byte and we should never have 1 byte ranges of unallocated space. Nevertheless fix the wrong calculation.
Reported-by: Alex Lyakas alex.lyakas@zadara.com Link: https://lore.kernel.org/linux-btrfs/CAOcd+r30e-f4R-5x-S7sV22RJPe7+pgwherA6xq... Fixes: 1c11b63eff2a ("btrfs: replace pending/pinned chunks lists with io tree") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index eaf5cd043dace..9a05313c69f33 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1471,7 +1471,7 @@ static bool contains_pending_extent(struct btrfs_device *device, u64 *start,
if (in_range(physical_start, *start, len) || in_range(*start, physical_start, - physical_end - physical_start)) { + physical_end + 1 - physical_start)) { *start = physical_end + 1; return true; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
[ Upstream commit 097d9d414433315122f759ee6c2d8a7417a8ff0f ]
When the driver core calls pci_device_remove(), there is a driver bound to the device, so pci_dev->driver is never NULL.
Remove the unnecessary test of pci_dev->driver.
Link: https://lore.kernel.org/r/20211004125935.2300113-2-u.kleine-koenig@pengutron... Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Christoph Hellwig hch@lst.de Stable-dep-of: 9d5286d4e7f6 ("PCI/PM: Drain runtime-idle callbacks before driver removal") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci-driver.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index c22cc20db1a74..dbfeb5c148755 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -444,16 +444,14 @@ static int pci_device_remove(struct device *dev) struct pci_dev *pci_dev = to_pci_dev(dev); struct pci_driver *drv = pci_dev->driver;
- if (drv) { - if (drv->remove) { - pm_runtime_get_sync(dev); - drv->remove(pci_dev); - pm_runtime_put_noidle(dev); - } - pcibios_free_irq(pci_dev); - pci_dev->driver = NULL; - pci_iov_remove(pci_dev); + if (drv->remove) { + pm_runtime_get_sync(dev); + drv->remove(pci_dev); + pm_runtime_put_noidle(dev); } + pcibios_free_irq(pci_dev); + pci_dev->driver = NULL; + pci_iov_remove(pci_dev);
/* Undo the runtime PM settings in local_pci_probe() */ pm_runtime_put_sync(dev);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
[ Upstream commit 9d5286d4e7f68beab450deddbb6a32edd5ecf4bf ]
A race condition between the .runtime_idle() callback and the .remove() callback in the rtsx_pcr PCI driver leads to a kernel crash due to an unhandled page fault [1].
The problem is that rtsx_pci_runtime_idle() is not expected to be running after pm_runtime_get_sync() has been called, but the latter doesn't really guarantee that. It only guarantees that the suspend and resume callbacks will not be running when it returns.
However, if a .runtime_idle() callback is already running when pm_runtime_get_sync() is called, the latter will notice that the runtime PM status of the device is RPM_ACTIVE and it will return right away without waiting for the former to complete. In fact, it cannot wait for .runtime_idle() to complete because it may be called from that callback (it arguably does not make much sense to do that, but it is not strictly prohibited).
Thus in general, whoever is providing a .runtime_idle() callback needs to protect it from running in parallel with whatever code runs after pm_runtime_get_sync(). [Note that .runtime_idle() will not start after pm_runtime_get_sync() has returned, but it may continue running then if it has started earlier.]
One way to address that race condition is to call pm_runtime_barrier() after pm_runtime_get_sync() (not before it, because a nonzero value of the runtime PM usage counter is necessary to prevent runtime PM callbacks from being invoked) to wait for the .runtime_idle() callback to complete should it be running at that point. A suitable place for doing that is in pci_device_remove() which calls pm_runtime_get_sync() before removing the driver, so it may as well call pm_runtime_barrier() subsequently, which will prevent the race in question from occurring, not just in the rtsx_pcr driver, but in any PCI drivers providing .runtime_idle() callbacks.
Link: https://lore.kernel.org/lkml/20240229062201.49500-1-kai.heng.feng@canonical.... # [1] Link: https://lore.kernel.org/r/5761426.DvuYhMxLoT@kreacher Reported-by: Kai-Heng Feng kai.heng.feng@canonical.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Tested-by: Ricky Wu ricky_wu@realtek.com Acked-by: Kai-Heng Feng kai.heng.feng@canonical.com Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci-driver.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index dbfeb5c148755..ae675c5a47151 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -446,6 +446,13 @@ static int pci_device_remove(struct device *dev)
if (drv->remove) { pm_runtime_get_sync(dev); + /* + * If the driver provides a .runtime_idle() callback and it has + * started to run already, it may continue to run in parallel + * with the code below, so wait until all of the runtime PM + * activity has completed. + */ + pm_runtime_barrier(dev); drv->remove(pci_dev); pm_runtime_put_noidle(dev); }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean V Kelley sean.v.kelley@intel.com
[ Upstream commit 90655631988f8f501529e6de5f13614389717ead ]
Extend support for Root Complex Event Collectors by decoding and caching the RCEC Endpoint Association Extended Capabilities when enumerating. Use that cached information for later error source reporting. See PCIe r5.0, sec 7.9.10.
Co-developed-by: Qiuxu Zhuo qiuxu.zhuo@intel.com Link: https://lore.kernel.org/r/20201121001036.8560-4-sean.v.kelley@intel.com Tested-by: Jonathan Cameron Jonathan.Cameron@huawei.com # non-native/no RCEC Signed-off-by: Qiuxu Zhuo qiuxu.zhuo@intel.com Signed-off-by: Sean V Kelley sean.v.kelley@intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Stable-dep-of: 627c6db20703 ("PCI/DPC: Quirk PIO log size for Intel Raptor Lake Root Ports") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci.h | 17 +++++++++++ drivers/pci/pcie/Makefile | 2 +- drivers/pci/pcie/rcec.c | 59 +++++++++++++++++++++++++++++++++++++++ drivers/pci/probe.c | 2 ++ include/linux/pci.h | 4 +++ 5 files changed, 83 insertions(+), 1 deletion(-) create mode 100644 drivers/pci/pcie/rcec.c
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 32fa07bfc448e..da40f29036d65 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -442,6 +442,15 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info); void aer_print_error(struct pci_dev *dev, struct aer_err_info *info); #endif /* CONFIG_PCIEAER */
+#ifdef CONFIG_PCIEPORTBUS +/* Cached RCEC Endpoint Association */ +struct rcec_ea { + u8 nextbusn; + u8 lastbusn; + u32 bitmap; +}; +#endif + #ifdef CONFIG_PCIE_DPC void pci_save_dpc_state(struct pci_dev *dev); void pci_restore_dpc_state(struct pci_dev *dev); @@ -456,6 +465,14 @@ static inline void pci_dpc_init(struct pci_dev *pdev) {} static inline bool pci_dpc_recovered(struct pci_dev *pdev) { return false; } #endif
+#ifdef CONFIG_PCIEPORTBUS +void pci_rcec_init(struct pci_dev *dev); +void pci_rcec_exit(struct pci_dev *dev); +#else +static inline void pci_rcec_init(struct pci_dev *dev) {} +static inline void pci_rcec_exit(struct pci_dev *dev) {} +#endif + #ifdef CONFIG_PCI_ATS /* Address Translation Service */ void pci_ats_init(struct pci_dev *dev); diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile index 9a7085668466f..b2980db88cc09 100644 --- a/drivers/pci/pcie/Makefile +++ b/drivers/pci/pcie/Makefile @@ -2,7 +2,7 @@ # # Makefile for PCI Express features and port driver
-pcieportdrv-y := portdrv_core.o portdrv_pci.o err.o +pcieportdrv-y := portdrv_core.o portdrv_pci.o err.o rcec.o
obj-$(CONFIG_PCIEPORTBUS) += pcieportdrv.o
diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c new file mode 100644 index 0000000000000..038e9d706d5fd --- /dev/null +++ b/drivers/pci/pcie/rcec.c @@ -0,0 +1,59 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Root Complex Event Collector Support + * + * Authors: + * Sean V Kelley sean.v.kelley@intel.com + * Qiuxu Zhuo qiuxu.zhuo@intel.com + * + * Copyright (C) 2020 Intel Corp. + */ + +#include <linux/kernel.h> +#include <linux/pci.h> +#include <linux/pci_regs.h> + +#include "../pci.h" + +void pci_rcec_init(struct pci_dev *dev) +{ + struct rcec_ea *rcec_ea; + u32 rcec, hdr, busn; + u8 ver; + + /* Only for Root Complex Event Collectors */ + if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_EC) + return; + + rcec = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_RCEC); + if (!rcec) + return; + + rcec_ea = kzalloc(sizeof(*rcec_ea), GFP_KERNEL); + if (!rcec_ea) + return; + + pci_read_config_dword(dev, rcec + PCI_RCEC_RCIEP_BITMAP, + &rcec_ea->bitmap); + + /* Check whether RCEC BUSN register is present */ + pci_read_config_dword(dev, rcec, &hdr); + ver = PCI_EXT_CAP_VER(hdr); + if (ver >= PCI_RCEC_BUSN_REG_VER) { + pci_read_config_dword(dev, rcec + PCI_RCEC_BUSN, &busn); + rcec_ea->nextbusn = PCI_RCEC_BUSN_NEXT(busn); + rcec_ea->lastbusn = PCI_RCEC_BUSN_LAST(busn); + } else { + /* Avoid later ver check by setting nextbusn */ + rcec_ea->nextbusn = 0xff; + rcec_ea->lastbusn = 0x00; + } + + dev->rcec_ea = rcec_ea; +} + +void pci_rcec_exit(struct pci_dev *dev) +{ + kfree(dev->rcec_ea); + dev->rcec_ea = NULL; +} diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index ece90a23936d2..ab106d2a99479 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2216,6 +2216,7 @@ static void pci_configure_device(struct pci_dev *dev) static void pci_release_capabilities(struct pci_dev *dev) { pci_aer_exit(dev); + pci_rcec_exit(dev); pci_vpd_release(dev); pci_iov_release(dev); pci_free_cap_save_buffers(dev); @@ -2416,6 +2417,7 @@ static void pci_init_capabilities(struct pci_dev *dev) pci_ptm_init(dev); /* Precision Time Measurement */ pci_aer_init(dev); /* Advanced Error Reporting */ pci_dpc_init(dev); /* Downstream Port Containment */ + pci_rcec_init(dev); /* Root Complex Event Collector */
pcie_report_downtraining(dev);
diff --git a/include/linux/pci.h b/include/linux/pci.h index bf46453475e31..1e3df93b39ca9 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -306,6 +306,7 @@ struct pcie_link_state; struct pci_vpd; struct pci_sriov; struct pci_p2pdma; +struct rcec_ea;
/* The pci_dev structure describes PCI devices */ struct pci_dev { @@ -328,6 +329,9 @@ struct pci_dev { #ifdef CONFIG_PCIEAER u16 aer_cap; /* AER capability offset */ struct aer_stats *aer_stats; /* AER stats for this device */ +#endif +#ifdef CONFIG_PCIEPORTBUS + struct rcec_ea *rcec_ea; /* RCEC cached endpoint association */ #endif u8 pcie_cap; /* PCIe capability offset */ u8 msi_cap; /* MSI capability offset */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amey Narkhede ameynarkhede03@gmail.com
[ Upstream commit 69139244806537f9d51364f37fe146bb2ee88a05 ]
Add a new member called devcap in struct pci_dev for caching the PCIe Device Capabilities register to avoid reading PCI_EXP_DEVCAP multiple times.
Refactor pcie_has_flr() to use cached device capabilities.
Link: https://lore.kernel.org/r/20210817180500.1253-2-ameynarkhede03@gmail.com Signed-off-by: Amey Narkhede ameynarkhede03@gmail.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Raphael Norwitz raphael.norwitz@nutanix.com Stable-dep-of: 627c6db20703 ("PCI/DPC: Quirk PIO log size for Intel Raptor Lake Root Ports") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci.c | 6 ++---- drivers/pci/probe.c | 5 +++-- include/linux/pci.h | 1 + 3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 1f8106ec70945..d1631109b1422 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -31,6 +31,7 @@ #include <linux/vmalloc.h> #include <asm/dma.h> #include <linux/aer.h> +#include <linux/bitfield.h> #include "pci.h"
DEFINE_MUTEX(pci_slot_mutex); @@ -4572,13 +4573,10 @@ EXPORT_SYMBOL(pci_wait_for_pending_transaction); */ bool pcie_has_flr(struct pci_dev *dev) { - u32 cap; - if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET) return false;
- pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap); - return cap & PCI_EXP_DEVCAP_FLR; + return FIELD_GET(PCI_EXP_DEVCAP_FLR, dev->devcap) == 1; } EXPORT_SYMBOL_GPL(pcie_has_flr);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index ab106d2a99479..02a75f3b59208 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -19,6 +19,7 @@ #include <linux/hypervisor.h> #include <linux/irqdomain.h> #include <linux/pm_runtime.h> +#include <linux/bitfield.h> #include "pci.h"
#define CARDBUS_LATENCY_TIMER 176 /* secondary latency timer */ @@ -1496,8 +1497,8 @@ void set_pcie_port_type(struct pci_dev *pdev) pdev->pcie_cap = pos; pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, ®16); pdev->pcie_flags_reg = reg16; - pci_read_config_word(pdev, pos + PCI_EXP_DEVCAP, ®16); - pdev->pcie_mpss = reg16 & PCI_EXP_DEVCAP_PAYLOAD; + pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP, &pdev->devcap); + pdev->pcie_mpss = FIELD_GET(PCI_EXP_DEVCAP_PAYLOAD, pdev->devcap);
parent = pci_upstream_bridge(pdev); if (!parent) diff --git a/include/linux/pci.h b/include/linux/pci.h index 1e3df93b39ca9..75f29838d25cf 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -333,6 +333,7 @@ struct pci_dev { #ifdef CONFIG_PCIEPORTBUS struct rcec_ea *rcec_ea; /* RCEC cached endpoint association */ #endif + u32 devcap; /* PCIe Device Capabilities */ u8 pcie_cap; /* PCIe capability offset */ u8 msi_cap; /* MSI capability offset */ u8 msix_cap; /* MSI-X capability offset */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bjorn Helgaas bhelgaas@google.com
[ Upstream commit 500b55b05d0a21c4adddf4c3b29ee6f32b502046 ]
Per PCIe r5, sec 7.5.1.2.4, a device must not claim accesses to its Expansion ROM unless both the Memory Space Enable and the Expansion ROM Enable bit are set. But apparently some Intel I210 NICs don't work correctly if the ROM BAR overlaps another BAR, even if the Expansion ROM is disabled.
Michael reported that on a Kontron SMARC-sAL28 ARM64 system with U-Boot v2021.01-rc3, the ROM BAR overlaps BAR 3, and networking doesn't work at all:
BAR 0: 0x40000000 (32-bit, non-prefetchable) [size=1M] BAR 3: 0x40200000 (32-bit, non-prefetchable) [size=16K] ROM: 0x40200000 (disabled) [size=1M]
NETDEV WATCHDOG: enP2p1s0 (igb): transmit queue 0 timed out Hardware name: Kontron SMARC-sAL28 (Single PHY) on SMARC Eval 2.0 carrier (DT) igb 0002:01:00.0 enP2p1s0: Reset adapter
Previously, pci_std_update_resource() wrote the assigned ROM address to the BAR only when the ROM was enabled. This meant that the I210 ROM BAR could be left with an address assigned by firmware, which might overlap with other BARs.
Quirk these I210 devices so pci_std_update_resource() always writes the assigned address to the ROM BAR, whether or not the ROM is enabled.
Link: https://lore.kernel.org/r/20211223163754.GA1267351@bhelgaas Link: https://lore.kernel.org/r/20201230185317.30915-1-michael@walle.cc Link: https://bugzilla.kernel.org/show_bug.cgi?id=211105 Reported-by: Michael Walle michael@walle.cc Tested-by: Michael Walle michael@walle.cc Signed-off-by: Bjorn Helgaas bhelgaas@google.com Stable-dep-of: 627c6db20703 ("PCI/DPC: Quirk PIO log size for Intel Raptor Lake Root Ports") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 10 ++++++++++ drivers/pci/setup-res.c | 8 ++++++-- include/linux/pci.h | 1 + 3 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 646807a443e2d..d7c4149855eb6 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5852,3 +5852,13 @@ static void nvidia_ion_ahci_fixup(struct pci_dev *pdev) pdev->dev_flags |= PCI_DEV_FLAGS_HAS_MSI_MASKING; } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NVIDIA, 0x0ab8, nvidia_ion_ahci_fixup); + +static void rom_bar_overlap_defect(struct pci_dev *dev) +{ + pci_info(dev, "working around ROM BAR overlap defect\n"); + dev->rom_bar_overlap = 1; +} +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1533, rom_bar_overlap_defect); +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1536, rom_bar_overlap_defect); +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1537, rom_bar_overlap_defect); +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1538, rom_bar_overlap_defect); diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c index 875d50c16f19d..b492e67c3d871 100644 --- a/drivers/pci/setup-res.c +++ b/drivers/pci/setup-res.c @@ -75,12 +75,16 @@ static void pci_std_update_resource(struct pci_dev *dev, int resno) * as zero when disabled, so don't update ROM BARs unless * they're enabled. See * https://lore.kernel.org/r/43147B3D.1030309@vc.cvut.cz/ + * But we must update ROM BAR for buggy devices where even a + * disabled ROM can conflict with other BARs. */ - if (!(res->flags & IORESOURCE_ROM_ENABLE)) + if (!(res->flags & IORESOURCE_ROM_ENABLE) && + !dev->rom_bar_overlap) return;
reg = dev->rom_base_reg; - new |= PCI_ROM_ADDRESS_ENABLE; + if (res->flags & IORESOURCE_ROM_ENABLE) + new |= PCI_ROM_ADDRESS_ENABLE; } else return;
diff --git a/include/linux/pci.h b/include/linux/pci.h index 75f29838d25cf..5b24a6fbfa0be 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -454,6 +454,7 @@ struct pci_dev { unsigned int link_active_reporting:1;/* Device capable of reporting link active */ unsigned int no_vf_scan:1; /* Don't scan for VFs after IOV enablement */ unsigned int no_command_memory:1; /* No PCI_COMMAND_MEMORY */ + unsigned int rom_bar_overlap:1; /* ROM BAR disable broken */ pci_dev_flags_t dev_flags; atomic_t enable_cnt; /* pci_enable_device has been called */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mika Westerberg mika.westerberg@linux.intel.com
[ Upstream commit 03038d84ace72678a9944524508f218a00377dc0 ]
Intel DG2 discrete graphics PCIe endpoints advertise L1 acceptable exit latency to be < 1us even though they can actually tolerate unlimited exit latencies just fine. Quirk the L1 acceptable exit latency for these endpoints to be unlimited so ASPM L1 can be enabled.
[bhelgaas: use FIELD_GET/FIELD_PREP, wordsmith comment & commit log] Link: https://lore.kernel.org/r/20220405093810.76613-1-mika.westerberg@linux.intel... Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com Stable-dep-of: 627c6db20703 ("PCI/DPC: Quirk PIO log size for Intel Raptor Lake Root Ports") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 47 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index d7c4149855eb6..3f494d6653284 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -12,6 +12,7 @@ * file, where their drivers can use them. */
+#include <linux/bitfield.h> #include <linux/types.h> #include <linux/kernel.h> #include <linux/export.h> @@ -5862,3 +5863,49 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1533, rom_bar_overlap_defect); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1536, rom_bar_overlap_defect); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1537, rom_bar_overlap_defect); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1538, rom_bar_overlap_defect); + +#ifdef CONFIG_PCIEASPM +/* + * Several Intel DG2 graphics devices advertise that they can only tolerate + * 1us latency when transitioning from L1 to L0, which may prevent ASPM L1 + * from being enabled. But in fact these devices can tolerate unlimited + * latency. Override their Device Capabilities value to allow ASPM L1 to + * be enabled. + */ +static void aspm_l1_acceptable_latency(struct pci_dev *dev) +{ + u32 l1_lat = FIELD_GET(PCI_EXP_DEVCAP_L1, dev->devcap); + + if (l1_lat < 7) { + dev->devcap |= FIELD_PREP(PCI_EXP_DEVCAP_L1, 7); + pci_info(dev, "ASPM: overriding L1 acceptable latency from %#x to 0x7\n", + l1_lat); + } +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f80, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f81, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f82, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f83, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f84, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f85, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f86, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f87, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f88, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5690, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5691, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5692, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5693, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5694, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5695, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a0, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a1, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a2, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a3, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a4, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a5, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a6, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b0, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency); +#endif
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mika Westerberg mika.westerberg@linux.intel.com
[ Upstream commit 5459c0b7046752e519a646e1c2404852bb628459 ]
Some Root Ports on Intel Tiger Lake and Alder Lake systems support the RP Extensions for DPC and the RP PIO Log registers but incorrectly advertise an RP PIO Log Size of zero. This means the kernel complains that:
DPC: RP PIO log size 0 is invalid
and if DPC is triggered, the DPC driver will not dump the RP PIO Log registers when it should.
This is caused by a BIOS bug and should be fixed the BIOS for future CPUs.
Add a quirk to set the correct RP PIO Log size for the affected Root Ports.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=209943 Link: https://lore.kernel.org/r/20220816102042.69125-1-mika.westerberg@linux.intel... Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Stable-dep-of: 627c6db20703 ("PCI/DPC: Quirk PIO log size for Intel Raptor Lake Root Ports") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/dpc.c | 15 ++++++++++----- drivers/pci/quirks.c | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c index cf0d4ba2e157a..ab83f78f3eb1d 100644 --- a/drivers/pci/pcie/dpc.c +++ b/drivers/pci/pcie/dpc.c @@ -335,11 +335,16 @@ void pci_dpc_init(struct pci_dev *pdev) return;
pdev->dpc_rp_extensions = true; - pdev->dpc_rp_log_size = (cap & PCI_EXP_DPC_RP_PIO_LOG_SIZE) >> 8; - if (pdev->dpc_rp_log_size < 4 || pdev->dpc_rp_log_size > 9) { - pci_err(pdev, "RP PIO log size %u is invalid\n", - pdev->dpc_rp_log_size); - pdev->dpc_rp_log_size = 0; + + /* Quirks may set dpc_rp_log_size if device or firmware is buggy */ + if (!pdev->dpc_rp_log_size) { + pdev->dpc_rp_log_size = + (cap & PCI_EXP_DPC_RP_PIO_LOG_SIZE) >> 8; + if (pdev->dpc_rp_log_size < 4 || pdev->dpc_rp_log_size > 9) { + pci_err(pdev, "RP PIO log size %u is invalid\n", + pdev->dpc_rp_log_size); + pdev->dpc_rp_log_size = 0; + } } }
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 3f494d6653284..a467e3ce6fb10 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5909,3 +5909,39 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency); #endif + +#ifdef CONFIG_PCIE_DPC +/* + * Intel Tiger Lake and Alder Lake BIOS has a bug that clears the DPC + * RP PIO Log Size of the integrated Thunderbolt PCIe Root Ports. + */ +static void dpc_log_size(struct pci_dev *dev) +{ + u16 dpc, val; + + dpc = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC); + if (!dpc) + return; + + pci_read_config_word(dev, dpc + PCI_EXP_DPC_CAP, &val); + if (!(val & PCI_EXP_DPC_CAP_RP_EXT)) + return; + + if (!((val & PCI_EXP_DPC_RP_PIO_LOG_SIZE) >> 8)) { + pci_info(dev, "Overriding RP PIO Log Size to 4\n"); + dev->dpc_rp_log_size = 4; + } +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x461f, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x462f, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x463f, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x466e, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a23, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a25, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a27, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a29, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2b, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2d, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2f, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size); +#endif
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Menzel pmenzel@molgen.mpg.de
[ Upstream commit 627c6db20703b5d18d928464f411d0d4ec327508 ]
Commit 5459c0b70467 ("PCI/DPC: Quirk PIO log size for certain Intel Root Ports") and commit 3b8803494a06 ("PCI/DPC: Quirk PIO log size for Intel Ice Lake Root Ports") add quirks for Ice, Tiger and Alder Lake Root Ports. System firmware for Raptor Lake still has the bug, so Linux logs the warning below on several Raptor Lake systems like Dell Precision 3581 with Intel Raptor Lake processor (0W18NX) system firmware/BIOS version 1.10.1.
pci 0000:00:07.0: [8086:a76e] type 01 class 0x060400 pci 0000:00:07.0: DPC: RP PIO log size 0 is invalid pci 0000:00:07.1: [8086:a73f] type 01 class 0x060400 pci 0000:00:07.1: DPC: RP PIO log size 0 is invalid
Apply the quirk for Raptor Lake Root Ports as well.
This also enables the DPC driver to dump the RP PIO Log registers when DPC is triggered.
Link: https://lore.kernel.org/r/20240305113057.56468-1-pmenzel@molgen.mpg.de Reported-by: Niels van Aert nvaert1986@hotmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218560 Signed-off-by: Paul Menzel pmenzel@molgen.mpg.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Cc: Mika Westerberg mika.westerberg@linux.intel.com Cc: Niels van Aert nvaert1986@hotmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index a467e3ce6fb10..eca4ed45bb4ee 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5944,4 +5944,6 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2b, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2d, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0xa73f, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0xa76e, dpc_log_size); #endif
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Song Liu song@kernel.org
[ Upstream commit 3445139e3a594be77eff48bc17eff67cf983daed ]
This reverts commit bed9e27baf52a09b7ba2a3714f1e24e17ced386d.
The original set [1][2] was expected to undo a suboptimal fix in [2], and replace it with a better fix [1]. However, as reported by Dan Moulding [2] causes an issue with raid5 with journal device.
Revert [2] for now to close the issue. We will follow up on another issue reported by Juxiao Bi, as [2] is expected to fix it. We believe this is a good trade-off, because the latter issue happens less freqently.
In the meanwhile, we will NOT revert [1], as it contains the right logic.
[1] commit d6e035aad6c0 ("md: bypass block throttle for superblock update") [2] commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")
Reported-by: Dan Moulding dan@danm.net Closes: https://lore.kernel.org/linux-raid/20240123005700.9302-1-dan@danm.net/ Fixes: bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"") Cc: stable@vger.kernel.org # v5.19+ Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Yu Kuai yukuai3@huawei.com Signed-off-by: Song Liu song@kernel.org Reviewed-by: Yu Kuai yukuai3@huawei.com Link: https://lore.kernel.org/r/20240125082131.788600-1-song@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/raid5.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 00995e60d46b1..9f114b9d8dc6b 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -36,6 +36,7 @@ */
#include <linux/blkdev.h> +#include <linux/delay.h> #include <linux/kthread.h> #include <linux/raid/pq.h> #include <linux/async_tx.h> @@ -6519,7 +6520,18 @@ static void raid5d(struct md_thread *thread) spin_unlock_irq(&conf->device_lock); md_check_recovery(mddev); spin_lock_irq(&conf->device_lock); + + /* + * Waiting on MD_SB_CHANGE_PENDING below may deadlock + * seeing md_check_recovery() is needed to clear + * the flag when using mdmon. + */ + continue; } + + wait_event_lock_irq(mddev->sb_wait, + !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags), + conf->device_lock); } pr_debug("%d stripes handled\n", handled);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit 95009ae904b1e9dca8db6f649f2d7c18a6e42c75 ]
The lockdep assert is added by commit a448af25becf ("md/raid10: remove rcu protection to access rdev from conf") in print_conf(). And I didn't notice that dm-raid is calling "pers->hot_add_disk" without holding 'reconfig_mutex'.
"pers->hot_add_disk" read and write many fields that is protected by 'reconfig_mutex', and raid_resume() already grab the lock in other contex. Hence fix this problem by protecting "pers->host_add_disk" with the lock.
Fixes: 9092c02d9435 ("DM RAID: Add ability to restore transiently failed devices on resume") Fixes: a448af25becf ("md/raid10: remove rcu protection to access rdev from conf") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Xiao Ni xni@redhat.com Acked-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Song Liu song@kernel.org Link: https://lore.kernel.org/r/20240305072306.2562024-10-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-raid.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index e523ecdf947f4..99995b1804b32 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -4019,7 +4019,9 @@ static void raid_resume(struct dm_target *ti) * Take this opportunity to check whether any failed * devices are reachable again. */ + mddev_lock_nointr(mddev); attempt_restore_of_faulty_devices(rs); + mddev_unlock(mddev); }
if (test_and_clear_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fedor Pchelkin pchelkin@ispras.ru
[ Upstream commit e8a1e58345cf40b7b272e08ac7b32328b2543e40 ]
mac802154_llsec_key_del() can free resources of a key directly without following the RCU rules for waiting before the end of a grace period. This may lead to use-after-free in case llsec_lookup_key() is traversing the list of keys in parallel with a key deletion:
refcount_t: addition on 0; use-after-free. WARNING: CPU: 4 PID: 16000 at lib/refcount.c:25 refcount_warn_saturate+0x162/0x2a0 Modules linked in: CPU: 4 PID: 16000 Comm: wpan-ping Not tainted 6.7.0 #19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 RIP: 0010:refcount_warn_saturate+0x162/0x2a0 Call Trace: <TASK> llsec_lookup_key.isra.0+0x890/0x9e0 mac802154_llsec_encrypt+0x30c/0x9c0 ieee802154_subif_start_xmit+0x24/0x1e0 dev_hard_start_xmit+0x13e/0x690 sch_direct_xmit+0x2ae/0xbc0 __dev_queue_xmit+0x11dd/0x3c20 dgram_sendmsg+0x90b/0xd60 __sys_sendto+0x466/0x4c0 __x64_sys_sendto+0xe0/0x1c0 do_syscall_64+0x45/0xf0 entry_SYSCALL_64_after_hwframe+0x6e/0x76
Also, ieee802154_llsec_key_entry structures are not freed by mac802154_llsec_key_del():
unreferenced object 0xffff8880613b6980 (size 64): comm "iwpan", pid 2176, jiffies 4294761134 (age 60.475s) hex dump (first 32 bytes): 78 0d 8f 18 80 88 ff ff 22 01 00 00 00 00 ad de x......."....... 00 00 00 00 00 00 00 00 03 00 cd ab 00 00 00 00 ................ backtrace: [<ffffffff81dcfa62>] __kmem_cache_alloc_node+0x1e2/0x2d0 [<ffffffff81c43865>] kmalloc_trace+0x25/0xc0 [<ffffffff88968b09>] mac802154_llsec_key_add+0xac9/0xcf0 [<ffffffff8896e41a>] ieee802154_add_llsec_key+0x5a/0x80 [<ffffffff8892adc6>] nl802154_add_llsec_key+0x426/0x5b0 [<ffffffff86ff293e>] genl_family_rcv_msg_doit+0x1fe/0x2f0 [<ffffffff86ff46d1>] genl_rcv_msg+0x531/0x7d0 [<ffffffff86fee7a9>] netlink_rcv_skb+0x169/0x440 [<ffffffff86ff1d88>] genl_rcv+0x28/0x40 [<ffffffff86fec15c>] netlink_unicast+0x53c/0x820 [<ffffffff86fecd8b>] netlink_sendmsg+0x93b/0xe60 [<ffffffff86b91b35>] ____sys_sendmsg+0xac5/0xca0 [<ffffffff86b9c3dd>] ___sys_sendmsg+0x11d/0x1c0 [<ffffffff86b9c65a>] __sys_sendmsg+0xfa/0x1d0 [<ffffffff88eadbf5>] do_syscall_64+0x45/0xf0 [<ffffffff890000ea>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
Handle the proper resource release in the RCU callback function mac802154_llsec_key_del_rcu().
Note that if llsec_lookup_key() finds a key, it gets a refcount via llsec_key_get() and locally copies key id from key_entry (which is a list element). So it's safe to call llsec_key_put() and free the list entry after the RCU grace period elapses.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 5d637d5aabd8 ("mac802154: add llsec structures and mutators") Cc: stable@vger.kernel.org Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Acked-by: Alexander Aring aahringo@redhat.com Message-ID: 20240228163840.6667-1-pchelkin@ispras.ru Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/cfg802154.h | 1 + net/mac802154/llsec.c | 18 +++++++++++++----- 2 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/include/net/cfg802154.h b/include/net/cfg802154.h index 6ed07844eb244..5290781abba3d 100644 --- a/include/net/cfg802154.h +++ b/include/net/cfg802154.h @@ -257,6 +257,7 @@ struct ieee802154_llsec_key {
struct ieee802154_llsec_key_entry { struct list_head list; + struct rcu_head rcu;
struct ieee802154_llsec_key_id id; struct ieee802154_llsec_key *key; diff --git a/net/mac802154/llsec.c b/net/mac802154/llsec.c index 55550ead2ced8..a4cc9d077c59c 100644 --- a/net/mac802154/llsec.c +++ b/net/mac802154/llsec.c @@ -265,19 +265,27 @@ int mac802154_llsec_key_add(struct mac802154_llsec *sec, return -ENOMEM; }
+static void mac802154_llsec_key_del_rcu(struct rcu_head *rcu) +{ + struct ieee802154_llsec_key_entry *pos; + struct mac802154_llsec_key *mkey; + + pos = container_of(rcu, struct ieee802154_llsec_key_entry, rcu); + mkey = container_of(pos->key, struct mac802154_llsec_key, key); + + llsec_key_put(mkey); + kfree_sensitive(pos); +} + int mac802154_llsec_key_del(struct mac802154_llsec *sec, const struct ieee802154_llsec_key_id *key) { struct ieee802154_llsec_key_entry *pos;
list_for_each_entry(pos, &sec->table.keys, list) { - struct mac802154_llsec_key *mkey; - - mkey = container_of(pos->key, struct mac802154_llsec_key, key); - if (llsec_key_id_equal(&pos->id, key)) { list_del_rcu(&pos->list); - llsec_key_put(mkey); + call_rcu(&pos->rcu, mac802154_llsec_key_del_rcu); return 0; } }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryan Roberts ryan.roberts@arm.com
[ Upstream commit 82b1c07a0af603e3c47b906c8e991dc96f01688e ]
There was previously a theoretical window where swapoff() could run and teardown a swap_info_struct while a call to free_swap_and_cache() was running in another thread. This could cause, amongst other bad possibilities, swap_page_trans_huge_swapped() (called by free_swap_and_cache()) to access the freed memory for swap_map.
This is a theoretical problem and I haven't been able to provoke it from a test case. But there has been agreement based on code review that this is possible (see link below).
Fix it by using get_swap_device()/put_swap_device(), which will stall swapoff(). There was an extra check in _swap_info_get() to confirm that the swap entry was not free. This isn't present in get_swap_device() because it doesn't make sense in general due to the race between getting the reference and swapoff. So I've added an equivalent check directly in free_swap_and_cache().
Details of how to provoke one possible issue (thanks to David Hildenbrand for deriving this):
--8<-----
__swap_entry_free() might be the last user and result in "count == SWAP_HAS_CACHE".
swapoff->try_to_unuse() will stop as soon as soon as si->inuse_pages==0.
So the question is: could someone reclaim the folio and turn si->inuse_pages==0, before we completed swap_page_trans_huge_swapped().
Imagine the following: 2 MiB folio in the swapcache. Only 2 subpages are still references by swap entries.
Process 1 still references subpage 0 via swap entry. Process 2 still references subpage 1 via swap entry.
Process 1 quits. Calls free_swap_and_cache(). -> count == SWAP_HAS_CACHE [then, preempted in the hypervisor etc.]
Process 2 quits. Calls free_swap_and_cache(). -> count == SWAP_HAS_CACHE
Process 2 goes ahead, passes swap_page_trans_huge_swapped(), and calls __try_to_reclaim_swap().
__try_to_reclaim_swap()->folio_free_swap()->delete_from_swap_cache()-> put_swap_folio()->free_swap_slot()->swapcache_free_entries()-> swap_entry_free()->swap_range_free()-> ... WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
What stops swapoff to succeed after process 2 reclaimed the swap cache but before process1 finished its call to swap_page_trans_huge_swapped()?
--8<-----
Link: https://lkml.kernel.org/r/20240306140356.3974886-1-ryan.roberts@arm.com Fixes: 7c00bafee87c ("mm/swap: free swap slots in batch") Closes: https://lore.kernel.org/linux-mm/65a66eb9-41f8-4790-8db2-0c70ea15979f@redhat... Signed-off-by: Ryan Roberts ryan.roberts@arm.com Cc: David Hildenbrand david@redhat.com Cc: "Huang, Ying" ying.huang@intel.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/swapfile.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c index 86ade667a7af6..4ca1d04d8732f 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1271,6 +1271,11 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p, }
/* + * Note that when only holding the PTL, swapoff might succeed immediately + * after freeing a swap entry. Therefore, immediately after + * __swap_entry_free(), the swap info might become stale and should not + * be touched without a prior get_swap_device(). + * * Check whether swap entry is valid in the swap device. If so, * return pointer to swap_info_struct, and keep the swap entry valid * via preventing the swap device from being swapoff, until @@ -1797,13 +1802,19 @@ int free_swap_and_cache(swp_entry_t entry) if (non_swap_entry(entry)) return 1;
- p = _swap_info_get(entry); + p = get_swap_device(entry); if (p) { + if (WARN_ON(data_race(!p->swap_map[swp_offset(entry)]))) { + put_swap_device(p); + return 0; + } + count = __swap_entry_free(p, entry); if (count == SWAP_HAS_CACHE && !swap_page_trans_huge_swapped(p, entry)) __try_to_reclaim_swap(p, swp_offset(entry), TTRS_UNMAPPED | TTRS_FULL); + put_swap_device(p); } return p != NULL; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dominique Martinet dominique.martinet@atmark-techno.com
[ Upstream commit 4af59a8df5ea930038cd3355e822f5eedf4accc1 ]
Commit e7794c14fd73 ("mmc: rpmb: fixes pause retune on all RPMB partitions.") added a mask check for 'part_type', but the mask used was wrong leading to the code intended for rpmb also being executed for GP3.
On some MMCs (but not all) this would make gp3 partition inaccessible: armadillo:~# head -c 1 < /dev/mmcblk2gp3 head: standard input: I/O error armadillo:~# dmesg -c [ 422.976583] mmc2: running CQE recovery [ 423.058182] mmc2: running CQE recovery [ 423.137607] mmc2: running CQE recovery [ 423.137802] blk_update_request: I/O error, dev mmcblk2gp3, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 4 prio class 0 [ 423.237125] mmc2: running CQE recovery [ 423.318206] mmc2: running CQE recovery [ 423.397680] mmc2: running CQE recovery [ 423.397837] blk_update_request: I/O error, dev mmcblk2gp3, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [ 423.408287] Buffer I/O error on dev mmcblk2gp3, logical block 0, async page read
the part_type values of interest here are defined as follow: main 0 boot0 1 boot1 2 rpmb 3 gp0 4 gp1 5 gp2 6 gp3 7
so mask with EXT_CSD_PART_CONFIG_ACC_MASK (7) to correctly identify rpmb
Fixes: e7794c14fd73 ("mmc: rpmb: fixes pause retune on all RPMB partitions.") Cc: stable@vger.kernel.org Cc: Jorge Ramirez-Ortiz jorge@foundries.io Signed-off-by: Dominique Martinet dominique.martinet@atmark-techno.com Reviewed-by: Linus Walleij linus.walleij@linaro.org Link: https://lore.kernel.org/r/20240306-mmc-partswitch-v1-1-bf116985d950@codewrec... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/core/block.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c index 2058f31a1bce6..d826a1bb39f2c 100644 --- a/drivers/mmc/core/block.c +++ b/drivers/mmc/core/block.c @@ -823,10 +823,11 @@ static const struct block_device_operations mmc_bdops = { static int mmc_blk_part_switch_pre(struct mmc_card *card, unsigned int part_type) { - const unsigned int mask = EXT_CSD_PART_CONFIG_ACC_RPMB; + const unsigned int mask = EXT_CSD_PART_CONFIG_ACC_MASK; + const unsigned int rpmb = EXT_CSD_PART_CONFIG_ACC_RPMB; int ret = 0;
- if ((part_type & mask) == mask) { + if ((part_type & mask) == rpmb) { if (card->ext_csd.cmdq_en) { ret = mmc_cmdq_disable(card); if (ret) @@ -841,10 +842,11 @@ static int mmc_blk_part_switch_pre(struct mmc_card *card, static int mmc_blk_part_switch_post(struct mmc_card *card, unsigned int part_type) { - const unsigned int mask = EXT_CSD_PART_CONFIG_ACC_RPMB; + const unsigned int mask = EXT_CSD_PART_CONFIG_ACC_MASK; + const unsigned int rpmb = EXT_CSD_PART_CONFIG_ACC_RPMB; int ret = 0;
- if ((part_type & mask) == mask) { + if ((part_type & mask) == rpmb) { mmc_retune_unpause(card->host); if (card->reenable_cmdq && !card->ext_csd.cmdq_en) ret = mmc_cmdq_enable(card);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian Gmeiner cgmeiner@igalia.com
[ Upstream commit b735ee173f84d5d0d0733c53946a83c12d770d05 ]
The hwdb selection logic as a feature that allows it to mark some fields as 'don't care'. If we match with such a field we memcpy(..) the current etnaviv_chip_identity into ident.
This step can overwrite some id values read from the GPU with the 'don't care' value.
Fix this issue by restoring the affected values after the memcpy(..).
As this is crucial for user space to know when this feature works as expected increment the minor version too.
Fixes: 4078a1186dd3 ("drm/etnaviv: update hwdb selection logic") Cc: stable@vger.kernel.org Signed-off-by: Christian Gmeiner cgmeiner@igalia.com Reviewed-by: Tomeu Vizoso tomeu@tomeuvizoso.net Signed-off-by: Lucas Stach l.stach@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_hwdb.c | 9 +++++++++ 2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c index a9a3afaef9a1c..edf9387069cdc 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c @@ -511,7 +511,7 @@ static struct drm_driver etnaviv_drm_driver = { .desc = "etnaviv DRM", .date = "20151214", .major = 1, - .minor = 3, + .minor = 4, };
/* diff --git a/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c b/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c index 167971a09be79..e9b777ab3be7f 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c @@ -73,6 +73,9 @@ static const struct etnaviv_chip_identity etnaviv_chip_identities[] = { bool etnaviv_fill_identity_from_hwdb(struct etnaviv_gpu *gpu) { struct etnaviv_chip_identity *ident = &gpu->identity; + const u32 product_id = ident->product_id; + const u32 customer_id = ident->customer_id; + const u32 eco_id = ident->eco_id; int i;
for (i = 0; i < ARRAY_SIZE(etnaviv_chip_identities); i++) { @@ -86,6 +89,12 @@ bool etnaviv_fill_identity_from_hwdb(struct etnaviv_gpu *gpu) etnaviv_chip_identities[i].eco_id == ~0U)) { memcpy(ident, &etnaviv_chip_identities[i], sizeof(*ident)); + + /* Restore some id values as ~0U aka 'don't care' might been used. */ + ident->product_id = product_id; + ident->customer_id = customer_id; + ident->eco_id = eco_id; + return true; } }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josua Mayer josua@solid-run.com
[ Upstream commit 3f003fda98a7a8d5f399057d92e6ed56b468657c ]
Add of_match table for "ti,amc6821" compatible string. This fixes automatic driver loading by userspace when using device-tree, and if built as a module like major linux distributions do.
While devices probe just fine with i2c_device_id table, userspace can't match the "ti,amc6821" compatible string from dt with the plain "amc6821" device id. As a result, the kernel module can not be loaded.
Cc: stable@vger.kernel.org Signed-off-by: Josua Mayer josua@solid-run.com Link: https://lore.kernel.org/r/20240307-amc6821-of-match-v1-1-5f40464a3110@solid-... [groeck: Cleaned up patch description] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/amc6821.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/hwmon/amc6821.c b/drivers/hwmon/amc6821.c index 6b1ce2242c618..60dfdb0f55231 100644 --- a/drivers/hwmon/amc6821.c +++ b/drivers/hwmon/amc6821.c @@ -934,10 +934,21 @@ static const struct i2c_device_id amc6821_id[] = {
MODULE_DEVICE_TABLE(i2c, amc6821_id);
+static const struct of_device_id __maybe_unused amc6821_of_match[] = { + { + .compatible = "ti,amc6821", + .data = (void *)amc6821, + }, + { } +}; + +MODULE_DEVICE_TABLE(of, amc6821_of_match); + static struct i2c_driver amc6821_driver = { .class = I2C_CLASS_HWMON, .driver = { .name = "amc6821", + .of_match_table = of_match_ptr(amc6821_of_match), }, .probe_new = amc6821_probe, .id_table = amc6821_id,
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maximilian Heyne mheyne@amazon.de
[ Upstream commit a6b3bfe176e8a5b05ec4447404e412c2a3fc92cc ]
We observed a corruption during on-line resize of a file system that is larger than 16 TiB with 4k block size. With having more then 2^32 blocks resize_inode is turned off by default by mke2fs. The issue can be reproduced on a smaller file system for convenience by explicitly turning off resize_inode. An on-line resize across an 8 GiB boundary (the size of a meta block group in this setup) then leads to a corruption:
dev=/dev/<some_dev> # should be >= 16 GiB mkdir -p /corruption /sbin/mke2fs -t ext4 -b 4096 -O ^resize_inode $dev $((2 * 2**21 - 2**15)) mount -t ext4 $dev /corruption
dd if=/dev/zero bs=4096 of=/corruption/test count=$((2*2**21 - 4*2**15)) sha1sum /corruption/test # 79d2658b39dcfd77274e435b0934028adafaab11 /corruption/test
/sbin/resize2fs $dev $((2*2**21)) # drop page cache to force reload the block from disk echo 1 > /proc/sys/vm/drop_caches
sha1sum /corruption/test # 3c2abc63cbf1a94c9e6977e0fbd72cd832c4d5c3 /corruption/test
2^21 = 2^15*2^6 equals 8 GiB whereof 2^15 is the number of blocks per block group and 2^6 are the number of block groups that make a meta block group.
The last checksum might be different depending on how the file is laid out across the physical blocks. The actual corruption occurs at physical block 63*2^15 = 2064384 which would be the location of the backup of the meta block group's block descriptor. During the on-line resize the file system will be converted to meta_bg starting at s_first_meta_bg which is 2 in the example - meaning all block groups after 16 GiB. However, in ext4_flex_group_add we might add block groups that are not part of the first meta block group yet. In the reproducer we achieved this by substracting the size of a whole block group from the point where the meta block group would start. This must be considered when updating the backup block group descriptors to follow the non-meta_bg layout. The fix is to add a test whether the group to add is already part of the meta block group or not.
Fixes: 01f795f9e0d67 ("ext4: add online resizing support for meta_bg and 64-bit file systems") Cc: stable@vger.kernel.org Signed-off-by: Maximilian Heyne mheyne@amazon.de Tested-by: Srivathsa Dara srivathsa.d.dara@oracle.com Reviewed-by: Srivathsa Dara srivathsa.d.dara@oracle.com Link: https://lore.kernel.org/r/20240215155009.94493-1-mheyne@amazon.de Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/resize.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c index 06e0eaf2ea4e1..5d45ec29e61a6 100644 --- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -1545,7 +1545,8 @@ static int ext4_flex_group_add(struct super_block *sb, int gdb_num = group / EXT4_DESC_PER_BLOCK(sb); int gdb_num_end = ((group + flex_gd->count - 1) / EXT4_DESC_PER_BLOCK(sb)); - int meta_bg = ext4_has_feature_meta_bg(sb); + int meta_bg = ext4_has_feature_meta_bg(sb) && + gdb_num >= le32_to_cpu(es->s_first_meta_bg); sector_t padding_blocks = meta_bg ? 0 : sbi->s_sbh->b_blocknr - ext4_group_first_block_no(sb, 0); sector_t old_gdb = 0;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jerome Brunet jbrunet@baylibre.com
[ Upstream commit cbd38332c140829ab752ba4e727f98be5c257f18 ]
clang-16 warns about casting functions to incompatible types, as is done here to call clk_disable_unprepare:
drivers/nvmem/meson-efuse.c:78:12: error: cast from 'void (*)(struct clk *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict] 78 | (void(*)(void *))clk_disable_unprepare,
The pattern of getting, enabling and setting a disable callback for a clock can be replaced with devm_clk_get_enabled(), which also fixes this warning.
Fixes: 611fbca1c861 ("nvmem: meson-efuse: add peripheral clock") Cc: Stable@vger.kernel.org Reported-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Jerome Brunet jbrunet@baylibre.com Reviewed-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Acked-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Justin Stitt justinstitt@google.com Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20240224114023.85535-2-srinivas.kandagatla@linaro.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvmem/meson-efuse.c | 25 +++---------------------- 1 file changed, 3 insertions(+), 22 deletions(-)
diff --git a/drivers/nvmem/meson-efuse.c b/drivers/nvmem/meson-efuse.c index d6b533497ce1a..ba2714bef8d0e 100644 --- a/drivers/nvmem/meson-efuse.c +++ b/drivers/nvmem/meson-efuse.c @@ -47,7 +47,6 @@ static int meson_efuse_probe(struct platform_device *pdev) struct nvmem_config *econfig; struct clk *clk; unsigned int size; - int ret;
sm_np = of_parse_phandle(pdev->dev.of_node, "secure-monitor", 0); if (!sm_np) { @@ -60,27 +59,9 @@ static int meson_efuse_probe(struct platform_device *pdev) if (!fw) return -EPROBE_DEFER;
- clk = devm_clk_get(dev, NULL); - if (IS_ERR(clk)) { - ret = PTR_ERR(clk); - if (ret != -EPROBE_DEFER) - dev_err(dev, "failed to get efuse gate"); - return ret; - } - - ret = clk_prepare_enable(clk); - if (ret) { - dev_err(dev, "failed to enable gate"); - return ret; - } - - ret = devm_add_action_or_reset(dev, - (void(*)(void *))clk_disable_unprepare, - clk); - if (ret) { - dev_err(dev, "failed to add disable callback"); - return ret; - } + clk = devm_clk_get_enabled(dev, NULL); + if (IS_ERR(clk)) + return dev_err_probe(dev, PTR_ERR(clk), "failed to get efuse gate");
if (meson_sm_call(fw, SM_EFUSE_USER_MAX, &size, 0, 0, 0, 0, 0) < 0) { dev_err(dev, "failed to get max user");
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 89ffa4cccec54467446f141a79b9e36893079fb8 ]
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove().
Note that the upper limit of ida_simple_get() is exclusive, but the one of ida_alloc_range() is inclusive. So change this change allows one more device. Previously address 0xFE was never used.
Fixes: 46a2bb5a7f7e ("slimbus: core: Add slim controllers support") Cc: Stable@vger.kernel.org Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20240224114137.85781-2-srinivas.kandagatla@linaro.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/slimbus/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/slimbus/core.c b/drivers/slimbus/core.c index 1d2bc181da050..69f6178f294c8 100644 --- a/drivers/slimbus/core.c +++ b/drivers/slimbus/core.c @@ -438,8 +438,8 @@ static int slim_device_alloc_laddr(struct slim_device *sbdev, if (ret < 0) goto err; } else if (report_present) { - ret = ida_simple_get(&ctrl->laddr_ida, - 0, SLIM_LA_MANAGER - 1, GFP_KERNEL); + ret = ida_alloc_max(&ctrl->laddr_ida, + SLIM_LA_MANAGER - 1, GFP_KERNEL); if (ret < 0) goto err;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wayne Chang waynec@nvidia.com
[ Upstream commit d843f031d9e90462253015bc0bd9e3852d206bf2 ]
This patch introduces a new API, tegra_xusb_padctl_get_port_number, to the Tegra XUSB Pad Controller driver. This API is used to identify the USB port that is associated with a given PHY.
The function takes a PHY pointer for either a USB2 PHY or USB3 PHY as input and returns the corresponding port number. If the PHY pointer is invalid, it returns -ENODEV.
Cc: stable@vger.kernel.org Signed-off-by: Wayne Chang waynec@nvidia.com Reviewed-by: Jon Hunter jonathanh@nvidia.com Tested-by: Jon Hunter jonathanh@nvidia.com Link: https://lore.kernel.org/r/20240307030328.1487748-2-waynec@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/tegra/xusb.c | 13 +++++++++++++ include/linux/phy/tegra/xusb.h | 2 ++ 2 files changed, 15 insertions(+)
diff --git a/drivers/phy/tegra/xusb.c b/drivers/phy/tegra/xusb.c index 8f11b293c48d1..856397def89ac 100644 --- a/drivers/phy/tegra/xusb.c +++ b/drivers/phy/tegra/xusb.c @@ -1399,6 +1399,19 @@ int tegra_xusb_padctl_get_usb3_companion(struct tegra_xusb_padctl *padctl, } EXPORT_SYMBOL_GPL(tegra_xusb_padctl_get_usb3_companion);
+int tegra_xusb_padctl_get_port_number(struct phy *phy) +{ + struct tegra_xusb_lane *lane; + + if (!phy) + return -ENODEV; + + lane = phy_get_drvdata(phy); + + return lane->index; +} +EXPORT_SYMBOL_GPL(tegra_xusb_padctl_get_port_number); + MODULE_AUTHOR("Thierry Reding treding@nvidia.com"); MODULE_DESCRIPTION("Tegra XUSB Pad Controller driver"); MODULE_LICENSE("GPL v2"); diff --git a/include/linux/phy/tegra/xusb.h b/include/linux/phy/tegra/xusb.h index 71d956935405f..86ccd90e425c7 100644 --- a/include/linux/phy/tegra/xusb.h +++ b/include/linux/phy/tegra/xusb.h @@ -23,4 +23,6 @@ int tegra_xusb_padctl_set_vbus_override(struct tegra_xusb_padctl *padctl, int tegra_phy_xusb_utmi_port_reset(struct phy *phy); int tegra_xusb_padctl_get_usb3_companion(struct tegra_xusb_padctl *padctl, unsigned int port); +int tegra_xusb_padctl_get_port_number(struct phy *phy); + #endif /* PHY_TEGRA_XUSB_H */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jon Hunter jonathanh@nvidia.com
[ Upstream commit 77b57218ac2f37da4e8b72e78f002944b9f85091 ]
Rather than testing if the error code is -EPROBE_DEFER before printing an error message, use dev_err_probe() instead to simplify the code.
Signed-off-by: Jon Hunter jonathanh@nvidia.com Link: https://lore.kernel.org/r/20210519163553.212682-2-jonathanh@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: 84fa943d93c3 ("usb: gadget: tegra-xudc: Fix USB3 PHY retrieval logic") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/tegra-xudc.c | 20 ++++++-------------- 1 file changed, 6 insertions(+), 14 deletions(-)
diff --git a/drivers/usb/gadget/udc/tegra-xudc.c b/drivers/usb/gadget/udc/tegra-xudc.c index c5f0fbb8ffe47..23e31eec442dc 100644 --- a/drivers/usb/gadget/udc/tegra-xudc.c +++ b/drivers/usb/gadget/udc/tegra-xudc.c @@ -3508,10 +3508,8 @@ static int tegra_xudc_phy_get(struct tegra_xudc *xudc) xudc->utmi_phy[i] = devm_phy_optional_get(xudc->dev, phy_name); if (IS_ERR(xudc->utmi_phy[i])) { err = PTR_ERR(xudc->utmi_phy[i]); - if (err != -EPROBE_DEFER) - dev_err(xudc->dev, "failed to get usb2-%d PHY: %d\n", - i, err); - + dev_err_probe(xudc->dev, err, + "failed to get usb2-%d PHY\n", i); goto clean_up; } else if (xudc->utmi_phy[i]) { /* Get usb-phy, if utmi phy is available */ @@ -3538,10 +3536,8 @@ static int tegra_xudc_phy_get(struct tegra_xudc *xudc) xudc->usb3_phy[i] = devm_phy_optional_get(xudc->dev, phy_name); if (IS_ERR(xudc->usb3_phy[i])) { err = PTR_ERR(xudc->usb3_phy[i]); - if (err != -EPROBE_DEFER) - dev_err(xudc->dev, "failed to get usb3-%d PHY: %d\n", - usb3, err); - + dev_err_probe(xudc->dev, err, + "failed to get usb3-%d PHY\n", usb3); goto clean_up; } else if (xudc->usb3_phy[i]) dev_dbg(xudc->dev, "usb3-%d PHY registered", usb3); @@ -3781,9 +3777,7 @@ static int tegra_xudc_probe(struct platform_device *pdev)
err = devm_clk_bulk_get(&pdev->dev, xudc->soc->num_clks, xudc->clks); if (err) { - if (err != -EPROBE_DEFER) - dev_err(xudc->dev, "failed to request clocks: %d\n", err); - + dev_err_probe(xudc->dev, err, "failed to request clocks\n"); return err; }
@@ -3798,9 +3792,7 @@ static int tegra_xudc_probe(struct platform_device *pdev) err = devm_regulator_bulk_get(&pdev->dev, xudc->soc->num_supplies, xudc->supplies); if (err) { - if (err != -EPROBE_DEFER) - dev_err(xudc->dev, "failed to request regulators: %d\n", err); - + dev_err_probe(xudc->dev, err, "failed to request regulators\n"); return err; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wayne Chang waynec@nvidia.com
[ Upstream commit 84fa943d93c31ee978355e6c6c69592dae3c9f59 ]
This commit resolves an issue in the tegra-xudc USB gadget driver that incorrectly fetched USB3 PHY instances. The problem stemmed from the assumption of a one-to-one correspondence between USB2 and USB3 PHY names and their association with physical USB ports in the device tree.
Previously, the driver associated USB3 PHY names directly with the USB3 instance number, leading to mismatches when mapping the physical USB ports. For instance, if using USB3-1 PHY, the driver expect the corresponding PHY name as 'usb3-1'. However, the physical USB ports in the device tree were designated as USB2-0 and USB3-0 as we only have one device controller, causing a misalignment.
This commit rectifies the issue by adjusting the PHY naming logic. Now, the driver correctly correlates the USB2 and USB3 PHY instances, allowing the USB2-0 and USB3-1 PHYs to form a physical USB port pair while accurately reflecting their configuration in the device tree by naming them USB2-0 and USB3-0, respectively.
The change ensures that the PHY and PHY names align appropriately, resolving the mismatch between physical USB ports and their associated names in the device tree.
Fixes: b4e19931c98a ("usb: gadget: tegra-xudc: Support multiple device modes") Cc: stable@vger.kernel.org Signed-off-by: Wayne Chang waynec@nvidia.com Reviewed-by: Jon Hunter jonathanh@nvidia.com Tested-by: Jon Hunter jonathanh@nvidia.com Link: https://lore.kernel.org/r/20240307030328.1487748-3-waynec@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/tegra-xudc.c | 39 ++++++++++++++++++----------- 1 file changed, 25 insertions(+), 14 deletions(-)
diff --git a/drivers/usb/gadget/udc/tegra-xudc.c b/drivers/usb/gadget/udc/tegra-xudc.c index 23e31eec442dc..9d3e36c867e83 100644 --- a/drivers/usb/gadget/udc/tegra-xudc.c +++ b/drivers/usb/gadget/udc/tegra-xudc.c @@ -3480,8 +3480,8 @@ static void tegra_xudc_device_params_init(struct tegra_xudc *xudc)
static int tegra_xudc_phy_get(struct tegra_xudc *xudc) { - int err = 0, usb3; - unsigned int i; + int err = 0, usb3_companion_port; + unsigned int i, j;
xudc->utmi_phy = devm_kcalloc(xudc->dev, xudc->soc->num_phys, sizeof(*xudc->utmi_phy), GFP_KERNEL); @@ -3509,7 +3509,7 @@ static int tegra_xudc_phy_get(struct tegra_xudc *xudc) if (IS_ERR(xudc->utmi_phy[i])) { err = PTR_ERR(xudc->utmi_phy[i]); dev_err_probe(xudc->dev, err, - "failed to get usb2-%d PHY\n", i); + "failed to get PHY for phy-name usb2-%d\n", i); goto clean_up; } else if (xudc->utmi_phy[i]) { /* Get usb-phy, if utmi phy is available */ @@ -3528,19 +3528,30 @@ static int tegra_xudc_phy_get(struct tegra_xudc *xudc) }
/* Get USB3 phy */ - usb3 = tegra_xusb_padctl_get_usb3_companion(xudc->padctl, i); - if (usb3 < 0) + usb3_companion_port = tegra_xusb_padctl_get_usb3_companion(xudc->padctl, i); + if (usb3_companion_port < 0) continue;
- snprintf(phy_name, sizeof(phy_name), "usb3-%d", usb3); - xudc->usb3_phy[i] = devm_phy_optional_get(xudc->dev, phy_name); - if (IS_ERR(xudc->usb3_phy[i])) { - err = PTR_ERR(xudc->usb3_phy[i]); - dev_err_probe(xudc->dev, err, - "failed to get usb3-%d PHY\n", usb3); - goto clean_up; - } else if (xudc->usb3_phy[i]) - dev_dbg(xudc->dev, "usb3-%d PHY registered", usb3); + for (j = 0; j < xudc->soc->num_phys; j++) { + snprintf(phy_name, sizeof(phy_name), "usb3-%d", j); + xudc->usb3_phy[i] = devm_phy_optional_get(xudc->dev, phy_name); + if (IS_ERR(xudc->usb3_phy[i])) { + err = PTR_ERR(xudc->usb3_phy[i]); + dev_err_probe(xudc->dev, err, + "failed to get PHY for phy-name usb3-%d\n", j); + goto clean_up; + } else if (xudc->usb3_phy[i]) { + int usb2_port = + tegra_xusb_padctl_get_port_number(xudc->utmi_phy[i]); + int usb3_port = + tegra_xusb_padctl_get_port_number(xudc->usb3_phy[i]); + if (usb3_port == usb3_companion_port) { + dev_dbg(xudc->dev, "USB2 port %d is paired with USB3 port %d for device mode port %d\n", + usb2_port, usb3_port, i); + break; + } + } + } }
return err;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Samuel Thibault samuel.thibault@ens-lyon.org
[ Upstream commit b6c8dafc9d86eb77e502bb018ec4105e8d2fbf78 ]
When userland echoes 8bit characters to /dev/synth with e.g.
echo -e '\xe9' > /dev/synth
synth_write would get characters beyond 0x7f, and thus negative when char is signed. When given to synth_buffer_add which takes a u16, this would sign-extend and produce a U+ffxy character rather than U+xy. Users thus get garbled text instead of accents in their output.
Let's fix this by making sure that we read unsigned characters.
Signed-off-by: Samuel Thibault samuel.thibault@ens-lyon.org Fixes: 89fc2ae80bb1 ("speakup: extend synth buffer to 16bit unicode characters") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240204155736.2oh4ot7tiaa2wpbh@begin Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/accessibility/speakup/synth.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/accessibility/speakup/synth.c b/drivers/accessibility/speakup/synth.c index ac47dbac72075..82cfc5ec6bdf9 100644 --- a/drivers/accessibility/speakup/synth.c +++ b/drivers/accessibility/speakup/synth.c @@ -208,8 +208,10 @@ void spk_do_flush(void) wake_up_process(speakup_task); }
-void synth_write(const char *buf, size_t count) +void synth_write(const char *_buf, size_t count) { + const unsigned char *buf = (const unsigned char *) _buf; + while (count--) synth_buffer_add(*buf++); synth_start();
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean V Kelley sean.v.kelley@intel.com
[ Upstream commit aa344bc8b727b47b4350b59d8166216a3f351e55 ]
In some cases a bridge may not exist as the hardware controlling may be handled only by firmware and so is not visible to the OS. This scenario is also possible in future use cases involving non-native use of RCECs by firmware. In this scenario, we expect the platform to retain control of the bridge and to clear error status itself.
Clear error status only when the OS has native control of AER.
Signed-off-by: Sean V Kelley sean.v.kelley@intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Stable-dep-of: 002bf2fbc00e ("PCI/AER: Block runtime suspend when handling errors") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/err.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index 984aa023c753f..a806dfd94586c 100644 --- a/drivers/pci/pcie/err.c +++ b/drivers/pci/pcie/err.c @@ -176,6 +176,7 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, int type = pci_pcie_type(dev); struct pci_dev *bridge; pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER; + struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
/* * If the error was detected by a Root Port, Downstream Port, or @@ -227,9 +228,17 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, pci_dbg(bridge, "broadcast resume message\n"); pci_walk_bridge(bridge, report_resume, &status);
- if (pcie_aer_is_native(bridge)) + /* + * If we have native control of AER, clear error status in the Root + * Port or Downstream Port that signaled the error. If the + * platform retained control of AER, it is responsible for clearing + * this status. In that case, the signaling device may not even be + * visible to the OS. + */ + if (host->native_aer || pcie_ports_native) { pcie_clear_device_status(bridge); - pci_aer_clear_nonfatal_status(bridge); + pci_aer_clear_nonfatal_status(bridge); + } pci_info(bridge, "device recovery successful\n"); return status;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stanislaw Gruszka stanislaw.gruszka@linux.intel.com
[ Upstream commit 002bf2fbc00e5c4b95fb167287e2ae7d1973281e ]
PM runtime can be done simultaneously with AER error handling. Avoid that by using pm_runtime_get_sync() before and pm_runtime_put() after reset in pcie_do_recovery() for all recovering devices.
pm_runtime_get_sync() will increase dev->power.usage_count counter to prevent any possible future request to runtime suspend a device. It will also resume a device, if it was previously in D3hot state.
I tested with igc device by doing simultaneous aer_inject and rpm suspend/resume via /sys/bus/pci/devices/PCI_ID/power/control and can reproduce:
igc 0000:02:00.0: not ready 65535ms after bus reset; giving up pcieport 0000:00:1c.2: AER: Root Port link has been reset (-25) pcieport 0000:00:1c.2: AER: subordinate device reset failed pcieport 0000:00:1c.2: AER: device recovery failed igc 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible
The problem disappears when this patch is applied.
Link: https://lore.kernel.org/r/20240212120135.146068-1-stanislaw.gruszka@linux.in... Signed-off-by: Stanislaw Gruszka stanislaw.gruszka@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Acked-by: Rafael J. Wysocki rafael@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/err.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index a806dfd94586c..bb4febcf45ba1 100644 --- a/drivers/pci/pcie/err.c +++ b/drivers/pci/pcie/err.c @@ -13,6 +13,7 @@ #define dev_fmt(fmt) "AER: " fmt
#include <linux/pci.h> +#include <linux/pm_runtime.h> #include <linux/module.h> #include <linux/kernel.h> #include <linux/errno.h> @@ -79,6 +80,18 @@ static int report_error_detected(struct pci_dev *dev, return 0; }
+static int pci_pm_runtime_get_sync(struct pci_dev *pdev, void *data) +{ + pm_runtime_get_sync(&pdev->dev); + return 0; +} + +static int pci_pm_runtime_put(struct pci_dev *pdev, void *data) +{ + pm_runtime_put(&pdev->dev); + return 0; +} + static int report_frozen_detected(struct pci_dev *dev, void *data) { return report_error_detected(dev, pci_channel_io_frozen, data); @@ -194,6 +207,8 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, else bridge = pci_upstream_bridge(dev);
+ pci_walk_bridge(bridge, pci_pm_runtime_get_sync, NULL); + pci_dbg(bridge, "broadcast error_detected message\n"); if (state == pci_channel_io_frozen) { pci_walk_bridge(bridge, report_frozen_detected, &status); @@ -239,10 +254,15 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, pcie_clear_device_status(bridge); pci_aer_clear_nonfatal_status(bridge); } + + pci_walk_bridge(bridge, pci_pm_runtime_put, NULL); + pci_info(bridge, "device recovery successful\n"); return status;
failed: + pci_walk_bridge(bridge, pci_pm_runtime_put, NULL); + pci_uevent_ers(bridge, PCI_ERS_RESULT_DISCONNECT);
/* TODO: Should kernel panic here? */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 17f46b803d4f23c66cacce81db35fef3adb8f2af ]
In production we have been hitting the following warning consistently
------------[ cut here ]------------ refcount_t: underflow; use-after-free. WARNING: CPU: 17 PID: 1800359 at lib/refcount.c:28 refcount_warn_saturate+0x9c/0xe0 Workqueue: nfsiod nfs_direct_write_schedule_work [nfs] RIP: 0010:refcount_warn_saturate+0x9c/0xe0 PKRU: 55555554 Call Trace: <TASK> ? __warn+0x9f/0x130 ? refcount_warn_saturate+0x9c/0xe0 ? report_bug+0xcc/0x150 ? handle_bug+0x3d/0x70 ? exc_invalid_op+0x16/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0x9c/0xe0 nfs_direct_write_schedule_work+0x237/0x250 [nfs] process_one_work+0x12f/0x4a0 worker_thread+0x14e/0x3b0 ? ZSTD_getCParams_internal+0x220/0x220 kthread+0xdc/0x120 ? __btf_name_valid+0xa0/0xa0 ret_from_fork+0x1f/0x30
This is because we're completing the nfs_direct_request twice in a row.
The source of this is when we have our commit requests to submit, we process them and send them off, and then in the completion path for the commit requests we have
if (nfs_commit_end(cinfo.mds)) nfs_direct_write_complete(dreq);
However since we're submitting asynchronous requests we sometimes have one that completes before we submit the next one, so we end up calling complete on the nfs_direct_request twice.
The only other place we use nfs_generic_commit_list() is in __nfs_commit_inode, which wraps this call in a
nfs_commit_begin(); nfs_commit_end();
Which is a common pattern for this style of completion handling, one that is also repeated in the direct code with get_dreq()/put_dreq() calls around where we process events as well as in the completion paths.
Fix this by using the same pattern for the commit requests.
Before with my 200 node rocksdb stress running this warning would pop every 10ish minutes. With my patch the stress test has been running for several hours without popping.
Signed-off-by: Josef Bacik josef@toxicpanda.com Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/direct.c | 11 +++++++++-- fs/nfs/write.c | 2 +- include/linux/nfs_fs.h | 1 + 3 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index 5d86ffa72ceab..499519f0f6ecd 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -678,10 +678,17 @@ static void nfs_direct_commit_schedule(struct nfs_direct_req *dreq) LIST_HEAD(mds_list);
nfs_init_cinfo_from_dreq(&cinfo, dreq); + nfs_commit_begin(cinfo.mds); nfs_scan_commit(dreq->inode, &mds_list, &cinfo); res = nfs_generic_commit_list(dreq->inode, &mds_list, 0, &cinfo); - if (res < 0) /* res == -ENOMEM */ - nfs_direct_write_reschedule(dreq); + if (res < 0) { /* res == -ENOMEM */ + spin_lock(&dreq->lock); + if (dreq->flags == 0) + dreq->flags = NFS_ODIRECT_RESCHED_WRITES; + spin_unlock(&dreq->lock); + } + if (nfs_commit_end(cinfo.mds)) + nfs_direct_write_complete(dreq); }
static void nfs_direct_write_clear_reqs(struct nfs_direct_req *dreq) diff --git a/fs/nfs/write.c b/fs/nfs/write.c index d3cd099ffb6e1..4cf0606919794 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1637,7 +1637,7 @@ static int wait_on_commit(struct nfs_mds_commit_info *cinfo) !atomic_read(&cinfo->rpcs_out)); }
-static void nfs_commit_begin(struct nfs_mds_commit_info *cinfo) +void nfs_commit_begin(struct nfs_mds_commit_info *cinfo) { atomic_inc(&cinfo->rpcs_out); } diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index e39342945a80b..7488864589a7a 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -553,6 +553,7 @@ extern int nfs_wb_page_cancel(struct inode *inode, struct page* page); extern int nfs_commit_inode(struct inode *, int); extern struct nfs_commit_data *nfs_commitdata_alloc(void); extern void nfs_commit_free(struct nfs_commit_data *data); +void nfs_commit_begin(struct nfs_mds_commit_info *cinfo); bool nfs_commit_end(struct nfs_mds_commit_info *cinfo);
static inline int
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
[ Upstream commit 75b5ab134bb5f657ef7979a59106dce0657e8d87 ]
Clang enables -Wenum-enum-conversion and -Wenum-compare-conditional under -Wenum-conversion. A recent change in Clang strengthened these warnings and they appear frequently in common builds, primarily due to several instances in common headers but there are quite a few drivers that have individual instances as well.
include/linux/vmstat.h:508:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion] 508 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~ ^ 509 | item]; | ~~~~
drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c:955:24: warning: conditional expression between different enumeration types ('enum iwl_mac_beacon_flags' and 'enum iwl_mac_beacon_flags_v1') [-Wenum-compare-conditional] 955 | flags |= is_new_rate ? IWL_MAC_BEACON_CCK | ^ ~~~~~~~~~~~~~~~~~~ 956 | : IWL_MAC_BEACON_CCK_V1; | ~~~~~~~~~~~~~~~~~~~~~ drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c:1120:21: warning: conditional expression between different enumeration types ('enum iwl_mac_beacon_flags' and 'enum iwl_mac_beacon_flags_v1') [-Wenum-compare-conditional] 1120 | 0) > 10 ? | ^ 1121 | IWL_MAC_BEACON_FILS : | ~~~~~~~~~~~~~~~~~~~ 1122 | IWL_MAC_BEACON_FILS_V1; | ~~~~~~~~~~~~~~~~~~~~~~
Doing arithmetic between or returning two different types of enums could be a bug, so each of the instance of the warning needs to be evaluated. Unfortunately, as mentioned above, there are many instances of this warning in many different configurations, which can break the build when CONFIG_WERROR is enabled.
To avoid introducing new instances of the warnings while cleaning up the disruption for the majority of users, disable these warnings for the default build while leaving them on for W=1 builds.
Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/2002 Link: https://github.com/llvm/llvm-project/commit/8c2ae42b3e1c6aa7c18f873edcebff7c... Acked-by: Yonghong Song yonghong.song@linux.dev Signed-off-by: Nathan Chancellor nathan@kernel.org Acked-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/Makefile.extrawarn | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/scripts/Makefile.extrawarn b/scripts/Makefile.extrawarn index fe327a4532ddb..eb01e8d07e280 100644 --- a/scripts/Makefile.extrawarn +++ b/scripts/Makefile.extrawarn @@ -53,6 +53,8 @@ KBUILD_CFLAGS += $(call cc-disable-warning, pointer-to-enum-cast) KBUILD_CFLAGS += -Wno-tautological-constant-out-of-range-compare KBUILD_CFLAGS += $(call cc-disable-warning, unaligned-access) KBUILD_CFLAGS += $(call cc-disable-warning, cast-function-type-strict) +KBUILD_CFLAGS += -Wno-enum-compare-conditional +KBUILD_CFLAGS += -Wno-enum-enum-conversion endif
endif
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Niklas Cassel cassel@kernel.org
[ Upstream commit 72e34b8593e08a0ee759b7a038e0b178418ea6f8 ]
The commit message in commit fc9a77040b04 ("PCI: designware-ep: Configure Resizable BAR cap to advertise the smallest size") claims that it modifies the Resizable BAR capability to only advertise support for 1 MB size BARs.
However, the commit writes all zeroes to PCI_REBAR_CAP (the register which contains the possible BAR sizes that a BAR be resized to).
According to the spec, it is illegal to not have a bit set in PCI_REBAR_CAP, and 1 MB is the smallest size allowed.
Set bit 4 in PCI_REBAR_CAP, so that we actually advertise support for a 1 MB BAR size.
Before: Capabilities: [2e8 v1] Physical Resizable BAR BAR 0: current size: 1MB BAR 1: current size: 1MB BAR 2: current size: 1MB BAR 3: current size: 1MB BAR 4: current size: 1MB BAR 5: current size: 1MB After: Capabilities: [2e8 v1] Physical Resizable BAR BAR 0: current size: 1MB, supported: 1MB BAR 1: current size: 1MB, supported: 1MB BAR 2: current size: 1MB, supported: 1MB BAR 3: current size: 1MB, supported: 1MB BAR 4: current size: 1MB, supported: 1MB BAR 5: current size: 1MB, supported: 1MB
Fixes: fc9a77040b04 ("PCI: designware-ep: Configure Resizable BAR cap to advertise the smallest size") Link: https://lore.kernel.org/linux-pci/20240307111520.3303774-1-cassel@kernel.org Signed-off-by: Niklas Cassel cassel@kernel.org Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Reviewed-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Cc: stable@vger.kernel.org # 5.2 Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/dwc/pcie-designware-ep.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c index 339318e790e21..8ed1df61f9c7c 100644 --- a/drivers/pci/controller/dwc/pcie-designware-ep.c +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c @@ -662,8 +662,13 @@ int dw_pcie_ep_init_complete(struct dw_pcie_ep *ep) nbars = (reg & PCI_REBAR_CTRL_NBAR_MASK) >> PCI_REBAR_CTRL_NBAR_SHIFT;
+ /* + * PCIe r6.0, sec 7.8.6.2 require us to support at least one + * size in the range from 1 MB to 512 GB. Advertise support + * for 1 MB BAR size only. + */ for (i = 0; i < nbars; i++, offset += PCI_REBAR_CTRL) - dw_pcie_writel_dbi(pci, offset + PCI_REBAR_CAP, 0x0); + dw_pcie_writel_dbi(pci, offset + PCI_REBAR_CAP, BIT(4)); }
dw_pcie_setup(pci);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Williamson alex.williamson@redhat.com
[ Upstream commit fcdc0d3d40bc26c105acf8467f7d9018970944ae ]
irqfds for mask and unmask that are not specifically disabled by the user are leaked. Remove any irqfds during cleanup
Cc: Eric Auger eric.auger@redhat.com Cc: stable@vger.kernel.org Fixes: a7fa7c77cf15 ("vfio/platform: implement IRQ masking/unmasking via an eventfd") Reviewed-by: Kevin Tian kevin.tian@intel.com Reviewed-by: Eric Auger eric.auger@redhat.com Link: https://lore.kernel.org/r/20240308230557.805580-6-alex.williamson@redhat.com Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vfio/platform/vfio_platform_irq.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index c5b09ec0a3c98..f2893f2fcaabd 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -321,8 +321,11 @@ void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev) { int i;
- for (i = 0; i < vdev->num_irqs; i++) + for (i = 0; i < vdev->num_irqs; i++) { + vfio_virqfd_disable(&vdev->irqs[i].mask); + vfio_virqfd_disable(&vdev->irqs[i].unmask); vfio_set_trigger(vdev, i, -1, NULL); + }
vdev->num_irqs = 0; kfree(vdev->irqs);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
[ Upstream commit b3594573681b53316ec0365332681a30463edfd6 ]
A task can wait on a ring buffer for when it fills up to a specific watermark. The writer will check the minimum watermark that waiters are waiting for and if the ring buffer is past that, it will wake up all the waiters.
The waiters are in a wait loop, and will first check if a signal is pending and then check if the ring buffer is at the desired level where it should break out of the loop.
If a file that uses a ring buffer closes, and there's threads waiting on the ring buffer, it needs to wake up those threads. To do this, a "wait_index" was used.
Before entering the wait loop, the waiter will read the wait_index. On wakeup, it will check if the wait_index is different than when it entered the loop, and will exit the loop if it is. The waker will only need to update the wait_index before waking up the waiters.
This had a couple of bugs. One trivial one and one broken by design.
The trivial bug was that the waiter checked the wait_index after the schedule() call. It had to be checked between the prepare_to_wait() and the schedule() which it was not.
The main bug is that the first check to set the default wait_index will always be outside the prepare_to_wait() and the schedule(). That's because the ring_buffer_wait() doesn't have enough context to know if it should break out of the loop.
The loop itself is not needed, because all the callers to the ring_buffer_wait() also has their own loop, as the callers have a better sense of what the context is to decide whether to break out of the loop or not.
Just have the ring_buffer_wait() block once, and if it gets woken up, exit the function and let the callers decide what to do next.
Link: https://lore.kernel.org/all/CAHk-=whs5MdtNjzFkTyaUy=vHi=qwWgPi0JgTe6OYUYMNSR... Link: https://lore.kernel.org/linux-trace-kernel/20240308202431.792933613@goodmis....
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: linke li lilinke99@qq.com Cc: Rabin Vincent rabin@rab.in Fixes: e30f53aad2202 ("tracing: Do not busy wait in buffer splice") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Stable-dep-of: 761d9473e27f ("ring-buffer: Do not set shortest_full when full target is hit") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/ring_buffer.c | 139 ++++++++++++++++++------------------- 1 file changed, 68 insertions(+), 71 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 4a43b8846b49f..936b560989a3e 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -415,7 +415,6 @@ struct rb_irq_work { struct irq_work work; wait_queue_head_t waiters; wait_queue_head_t full_waiters; - long wait_index; bool waiters_pending; bool full_waiters_pending; bool wakeup_full; @@ -862,14 +861,40 @@ void ring_buffer_wake_waiters(struct trace_buffer *buffer, int cpu) rbwork = &cpu_buffer->irq_work; }
- rbwork->wait_index++; - /* make sure the waiters see the new index */ - smp_wmb(); - /* This can be called in any context */ irq_work_queue(&rbwork->work); }
+static bool rb_watermark_hit(struct trace_buffer *buffer, int cpu, int full) +{ + struct ring_buffer_per_cpu *cpu_buffer; + bool ret = false; + + /* Reads of all CPUs always waits for any data */ + if (cpu == RING_BUFFER_ALL_CPUS) + return !ring_buffer_empty(buffer); + + cpu_buffer = buffer->buffers[cpu]; + + if (!ring_buffer_empty_cpu(buffer, cpu)) { + unsigned long flags; + bool pagebusy; + + if (!full) + return true; + + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); + pagebusy = cpu_buffer->reader_page == cpu_buffer->commit_page; + ret = !pagebusy && full_hit(buffer, cpu, full); + + if (!cpu_buffer->shortest_full || + cpu_buffer->shortest_full > full) + cpu_buffer->shortest_full = full; + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); + } + return ret; +} + /** * ring_buffer_wait - wait for input to the ring buffer * @buffer: buffer to wait on @@ -885,7 +910,6 @@ int ring_buffer_wait(struct trace_buffer *buffer, int cpu, int full) struct ring_buffer_per_cpu *cpu_buffer; DEFINE_WAIT(wait); struct rb_irq_work *work; - long wait_index; int ret = 0;
/* @@ -904,81 +928,54 @@ int ring_buffer_wait(struct trace_buffer *buffer, int cpu, int full) work = &cpu_buffer->irq_work; }
- wait_index = READ_ONCE(work->wait_index); - - while (true) { - if (full) - prepare_to_wait(&work->full_waiters, &wait, TASK_INTERRUPTIBLE); - else - prepare_to_wait(&work->waiters, &wait, TASK_INTERRUPTIBLE); - - /* - * The events can happen in critical sections where - * checking a work queue can cause deadlocks. - * After adding a task to the queue, this flag is set - * only to notify events to try to wake up the queue - * using irq_work. - * - * We don't clear it even if the buffer is no longer - * empty. The flag only causes the next event to run - * irq_work to do the work queue wake up. The worse - * that can happen if we race with !trace_empty() is that - * an event will cause an irq_work to try to wake up - * an empty queue. - * - * There's no reason to protect this flag either, as - * the work queue and irq_work logic will do the necessary - * synchronization for the wake ups. The only thing - * that is necessary is that the wake up happens after - * a task has been queued. It's OK for spurious wake ups. - */ - if (full) - work->full_waiters_pending = true; - else - work->waiters_pending = true; - - if (signal_pending(current)) { - ret = -EINTR; - break; - } - - if (cpu == RING_BUFFER_ALL_CPUS && !ring_buffer_empty(buffer)) - break; - - if (cpu != RING_BUFFER_ALL_CPUS && - !ring_buffer_empty_cpu(buffer, cpu)) { - unsigned long flags; - bool pagebusy; - bool done; - - if (!full) - break; - - raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); - pagebusy = cpu_buffer->reader_page == cpu_buffer->commit_page; - done = !pagebusy && full_hit(buffer, cpu, full); + if (full) + prepare_to_wait(&work->full_waiters, &wait, TASK_INTERRUPTIBLE); + else + prepare_to_wait(&work->waiters, &wait, TASK_INTERRUPTIBLE);
- if (!cpu_buffer->shortest_full || - cpu_buffer->shortest_full > full) - cpu_buffer->shortest_full = full; - raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); - if (done) - break; - } + /* + * The events can happen in critical sections where + * checking a work queue can cause deadlocks. + * After adding a task to the queue, this flag is set + * only to notify events to try to wake up the queue + * using irq_work. + * + * We don't clear it even if the buffer is no longer + * empty. The flag only causes the next event to run + * irq_work to do the work queue wake up. The worse + * that can happen if we race with !trace_empty() is that + * an event will cause an irq_work to try to wake up + * an empty queue. + * + * There's no reason to protect this flag either, as + * the work queue and irq_work logic will do the necessary + * synchronization for the wake ups. The only thing + * that is necessary is that the wake up happens after + * a task has been queued. It's OK for spurious wake ups. + */ + if (full) + work->full_waiters_pending = true; + else + work->waiters_pending = true;
- schedule(); + if (rb_watermark_hit(buffer, cpu, full)) + goto out;
- /* Make sure to see the new wait index */ - smp_rmb(); - if (wait_index != work->wait_index) - break; + if (signal_pending(current)) { + ret = -EINTR; + goto out; }
+ schedule(); + out: if (full) finish_wait(&work->full_waiters, &wait); else finish_wait(&work->waiters, &wait);
+ if (!ret && !rb_watermark_hit(buffer, cpu, full) && signal_pending(current)) + ret = -EINTR; + return ret; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
[ Upstream commit 761d9473e27f0c8782895013a3e7b52a37c8bcfc ]
The rb_watermark_hit() checks if the amount of data in the ring buffer is above the percentage level passed in by the "full" variable. If it is, it returns true.
But it also sets the "shortest_full" field of the cpu_buffer that informs writers that it needs to call the irq_work if the amount of data on the ring buffer is above the requested amount.
The rb_watermark_hit() always sets the shortest_full even if the amount in the ring buffer is what it wants. As it is not going to wait, because it has what it wants, there's no reason to set shortest_full.
Link: https://lore.kernel.org/linux-trace-kernel/20240312115641.6aa8ba08@gandalf.l...
Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Fixes: 42fb0a1e84ff5 ("tracing/ring-buffer: Have polling block on watermark") Reviewed-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/ring_buffer.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 936b560989a3e..5b665e5991bf7 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -887,9 +887,10 @@ static bool rb_watermark_hit(struct trace_buffer *buffer, int cpu, int full) pagebusy = cpu_buffer->reader_page == cpu_buffer->commit_page; ret = !pagebusy && full_hit(buffer, cpu, full);
- if (!cpu_buffer->shortest_full || - cpu_buffer->shortest_full > full) - cpu_buffer->shortest_full = full; + if (!ret && (!cpu_buffer->shortest_full || + cpu_buffer->shortest_full > full)) { + cpu_buffer->shortest_full = full; + } raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); } return ret;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
[ Upstream commit 68282dd930ea38b068ce2c109d12405f40df3f93 ]
The "shortest_full" variable is used to keep track of the waiter that is waiting for the smallest amount on the ring buffer before being woken up. When a tasks waits on the ring buffer, it passes in a "full" value that is a percentage. 0 means wake up on any data. 1-100 means wake up from 1% to 100% full buffer.
As all waiters are on the same wait queue, the wake up happens for the waiter with the smallest percentage.
The problem is that the smallest_full on the cpu_buffer that stores the smallest amount doesn't get reset when all the waiters are woken up. It does get reset when the ring buffer is reset (echo > /sys/kernel/tracing/trace).
This means that tasks may be woken up more often then when they want to be. Instead, have the shortest_full field get reset just before waking up all the tasks. If the tasks wait again, they will update the shortest_full before sleeping.
Also add locking around setting of shortest_full in the poll logic, and change "work" to "rbwork" to match the variable name for rb_irq_work structures that are used in other places.
Link: https://lore.kernel.org/linux-trace-kernel/20240308202431.948914369@goodmis....
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: linke li lilinke99@qq.com Cc: Rabin Vincent rabin@rab.in Fixes: 2c2b0a78b3739 ("ring-buffer: Add percentage of ring buffer full to wake up reader") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Stable-dep-of: 8145f1c35fa6 ("ring-buffer: Fix full_waiters_pending in poll") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/ring_buffer.c | 30 +++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 5b665e5991bf7..5465f4c950f27 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -831,8 +831,19 @@ static void rb_wake_up_waiters(struct irq_work *work)
wake_up_all(&rbwork->waiters); if (rbwork->full_waiters_pending || rbwork->wakeup_full) { + /* Only cpu_buffer sets the above flags */ + struct ring_buffer_per_cpu *cpu_buffer = + container_of(rbwork, struct ring_buffer_per_cpu, irq_work); + + /* Called from interrupt context */ + raw_spin_lock(&cpu_buffer->reader_lock); rbwork->wakeup_full = false; rbwork->full_waiters_pending = false; + + /* Waking up all waiters, they will reset the shortest full */ + cpu_buffer->shortest_full = 0; + raw_spin_unlock(&cpu_buffer->reader_lock); + wake_up_all(&rbwork->full_waiters); } } @@ -999,28 +1010,33 @@ __poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu, struct file *filp, poll_table *poll_table, int full) { struct ring_buffer_per_cpu *cpu_buffer; - struct rb_irq_work *work; + struct rb_irq_work *rbwork;
if (cpu == RING_BUFFER_ALL_CPUS) { - work = &buffer->irq_work; + rbwork = &buffer->irq_work; full = 0; } else { if (!cpumask_test_cpu(cpu, buffer->cpumask)) return EPOLLERR;
cpu_buffer = buffer->buffers[cpu]; - work = &cpu_buffer->irq_work; + rbwork = &cpu_buffer->irq_work; }
if (full) { - poll_wait(filp, &work->full_waiters, poll_table); - work->full_waiters_pending = true; + unsigned long flags; + + poll_wait(filp, &rbwork->full_waiters, poll_table); + + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); + rbwork->full_waiters_pending = true; if (!cpu_buffer->shortest_full || cpu_buffer->shortest_full > full) cpu_buffer->shortest_full = full; + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); } else { - poll_wait(filp, &work->waiters, poll_table); - work->waiters_pending = true; + poll_wait(filp, &rbwork->waiters, poll_table); + rbwork->waiters_pending = true; }
/*
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
[ Upstream commit 8145f1c35fa648da662078efab299c4467b85ad5 ]
If a reader of the ring buffer is doing a poll, and waiting for the ring buffer to hit a specific watermark, there could be a case where it gets into an infinite ping-pong loop.
The poll code has:
rbwork->full_waiters_pending = true; if (!cpu_buffer->shortest_full || cpu_buffer->shortest_full > full) cpu_buffer->shortest_full = full;
The writer will see full_waiters_pending and check if the ring buffer is filled over the percentage of the shortest_full value. If it is, it calls an irq_work to wake up all the waiters.
But the code could get into a circular loop:
CPU 0 CPU 1 ----- ----- [ Poll ] [ shortest_full = 0 ] rbwork->full_waiters_pending = true; if (rbwork->full_waiters_pending && [ buffer percent ] > shortest_full) { rbwork->wakeup_full = true; [ queue_irqwork ]
cpu_buffer->shortest_full = full;
[ IRQ work ] if (rbwork->wakeup_full) { cpu_buffer->shortest_full = 0; wakeup poll waiters; [woken] if ([ buffer percent ] > full) break; rbwork->full_waiters_pending = true; if (rbwork->full_waiters_pending && [ buffer percent ] > shortest_full) { rbwork->wakeup_full = true; [ queue_irqwork ]
cpu_buffer->shortest_full = full;
[ IRQ work ] if (rbwork->wakeup_full) { cpu_buffer->shortest_full = 0; wakeup poll waiters; [woken]
[ Wash, rinse, repeat! ]
In the poll, the shortest_full needs to be set before the full_pending_waiters, as once that is set, the writer will compare the current shortest_full (which is incorrect) to decide to call the irq_work, which will reset the shortest_full (expecting the readers to update it).
Also move the setting of full_waiters_pending after the check if the ring buffer has the required percentage filled. There's no reason to tell the writer to wake up waiters if there are no waiters.
Link: https://lore.kernel.org/linux-trace-kernel/20240312131952.630922155@goodmis....
Cc: stable@vger.kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Andrew Morton akpm@linux-foundation.org Fixes: 42fb0a1e84ff5 ("tracing/ring-buffer: Have polling block on watermark") Reviewed-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/ring_buffer.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 5465f4c950f27..fb87def73a3a4 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -1029,16 +1029,32 @@ __poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu, poll_wait(filp, &rbwork->full_waiters, poll_table);
raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); - rbwork->full_waiters_pending = true; if (!cpu_buffer->shortest_full || cpu_buffer->shortest_full > full) cpu_buffer->shortest_full = full; raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); - } else { - poll_wait(filp, &rbwork->waiters, poll_table); - rbwork->waiters_pending = true; + if (full_hit(buffer, cpu, full)) + return EPOLLIN | EPOLLRDNORM; + /* + * Only allow full_waiters_pending update to be seen after + * the shortest_full is set. If the writer sees the + * full_waiters_pending flag set, it will compare the + * amount in the ring buffer to shortest_full. If the amount + * in the ring buffer is greater than the shortest_full + * percent, it will call the irq_work handler to wake up + * this list. The irq_handler will reset shortest_full + * back to zero. That's done under the reader_lock, but + * the below smp_mb() makes sure that the update to + * full_waiters_pending doesn't leak up into the above. + */ + smp_mb(); + rbwork->full_waiters_pending = true; + return 0; }
+ poll_wait(filp, &rbwork->waiters, poll_table); + rbwork->waiters_pending = true; + /* * There's a tight race between setting the waiters_pending and * checking if the ring buffer is empty. Once the waiters_pending bit @@ -1054,9 +1070,6 @@ __poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu, */ smp_mb();
- if (full) - return full_hit(buffer, cpu, full) ? EPOLLIN | EPOLLRDNORM : 0; - if ((cpu == RING_BUFFER_ALL_CPUS && !ring_buffer_empty(buffer)) || (cpu != RING_BUFFER_ALL_CPUS && !ring_buffer_empty_cpu(buffer, cpu))) return EPOLLIN | EPOLLRDNORM;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Anderson sean.anderson@linux.dev
[ Upstream commit 584c2a9184a33a40fceee838f856de3cffa19be3 ]
smp_call_function_single disables IRQs when executing the callback. To prevent deadlocks, we must disable IRQs when taking cgr_lock elsewhere. This is already done by qman_update_cgr and qman_delete_cgr; fix the other lockers.
Fixes: 96f413f47677 ("soc/fsl/qbman: fix issue in qman_delete_cgr_safe()") CC: stable@vger.kernel.org Signed-off-by: Sean Anderson sean.anderson@linux.dev Reviewed-by: Camelia Groza camelia.groza@nxp.com Tested-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/fsl/qbman/qman.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index feb97470699d9..d8267e6c31a50 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -1456,11 +1456,11 @@ static void qm_congestion_task(struct work_struct *work) union qm_mc_result *mcr; struct qman_cgr *cgr;
- spin_lock(&p->cgr_lock); + spin_lock_irq(&p->cgr_lock); qm_mc_start(&p->p); qm_mc_commit(&p->p, QM_MCC_VERB_QUERYCONGESTION); if (!qm_mc_result_timeout(&p->p, &mcr)) { - spin_unlock(&p->cgr_lock); + spin_unlock_irq(&p->cgr_lock); dev_crit(p->config->dev, "QUERYCONGESTION timeout\n"); qman_p_irqsource_add(p, QM_PIRQ_CSCI); return; @@ -1476,7 +1476,7 @@ static void qm_congestion_task(struct work_struct *work) list_for_each_entry(cgr, &p->cgr_cbs, node) if (cgr->cb && qman_cgrs_get(&c, cgr->cgrid)) cgr->cb(p, cgr, qman_cgrs_get(&rr, cgr->cgrid)); - spin_unlock(&p->cgr_lock); + spin_unlock_irq(&p->cgr_lock); qman_p_irqsource_add(p, QM_PIRQ_CSCI); }
@@ -2440,7 +2440,7 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, preempt_enable();
cgr->chan = p->config->channel; - spin_lock(&p->cgr_lock); + spin_lock_irq(&p->cgr_lock);
if (opts) { struct qm_mcc_initcgr local_opts = *opts; @@ -2477,7 +2477,7 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, qman_cgrs_get(&p->cgrs[1], cgr->cgrid)) cgr->cb(p, cgr, 1); out: - spin_unlock(&p->cgr_lock); + spin_unlock_irq(&p->cgr_lock); put_affine_portal(); return ret; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Anderson sean.anderson@seco.com
[ Upstream commit d0e17a4653cebc2c8a20251c837dd1fcec5014d9 ]
This breaks out/combines get_affine_portal and the cgr sanity check in preparation for the next commit. No functional change intended.
Signed-off-by: Sean Anderson sean.anderson@seco.com Acked-by: Camelia Groza camelia.groza@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: fbec4e7fed89 ("soc: fsl: qbman: Use raw spinlock for cgr_lock") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/fsl/qbman/qman.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index d8267e6c31a50..d581576c9a861 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -2483,13 +2483,8 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, } EXPORT_SYMBOL(qman_create_cgr);
-int qman_delete_cgr(struct qman_cgr *cgr) +static struct qman_portal *qman_cgr_get_affine_portal(struct qman_cgr *cgr) { - unsigned long irqflags; - struct qm_mcr_querycgr cgr_state; - struct qm_mcc_initcgr local_opts; - int ret = 0; - struct qman_cgr *i; struct qman_portal *p = get_affine_portal();
if (cgr->chan != p->config->channel) { @@ -2497,10 +2492,25 @@ int qman_delete_cgr(struct qman_cgr *cgr) dev_err(p->config->dev, "CGR not owned by current portal"); dev_dbg(p->config->dev, " create 0x%x, delete 0x%x\n", cgr->chan, p->config->channel); - - ret = -EINVAL; - goto put_portal; + put_affine_portal(); + return NULL; } + + return p; +} + +int qman_delete_cgr(struct qman_cgr *cgr) +{ + unsigned long irqflags; + struct qm_mcr_querycgr cgr_state; + struct qm_mcc_initcgr local_opts; + int ret = 0; + struct qman_cgr *i; + struct qman_portal *p = qman_cgr_get_affine_portal(cgr); + + if (!p) + return -EINVAL; + memset(&local_opts, 0, sizeof(struct qm_mcc_initcgr)); spin_lock_irqsave(&p->cgr_lock, irqflags); list_del(&cgr->node); @@ -2528,7 +2538,6 @@ int qman_delete_cgr(struct qman_cgr *cgr) list_add(&cgr->node, &p->cgr_cbs); release_lock: spin_unlock_irqrestore(&p->cgr_lock, irqflags); -put_portal: put_affine_portal(); return ret; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Anderson sean.anderson@seco.com
[ Upstream commit 914f8b228ede709274b8c80514b352248ec9da00 ]
This adds a function to update a CGR with new parameters. qman_create_cgr can almost be used for this (with flags=0), but it's not suitable because it also registers the callback function. The _safe variant was modeled off of qman_cgr_delete_safe. However, we handle multiple arguments and a return value.
Signed-off-by: Sean Anderson sean.anderson@seco.com Acked-by: Camelia Groza camelia.groza@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: fbec4e7fed89 ("soc: fsl: qbman: Use raw spinlock for cgr_lock") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/fsl/qbman/qman.c | 48 ++++++++++++++++++++++++++++++++++++ include/soc/fsl/qman.h | 9 +++++++ 2 files changed, 57 insertions(+)
diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index d581576c9a861..3346593e0bc62 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -2568,6 +2568,54 @@ void qman_delete_cgr_safe(struct qman_cgr *cgr) } EXPORT_SYMBOL(qman_delete_cgr_safe);
+static int qman_update_cgr(struct qman_cgr *cgr, struct qm_mcc_initcgr *opts) +{ + int ret; + unsigned long irqflags; + struct qman_portal *p = qman_cgr_get_affine_portal(cgr); + + if (!p) + return -EINVAL; + + spin_lock_irqsave(&p->cgr_lock, irqflags); + ret = qm_modify_cgr(cgr, 0, opts); + spin_unlock_irqrestore(&p->cgr_lock, irqflags); + put_affine_portal(); + return ret; +} + +struct update_cgr_params { + struct qman_cgr *cgr; + struct qm_mcc_initcgr *opts; + int ret; +}; + +static void qman_update_cgr_smp_call(void *p) +{ + struct update_cgr_params *params = p; + + params->ret = qman_update_cgr(params->cgr, params->opts); +} + +int qman_update_cgr_safe(struct qman_cgr *cgr, struct qm_mcc_initcgr *opts) +{ + struct update_cgr_params params = { + .cgr = cgr, + .opts = opts, + }; + + preempt_disable(); + if (qman_cgr_cpus[cgr->cgrid] != smp_processor_id()) + smp_call_function_single(qman_cgr_cpus[cgr->cgrid], + qman_update_cgr_smp_call, ¶ms, + true); + else + params.ret = qman_update_cgr(cgr, opts); + preempt_enable(); + return params.ret; +} +EXPORT_SYMBOL(qman_update_cgr_safe); + /* Cleanup FQs */
static int _qm_mr_consume_and_match_verb(struct qm_portal *p, int v) diff --git a/include/soc/fsl/qman.h b/include/soc/fsl/qman.h index 9f484113cfda7..3cecbfdb0f8c2 100644 --- a/include/soc/fsl/qman.h +++ b/include/soc/fsl/qman.h @@ -1170,6 +1170,15 @@ int qman_delete_cgr(struct qman_cgr *cgr); */ void qman_delete_cgr_safe(struct qman_cgr *cgr);
+/** + * qman_update_cgr_safe - Modifies a congestion group object from any CPU + * @cgr: the 'cgr' object to modify + * @opts: state of the CGR settings + * + * This will select the proper CPU and modify the CGR settings. + */ +int qman_update_cgr_safe(struct qman_cgr *cgr, struct qm_mcc_initcgr *opts); + /** * qman_query_cgr_congested - Queries CGR's congestion status * @cgr: the 'cgr' object to query
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Anderson sean.anderson@linux.dev
[ Upstream commit fbec4e7fed89b579f2483041fabf9650fb0dd6bc ]
smp_call_function always runs its callback in hard IRQ context, even on PREEMPT_RT, where spinlocks can sleep. So we need to use a raw spinlock for cgr_lock to ensure we aren't waiting on a sleeping task.
Although this bug has existed for a while, it was not apparent until commit ef2a8d5478b9 ("net: dpaa: Adjust queue depth on rate change") which invokes smp_call_function_single via qman_update_cgr_safe every time a link goes up or down.
Fixes: 96f413f47677 ("soc/fsl/qbman: fix issue in qman_delete_cgr_safe()") CC: stable@vger.kernel.org Reported-by: Vladimir Oltean vladimir.oltean@nxp.com Closes: https://lore.kernel.org/all/20230323153935.nofnjucqjqnz34ej@skbuf/ Reported-by: Steffen Trumtrar s.trumtrar@pengutronix.de Closes: https://lore.kernel.org/linux-arm-kernel/87wmsyvclu.fsf@pengutronix.de/ Signed-off-by: Sean Anderson sean.anderson@linux.dev Reviewed-by: Camelia Groza camelia.groza@nxp.com Tested-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/fsl/qbman/qman.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index 3346593e0bc62..7abc9b6a04ab6 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -991,7 +991,7 @@ struct qman_portal { /* linked-list of CSCN handlers. */ struct list_head cgr_cbs; /* list lock */ - spinlock_t cgr_lock; + raw_spinlock_t cgr_lock; struct work_struct congestion_work; struct work_struct mr_work; char irqname[MAX_IRQNAME]; @@ -1281,7 +1281,7 @@ static int qman_create_portal(struct qman_portal *portal, /* if the given mask is NULL, assume all CGRs can be seen */ qman_cgrs_fill(&portal->cgrs[0]); INIT_LIST_HEAD(&portal->cgr_cbs); - spin_lock_init(&portal->cgr_lock); + raw_spin_lock_init(&portal->cgr_lock); INIT_WORK(&portal->congestion_work, qm_congestion_task); INIT_WORK(&portal->mr_work, qm_mr_process_task); portal->bits = 0; @@ -1456,11 +1456,14 @@ static void qm_congestion_task(struct work_struct *work) union qm_mc_result *mcr; struct qman_cgr *cgr;
- spin_lock_irq(&p->cgr_lock); + /* + * FIXME: QM_MCR_TIMEOUT is 10ms, which is too long for a raw spinlock! + */ + raw_spin_lock_irq(&p->cgr_lock); qm_mc_start(&p->p); qm_mc_commit(&p->p, QM_MCC_VERB_QUERYCONGESTION); if (!qm_mc_result_timeout(&p->p, &mcr)) { - spin_unlock_irq(&p->cgr_lock); + raw_spin_unlock_irq(&p->cgr_lock); dev_crit(p->config->dev, "QUERYCONGESTION timeout\n"); qman_p_irqsource_add(p, QM_PIRQ_CSCI); return; @@ -1476,7 +1479,7 @@ static void qm_congestion_task(struct work_struct *work) list_for_each_entry(cgr, &p->cgr_cbs, node) if (cgr->cb && qman_cgrs_get(&c, cgr->cgrid)) cgr->cb(p, cgr, qman_cgrs_get(&rr, cgr->cgrid)); - spin_unlock_irq(&p->cgr_lock); + raw_spin_unlock_irq(&p->cgr_lock); qman_p_irqsource_add(p, QM_PIRQ_CSCI); }
@@ -2440,7 +2443,7 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, preempt_enable();
cgr->chan = p->config->channel; - spin_lock_irq(&p->cgr_lock); + raw_spin_lock_irq(&p->cgr_lock);
if (opts) { struct qm_mcc_initcgr local_opts = *opts; @@ -2477,7 +2480,7 @@ int qman_create_cgr(struct qman_cgr *cgr, u32 flags, qman_cgrs_get(&p->cgrs[1], cgr->cgrid)) cgr->cb(p, cgr, 1); out: - spin_unlock_irq(&p->cgr_lock); + raw_spin_unlock_irq(&p->cgr_lock); put_affine_portal(); return ret; } @@ -2512,7 +2515,7 @@ int qman_delete_cgr(struct qman_cgr *cgr) return -EINVAL;
memset(&local_opts, 0, sizeof(struct qm_mcc_initcgr)); - spin_lock_irqsave(&p->cgr_lock, irqflags); + raw_spin_lock_irqsave(&p->cgr_lock, irqflags); list_del(&cgr->node); /* * If there are no other CGR objects for this CGRID in the list, @@ -2537,7 +2540,7 @@ int qman_delete_cgr(struct qman_cgr *cgr) /* add back to the list */ list_add(&cgr->node, &p->cgr_cbs); release_lock: - spin_unlock_irqrestore(&p->cgr_lock, irqflags); + raw_spin_unlock_irqrestore(&p->cgr_lock, irqflags); put_affine_portal(); return ret; } @@ -2577,9 +2580,9 @@ static int qman_update_cgr(struct qman_cgr *cgr, struct qm_mcc_initcgr *opts) if (!p) return -EINVAL;
- spin_lock_irqsave(&p->cgr_lock, irqflags); + raw_spin_lock_irqsave(&p->cgr_lock, irqflags); ret = qm_modify_cgr(cgr, 0, opts); - spin_unlock_irqrestore(&p->cgr_lock, irqflags); + raw_spin_unlock_irqrestore(&p->cgr_lock, irqflags); put_affine_portal(); return ret; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harald Freudenberger freude@linux.ibm.com
[ Upstream commit 50ed48c80fecbe17218afed4f8bed005c802976c ]
Tests with hot-plugging crytpo cards on KVM guests with debug kernel build revealed an use after free for the load field of the struct zcrypt_card. The reason was an incorrect reference handling of the zcrypt card object which could lead to a free of the zcrypt card object while it was still in use.
This is an example of the slab message:
kernel: 0x00000000885a7512-0x00000000885a7513 @offset=1298. First byte 0x68 instead of 0x6b kernel: Allocated in zcrypt_card_alloc+0x36/0x70 [zcrypt] age=18046 cpu=3 pid=43 kernel: kmalloc_trace+0x3f2/0x470 kernel: zcrypt_card_alloc+0x36/0x70 [zcrypt] kernel: zcrypt_cex4_card_probe+0x26/0x380 [zcrypt_cex4] kernel: ap_device_probe+0x15c/0x290 kernel: really_probe+0xd2/0x468 kernel: driver_probe_device+0x40/0xf0 kernel: __device_attach_driver+0xc0/0x140 kernel: bus_for_each_drv+0x8c/0xd0 kernel: __device_attach+0x114/0x198 kernel: bus_probe_device+0xb4/0xc8 kernel: device_add+0x4d2/0x6e0 kernel: ap_scan_adapter+0x3d0/0x7c0 kernel: ap_scan_bus+0x5a/0x3b0 kernel: ap_scan_bus_wq_callback+0x40/0x60 kernel: process_one_work+0x26e/0x620 kernel: worker_thread+0x21c/0x440 kernel: Freed in zcrypt_card_put+0x54/0x80 [zcrypt] age=9024 cpu=3 pid=43 kernel: kfree+0x37e/0x418 kernel: zcrypt_card_put+0x54/0x80 [zcrypt] kernel: ap_device_remove+0x4c/0xe0 kernel: device_release_driver_internal+0x1c4/0x270 kernel: bus_remove_device+0x100/0x188 kernel: device_del+0x164/0x3c0 kernel: device_unregister+0x30/0x90 kernel: ap_scan_adapter+0xc8/0x7c0 kernel: ap_scan_bus+0x5a/0x3b0 kernel: ap_scan_bus_wq_callback+0x40/0x60 kernel: process_one_work+0x26e/0x620 kernel: worker_thread+0x21c/0x440 kernel: kthread+0x150/0x168 kernel: __ret_from_fork+0x3c/0x58 kernel: ret_from_fork+0xa/0x30 kernel: Slab 0x00000372022169c0 objects=20 used=18 fp=0x00000000885a7c88 flags=0x3ffff00000000a00(workingset|slab|node=0|zone=1|lastcpupid=0x1ffff) kernel: Object 0x00000000885a74b8 @offset=1208 fp=0x00000000885a7c88 kernel: Redzone 00000000885a74b0: bb bb bb bb bb bb bb bb ........ kernel: Object 00000000885a74b8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a74c8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a74d8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a74e8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a74f8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a7508: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 68 4b 6b 6b 6b a5 kkkkkkkkkkhKkkk. kernel: Redzone 00000000885a7518: bb bb bb bb bb bb bb bb ........ kernel: Padding 00000000885a756c: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZ kernel: CPU: 0 PID: 387 Comm: systemd-udevd Not tainted 6.8.0-HF #2 kernel: Hardware name: IBM 3931 A01 704 (KVM/Linux) kernel: Call Trace: kernel: [<00000000ca5ab5b8>] dump_stack_lvl+0x90/0x120 kernel: [<00000000c99d78bc>] check_bytes_and_report+0x114/0x140 kernel: [<00000000c99d53cc>] check_object+0x334/0x3f8 kernel: [<00000000c99d820c>] alloc_debug_processing+0xc4/0x1f8 kernel: [<00000000c99d852e>] get_partial_node.part.0+0x1ee/0x3e0 kernel: [<00000000c99d94ec>] ___slab_alloc+0xaf4/0x13c8 kernel: [<00000000c99d9e38>] __slab_alloc.constprop.0+0x78/0xb8 kernel: [<00000000c99dc8dc>] __kmalloc+0x434/0x590 kernel: [<00000000c9b4c0ce>] ext4_htree_store_dirent+0x4e/0x1c0 kernel: [<00000000c9b908a2>] htree_dirblock_to_tree+0x17a/0x3f0 kernel: [<00000000c9b919dc>] ext4_htree_fill_tree+0x134/0x400 kernel: [<00000000c9b4b3d0>] ext4_dx_readdir+0x160/0x2f0 kernel: [<00000000c9b4bedc>] ext4_readdir+0x5f4/0x760 kernel: [<00000000c9a7efc4>] iterate_dir+0xb4/0x280 kernel: [<00000000c9a7f1ea>] __do_sys_getdents64+0x5a/0x120 kernel: [<00000000ca5d6946>] __do_syscall+0x256/0x310 kernel: [<00000000ca5eea10>] system_call+0x70/0x98 kernel: INFO: lockdep is turned off. kernel: FIX kmalloc-96: Restoring Poison 0x00000000885a7512-0x00000000885a7513=0x6b kernel: FIX kmalloc-96: Marking all objects used
The fix is simple: Before use of the queue not only the queue object but also the card object needs to increase it's reference count with a call to zcrypt_card_get(). Similar after use of the queue not only the queue but also the card object's reference count is decreased with zcrypt_card_put().
Signed-off-by: Harald Freudenberger freude@linux.ibm.com Reviewed-by: Holger Dengler dengler@linux.ibm.com Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/crypto/zcrypt_api.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/s390/crypto/zcrypt_api.c b/drivers/s390/crypto/zcrypt_api.c index b518009715eeb..dde10a5c7d509 100644 --- a/drivers/s390/crypto/zcrypt_api.c +++ b/drivers/s390/crypto/zcrypt_api.c @@ -576,6 +576,7 @@ static inline struct zcrypt_queue *zcrypt_pick_queue(struct zcrypt_card *zc, { if (!zq || !try_module_get(zq->queue->ap_dev.drv->driver.owner)) return NULL; + zcrypt_card_get(zc); zcrypt_queue_get(zq); get_device(&zq->queue->ap_dev.device); atomic_add(weight, &zc->load); @@ -595,6 +596,7 @@ static inline void zcrypt_drop_queue(struct zcrypt_card *zc, atomic_sub(weight, &zq->load); put_device(&zq->queue->ap_dev.device); zcrypt_queue_put(zq); + zcrypt_card_put(zc); module_put(mod); }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
[ Upstream commit fc4e97726530241d96dd7db72eb65979217422c9 ]
None of the callers of drm_panel_get_modes() expect it to return negative error codes. Either they propagate the return value in their struct drm_connector_helper_funcs .get_modes() hook (which is also not supposed to return negative codes), or add it to other counts leading to bogus values.
On the other hand, many of the struct drm_panel_funcs .get_modes() hooks do return negative error codes, so handle them gracefully instead of propagating further.
Return 0 for no modes, whatever the reason.
Cc: Neil Armstrong neil.armstrong@linaro.org Cc: Jessica Zhang quic_jesszhan@quicinc.com Cc: Sam Ravnborg sam@ravnborg.org Cc: stable@vger.kernel.org Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Reviewed-by: Jessica Zhang quic_jesszhan@quicinc.com Acked-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/79f559b72d8c493940417304e222a4... Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/drm_panel.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/drm_panel.c b/drivers/gpu/drm/drm_panel.c index f634371c717a8..7fd3de89ed079 100644 --- a/drivers/gpu/drm/drm_panel.c +++ b/drivers/gpu/drm/drm_panel.c @@ -207,19 +207,24 @@ EXPORT_SYMBOL(drm_panel_disable); * The modes probed from the panel are automatically added to the connector * that the panel is attached to. * - * Return: The number of modes available from the panel on success or a - * negative error code on failure. + * Return: The number of modes available from the panel on success, or 0 on + * failure (no modes). */ int drm_panel_get_modes(struct drm_panel *panel, struct drm_connector *connector) { if (!panel) - return -EINVAL; + return 0;
- if (panel->funcs && panel->funcs->get_modes) - return panel->funcs->get_modes(panel, connector); + if (panel->funcs && panel->funcs->get_modes) { + int num;
- return -EOPNOTSUPP; + num = panel->funcs->get_modes(panel, connector); + if (num > 0) + return num; + } + + return 0; } EXPORT_SYMBOL(drm_panel_get_modes);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
[ Upstream commit 13d5b040363c7ec0ac29c2de9cf661a24a8aa531 ]
The .get_modes() hooks aren't supposed to return negative error codes. Return 0 for no modes, whatever the reason.
Cc: Inki Dae inki.dae@samsung.com Cc: Seung-Woo Kim sw0312.kim@samsung.com Cc: Kyungmin Park kyungmin.park@samsung.com Cc: stable@vger.kernel.org Acked-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/d8665f620d9c252aa7d5a4811ff6b1... Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/exynos/exynos_drm_vidi.c | 4 ++-- drivers/gpu/drm/exynos/exynos_hdmi.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/exynos/exynos_drm_vidi.c b/drivers/gpu/drm/exynos/exynos_drm_vidi.c index e96436e11a36c..e1ffe8a28b649 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_vidi.c +++ b/drivers/gpu/drm/exynos/exynos_drm_vidi.c @@ -315,14 +315,14 @@ static int vidi_get_modes(struct drm_connector *connector) */ if (!ctx->raw_edid) { DRM_DEV_DEBUG_KMS(ctx->dev, "raw_edid is null.\n"); - return -EFAULT; + return 0; }
edid_len = (1 + ctx->raw_edid->extensions) * EDID_LENGTH; edid = kmemdup(ctx->raw_edid, edid_len, GFP_KERNEL); if (!edid) { DRM_DEV_DEBUG_KMS(ctx->dev, "failed to allocate edid\n"); - return -ENOMEM; + return 0; }
drm_connector_update_edid_property(connector, edid); diff --git a/drivers/gpu/drm/exynos/exynos_hdmi.c b/drivers/gpu/drm/exynos/exynos_hdmi.c index 981bffacda243..576fcf1807164 100644 --- a/drivers/gpu/drm/exynos/exynos_hdmi.c +++ b/drivers/gpu/drm/exynos/exynos_hdmi.c @@ -878,11 +878,11 @@ static int hdmi_get_modes(struct drm_connector *connector) int ret;
if (!hdata->ddc_adpt) - return -ENODEV; + return 0;
edid = drm_get_edid(connector, hdata->ddc_adpt); if (!edid) - return -ENODEV; + return 0;
hdata->dvi_mode = !drm_detect_hdmi_monitor(edid); DRM_DEV_DEBUG_KMS(hdata->dev, "%s : width[%d] x height[%d]\n",
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
[ Upstream commit c2da9ada64962fcd2e6395ed9987b9874ea032d3 ]
The .get_modes() hooks aren't supposed to return negative error codes. Return 0 for no modes, whatever the reason.
Cc: Philipp Zabel p.zabel@pengutronix.de Cc: stable@vger.kernel.org Acked-by: Philipp Zabel p.zabel@pengutronix.de Acked-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/311f6eec96d47949b16a670529f4d8... Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/imx/parallel-display.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/imx/parallel-display.c b/drivers/gpu/drm/imx/parallel-display.c index b61bfa84b6bbd..bcd6b9ee8eea6 100644 --- a/drivers/gpu/drm/imx/parallel-display.c +++ b/drivers/gpu/drm/imx/parallel-display.c @@ -65,14 +65,14 @@ static int imx_pd_connector_get_modes(struct drm_connector *connector) int ret;
if (!mode) - return -EINVAL; + return 0;
ret = of_get_drm_display_mode(np, &imxpd->mode, &imxpd->bus_flags, OF_USE_NATIVE_MODE); if (ret) { drm_mode_destroy(connector->dev, mode); - return ret; + return 0; }
drm_mode_copy(mode, &imxpd->mode);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
[ Upstream commit abf493988e380f25242c1023275c68bd3579c9ce ]
The .get_modes() hooks aren't supposed to return negative error codes. Return 0 for no modes, whatever the reason.
Cc: Maxime Ripard mripard@kernel.org Cc: stable@vger.kernel.org Acked-by: Maxime Ripard mripard@kernel.org Acked-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/dcda6d4003e2c6192987916b35c730... Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vc4/vc4_hdmi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c index 7e8620838de9c..6d01258349faa 100644 --- a/drivers/gpu/drm/vc4/vc4_hdmi.c +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c @@ -197,7 +197,7 @@ static int vc4_hdmi_connector_get_modes(struct drm_connector *connector) edid = drm_get_edid(connector, vc4_hdmi->ddc); cec_s_phys_addr_from_edid(vc4_hdmi->cec_adap, edid); if (!edid) - return -ENODEV; + return 0;
vc4_encoder->hdmi_monitor = drm_detect_hdmi_monitor(edid);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qiang Zhang qiang4.zhang@intel.com
[ Upstream commit 82634d7e24271698e50a3ec811e5f50de790a65f ]
memtest failed to find bad memory when compiled with clang. So use {WRITE,READ}_ONCE to access memory to avoid compiler over optimization.
Link: https://lkml.kernel.org/r/20240312080422.691222-1-qiang4.zhang@intel.com Signed-off-by: Qiang Zhang qiang4.zhang@intel.com Cc: Bill Wendling morbo@google.com Cc: Justin Stitt justinstitt@google.com Cc: Nathan Chancellor nathan@kernel.org Cc: Nick Desaulniers ndesaulniers@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/memtest.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/memtest.c b/mm/memtest.c index f53ace709ccd8..d407373f225b4 100644 --- a/mm/memtest.c +++ b/mm/memtest.c @@ -46,10 +46,10 @@ static void __init memtest(u64 pattern, phys_addr_t start_phys, phys_addr_t size last_bad = 0;
for (p = start; p < end; p++) - *p = pattern; + WRITE_ONCE(*p, pattern);
for (p = start; p < end; p++, start_phys_aligned += incr) { - if (*p == pattern) + if (READ_ONCE(*p) == pattern) continue; if (start_phys_aligned == last_bad + incr) { last_bad += incr;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryusuke Konishi konishi.ryusuke@gmail.com
[ Upstream commit f2f26b4a84a0ef41791bd2d70861c8eac748f4ba ]
Patch series "nilfs2: fix kernel bug at submit_bh_wbc()".
This resolves a kernel BUG reported by syzbot. Since there are two flaws involved, I've made each one a separate patch.
The first patch alone resolves the syzbot-reported bug, but I think both fixes should be sent to stable, so I've tagged them as such.
This patch (of 2):
Syzbot has reported a kernel bug in submit_bh_wbc() when writing file data to a nilfs2 file system whose metadata is corrupted.
There are two flaws involved in this issue.
The first flaw is that when nilfs_get_block() locates a data block using btree or direct mapping, if the disk address translation routine nilfs_dat_translate() fails with internal code -ENOENT due to DAT metadata corruption, it can be passed back to nilfs_get_block(). This causes nilfs_get_block() to misidentify an existing block as non-existent, causing both data block lookup and insertion to fail inconsistently.
The second flaw is that nilfs_get_block() returns a successful status in this inconsistent state. This causes the caller __block_write_begin_int() or others to request a read even though the buffer is not mapped, resulting in a BUG_ON check for the BH_Mapped flag in submit_bh_wbc() failing.
This fixes the first issue by changing the return value to code -EINVAL when a conversion using DAT fails with code -ENOENT, avoiding the conflicting condition that leads to the kernel bug described above. Here, code -EINVAL indicates that metadata corruption was detected during the block lookup, which will be properly handled as a file system error and converted to -EIO when passing through the nilfs2 bmap layer.
Link: https://lkml.kernel.org/r/20240313105827.5296-1-konishi.ryusuke@gmail.com Link: https://lkml.kernel.org/r/20240313105827.5296-2-konishi.ryusuke@gmail.com Fixes: c3a7abf06ce7 ("nilfs2: support contiguous lookup of blocks") Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Reported-by: syzbot+cfed5b56649bddf80d6e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=cfed5b56649bddf80d6e Tested-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/btree.c | 9 +++++++-- fs/nilfs2/direct.c | 9 +++++++-- 2 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/fs/nilfs2/btree.c b/fs/nilfs2/btree.c index 65cd599cb2ab6..4905b7cd7bf33 100644 --- a/fs/nilfs2/btree.c +++ b/fs/nilfs2/btree.c @@ -724,7 +724,7 @@ static int nilfs_btree_lookup_contig(const struct nilfs_bmap *btree, dat = nilfs_bmap_get_dat(btree); ret = nilfs_dat_translate(dat, ptr, &blocknr); if (ret < 0) - goto out; + goto dat_error; ptr = blocknr; } cnt = 1; @@ -743,7 +743,7 @@ static int nilfs_btree_lookup_contig(const struct nilfs_bmap *btree, if (dat) { ret = nilfs_dat_translate(dat, ptr2, &blocknr); if (ret < 0) - goto out; + goto dat_error; ptr2 = blocknr; } if (ptr2 != ptr + cnt || ++cnt == maxblocks) @@ -782,6 +782,11 @@ static int nilfs_btree_lookup_contig(const struct nilfs_bmap *btree, out: nilfs_btree_free_path(path); return ret; + + dat_error: + if (ret == -ENOENT) + ret = -EINVAL; /* Notify bmap layer of metadata corruption */ + goto out; }
static void nilfs_btree_promote_key(struct nilfs_bmap *btree, diff --git a/fs/nilfs2/direct.c b/fs/nilfs2/direct.c index f353101955e3b..7faf8c285d6c9 100644 --- a/fs/nilfs2/direct.c +++ b/fs/nilfs2/direct.c @@ -66,7 +66,7 @@ static int nilfs_direct_lookup_contig(const struct nilfs_bmap *direct, dat = nilfs_bmap_get_dat(direct); ret = nilfs_dat_translate(dat, ptr, &blocknr); if (ret < 0) - return ret; + goto dat_error; ptr = blocknr; }
@@ -79,7 +79,7 @@ static int nilfs_direct_lookup_contig(const struct nilfs_bmap *direct, if (dat) { ret = nilfs_dat_translate(dat, ptr2, &blocknr); if (ret < 0) - return ret; + goto dat_error; ptr2 = blocknr; } if (ptr2 != ptr + cnt) @@ -87,6 +87,11 @@ static int nilfs_direct_lookup_contig(const struct nilfs_bmap *direct, } *ptrp = ptr; return cnt; + + dat_error: + if (ret == -ENOENT) + ret = -EINVAL; /* Notify bmap layer of metadata corruption */ + return ret; }
static __u64
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryusuke Konishi konishi.ryusuke@gmail.com
[ Upstream commit 269cdf353b5bdd15f1a079671b0f889113865f20 ]
Fix a bug where nilfs_get_block() returns a successful status when searching and inserting the specified block both fail inconsistently. If this inconsistent behavior is not due to a previously fixed bug, then an unexpected race is occurring, so return a temporary error -EAGAIN instead.
This prevents callers such as __block_write_begin_int() from requesting a read into a buffer that is not mapped, which would cause the BUG_ON check for the BH_Mapped flag in submit_bh_wbc() to fail.
Link: https://lkml.kernel.org/r/20240313105827.5296-3-konishi.ryusuke@gmail.com Fixes: 1f5abe7e7dbc ("nilfs2: replace BUG_ON and BUG calls triggerable from ioctl") Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c index d144e08a9003a..06f4deb550c9f 100644 --- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -112,7 +112,7 @@ int nilfs_get_block(struct inode *inode, sector_t blkoff, "%s (ino=%lu): a race condition while inserting a data block at offset=%llu", __func__, inode->i_ino, (unsigned long long)blkoff); - err = 0; + err = -EAGAIN; } nilfs_transaction_abort(inode->i_sb); goto out;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Szyprowski m.szyprowski@samsung.com
[ Upstream commit d2399501c2c081eac703ca9597ceb83c7875a537 ]
Commit 0499a78369ad ("ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512") changed the handling of cpumasks on ARM 64bit, what resulted in the strange issues and warnings during cpufreq-dt initialization on some big.LITTLE platforms.
This was caused by mixing OPPs between big and LITTLE cores, because OPP-sharing information between big and LITTLE cores is computed on cpumask, which in turn was not zeroed on allocation. Fix this by switching to zalloc_cpumask_var() call.
Fixes: dc279ac6e5b4 ("cpufreq: dt: Refactor initialization to handle probe deferral properly") CC: stable@vger.kernel.org # v5.10+ Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Reviewed-by: Christoph Lameter (Ampere) cl@linux.com Reviewed-by: Dhruva Gole d-gole@ti.com Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpufreq/cpufreq-dt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c index e363ae04aac62..44cc596ca0a5e 100644 --- a/drivers/cpufreq/cpufreq-dt.c +++ b/drivers/cpufreq/cpufreq-dt.c @@ -251,7 +251,7 @@ static int dt_cpufreq_early_init(struct device *dev, int cpu) if (!priv) return -ENOMEM;
- if (!alloc_cpumask_var(&priv->cpus, GFP_KERNEL)) + if (!zalloc_cpumask_var(&priv->cpus, GFP_KERNEL)) return -ENOMEM;
priv->cpu_dev = cpu_dev;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov (AMD) bp@alien8.de
[ Upstream commit 5c84b051bd4e777cf37aaff983277e58c99618d5 ]
Update them to the correct revision numbers.
Fixes: 522b1d69219d ("x86/cpu/amd: Add a Zenbleed fix") Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Cc: stable@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/cpu/amd.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index f29c6bed9d657..3b02cb8b05338 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -1049,11 +1049,11 @@ static bool cpu_has_zenbleed_microcode(void) u32 good_rev = 0;
switch (boot_cpu_data.x86_model) { - case 0x30 ... 0x3f: good_rev = 0x0830107a; break; - case 0x60 ... 0x67: good_rev = 0x0860010b; break; - case 0x68 ... 0x6f: good_rev = 0x08608105; break; - case 0x70 ... 0x7f: good_rev = 0x08701032; break; - case 0xa0 ... 0xaf: good_rev = 0x08a00008; break; + case 0x30 ... 0x3f: good_rev = 0x0830107b; break; + case 0x60 ... 0x67: good_rev = 0x0860010c; break; + case 0x68 ... 0x6f: good_rev = 0x08608107; break; + case 0x70 ... 0x7f: good_rev = 0x08701033; break; + case 0xa0 ... 0xaf: good_rev = 0x08a00009; break;
default: return false;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
[ Upstream commit 3f9952e8d80cca2da3b47ecd5ad9ec16cfd1a649 ]
The __string() and __assign_str() helper macros of the TRACE_EVENT() macro are going through some optimizations where only the source string of __string() will be used and the __assign_str() source will be ignored and later removed.
To make sure that there's no issues, a new check is added between the __string() src argument and the __assign_str() src argument that does a strcmp() to make sure they are the same string.
The hclgevf trace events have:
__assign_str(devname, &hdev->nic.kinfo.netdev->name);
Which triggers the warning:
hclgevf_trace.h:34:39: error: passing argument 1 of ‘strcmp’ from incompatible pointer type [-Werror=incompatible-pointer-types] 34 | __assign_str(devname, &hdev->nic.kinfo.netdev->name); [..] arch/x86/include/asm/string_64.h:75:24: note: expected ‘const char *’ but argument is of type ‘char (*)[16]’ 75 | int strcmp(const char *cs, const char *ct); | ~~~~~~~~~~~~^~
Because __assign_str() now has:
WARN_ON_ONCE(__builtin_constant_p(src) ? \ strcmp((src), __data_offsets.dst##_ptr_) : \ (src) != __data_offsets.dst##_ptr_); \
The problem is the '&' on hdev->nic.kinfo.netdev->name. That's because that name is:
char name[IFNAMSIZ]
Where passing an address '&' of a char array is not compatible with strcmp().
The '&' is not necessary, remove it.
Link: https://lore.kernel.org/linux-trace-kernel/20240313093454.3909afe7@gandalf.l...
Cc: netdev netdev@vger.kernel.org Cc: Yisen Zhuang yisen.zhuang@huawei.com Cc: Salil Mehta salil.mehta@huawei.com Cc: "David S. Miller" davem@davemloft.net Cc: Eric Dumazet edumazet@google.com Cc: Jakub Kicinski kuba@kernel.org Cc: Yufeng Mo moyufeng@huawei.com Cc: Huazhong Tan tanhuazhong@huawei.com Cc: stable@vger.kernel.org Acked-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Jijie Shao shaojijie@huawei.com Fixes: d8355240cf8fb ("net: hns3: add trace event support for PF/VF mailbox") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_trace.h | 8 ++++---- .../net/ethernet/hisilicon/hns3/hns3vf/hclgevf_trace.h | 8 ++++---- 2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_trace.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_trace.h index 5b0b71bd61200..e8e67321e9632 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_trace.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_trace.h @@ -24,7 +24,7 @@ TRACE_EVENT(hclge_pf_mbx_get, __field(u8, code) __field(u8, subcode) __string(pciname, pci_name(hdev->pdev)) - __string(devname, &hdev->vport[0].nic.kinfo.netdev->name) + __string(devname, hdev->vport[0].nic.kinfo.netdev->name) __array(u32, mbx_data, PF_GET_MBX_LEN) ),
@@ -33,7 +33,7 @@ TRACE_EVENT(hclge_pf_mbx_get, __entry->code = req->msg.code; __entry->subcode = req->msg.subcode; __assign_str(pciname, pci_name(hdev->pdev)); - __assign_str(devname, &hdev->vport[0].nic.kinfo.netdev->name); + __assign_str(devname, hdev->vport[0].nic.kinfo.netdev->name); memcpy(__entry->mbx_data, req, sizeof(struct hclge_mbx_vf_to_pf_cmd)); ), @@ -56,7 +56,7 @@ TRACE_EVENT(hclge_pf_mbx_send, __field(u8, vfid) __field(u16, code) __string(pciname, pci_name(hdev->pdev)) - __string(devname, &hdev->vport[0].nic.kinfo.netdev->name) + __string(devname, hdev->vport[0].nic.kinfo.netdev->name) __array(u32, mbx_data, PF_SEND_MBX_LEN) ),
@@ -64,7 +64,7 @@ TRACE_EVENT(hclge_pf_mbx_send, __entry->vfid = req->dest_vfid; __entry->code = req->msg.code; __assign_str(pciname, pci_name(hdev->pdev)); - __assign_str(devname, &hdev->vport[0].nic.kinfo.netdev->name); + __assign_str(devname, hdev->vport[0].nic.kinfo.netdev->name); memcpy(__entry->mbx_data, req, sizeof(struct hclge_mbx_pf_to_vf_cmd)); ), diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_trace.h b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_trace.h index e4bfb6191fef5..a208af567909f 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_trace.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_trace.h @@ -23,7 +23,7 @@ TRACE_EVENT(hclge_vf_mbx_get, __field(u8, vfid) __field(u16, code) __string(pciname, pci_name(hdev->pdev)) - __string(devname, &hdev->nic.kinfo.netdev->name) + __string(devname, hdev->nic.kinfo.netdev->name) __array(u32, mbx_data, VF_GET_MBX_LEN) ),
@@ -31,7 +31,7 @@ TRACE_EVENT(hclge_vf_mbx_get, __entry->vfid = req->dest_vfid; __entry->code = req->msg.code; __assign_str(pciname, pci_name(hdev->pdev)); - __assign_str(devname, &hdev->nic.kinfo.netdev->name); + __assign_str(devname, hdev->nic.kinfo.netdev->name); memcpy(__entry->mbx_data, req, sizeof(struct hclge_mbx_pf_to_vf_cmd)); ), @@ -55,7 +55,7 @@ TRACE_EVENT(hclge_vf_mbx_send, __field(u8, code) __field(u8, subcode) __string(pciname, pci_name(hdev->pdev)) - __string(devname, &hdev->nic.kinfo.netdev->name) + __string(devname, hdev->nic.kinfo.netdev->name) __array(u32, mbx_data, VF_SEND_MBX_LEN) ),
@@ -64,7 +64,7 @@ TRACE_EVENT(hclge_vf_mbx_send, __entry->code = req->msg.code; __entry->subcode = req->msg.subcode; __assign_str(pciname, pci_name(hdev->pdev)); - __assign_str(devname, &hdev->nic.kinfo.netdev->name); + __assign_str(devname, hdev->nic.kinfo.netdev->name); memcpy(__entry->mbx_data, req, sizeof(struct hclge_mbx_vf_to_pf_cmd)); ),
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason A. Donenfeld Jason@zx2c4.com
[ Upstream commit 55b6c738673871c9b0edae05d0c97995c1ff08c4 ]
If all peers are removed via wg_peer_remove_all(), rather than setting peer_list to empty, the peer is added to a temporary list with a head on the stack of wg_peer_remove_all(). If a netlink dump is resumed and the cursored peer is one that has been removed via wg_peer_remove_all(), it will iterate from that peer and then attempt to dump freed peers.
Fix this by instead checking peer->is_dead, which was explictly created for this purpose. Also move up the device_update_lock lockdep assertion, since reading is_dead relies on that.
It can be reproduced by a small script like:
echo "Setting config..." ip link add dev wg0 type wireguard wg setconf wg0 /big-config ( while true; do echo "Showing config..." wg showconf wg0 > /dev/null done ) & sleep 4 wg setconf wg0 <(printf "[Peer]\nPublicKey=$(wg genkey)\n")
Resulting in:
BUG: KASAN: slab-use-after-free in __lock_acquire+0x182a/0x1b20 Read of size 8 at addr ffff88811956ec70 by task wg/59 CPU: 2 PID: 59 Comm: wg Not tainted 6.8.0-rc2-debug+ #5 Call Trace: <TASK> dump_stack_lvl+0x47/0x70 print_address_description.constprop.0+0x2c/0x380 print_report+0xab/0x250 kasan_report+0xba/0xf0 __lock_acquire+0x182a/0x1b20 lock_acquire+0x191/0x4b0 down_read+0x80/0x440 get_peer+0x140/0xcb0 wg_get_device_dump+0x471/0x1130
Cc: stable@vger.kernel.org Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Reported-by: Lillian Berry lillian@star-ark.net Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Reviewed-by: Jiri Pirko jiri@nvidia.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireguard/netlink.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c index f5bc279c9a8c2..6523f9d5a1527 100644 --- a/drivers/net/wireguard/netlink.c +++ b/drivers/net/wireguard/netlink.c @@ -255,17 +255,17 @@ static int wg_get_device_dump(struct sk_buff *skb, struct netlink_callback *cb) if (!peers_nest) goto out; ret = 0; - /* If the last cursor was removed via list_del_init in peer_remove, then + lockdep_assert_held(&wg->device_update_lock); + /* If the last cursor was removed in peer_remove or peer_remove_all, then * we just treat this the same as there being no more peers left. The * reason is that seq_nr should indicate to userspace that this isn't a * coherent dump anyway, so they'll try again. */ if (list_empty(&wg->peer_list) || - (ctx->next_peer && list_empty(&ctx->next_peer->peer_list))) { + (ctx->next_peer && ctx->next_peer->is_dead)) { nla_nest_cancel(skb, peers_nest); goto out; } - lockdep_assert_held(&wg->device_update_lock); peer = list_prepare_entry(ctx->next_peer, &wg->peer_list, peer_list); list_for_each_entry_continue(peer, &wg->peer_list, peer_list) { if (get_peer(peer, skb, ctx)) {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason A. Donenfeld Jason@zx2c4.com
[ Upstream commit 71cbd32e3db82ea4a74e3ef9aeeaa6971969c86f ]
The previous commit fixed a bug that led to a NULL peer->device being dereferenced. It's actually easier and faster performance-wise to instead get the device from ctx->wg. This semantically makes more sense too, since ctx->wg->peer_allowedips.seq is compared with ctx->allowedips_seq, basing them both in ctx. This also acts as a defence in depth provision against freed peers.
Cc: stable@vger.kernel.org Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Reviewed-by: Jiri Pirko jiri@nvidia.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireguard/netlink.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c index 6523f9d5a1527..9dc02fa51ed09 100644 --- a/drivers/net/wireguard/netlink.c +++ b/drivers/net/wireguard/netlink.c @@ -164,8 +164,8 @@ get_peer(struct wg_peer *peer, struct sk_buff *skb, struct dump_ctx *ctx) if (!allowedips_node) goto no_allowedips; if (!ctx->allowedips_seq) - ctx->allowedips_seq = peer->device->peer_allowedips.seq; - else if (ctx->allowedips_seq != peer->device->peer_allowedips.seq) + ctx->allowedips_seq = ctx->wg->peer_allowedips.seq; + else if (ctx->allowedips_seq != ctx->wg->peer_allowedips.seq) goto no_allowedips;
allowedips_nest = nla_nest_start(skb, WGPEER_A_ALLOWEDIPS);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrey Jr. Melnikov temnota.am@gmail.com
[ Upstream commit 9815e39617541ef52d0dfac4be274ad378c6dc09 ]
The ASM1064 SATA host controller always reports wrongly, that it has 24 ports. But in reality, it only has four ports.
before: ahci 0000:04:00.0: SSS flag set, parallel bus scan disabled ahci 0000:04:00.0: AHCI 0001.0301 32 slots 24 ports 6 Gbps 0xffff0f impl SATA mode ahci 0000:04:00.0: flags: 64bit ncq sntf stag pm led only pio sxs deso sadm sds apst
after: ahci 0000:04:00.0: ASM1064 has only four ports ahci 0000:04:00.0: forcing port_map 0xffff0f -> 0xf ahci 0000:04:00.0: SSS flag set, parallel bus scan disabled ahci 0000:04:00.0: AHCI 0001.0301 32 slots 24 ports 6 Gbps 0xf impl SATA mode ahci 0000:04:00.0: flags: 64bit ncq sntf stag pm led only pio sxs deso sadm sds apst
Signed-off-by: "Andrey Jr. Melnikov" temnota.am@gmail.com Signed-off-by: Niklas Cassel cassel@kernel.org Stable-dep-of: 6cd8adc3e189 ("ahci: asm1064: asm1166: don't limit reported ports") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/ahci.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 6f7f8e41404dc..5df344e26c110 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -662,9 +662,17 @@ MODULE_PARM_DESC(mobile_lpm_policy, "Default LPM policy for mobile chipsets"); static void ahci_pci_save_initial_config(struct pci_dev *pdev, struct ahci_host_priv *hpriv) { - if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && pdev->device == 0x1166) { - dev_info(&pdev->dev, "ASM1166 has only six ports\n"); - hpriv->saved_port_map = 0x3f; + if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA) { + switch (pdev->device) { + case 0x1166: + dev_info(&pdev->dev, "ASM1166 has only six ports\n"); + hpriv->saved_port_map = 0x3f; + break; + case 0x1064: + dev_info(&pdev->dev, "ASM1064 has only four ports\n"); + hpriv->saved_port_map = 0xf; + break; + } }
if (pdev->vendor == PCI_VENDOR_ID_JMICRON && pdev->device == 0x2361) {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Conrad Kostecki conikost@gentoo.org
[ Upstream commit 6cd8adc3e18960f6e59d797285ed34ef473cc896 ]
Previously, patches have been added to limit the reported count of SATA ports for asm1064 and asm1166 SATA controllers, as those controllers do report more ports than physically having.
While it is allowed to report more ports than physically having in CAP.NP, it is not allowed to report more ports than physically having in the PI (Ports Implemented) register, which is what these HBAs do. (This is a AHCI spec violation.)
Unfortunately, it seems that the PMP implementation in these ASMedia HBAs is also violating the AHCI and SATA-IO PMP specification.
What these HBAs do is that they do not report that they support PMP (CAP.SPM (Supports Port Multiplier) is not set).
Instead, they have decided to add extra "virtual" ports in the PI register that is used if a port multiplier is connected to any of the physical ports of the HBA.
Enumerating the devices behind the PMP as specified in the AHCI and SATA-IO specifications, by using PMP READ and PMP WRITE commands to the physical ports of the HBA is not possible, you have to use the "virtual" ports.
This is of course bad, because this gives us no way to detect the device and vendor ID of the PMP actually connected to the HBA, which means that we can not apply the proper PMP quirks for the PMP that is connected to the HBA.
Limiting the port map will thus stop these controllers from working with SATA Port Multipliers.
This patch reverts both patches for asm1064 and asm1166, so old behavior is restored and SATA PMP will work again, but it will also reintroduce the (minutes long) extra boot time for the ASMedia controllers that do not have a PMP connected (either on the PCIe card itself, or an external PMP).
However, a longer boot time for some, is the lesser evil compared to some other users not being able to detect their drives at all.
Fixes: 0077a504e1a4 ("ahci: asm1166: correct count of reported ports") Fixes: 9815e3961754 ("ahci: asm1064: correct count of reported ports") Cc: stable@vger.kernel.org Reported-by: Matt cryptearth@googlemail.com Signed-off-by: Conrad Kostecki conikost@gentoo.org Reviewed-by: Hans de Goede hdegoede@redhat.com [cassel: rewrote commit message] Signed-off-by: Niklas Cassel cassel@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/ahci.c | 13 ------------- 1 file changed, 13 deletions(-)
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 5df344e26c110..55a07dbe1d8a6 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -662,19 +662,6 @@ MODULE_PARM_DESC(mobile_lpm_policy, "Default LPM policy for mobile chipsets"); static void ahci_pci_save_initial_config(struct pci_dev *pdev, struct ahci_host_priv *hpriv) { - if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA) { - switch (pdev->device) { - case 0x1166: - dev_info(&pdev->dev, "ASM1166 has only six ports\n"); - hpriv->saved_port_map = 0x3f; - break; - case 0x1064: - dev_info(&pdev->dev, "ASM1064 has only four ports\n"); - hpriv->saved_port_map = 0xf; - break; - } - } - if (pdev->vendor == PCI_VENDOR_ID_JMICRON && pdev->device == 0x2361) { dev_info(&pdev->dev, "JMB361 has only one port\n"); hpriv->force_port_map = 1;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rodrigo Siqueira Rodrigo.Siqueira@amd.com
[ Upstream commit e64b3f55e458ce7e2087a0051f47edabf74545e7 ]
[WHY & HOW] If the display is null when creating an HDCP session, return a proper error code.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c b/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c index 972f2600f967f..05d96fa681354 100644 --- a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c +++ b/drivers/gpu/drm/amd/display/modules/hdcp/hdcp_psp.c @@ -405,6 +405,9 @@ enum mod_hdcp_status mod_hdcp_hdcp2_create_session(struct mod_hdcp *hdcp) hdcp_cmd = (struct ta_hdcp_shared_memory *)psp->hdcp_context.hdcp_shared_buf; memset(hdcp_cmd, 0, sizeof(struct ta_hdcp_shared_memory));
+ if (!display) + return MOD_HDCP_STATUS_DISPLAY_NOT_FOUND; + hdcp_cmd->in_msg.hdcp2_create_session_v2.display_handle = display->index;
if (hdcp->connection.link.adjust.hdcp2.force_type == MOD_HDCP_FORCE_TYPE_0)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leo Ma hanghong.ma@amd.com
[ Upstream commit 69e3be6893a7e668660b05a966bead82bbddb01d ]
[Why] When mode switching is triggered there is momentary noise visible on some HDMI TV or displays.
[How] Wait for 2 frames to make sure we have enough time to send out AV mute and sink receives a full frame.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Reviewed-by: Wenjing Liu wenjing.liu@amd.com Acked-by: Wayne Lin wayne.lin@amd.com Signed-off-by: Leo Ma hanghong.ma@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c index 22c77e96f6a54..02d22f62c0031 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c @@ -641,10 +641,20 @@ void dcn30_set_avmute(struct pipe_ctx *pipe_ctx, bool enable) if (pipe_ctx == NULL) return;
- if (dc_is_hdmi_tmds_signal(pipe_ctx->stream->signal) && pipe_ctx->stream_res.stream_enc != NULL) + if (dc_is_hdmi_tmds_signal(pipe_ctx->stream->signal) && pipe_ctx->stream_res.stream_enc != NULL) { pipe_ctx->stream_res.stream_enc->funcs->set_avmute( pipe_ctx->stream_res.stream_enc, enable); + + /* Wait for two frame to make sure AV mute is sent out */ + if (enable) { + pipe_ctx->stream_res.tg->funcs->wait_for_state(pipe_ctx->stream_res.tg, CRTC_STATE_VACTIVE); + pipe_ctx->stream_res.tg->funcs->wait_for_state(pipe_ctx->stream_res.tg, CRTC_STATE_VBLANK); + pipe_ctx->stream_res.tg->funcs->wait_for_state(pipe_ctx->stream_res.tg, CRTC_STATE_VACTIVE); + pipe_ctx->stream_res.tg->funcs->wait_for_state(pipe_ctx->stream_res.tg, CRTC_STATE_VBLANK); + pipe_ctx->stream_res.tg->funcs->wait_for_state(pipe_ctx->stream_res.tg, CRTC_STATE_VACTIVE); + } + } }
void dcn30_update_info_frame(struct pipe_ctx *pipe_ctx)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
[ Upstream commit 6e7132ed3c07bd8a6ce3db4bb307ef2852b322dc ]
There was reported lockup when we exit a snapshot with many exceptions. Fix this by adding "cond_resched" to the loop that frees the exceptions.
Reported-by: John Pittman jpittman@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-snap.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index 41735a25d50aa..de73fe79640fd 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -685,8 +685,10 @@ static void dm_exception_table_exit(struct dm_exception_table *et, for (i = 0; i < size; i++) { slot = et->table + i;
- hlist_bl_for_each_entry_safe(ex, pos, n, slot, hash_list) + hlist_bl_for_each_entry_safe(ex, pos, n, slot, hash_list) { kmem_cache_free(mem, ex); + cond_resched(); + } }
vfree(et->table);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xu Wang vulab@iscas.ac.cn
[ Upstream commit b6bf4776d9e2ed4b2552d1c252fff8de3786309a ]
Remove unnecessary cast in the argument to kfree.
Signed-off-by: Xu Wang vulab@iscas.ac.cn Link: https://lore.kernel.org/r/20201023085533.4792-1-vulab@iscas.ac.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: e3f269ed0acc ("x86/pm: Work around false positive kmemleak report in msr_build_context()") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/neterion/vxge/vxge-config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/neterion/vxge/vxge-config.c b/drivers/net/ethernet/neterion/vxge/vxge-config.c index f5d48d7c4ce28..da48dd85770c0 100644 --- a/drivers/net/ethernet/neterion/vxge/vxge-config.c +++ b/drivers/net/ethernet/neterion/vxge/vxge-config.c @@ -1121,7 +1121,7 @@ static void __vxge_hw_blockpool_destroy(struct __vxge_hw_blockpool *blockpool)
list_for_each_safe(p, n, &blockpool->free_entry_list) { list_del(&((struct __vxge_hw_blockpool_entry *)p)->item); - kfree((void *)p); + kfree(p); }
return;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Lutomirski luto@kernel.org
[ Upstream commit 3fb0fdb3bbe7aed495109b3296b06c2409734023 ]
On 32-bit kernels, the stackprotector canary is quite nasty -- it is stored at %gs:(20), which is nasty because 32-bit kernels use %fs for percpu storage. It's even nastier because it means that whether %gs contains userspace state or kernel state while running kernel code depends on whether stackprotector is enabled (this is CONFIG_X86_32_LAZY_GS), and this setting radically changes the way that segment selectors work. Supporting both variants is a maintenance and testing mess.
Merely rearranging so that percpu and the stack canary share the same segment would be messy as the 32-bit percpu address layout isn't currently compatible with putting a variable at a fixed offset.
Fortunately, GCC 8.1 added options that allow the stack canary to be accessed as %fs:__stack_chk_guard, effectively turning it into an ordinary percpu variable. This lets us get rid of all of the code to manage the stack canary GDT descriptor and the CONFIG_X86_32_LAZY_GS mess.
(That name is special. We could use any symbol we want for the %fs-relative mode, but for CONFIG_SMP=n, gcc refuses to let us use any name other than __stack_chk_guard.)
Forcibly disable stackprotector on older compilers that don't support the new options and turn the stack canary into a percpu variable. The "lazy GS" approach is now used for all 32-bit configurations.
Also makes load_gs_index() work on 32-bit kernels. On 64-bit kernels, it loads the GS selector and updates the user GSBASE accordingly. (This is unchanged.) On 32-bit kernels, it loads the GS selector and updates GSBASE, which is now always the user base. This means that the overall effect is the same on 32-bit and 64-bit, which avoids some ifdeffery.
[ bp: Massage commit message. ]
Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Borislav Petkov bp@suse.de Link: https://lkml.kernel.org/r/c0ff7dba14041c7e5d1cae5d4df052f03759bef3.161324384... Stable-dep-of: e3f269ed0acc ("x86/pm: Work around false positive kmemleak report in msr_build_context()") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/Kconfig | 7 +- arch/x86/Makefile | 8 +++ arch/x86/entry/entry_32.S | 56 ++-------------- arch/x86/include/asm/processor.h | 15 ++--- arch/x86/include/asm/ptrace.h | 5 +- arch/x86/include/asm/segment.h | 30 +++------ arch/x86/include/asm/stackprotector.h | 79 +++++------------------ arch/x86/include/asm/suspend_32.h | 6 +- arch/x86/kernel/asm-offsets_32.c | 5 -- arch/x86/kernel/cpu/common.c | 5 +- arch/x86/kernel/doublefault_32.c | 4 +- arch/x86/kernel/head_32.S | 18 +----- arch/x86/kernel/setup_percpu.c | 1 - arch/x86/kernel/tls.c | 8 +-- arch/x86/lib/insn-eval.c | 4 -- arch/x86/platform/pvh/head.S | 14 ---- arch/x86/power/cpu.c | 6 +- arch/x86/xen/enlighten_pv.c | 1 - scripts/gcc-x86_32-has-stack-protector.sh | 6 +- 19 files changed, 60 insertions(+), 218 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6dc670e363939..47c94e9de03e4 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -352,10 +352,6 @@ config X86_64_SMP def_bool y depends on X86_64 && SMP
-config X86_32_LAZY_GS - def_bool y - depends on X86_32 && !STACKPROTECTOR - config ARCH_SUPPORTS_UPROBES def_bool y
@@ -378,7 +374,8 @@ config CC_HAS_SANE_STACKPROTECTOR default $(success,$(srctree)/scripts/gcc-x86_32-has-stack-protector.sh $(CC)) help We have to make sure stack protector is unconditionally disabled if - the compiler produces broken code. + the compiler produces broken code or if it does not let us control + the segment on 32-bit kernels.
menu "Processor type and features"
diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 1f796050c6dde..8b9fa777f513b 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -87,6 +87,14 @@ ifeq ($(CONFIG_X86_32),y)
# temporary until string.h is fixed KBUILD_CFLAGS += -ffreestanding + + ifeq ($(CONFIG_STACKPROTECTOR),y) + ifeq ($(CONFIG_SMP),y) + KBUILD_CFLAGS += -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard + else + KBUILD_CFLAGS += -mstack-protector-guard=global + endif + endif else BITS := 64 UTS_MACHINE := x86_64 diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index 70bd81b6c612e..10b7c62a3e97a 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -20,7 +20,7 @@ * 1C(%esp) - %ds * 20(%esp) - %es * 24(%esp) - %fs - * 28(%esp) - %gs saved iff !CONFIG_X86_32_LAZY_GS + * 28(%esp) - unused -- was %gs on old stackprotector kernels * 2C(%esp) - orig_eax * 30(%esp) - %eip * 34(%esp) - %cs @@ -56,14 +56,9 @@ /* * User gs save/restore * - * %gs is used for userland TLS and kernel only uses it for stack - * canary which is required to be at %gs:20 by gcc. Read the comment - * at the top of stackprotector.h for more info. - * - * Local labels 98 and 99 are used. + * This is leftover junk from CONFIG_X86_32_LAZY_GS. A subsequent patch + * will remove it entirely. */ -#ifdef CONFIG_X86_32_LAZY_GS - /* unfortunately push/pop can't be no-op */ .macro PUSH_GS pushl $0 @@ -86,49 +81,6 @@ .macro SET_KERNEL_GS reg .endm
-#else /* CONFIG_X86_32_LAZY_GS */ - -.macro PUSH_GS - pushl %gs -.endm - -.macro POP_GS pop=0 -98: popl %gs - .if \pop <> 0 - add $\pop, %esp - .endif -.endm -.macro POP_GS_EX -.pushsection .fixup, "ax" -99: movl $0, (%esp) - jmp 98b -.popsection - _ASM_EXTABLE(98b, 99b) -.endm - -.macro PTGS_TO_GS -98: mov PT_GS(%esp), %gs -.endm -.macro PTGS_TO_GS_EX -.pushsection .fixup, "ax" -99: movl $0, PT_GS(%esp) - jmp 98b -.popsection - _ASM_EXTABLE(98b, 99b) -.endm - -.macro GS_TO_REG reg - movl %gs, \reg -.endm -.macro REG_TO_PTGS reg - movl \reg, PT_GS(%esp) -.endm -.macro SET_KERNEL_GS reg - movl $(__KERNEL_STACK_CANARY), \reg - movl \reg, %gs -.endm - -#endif /* CONFIG_X86_32_LAZY_GS */
/* Unconditionally switch to user cr3 */ .macro SWITCH_TO_USER_CR3 scratch_reg:req @@ -779,7 +731,7 @@ SYM_CODE_START(__switch_to_asm)
#ifdef CONFIG_STACKPROTECTOR movl TASK_stack_canary(%edx), %ebx - movl %ebx, PER_CPU_VAR(stack_canary)+stack_canary_offset + movl %ebx, PER_CPU_VAR(__stack_chk_guard) #endif
/* diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index d7e017b0b4c3b..6dc3c5f0be076 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -441,6 +441,9 @@ struct fixed_percpu_data { * GCC hardcodes the stack canary as %gs:40. Since the * irq_stack is the object at %gs:0, we reserve the bottom * 48 bytes of the irq stack for the canary. + * + * Once we are willing to require -mstack-protector-guard-symbol= + * support for x86_64 stackprotector, we can get rid of this. */ char gs_base[40]; unsigned long stack_canary; @@ -461,17 +464,7 @@ extern asmlinkage void ignore_sysret(void); void current_save_fsgs(void); #else /* X86_64 */ #ifdef CONFIG_STACKPROTECTOR -/* - * Make sure stack canary segment base is cached-aligned: - * "For Intel Atom processors, avoid non zero segment base address - * that is not aligned to cache line boundary at all cost." - * (Optim Ref Manual Assembly/Compiler Coding Rule 15.) - */ -struct stack_canary { - char __pad[20]; /* canary at %gs:20 */ - unsigned long canary; -}; -DECLARE_PER_CPU_ALIGNED(struct stack_canary, stack_canary); +DECLARE_PER_CPU(unsigned long, __stack_chk_guard); #endif /* Per CPU softirq stack pointer */ DECLARE_PER_CPU(struct irq_stack *, softirq_stack_ptr); diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index 409f661481e11..b94f615600d57 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -37,7 +37,10 @@ struct pt_regs { unsigned short __esh; unsigned short fs; unsigned short __fsh; - /* On interrupt, gs and __gsh store the vector number. */ + /* + * On interrupt, gs and __gsh store the vector number. They never + * store gs any more. + */ unsigned short gs; unsigned short __gsh; /* On interrupt, this is the error code. */ diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h index 7fdd4facfce71..72044026eb3c2 100644 --- a/arch/x86/include/asm/segment.h +++ b/arch/x86/include/asm/segment.h @@ -95,7 +95,7 @@ * * 26 - ESPFIX small SS * 27 - per-cpu [ offset to per-cpu data area ] - * 28 - stack_canary-20 [ for stack protector ] <=== cacheline #8 + * 28 - unused * 29 - unused * 30 - unused * 31 - TSS for double fault handler @@ -118,7 +118,6 @@
#define GDT_ENTRY_ESPFIX_SS 26 #define GDT_ENTRY_PERCPU 27 -#define GDT_ENTRY_STACK_CANARY 28
#define GDT_ENTRY_DOUBLEFAULT_TSS 31
@@ -158,12 +157,6 @@ # define __KERNEL_PERCPU 0 #endif
-#ifdef CONFIG_STACKPROTECTOR -# define __KERNEL_STACK_CANARY (GDT_ENTRY_STACK_CANARY*8) -#else -# define __KERNEL_STACK_CANARY 0 -#endif - #else /* 64-bit: */
#include <asm/cache.h> @@ -364,22 +357,15 @@ static inline void __loadsegment_fs(unsigned short value) asm("mov %%" #seg ",%0":"=r" (value) : : "memory")
/* - * x86-32 user GS accessors: + * x86-32 user GS accessors. This is ugly and could do with some cleaning up. */ #ifdef CONFIG_X86_32 -# ifdef CONFIG_X86_32_LAZY_GS -# define get_user_gs(regs) (u16)({ unsigned long v; savesegment(gs, v); v; }) -# define set_user_gs(regs, v) loadsegment(gs, (unsigned long)(v)) -# define task_user_gs(tsk) ((tsk)->thread.gs) -# define lazy_save_gs(v) savesegment(gs, (v)) -# define lazy_load_gs(v) loadsegment(gs, (v)) -# else /* X86_32_LAZY_GS */ -# define get_user_gs(regs) (u16)((regs)->gs) -# define set_user_gs(regs, v) do { (regs)->gs = (v); } while (0) -# define task_user_gs(tsk) (task_pt_regs(tsk)->gs) -# define lazy_save_gs(v) do { } while (0) -# define lazy_load_gs(v) do { } while (0) -# endif /* X86_32_LAZY_GS */ +# define get_user_gs(regs) (u16)({ unsigned long v; savesegment(gs, v); v; }) +# define set_user_gs(regs, v) loadsegment(gs, (unsigned long)(v)) +# define task_user_gs(tsk) ((tsk)->thread.gs) +# define lazy_save_gs(v) savesegment(gs, (v)) +# define lazy_load_gs(v) loadsegment(gs, (v)) +# define load_gs_index(v) loadsegment(gs, (v)) #endif /* X86_32 */
#endif /* !__ASSEMBLY__ */ diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h index 7fb482f0f25b0..b6ffe58c70fab 100644 --- a/arch/x86/include/asm/stackprotector.h +++ b/arch/x86/include/asm/stackprotector.h @@ -5,30 +5,23 @@ * Stack protector works by putting predefined pattern at the start of * the stack frame and verifying that it hasn't been overwritten when * returning from the function. The pattern is called stack canary - * and unfortunately gcc requires it to be at a fixed offset from %gs. - * On x86_64, the offset is 40 bytes and on x86_32 20 bytes. x86_64 - * and x86_32 use segment registers differently and thus handles this - * requirement differently. + * and unfortunately gcc historically required it to be at a fixed offset + * from the percpu segment base. On x86_64, the offset is 40 bytes. * - * On x86_64, %gs is shared by percpu area and stack canary. All - * percpu symbols are zero based and %gs points to the base of percpu - * area. The first occupant of the percpu area is always - * fixed_percpu_data which contains stack_canary at offset 40. Userland - * %gs is always saved and restored on kernel entry and exit using - * swapgs, so stack protector doesn't add any complexity there. + * The same segment is shared by percpu area and stack canary. On + * x86_64, percpu symbols are zero based and %gs (64-bit) points to the + * base of percpu area. The first occupant of the percpu area is always + * fixed_percpu_data which contains stack_canary at the approproate + * offset. On x86_32, the stack canary is just a regular percpu + * variable. * - * On x86_32, it's slightly more complicated. As in x86_64, %gs is - * used for userland TLS. Unfortunately, some processors are much - * slower at loading segment registers with different value when - * entering and leaving the kernel, so the kernel uses %fs for percpu - * area and manages %gs lazily so that %gs is switched only when - * necessary, usually during task switch. + * Putting percpu data in %fs on 32-bit is a minor optimization compared to + * using %gs. Since 32-bit userspace normally has %fs == 0, we are likely + * to load 0 into %fs on exit to usermode, whereas with percpu data in + * %gs, we are likely to load a non-null %gs on return to user mode. * - * As gcc requires the stack canary at %gs:20, %gs can't be managed - * lazily if stack protector is enabled, so the kernel saves and - * restores userland %gs on kernel entry and exit. This behavior is - * controlled by CONFIG_X86_32_LAZY_GS and accessors are defined in - * system.h to hide the details. + * Once we are willing to require GCC 8.1 or better for 64-bit stackprotector + * support, we can remove some of this complexity. */
#ifndef _ASM_STACKPROTECTOR_H @@ -44,14 +37,6 @@ #include <linux/random.h> #include <linux/sched.h>
-/* - * 24 byte read-only segment initializer for stack canary. Linker - * can't handle the address bit shifting. Address will be set in - * head_32 for boot CPU and setup_per_cpu_areas() for others. - */ -#define GDT_STACK_CANARY_INIT \ - [GDT_ENTRY_STACK_CANARY] = GDT_ENTRY_INIT(0x4090, 0, 0x18), - /* * Initialize the stackprotector canary value. * @@ -86,7 +71,7 @@ static __always_inline void boot_init_stack_canary(void) #ifdef CONFIG_X86_64 this_cpu_write(fixed_percpu_data.stack_canary, canary); #else - this_cpu_write(stack_canary.canary, canary); + this_cpu_write(__stack_chk_guard, canary); #endif }
@@ -95,48 +80,16 @@ static inline void cpu_init_stack_canary(int cpu, struct task_struct *idle) #ifdef CONFIG_X86_64 per_cpu(fixed_percpu_data.stack_canary, cpu) = idle->stack_canary; #else - per_cpu(stack_canary.canary, cpu) = idle->stack_canary; -#endif -} - -static inline void setup_stack_canary_segment(int cpu) -{ -#ifdef CONFIG_X86_32 - unsigned long canary = (unsigned long)&per_cpu(stack_canary, cpu); - struct desc_struct *gdt_table = get_cpu_gdt_rw(cpu); - struct desc_struct desc; - - desc = gdt_table[GDT_ENTRY_STACK_CANARY]; - set_desc_base(&desc, canary); - write_gdt_entry(gdt_table, GDT_ENTRY_STACK_CANARY, &desc, DESCTYPE_S); -#endif -} - -static inline void load_stack_canary_segment(void) -{ -#ifdef CONFIG_X86_32 - asm("mov %0, %%gs" : : "r" (__KERNEL_STACK_CANARY) : "memory"); + per_cpu(__stack_chk_guard, cpu) = idle->stack_canary; #endif }
#else /* STACKPROTECTOR */
-#define GDT_STACK_CANARY_INIT - /* dummy boot_init_stack_canary() is defined in linux/stackprotector.h */
-static inline void setup_stack_canary_segment(int cpu) -{ } - static inline void cpu_init_stack_canary(int cpu, struct task_struct *idle) { }
-static inline void load_stack_canary_segment(void) -{ -#ifdef CONFIG_X86_32 - asm volatile ("mov %0, %%gs" : : "r" (0)); -#endif -} - #endif /* STACKPROTECTOR */ #endif /* _ASM_STACKPROTECTOR_H */ diff --git a/arch/x86/include/asm/suspend_32.h b/arch/x86/include/asm/suspend_32.h index 3b97aa9215430..a800abb1a9925 100644 --- a/arch/x86/include/asm/suspend_32.h +++ b/arch/x86/include/asm/suspend_32.h @@ -13,12 +13,10 @@ /* image of the saved processor state */ struct saved_context { /* - * On x86_32, all segment registers, with the possible exception of - * gs, are saved at kernel entry in pt_regs. + * On x86_32, all segment registers except gs are saved at kernel + * entry in pt_regs. */ -#ifdef CONFIG_X86_32_LAZY_GS u16 gs; -#endif unsigned long cr0, cr2, cr3, cr4; u64 misc_enable; struct saved_msrs saved_msrs; diff --git a/arch/x86/kernel/asm-offsets_32.c b/arch/x86/kernel/asm-offsets_32.c index 6e043f295a605..2b411cd00a4e2 100644 --- a/arch/x86/kernel/asm-offsets_32.c +++ b/arch/x86/kernel/asm-offsets_32.c @@ -53,11 +53,6 @@ void foo(void) offsetof(struct cpu_entry_area, tss.x86_tss.sp1) - offsetofend(struct cpu_entry_area, entry_stack_page.stack));
-#ifdef CONFIG_STACKPROTECTOR - BLANK(); - OFFSET(stack_canary_offset, stack_canary, canary); -#endif - BLANK(); DEFINE(EFI_svam, offsetof(efi_runtime_services_t, set_virtual_address_map)); } diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 0c72ff732aa08..33002cb5a1c62 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -166,7 +166,6 @@ DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = {
[GDT_ENTRY_ESPFIX_SS] = GDT_ENTRY_INIT(0xc092, 0, 0xfffff), [GDT_ENTRY_PERCPU] = GDT_ENTRY_INIT(0xc092, 0, 0xfffff), - GDT_STACK_CANARY_INIT #endif } }; EXPORT_PER_CPU_SYMBOL_GPL(gdt_page); @@ -600,7 +599,6 @@ void load_percpu_segment(int cpu) __loadsegment_simple(gs, 0); wrmsrl(MSR_GS_BASE, cpu_kernelmode_gs_base(cpu)); #endif - load_stack_canary_segment(); }
#ifdef CONFIG_X86_32 @@ -1940,7 +1938,8 @@ DEFINE_PER_CPU(unsigned long, cpu_current_top_of_stack) = EXPORT_PER_CPU_SYMBOL(cpu_current_top_of_stack);
#ifdef CONFIG_STACKPROTECTOR -DEFINE_PER_CPU_ALIGNED(struct stack_canary, stack_canary); +DEFINE_PER_CPU(unsigned long, __stack_chk_guard); +EXPORT_PER_CPU_SYMBOL(__stack_chk_guard); #endif
#endif /* CONFIG_X86_64 */ diff --git a/arch/x86/kernel/doublefault_32.c b/arch/x86/kernel/doublefault_32.c index 759d392cbe9f0..d1d49e3d536b8 100644 --- a/arch/x86/kernel/doublefault_32.c +++ b/arch/x86/kernel/doublefault_32.c @@ -100,9 +100,7 @@ DEFINE_PER_CPU_PAGE_ALIGNED(struct doublefault_stack, doublefault_stack) = { .ss = __KERNEL_DS, .ds = __USER_DS, .fs = __KERNEL_PERCPU, -#ifndef CONFIG_X86_32_LAZY_GS - .gs = __KERNEL_STACK_CANARY, -#endif + .gs = 0,
.__cr3 = __pa_nodebug(swapper_pg_dir), }, diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S index 3f1691b89231f..0359333f6bdee 100644 --- a/arch/x86/kernel/head_32.S +++ b/arch/x86/kernel/head_32.S @@ -319,8 +319,8 @@ SYM_FUNC_START(startup_32_smp) movl $(__KERNEL_PERCPU), %eax movl %eax,%fs # set this cpu's percpu
- movl $(__KERNEL_STACK_CANARY),%eax - movl %eax,%gs + xorl %eax,%eax + movl %eax,%gs # clear possible garbage in %gs
xorl %eax,%eax # Clear LDT lldt %ax @@ -340,20 +340,6 @@ SYM_FUNC_END(startup_32_smp) */ __INIT setup_once: -#ifdef CONFIG_STACKPROTECTOR - /* - * Configure the stack canary. The linker can't handle this by - * relocation. Manually set base address in stack canary - * segment descriptor. - */ - movl $gdt_page,%eax - movl $stack_canary,%ecx - movw %cx, 8 * GDT_ENTRY_STACK_CANARY + 2(%eax) - shrl $16, %ecx - movb %cl, 8 * GDT_ENTRY_STACK_CANARY + 4(%eax) - movb %ch, 8 * GDT_ENTRY_STACK_CANARY + 7(%eax) -#endif - andl $0,setup_once_ref /* Once is enough, thanks */ RET
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c index fd945ce78554e..0941d2f44f2a2 100644 --- a/arch/x86/kernel/setup_percpu.c +++ b/arch/x86/kernel/setup_percpu.c @@ -224,7 +224,6 @@ void __init setup_per_cpu_areas(void) per_cpu(this_cpu_off, cpu) = per_cpu_offset(cpu); per_cpu(cpu_number, cpu) = cpu; setup_percpu_segment(cpu); - setup_stack_canary_segment(cpu); /* * Copy data used in early init routines from the * initial arrays to the per cpu data areas. These diff --git a/arch/x86/kernel/tls.c b/arch/x86/kernel/tls.c index 64a496a0687f6..3c883e0642424 100644 --- a/arch/x86/kernel/tls.c +++ b/arch/x86/kernel/tls.c @@ -164,17 +164,11 @@ int do_set_thread_area(struct task_struct *p, int idx, savesegment(fs, sel); if (sel == modified_sel) loadsegment(fs, sel); - - savesegment(gs, sel); - if (sel == modified_sel) - load_gs_index(sel); #endif
-#ifdef CONFIG_X86_32_LAZY_GS savesegment(gs, sel); if (sel == modified_sel) - loadsegment(gs, sel); -#endif + load_gs_index(sel); } else { #ifdef CONFIG_X86_64 if (p->thread.fsindex == modified_sel) diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c index ffc8b7dcf1feb..6ed542e310adc 100644 --- a/arch/x86/lib/insn-eval.c +++ b/arch/x86/lib/insn-eval.c @@ -404,10 +404,6 @@ static short get_segment_selector(struct pt_regs *regs, int seg_reg_idx) case INAT_SEG_REG_FS: return (unsigned short)(regs->fs & 0xffff); case INAT_SEG_REG_GS: - /* - * GS may or may not be in regs as per CONFIG_X86_32_LAZY_GS. - * The macro below takes care of both cases. - */ return get_user_gs(regs); case INAT_SEG_REG_IGNORE: default: diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S index 43b4d864817ec..afbf0bb252da5 100644 --- a/arch/x86/platform/pvh/head.S +++ b/arch/x86/platform/pvh/head.S @@ -45,10 +45,8 @@
#define PVH_GDT_ENTRY_CS 1 #define PVH_GDT_ENTRY_DS 2 -#define PVH_GDT_ENTRY_CANARY 3 #define PVH_CS_SEL (PVH_GDT_ENTRY_CS * 8) #define PVH_DS_SEL (PVH_GDT_ENTRY_DS * 8) -#define PVH_CANARY_SEL (PVH_GDT_ENTRY_CANARY * 8)
SYM_CODE_START_LOCAL(pvh_start_xen) cld @@ -109,17 +107,6 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
#else /* CONFIG_X86_64 */
- /* Set base address in stack canary descriptor. */ - movl $_pa(gdt_start),%eax - movl $_pa(canary),%ecx - movw %cx, (PVH_GDT_ENTRY_CANARY * 8) + 2(%eax) - shrl $16, %ecx - movb %cl, (PVH_GDT_ENTRY_CANARY * 8) + 4(%eax) - movb %ch, (PVH_GDT_ENTRY_CANARY * 8) + 7(%eax) - - mov $PVH_CANARY_SEL,%eax - mov %eax,%gs - call mk_early_pgtbl_32
mov $_pa(initial_page_table), %eax @@ -163,7 +150,6 @@ SYM_DATA_START_LOCAL(gdt_start) .quad GDT_ENTRY(0xc09a, 0, 0xfffff) /* PVH_CS_SEL */ #endif .quad GDT_ENTRY(0xc092, 0, 0xfffff) /* PVH_DS_SEL */ - .quad GDT_ENTRY(0x4090, 0, 0x18) /* PVH_CANARY_SEL */ SYM_DATA_END_LABEL(gdt_start, SYM_L_LOCAL, gdt_end)
.balign 16 diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c index 4e4e76ecd3ecd..84c7b2312ea9e 100644 --- a/arch/x86/power/cpu.c +++ b/arch/x86/power/cpu.c @@ -101,11 +101,8 @@ static void __save_processor_state(struct saved_context *ctxt) /* * segment registers */ -#ifdef CONFIG_X86_32_LAZY_GS savesegment(gs, ctxt->gs); -#endif #ifdef CONFIG_X86_64 - savesegment(gs, ctxt->gs); savesegment(fs, ctxt->fs); savesegment(ds, ctxt->ds); savesegment(es, ctxt->es); @@ -234,7 +231,6 @@ static void notrace __restore_processor_state(struct saved_context *ctxt) wrmsrl(MSR_GS_BASE, ctxt->kernelmode_gs_base); #else loadsegment(fs, __KERNEL_PERCPU); - loadsegment(gs, __KERNEL_STACK_CANARY); #endif
/* Restore the TSS, RO GDT, LDT, and usermode-relevant MSRs. */ @@ -257,7 +253,7 @@ static void notrace __restore_processor_state(struct saved_context *ctxt) */ wrmsrl(MSR_FS_BASE, ctxt->fs_base); wrmsrl(MSR_KERNEL_GS_BASE, ctxt->usermode_gs_base); -#elif defined(CONFIG_X86_32_LAZY_GS) +#else loadsegment(gs, ctxt->gs); #endif
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 815030b7f6fa8..94804670caab8 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -1193,7 +1193,6 @@ static void __init xen_setup_gdt(int cpu) pv_ops.cpu.write_gdt_entry = xen_write_gdt_entry_boot; pv_ops.cpu.load_gdt = xen_load_gdt_boot;
- setup_stack_canary_segment(cpu); switch_to_new_gdt(cpu);
pv_ops.cpu.write_gdt_entry = xen_write_gdt_entry; diff --git a/scripts/gcc-x86_32-has-stack-protector.sh b/scripts/gcc-x86_32-has-stack-protector.sh index f5c1194952540..825c75c5b7150 100755 --- a/scripts/gcc-x86_32-has-stack-protector.sh +++ b/scripts/gcc-x86_32-has-stack-protector.sh @@ -1,4 +1,8 @@ #!/bin/sh # SPDX-License-Identifier: GPL-2.0
-echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -c -m32 -O0 -fstack-protector - -o - 2> /dev/null | grep -q "%gs" +# This requires GCC 8.1 or better. Specifically, we require +# -mstack-protector-guard-reg, added by +# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81708 + +echo "int foo(void) { char X[200]; return 3; }" | $* -S -x c -c -m32 -O0 -fstack-protector -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard - -o - 2> /dev/null | grep -q "%fs"
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anton Altaparmakov anton@tuxera.com
[ Upstream commit e3f269ed0accbb22aa8f25d2daffa23c3fccd407 ]
Since:
7ee18d677989 ("x86/power: Make restore_processor_context() sane")
kmemleak reports this issue:
unreferenced object 0xf68241e0 (size 32): comm "swapper/0", pid 1, jiffies 4294668610 (age 68.432s) hex dump (first 32 bytes): 00 cc cc cc 29 10 01 c0 00 00 00 00 00 00 00 00 ....)........... 00 42 82 f6 cc cc cc cc cc cc cc cc cc cc cc cc .B.............. backtrace: [<461c1d50>] __kmem_cache_alloc_node+0x106/0x260 [<ea65e13b>] __kmalloc+0x54/0x160 [<c3858cd2>] msr_build_context.constprop.0+0x35/0x100 [<46635aff>] pm_check_save_msr+0x63/0x80 [<6b6bb938>] do_one_initcall+0x41/0x1f0 [<3f3add60>] kernel_init_freeable+0x199/0x1e8 [<3b538fde>] kernel_init+0x1a/0x110 [<938ae2b2>] ret_from_fork+0x1c/0x28
Which is a false positive.
Reproducer:
- Run rsync of whole kernel tree (multiple times if needed). - start a kmemleak scan - Note this is just an example: a lot of our internal tests hit these.
The root cause is similar to the fix in:
b0b592cf0836 x86/pm: Fix false positive kmemleak report in msr_build_context()
ie. the alignment within the packed struct saved_context which has everything unaligned as there is only "u16 gs;" at start of struct where in the past there were four u16 there thus aligning everything afterwards. The issue is with the fact that Kmemleak only searches for pointers that are aligned (see how pointers are scanned in kmemleak.c) so when the struct members are not aligned it doesn't see them.
Testing:
We run a lot of tests with our CI, and after applying this fix we do not see any kmemleak issues any more whilst without it we see hundreds of the above report. From a single, simple test run consisting of 416 individual test cases on kernel 5.10 x86 with kmemleak enabled we got 20 failures due to this, which is quite a lot. With this fix applied we get zero kmemleak related failures.
Fixes: 7ee18d677989 ("x86/power: Make restore_processor_context() sane") Signed-off-by: Anton Altaparmakov anton@tuxera.com Signed-off-by: Ingo Molnar mingo@kernel.org Acked-by: "Rafael J. Wysocki" rafael@kernel.org Cc: stable@vger.kernel.org Cc: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/r/20240314142656.17699-1-anton@tuxera.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/suspend_32.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/suspend_32.h b/arch/x86/include/asm/suspend_32.h index a800abb1a9925..d8416b3bf832e 100644 --- a/arch/x86/include/asm/suspend_32.h +++ b/arch/x86/include/asm/suspend_32.h @@ -12,11 +12,6 @@
/* image of the saved processor state */ struct saved_context { - /* - * On x86_32, all segment registers except gs are saved at kernel - * entry in pt_regs. - */ - u16 gs; unsigned long cr0, cr2, cr3, cr4; u64 misc_enable; struct saved_msrs saved_msrs; @@ -27,6 +22,11 @@ struct saved_context { unsigned long tr; unsigned long safety; unsigned long return_address; + /* + * On x86_32, all segment registers except gs are saved at kernel + * entry in pt_regs. + */ + u16 gs; bool misc_enable_saved; } __attribute__((packed));
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Salvatore Bonaccorso carnil@debian.org
The backport of commit 3080ea5553cc ("stddef: Introduce DECLARE_FLEX_ARRAY() helper") to 5.10.y (as a prerequisite of another fix) modified scripts/kernel-doc and introduced a syntax error:
Global symbol "$args" requires explicit package name (did you forget to declare "my $args"?) at ./scripts/kernel-doc line 1236. Global symbol "$args" requires explicit package name (did you forget to declare "my $args"?) at ./scripts/kernel-doc line 1236. Execution of ./scripts/kernel-doc aborted due to compilation errors.
Note: The issue could be fixed in the 5.10.y series as well by backporting e86bdb24375a ("scripts: kernel-doc: reduce repeated regex expressions into variables") but just replacing the undeclared args back to ([^,)]+) was the most straightforward approach. The issue is specific to the backport to the 5.10.y series. Thus there is as well no upstream commit for this change.
Fixes: 443b16ee3d9c ("stddef: Introduce DECLARE_FLEX_ARRAY() helper") # 5.10.y Reported-by: Ben Hutchings ben@decadent.org.uk Link: https://lore.kernel.org/regressions/ZeHKjjPGoyv_b2Tg@eldamar.lan/T/#u Link: https://bugs.debian.org/1064035 Signed-off-by: Salvatore Bonaccorso carnil@debian.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/kernel-doc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/scripts/kernel-doc +++ b/scripts/kernel-doc @@ -1233,7 +1233,7 @@ sub dump_struct($$) { # replace DECLARE_KFIFO_PTR $members =~ s/DECLARE_KFIFO_PTR\s*(([^,)]+),\s*([^,)]+))/$2 *$1/gos; # replace DECLARE_FLEX_ARRAY - $members =~ s/(?:__)?DECLARE_FLEX_ARRAY\s*($args,\s*$args)/$1 $2[]/gos; + $members =~ s/(?:__)?DECLARE_FLEX_ARRAY\s*(([^,)]+),\s*([^,)]+))/$1 $2[]/gos; my $declaration = $members;
# Split nested struct/union elements as newer ones
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Abbott abbotti@mev.co.uk
commit f53641a6e849034a44bf80f50245a75d7a376025 upstream.
The comedi_test devices have a couple of timers (ai_timer and ao_timer) that can be started to simulate hardware interrupts. Their expiry functions normally reschedule the timer. The driver code calls either del_timer_sync() or del_timer() to delete the timers from the queue, but does not currently prevent the timers from rescheduling themselves so synchronized deletion may be ineffective.
Add a couple of boolean members (one for each timer: ai_timer_enable and ao_timer_enable) to the device private data structure to indicate whether the timers are allowed to reschedule themselves. Set the member to true when adding the timer to the queue, and to false when deleting the timer from the queue in the waveform_ai_cancel() and waveform_ao_cancel() functions.
The del_timer_sync() function is also called from the waveform_detach() function, but the timer enable members will already be set to false when that function is called, so no change is needed there.
Fixes: 403fe7f34e33 ("staging: comedi: comedi_test: fix timer race conditions") Cc: stable@vger.kernel.org # 4.4+ Signed-off-by: Ian Abbott abbotti@mev.co.uk Link: https://lore.kernel.org/r/20240214100747.16203-1-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/comedi/drivers/comedi_test.c | 30 +++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-)
--- a/drivers/staging/comedi/drivers/comedi_test.c +++ b/drivers/staging/comedi/drivers/comedi_test.c @@ -87,6 +87,8 @@ struct waveform_private { struct comedi_device *dev; /* parent comedi device */ u64 ao_last_scan_time; /* time of previous AO scan in usec */ unsigned int ao_scan_period; /* AO scan period in usec */ + bool ai_timer_enable:1; /* should AI timer be running? */ + bool ao_timer_enable:1; /* should AO timer be running? */ unsigned short ao_loopbacks[N_CHANS]; };
@@ -236,8 +238,12 @@ static void waveform_ai_timer(struct tim time_increment = devpriv->ai_convert_time - now; else time_increment = 1; - mod_timer(&devpriv->ai_timer, - jiffies + usecs_to_jiffies(time_increment)); + spin_lock(&dev->spinlock); + if (devpriv->ai_timer_enable) { + mod_timer(&devpriv->ai_timer, + jiffies + usecs_to_jiffies(time_increment)); + } + spin_unlock(&dev->spinlock); }
overrun: @@ -393,9 +399,12 @@ static int waveform_ai_cmd(struct comedi * Seem to need an extra jiffy here, otherwise timer expires slightly * early! */ + spin_lock_bh(&dev->spinlock); + devpriv->ai_timer_enable = true; devpriv->ai_timer.expires = jiffies + usecs_to_jiffies(devpriv->ai_convert_period) + 1; add_timer(&devpriv->ai_timer); + spin_unlock_bh(&dev->spinlock); return 0; }
@@ -404,6 +413,9 @@ static int waveform_ai_cancel(struct com { struct waveform_private *devpriv = dev->private;
+ spin_lock_bh(&dev->spinlock); + devpriv->ai_timer_enable = false; + spin_unlock_bh(&dev->spinlock); if (in_softirq()) { /* Assume we were called from the timer routine itself. */ del_timer(&devpriv->ai_timer); @@ -495,8 +507,12 @@ static void waveform_ao_timer(struct tim unsigned int time_inc = devpriv->ao_last_scan_time + devpriv->ao_scan_period - now;
- mod_timer(&devpriv->ao_timer, - jiffies + usecs_to_jiffies(time_inc)); + spin_lock(&dev->spinlock); + if (devpriv->ao_timer_enable) { + mod_timer(&devpriv->ao_timer, + jiffies + usecs_to_jiffies(time_inc)); + } + spin_unlock(&dev->spinlock); }
underrun: @@ -517,9 +533,12 @@ static int waveform_ao_inttrig_start(str async->inttrig = NULL;
devpriv->ao_last_scan_time = ktime_to_us(ktime_get()); + spin_lock_bh(&dev->spinlock); + devpriv->ao_timer_enable = true; devpriv->ao_timer.expires = jiffies + usecs_to_jiffies(devpriv->ao_scan_period); add_timer(&devpriv->ao_timer); + spin_unlock_bh(&dev->spinlock);
return 1; } @@ -604,6 +623,9 @@ static int waveform_ao_cancel(struct com struct waveform_private *devpriv = dev->private;
s->async->inttrig = NULL; + spin_lock_bh(&dev->spinlock); + devpriv->ao_timer_enable = false; + spin_unlock_bh(&dev->spinlock); if (in_softirq()) { /* Assume we were called from the timer routine itself. */ del_timer(&devpriv->ao_timer);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
In commit 9127599c075c ("cpufreq: brcmstb-avs-cpufreq: add check for cpufreq_cpu_get's return value"), build warnings occur because a variable is created after some logic, resulting in:
drivers/cpufreq/brcmstb-avs-cpufreq.c: In function 'brcm_avs_cpufreq_get': drivers/cpufreq/brcmstb-avs-cpufreq.c:486:9: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 486 | struct private_data *priv = policy->driver_data; | ^~~~~~ cc1: all warnings being treated as errors make[2]: *** [scripts/Makefile.build:289: drivers/cpufreq/brcmstb-avs-cpufreq.o] Error 1 make[1]: *** [scripts/Makefile.build:552: drivers/cpufreq] Error 2 make[1]: *** Waiting for unfinished jobs.... make: *** [Makefile:1907: drivers] Error 2
Fix this up.
Link: https://lore.kernel.org/r/e114d9e5-26af-42be-9baa-72c3a6ec8fe5@oracle.com Link: https://lore.kernel.org/stable/20240327015023.GC7502@linuxonhyperv3.guj3yctz... Reported-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Reported-by: Linux Kernel Functional Testing lkft@linaro.org Fixes: 9127599c075c ("cpufreq: brcmstb-avs-cpufreq: add check for cpufreq_cpu_get's return value") Cc: Anastasia Belova abelova@astralinux.ru Cc: Viresh Kumar viresh.kumar@linaro.org Cc: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cpufreq/brcmstb-avs-cpufreq.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/cpufreq/brcmstb-avs-cpufreq.c +++ b/drivers/cpufreq/brcmstb-avs-cpufreq.c @@ -481,10 +481,11 @@ static bool brcm_avs_is_firmware_loaded( static unsigned int brcm_avs_cpufreq_get(unsigned int cpu) { struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); + struct private_data *priv; + if (!policy) return 0; - struct private_data *priv = policy->driver_data; - + priv = policy->driver_data; cpufreq_cpu_put(policy);
return brcm_avs_get_frequency(priv->base);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
commit 552705a3650bbf46a22b1adedc1b04181490fc36 upstream.
While the rhashtable set gc runs asynchronously, a race allows it to collect elements from anonymous sets with timeouts while it is being released from the commit path.
Mingi Cho originally reported this issue in a different path in 6.1.x with a pipapo set with low timeouts which is not possible upstream since 7395dfacfff6 ("netfilter: nf_tables: use timestamp to check for set element timeout").
Fix this by setting on the dead flag for anonymous sets to skip async gc in this case.
According to 08e4c8c5919f ("netfilter: nf_tables: mark newset as dead on transaction abort"), Florian plans to accelerate abort path by releasing objects via workqueue, therefore, this sets on the dead flag for abort path too.
Cc: stable@vger.kernel.org Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Reported-by: Mingi Cho mgcho.minic@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nf_tables_api.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -4754,6 +4754,7 @@ static void nf_tables_unbind_set(const s
if (list_empty(&set->bindings) && nft_set_is_anonymous(set)) { list_del_rcu(&set->list); + set->dead = 1; if (event) nf_tables_set_notify(ctx, set, NFT_MSG_DELSET, GFP_KERNEL);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
commit 16603605b667b70da974bea8216c93e7db043bf1 upstream.
Anonymous sets are never used with timeout from userspace, reject this. Exception to this rule is NFT_SET_EVAL to ensure legacy meters still work.
Cc: stable@vger.kernel.org Fixes: 761da2935d6e ("netfilter: nf_tables: add set timeout API support") Reported-by: lonial con kongln9170@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nf_tables_api.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -4410,6 +4410,9 @@ static int nf_tables_newset(struct net * if ((flags & (NFT_SET_EVAL | NFT_SET_OBJECT)) == (NFT_SET_EVAL | NFT_SET_OBJECT)) return -EOPNOTSUPP; + if ((flags & (NFT_SET_ANONYMOUS | NFT_SET_TIMEOUT | NFT_SET_EVAL)) == + (NFT_SET_ANONYMOUS | NFT_SET_TIMEOUT)) + return -EOPNOTSUPP; }
dtype = 0;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
commit 5f4fc4bd5cddb4770ab120ce44f02695c4505562 upstream.
This set combination is weird: it allows for elements to be added/deleted, but once bound to the rule it cannot be updated anymore. Eventually, all elements expire, leading to an empty set which cannot be updated anymore. Reject this flags combination.
Cc: stable@vger.kernel.org Fixes: 761da2935d6e ("netfilter: nf_tables: add set timeout API support") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nf_tables_api.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -4413,6 +4413,9 @@ static int nf_tables_newset(struct net * if ((flags & (NFT_SET_ANONYMOUS | NFT_SET_TIMEOUT | NFT_SET_EVAL)) == (NFT_SET_ANONYMOUS | NFT_SET_TIMEOUT)) return -EOPNOTSUPP; + if ((flags & (NFT_SET_CONSTANT | NFT_SET_TIMEOUT)) == + (NFT_SET_CONSTANT | NFT_SET_TIMEOUT)) + return -EOPNOTSUPP; }
dtype = 0;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Kelley mhklinux@outlook.com
commit b8209544296edbd1af186e2ea9c648642c37b18c upstream.
The VMBUS_RING_SIZE macro adds space for a ring buffer header to the requested ring buffer size. The header size is always 1 page, and so its size varies based on the PAGE_SIZE for which the kernel is built. If the requested ring buffer size is a large power-of-2 size and the header size is small, the resulting size is inefficient in its use of memory. For example, a 512 Kbyte ring buffer with a 4 Kbyte page size results in a 516 Kbyte allocation, which is rounded to up 1 Mbyte by the memory allocator, and wastes 508 Kbytes of memory.
In such situations, the exact size of the ring buffer isn't that important, and it's OK to allocate the 4 Kbyte header at the beginning of the 512 Kbytes, leaving the ring buffer itself with just 508 Kbytes. The memory allocation can be 512 Kbytes instead of 1 Mbyte and nothing is wasted.
Update VMBUS_RING_SIZE to implement this approach for "large" ring buffer sizes. "Large" is somewhat arbitrarily defined as 8 times the size of the ring buffer header (which is of size PAGE_SIZE). For example, for 4 Kbyte PAGE_SIZE, ring buffers of 32 Kbytes and larger use the first 4 Kbytes as the ring buffer header. For 64 Kbyte PAGE_SIZE, ring buffers of 512 Kbytes and larger use the first 64 Kbytes as the ring buffer header. In both cases, smaller sizes add space for the header so the ring size isn't reduced too much by using part of the space for the header. For example, with a 64 Kbyte page size, we don't want a 128 Kbyte ring buffer to be reduced to 64 Kbytes by allocating half of the space for the header. In such a case, the memory allocation is less efficient, but it's the best that can be done.
While the new algorithm slightly changes the amount of space allocated for ring buffers by drivers that use VMBUS_RING_SIZE, the devices aren't known to be sensitive to small changes in ring buffer size, so there shouldn't be any effect.
Fixes: c1135c7fd0e9 ("Drivers: hv: vmbus: Introduce types of GPADL") Fixes: 6941f67ad37d ("hv_netvsc: Calculate correct ring size when PAGE_SIZE is not 4 Kbytes") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218502 Cc: stable@vger.kernel.org Signed-off-by: Michael Kelley mhklinux@outlook.com Reviewed-by: Saurabh Sengar ssengar@linux.microsoft.com Reviewed-by: Dexuan Cui decui@microsoft.com Tested-by: Souradeep Chakrabarti schakrabarti@linux.microsoft.com Link: https://lore.kernel.org/r/20240229004533.313662-1-mhklinux@outlook.com Signed-off-by: Wei Liu wei.liu@kernel.org Message-ID: 20240229004533.313662-1-mhklinux@outlook.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/hyperv.h | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-)
--- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -164,8 +164,28 @@ struct hv_ring_buffer { u8 buffer[]; } __packed;
+ +/* + * If the requested ring buffer size is at least 8 times the size of the + * header, steal space from the ring buffer for the header. Otherwise, add + * space for the header so that is doesn't take too much of the ring buffer + * space. + * + * The factor of 8 is somewhat arbitrary. The goal is to prevent adding a + * relatively small header (4 Kbytes on x86) to a large-ish power-of-2 ring + * buffer size (such as 128 Kbytes) and so end up making a nearly twice as + * large allocation that will be almost half wasted. As a contrasting example, + * on ARM64 with 64 Kbyte page size, we don't want to take 64 Kbytes for the + * header from a 128 Kbyte allocation, leaving only 64 Kbytes for the ring. + * In this latter case, we must add 64 Kbytes for the header and not worry + * about what's wasted. + */ +#define VMBUS_HEADER_ADJ(payload_sz) \ + ((payload_sz) >= 8 * sizeof(struct hv_ring_buffer) ? \ + 0 : sizeof(struct hv_ring_buffer)) + /* Calculate the proper size of a ringbuffer, it must be page-aligned */ -#define VMBUS_RING_SIZE(payload_sz) PAGE_ALIGN(sizeof(struct hv_ring_buffer) + \ +#define VMBUS_RING_SIZE(payload_sz) PAGE_ALIGN(VMBUS_HEADER_ADJ(payload_sz) + \ (payload_sz))
struct hv_ring_buffer_info {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
commit 1a807e46aa93ebad1dfbed4f82dc3bf779423a6e upstream.
After a couple recent changes in LLVM, there is a warning (or error with CONFIG_WERROR=y or W=e) from the compile time fortify source routines, specifically the memset() in copy_to_user_tmpl().
In file included from net/xfrm/xfrm_user.c:14: ... include/linux/fortify-string.h:438:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning] 438 | __write_overflow_field(p_size_field, size); | ^ 1 error generated.
While ->xfrm_nr has been validated against XFRM_MAX_DEPTH when its value is first assigned in copy_templates() by calling validate_tmpl() first (so there should not be any issue in practice), LLVM/clang cannot really deduce that across the boundaries of these functions. Without that knowledge, it cannot assume that the loop stops before i is greater than XFRM_MAX_DEPTH, which would indeed result a stack buffer overflow in the memset().
To make the bounds of ->xfrm_nr clear to the compiler and add additional defense in case copy_to_user_tmpl() is ever used in a path where ->xfrm_nr has not been properly validated against XFRM_MAX_DEPTH first, add an explicit bound check and early return, which clears up the warning.
Cc: stable@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linux/issues/1985 Signed-off-by: Nathan Chancellor nathan@kernel.org Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/xfrm/xfrm_user.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -1753,6 +1753,9 @@ static int copy_to_user_tmpl(struct xfrm if (xp->xfrm_nr == 0) return 0;
+ if (xp->xfrm_nr > XFRM_MAX_DEPTH) + return -ENOBUFS; + for (i = 0; i < xp->xfrm_nr; i++) { struct xfrm_user_tmpl *up = &vec[i]; struct xfrm_tmpl *kp = &xp->xfrm_vec[i];
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 5ef1d8c1ddbf696e47b226e11888eaf8d9e8e807 upstream.
Do the cache flush of converted pages in svm_register_enc_region() before dropping kvm->lock to fix use-after-free issues where region and/or its array of pages could be freed by a different task, e.g. if userspace has __unregister_enc_region_locked() already queued up for the region.
Note, the "obvious" alternative of using local variables doesn't fully resolve the bug, as region->pages is also dynamically allocated. I.e. the region structure itself would be fine, but region->pages could be freed.
Flushing multiple pages under kvm->lock is unfortunate, but the entire flow is a rare slow path, and the manual flush is only needed on CPUs that lack coherency for encrypted memory.
Fixes: 19a23da53932 ("Fix unsynchronized access to sev members through svm_register_enc_region") Reported-by: Gabe Kirkpatrick gkirkpatrick@google.com Cc: Josh Eads josheads@google.com Cc: Peter Gonda pgonda@google.com Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20240217013430.2079561-1-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/sev.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
--- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -1024,20 +1024,22 @@ int svm_register_enc_region(struct kvm * goto e_free; }
- region->uaddr = range->addr; - region->size = range->size; - - list_add_tail(®ion->list, &sev->regions_list); - mutex_unlock(&kvm->lock); - /* * The guest may change the memory encryption attribute from C=0 -> C=1 * or vice versa for this memory range. Lets make sure caches are * flushed to ensure that guest data gets written into memory with - * correct C-bit. + * correct C-bit. Note, this must be done before dropping kvm->lock, + * as region and its array of pages can be freed by a different task + * once kvm->lock is released. */ sev_clflush_pages(region->pages, region->npages);
+ region->uaddr = range->addr; + region->size = range->size; + + list_add_tail(®ion->list, &sev->regions_list); + mutex_unlock(&kvm->lock); + return ret;
e_free:
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kailang Yang kailang@realtek.com
commit d397b6e56151099cf3b1f7bfccb204a6a8591720 upstream.
Headset Mic will no show at resume back. This patch will fix this issue.
Fixes: d7f32791a9fc ("ALSA: hda/realtek - Add headset Mic support for Lenovo ALC897 platform") Cc: stable@vger.kernel.org Signed-off-by: Kailang Yang kailang@realtek.com Link: https://lore.kernel.org/r/4713d48a372e47f98bba0c6120fd8254@realtek.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -10692,8 +10692,7 @@ static void alc897_hp_automute_hook(stru
snd_hda_gen_hp_automute(codec, jack); vref = spec->gen.hp_jack_present ? (PIN_HP | AC_PINCTL_VREF_100) : PIN_HP; - snd_hda_codec_write(codec, 0x1b, 0, AC_VERB_SET_PIN_WIDGET_CONTROL, - vref); + snd_hda_set_pin_ctl(codec, 0x1b, vref); }
static void alc897_fixup_lenovo_headset_mic(struct hda_codec *codec, @@ -10702,6 +10701,10 @@ static void alc897_fixup_lenovo_headset_ struct alc_spec *spec = codec->spec; if (action == HDA_FIXUP_ACT_PRE_PROBE) { spec->gen.hp_automute_hook = alc897_hp_automute_hook; + spec->no_shutup_pins = 1; + } + if (action == HDA_FIXUP_ACT_PROBE) { + snd_hda_set_pin_ctl_cache(codec, 0x1a, PIN_IN | AC_PINCTL_VREF_100); } }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alan Stern stern@rowland.harvard.edu
commit 014bcf41d946b36a8f0b8e9b5d9529efbb822f49 upstream.
The isd200 sub-driver in usb-storage uses the HEADS and SECTORS values in the ATA ID information to calculate cylinder and head values when creating a CDB for READ or WRITE commands. The calculation involves division and modulus operations, which will cause a crash if either of these values is 0. While this never happens with a genuine device, it could happen with a flawed or subversive emulation, as reported by the syzbot fuzzer.
Protect against this possibility by refusing to bind to the device if either the ATA_ID_HEADS or ATA_ID_SECTORS value in the device's ID information is 0. This requires isd200_Initialization() to return a negative error code when initialization fails; currently it always returns 0 (even when there is an error).
Signed-off-by: Alan Stern stern@rowland.harvard.edu Reported-and-tested-by: syzbot+28748250ab47a8f04100@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-usb/0000000000003eb868061245ba7f@google.com/ Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Reviewed-by: PrasannaKumar Muralidharan prasannatsmkumar@gmail.com Reviewed-by: Martin K. Petersen martin.petersen@oracle.com Link: https://lore.kernel.org/r/b1e605ea-333f-4ac0-9511-da04f411763e@rowland.harva... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/storage/isd200.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
--- a/drivers/usb/storage/isd200.c +++ b/drivers/usb/storage/isd200.c @@ -1105,7 +1105,7 @@ static void isd200_dump_driveid(struct u static int isd200_get_inquiry_data( struct us_data *us ) { struct isd200_info *info = (struct isd200_info *)us->extra; - int retStatus = ISD200_GOOD; + int retStatus; u16 *id = info->id;
usb_stor_dbg(us, "Entering isd200_get_inquiry_data\n"); @@ -1137,6 +1137,13 @@ static int isd200_get_inquiry_data( stru isd200_fix_driveid(id); isd200_dump_driveid(us, id);
+ /* Prevent division by 0 in isd200_scsi_to_ata() */ + if (id[ATA_ID_HEADS] == 0 || id[ATA_ID_SECTORS] == 0) { + usb_stor_dbg(us, " Invalid ATA Identify data\n"); + retStatus = ISD200_ERROR; + goto Done; + } + memset(&info->InquiryData, 0, sizeof(info->InquiryData));
/* Standard IDE interface only supports disks */ @@ -1202,6 +1209,7 @@ static int isd200_get_inquiry_data( stru } }
+ Done: usb_stor_dbg(us, "Leaving isd200_get_inquiry_data %08X\n", retStatus);
return(retStatus); @@ -1481,22 +1489,27 @@ static int isd200_init_info(struct us_da
static int isd200_Initialization(struct us_data *us) { + int rc = 0; + usb_stor_dbg(us, "ISD200 Initialization...\n");
/* Initialize ISD200 info struct */
- if (isd200_init_info(us) == ISD200_ERROR) { + if (isd200_init_info(us) < 0) { usb_stor_dbg(us, "ERROR Initializing ISD200 Info struct\n"); + rc = -ENOMEM; } else { /* Get device specific data */
- if (isd200_get_inquiry_data(us) != ISD200_GOOD) + if (isd200_get_inquiry_data(us) != ISD200_GOOD) { usb_stor_dbg(us, "ISD200 Initialization Failure\n"); - else + rc = -EINVAL; + } else { usb_stor_dbg(us, "ISD200 Initialization complete\n"); + } }
- return 0; + return rc; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krishna Kurapati quic_kriskura@quicinc.com
commit f90ce1e04cbcc76639d6cba0fdbd820cd80b3c70 upstream.
While connecting to a Linux host with CDC_NCM_NTB_DEF_SIZE_TX set to 65536, it has been observed that we receive short packets, which come at interval of 5-10 seconds sometimes and have block length zero but still contain 1-2 valid datagrams present.
According to the NCM spec:
"If wBlockLength = 0x0000, the block is terminated by a short packet. In this case, the USB transfer must still be shorter than dwNtbInMaxSize or dwNtbOutMaxSize. If exactly dwNtbInMaxSize or dwNtbOutMaxSize bytes are sent, and the size is a multiple of wMaxPacketSize for the given pipe, then no ZLP shall be sent.
wBlockLength= 0x0000 must be used with extreme care, because of the possibility that the host and device may get out of sync, and because of test issues.
wBlockLength = 0x0000 allows the sender to reduce latency by starting to send a very large NTB, and then shortening it when the sender discovers that there’s not sufficient data to justify sending a large NTB"
However, there is a potential issue with the current implementation, as it checks for the occurrence of multiple NTBs in a single giveback by verifying if the leftover bytes to be processed is zero or not. If the block length reads zero, we would process the same NTB infintely because the leftover bytes is never zero and it leads to a crash. Fix this by bailing out if block length reads zero.
Cc: stable@vger.kernel.org Fixes: 427694cfaafa ("usb: gadget: ncm: Handle decoding of multiple NTB's in unwrap call") Signed-off-by: Krishna Kurapati quic_kriskura@quicinc.com Reviewed-by: Maciej Żenczykowski maze@google.com Link: https://lore.kernel.org/r/20240228115441.2105585-1-quic_kriskura@quicinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_ncm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/gadget/function/f_ncm.c +++ b/drivers/usb/gadget/function/f_ncm.c @@ -1357,7 +1357,7 @@ parse_ntb: if (to_process == 1 && (*(unsigned char *)(ntb_ptr + block_len) == 0x00)) { to_process--; - } else if (to_process > 0) { + } else if ((to_process > 0) && (block_len != 0)) { ntb_ptr = (unsigned char *)(ntb_ptr + block_len); goto parse_ntb; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mathias Nyman mathias.nyman@linux.intel.com
commit 69c63350e573367f9c8594162288cffa8a26d0d1 upstream.
Unused USB ports may have bogus location data in ACPI PLD tables. This causes port peering failures as these unused USB2 and USB3 ports location may match.
Due to these failures the driver prints a "usb: port power management may be unreliable" warning, and unnecessarily blocks port power off during runtime suspend.
This was debugged on a couple DELL systems where the unused ports all returned zeroes in their location data. Similar bugreports exist for other systems.
Don't try to peer or match ports that have connect type set to USB_PORT_NOT_USED.
Fixes: 3bfd659baec8 ("usb: find internal hub tier mismatch via acpi") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218465 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218486 Tested-by: Paul Menzel pmenzel@molgen.mpg.de Link: https://lore.kernel.org/linux-usb/5406d361-f5b7-4309-b0e6-8c94408f7d75@molge... Cc: stable@vger.kernel.org # v3.16+ Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218490 Link: https://lore.kernel.org/r/20240222233343.71856-1-mathias.nyman@linux.intel.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/port.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/usb/core/port.c +++ b/drivers/usb/core/port.c @@ -450,7 +450,7 @@ static int match_location(struct usb_dev struct usb_hub *peer_hub = usb_hub_to_struct_hub(peer_hdev); struct usb_device *hdev = to_usb_device(port_dev->dev.parent->parent);
- if (!peer_hub) + if (!peer_hub || port_dev->connect_type == USB_PORT_NOT_USED) return 0;
hcd = bus_to_hcd(hdev->bus); @@ -461,7 +461,8 @@ static int match_location(struct usb_dev
for (port1 = 1; port1 <= peer_hdev->maxchild; port1++) { peer = peer_hub->ports[port1 - 1]; - if (peer && peer->location == port_dev->location) { + if (peer && peer->connect_type != USB_PORT_NOT_USED && + peer->location == port_dev->location) { link_peers_report(port_dev, peer); return 1; /* done */ }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sherry Sun sherry.sun@nxp.com
commit 74cb7e0355fae9641f825afa389d3fba3b617714 upstream.
If the remote uart device is not connected or not enabled after booting up, the CTS line is high by default. At this time, if we enable the flow control when opening the device(for example, using “stty -F /dev/ttyLP4 crtscts” command), there will be a pending idle preamble(first writing 0 and then writing 1 to UARTCTRL_TE will queue an idle preamble) that cannot be sent out, resulting in the uart port fail to close(waiting for TX empty), so the user space stty will have to wait for a long time or forever.
This is an LPUART IP bug(idle preamble has higher priority than CTS), here add a workaround patch to enable TX CTS after enabling UARTCTRL_TE, so that the idle preamble does not get stuck due to CTS is deasserted.
Fixes: 380c966c093e ("tty: serial: fsl_lpuart: add 32-bit register interface support") Cc: stable stable@kernel.org Signed-off-by: Sherry Sun sherry.sun@nxp.com Reviewed-by: Alexander Sverdlin alexander.sverdlin@siemens.com Link: https://lore.kernel.org/r/20240305015706.1050769-1-sherry.sun@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/fsl_lpuart.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/tty/serial/fsl_lpuart.c +++ b/drivers/tty/serial/fsl_lpuart.c @@ -2178,9 +2178,12 @@ lpuart32_set_termios(struct uart_port *p UARTCTRL);
lpuart32_serial_setbrg(sport, baud); - lpuart32_write(&sport->port, modem, UARTMODIR); - lpuart32_write(&sport->port, ctrl, UARTCTRL); + /* disable CTS before enabling UARTCTRL_TE to avoid pending idle preamble */ + lpuart32_write(&sport->port, modem & ~UARTMODIR_TXCTSE, UARTMODIR); /* restore control register */ + lpuart32_write(&sport->port, ctrl, UARTCTRL); + /* re-enable the CTS if needed */ + lpuart32_write(&sport->port, modem, UARTMODIR);
if (old && sport->lpuart_dma_rx_use) { if (!lpuart_start_rx_dma(sport))
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Usyskin alexander.usyskin@intel.com
commit 7a9b9012043e126f6d6f4683e67409312d1b707b upstream.
Add Arrow Lake S device id.
Cc: stable@vger.kernel.org Signed-off-by: Alexander Usyskin alexander.usyskin@intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Link: https://lore.kernel.org/r/20240211103912.117105-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/mei/hw-me-regs.h | 1 + drivers/misc/mei/pci-me.c | 1 + 2 files changed, 2 insertions(+)
--- a/drivers/misc/mei/hw-me-regs.h +++ b/drivers/misc/mei/hw-me-regs.h @@ -112,6 +112,7 @@ #define MEI_DEV_ID_RPL_S 0x7A68 /* Raptor Lake Point S */
#define MEI_DEV_ID_MTL_M 0x7E70 /* Meteor Lake Point M */ +#define MEI_DEV_ID_ARL_S 0x7F68 /* Arrow Lake Point S */
/* * MEI HW Section --- a/drivers/misc/mei/pci-me.c +++ b/drivers/misc/mei/pci-me.c @@ -118,6 +118,7 @@ static const struct pci_device_id mei_me {MEI_PCI_DEVICE(MEI_DEV_ID_RPL_S, MEI_ME_PCH15_CFG)},
{MEI_PCI_DEVICE(MEI_DEV_ID_MTL_M, MEI_ME_PCH15_CFG)}, + {MEI_PCI_DEVICE(MEI_DEV_ID_ARL_S, MEI_ME_PCH15_CFG)},
/* required last entry */ {0, }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Usyskin alexander.usyskin@intel.com
commit 8436f25802ec028ac7254990893f3e01926d9b79 upstream.
Add Arrow Lake H device id.
Cc: stable@vger.kernel.org Signed-off-by: Alexander Usyskin alexander.usyskin@intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Link: https://lore.kernel.org/r/20240211103912.117105-2-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/mei/hw-me-regs.h | 1 + drivers/misc/mei/pci-me.c | 1 + 2 files changed, 2 insertions(+)
--- a/drivers/misc/mei/hw-me-regs.h +++ b/drivers/misc/mei/hw-me-regs.h @@ -113,6 +113,7 @@
#define MEI_DEV_ID_MTL_M 0x7E70 /* Meteor Lake Point M */ #define MEI_DEV_ID_ARL_S 0x7F68 /* Arrow Lake Point S */ +#define MEI_DEV_ID_ARL_H 0x7770 /* Arrow Lake Point H */
/* * MEI HW Section --- a/drivers/misc/mei/pci-me.c +++ b/drivers/misc/mei/pci-me.c @@ -119,6 +119,7 @@ static const struct pci_device_id mei_me
{MEI_PCI_DEVICE(MEI_DEV_ID_MTL_M, MEI_ME_PCH15_CFG)}, {MEI_PCI_DEVICE(MEI_DEV_ID_ARL_S, MEI_ME_PCH15_CFG)}, + {MEI_PCI_DEVICE(MEI_DEV_ID_ARL_H, MEI_ME_PCH15_CFG)},
/* required last entry */ {0, }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Pitre nico@fluxnic.net
commit 1581dafaf0d34bc9c428a794a22110d7046d186d upstream.
This is the same issue that was fixed for the VGA text buffer in commit 39cdb68c64d8 ("vt: fix memory overlapping when deleting chars in the buffer"). The cure is also the same i.e. replace memcpy() with memmove() due to the overlaping buffers.
Signed-off-by: Nicolas Pitre nico@fluxnic.net Fixes: 81732c3b2fed ("tty vt: Fix line garbage in virtual console on command line edition") Cc: stable stable@kernel.org Link: https://lore.kernel.org/r/sn184on2-3p0q-0qrq-0218-895349s4753o@syhkavp.arg Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/vt/vt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -398,7 +398,7 @@ static void vc_uniscr_delete(struct vc_d char32_t *ln = uniscr->lines[vc->state.y]; unsigned int x = vc->state.x, cols = vc->vc_cols;
- memcpy(&ln[x], &ln[x + nr], (cols - x - nr) * sizeof(*ln)); + memmove(&ln[x], &ln[x + nr], (cols - x - nr) * sizeof(*ln)); memset32(&ln[cols - nr], ' ', nr); } }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bart Van Assche bvanassche@acm.org
commit 961ebd120565cb60cebe21cb634fbc456022db4a upstream.
The first kiocb_set_cancel_fn() argument may point at a struct kiocb that is not embedded inside struct aio_kiocb. With the current code, depending on the compiler, the req->ki_ctx read happens either before the IOCB_AIO_RW test or after that test. Move the req->ki_ctx read such that it is guaranteed that the IOCB_AIO_RW test happens first.
Reported-by: Eric Biggers ebiggers@kernel.org Cc: Benjamin LaHaise ben@communityfibre.ca Cc: Eric Biggers ebiggers@google.com Cc: Christoph Hellwig hch@lst.de Cc: Avi Kivity avi@scylladb.com Cc: Sandeep Dhavale dhavale@google.com Cc: Jens Axboe axboe@kernel.dk Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Kent Overstreet kent.overstreet@linux.dev Cc: stable@vger.kernel.org Fixes: b820de741ae4 ("fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio") Signed-off-by: Bart Van Assche bvanassche@acm.org Link: https://lore.kernel.org/r/20240304235715.3790858-1-bvanassche@acm.org Reviewed-by: Jens Axboe axboe@kernel.dk Reviewed-by: Eric Biggers ebiggers@google.com Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/aio.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/fs/aio.c +++ b/fs/aio.c @@ -565,8 +565,8 @@ static int aio_setup_ring(struct kioctx
void kiocb_set_cancel_fn(struct kiocb *iocb, kiocb_cancel_fn *cancel) { - struct aio_kiocb *req = container_of(iocb, struct aio_kiocb, rw); - struct kioctx *ctx = req->ki_ctx; + struct aio_kiocb *req; + struct kioctx *ctx; unsigned long flags;
/* @@ -576,9 +576,13 @@ void kiocb_set_cancel_fn(struct kiocb *i if (!(iocb->ki_flags & IOCB_AIO_RW)) return;
+ req = container_of(iocb, struct aio_kiocb, rw); + if (WARN_ON_ONCE(!list_empty(&req->ki_list))) return;
+ ctx = req->ki_ctx; + spin_lock_irqsave(&ctx->ctx_lock, flags); list_add_tail(&req->ki_list, &ctx->active_reqs); req->ki_cancel = cancel;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sumit Garg sumit.garg@linaro.org
commit 95915ba4b987cf2b222b0f251280228a1ff977ac upstream.
The error path while failing to register devices on the TEE bus has a bug leading to kernel panic as follows:
[ 15.398930] Unable to handle kernel paging request at virtual address ffff07ed00626d7c [ 15.406913] Mem abort info: [ 15.409722] ESR = 0x0000000096000005 [ 15.413490] EC = 0x25: DABT (current EL), IL = 32 bits [ 15.418814] SET = 0, FnV = 0 [ 15.421878] EA = 0, S1PTW = 0 [ 15.425031] FSC = 0x05: level 1 translation fault [ 15.429922] Data abort info: [ 15.432813] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 15.438310] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 15.443372] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 15.448697] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000d9e3e000 [ 15.455413] [ffff07ed00626d7c] pgd=1800000bffdf9003, p4d=1800000bffdf9003, pud=0000000000000000 [ 15.464146] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
Commit 7269cba53d90 ("tee: optee: Fix supplicant based device enumeration") lead to the introduction of this bug. So fix it appropriately.
Reported-by: Mikko Rapeli mikko.rapeli@linaro.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218542 Fixes: 7269cba53d90 ("tee: optee: Fix supplicant based device enumeration") Cc: stable@vger.kernel.org Signed-off-by: Sumit Garg sumit.garg@linaro.org Signed-off-by: Jens Wiklander jens.wiklander@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tee/optee/device.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/tee/optee/device.c +++ b/drivers/tee/optee/device.c @@ -90,13 +90,14 @@ static int optee_register_device(const u if (rc) { pr_err("device registration failed, err: %d\n", rc); put_device(&optee_device->dev); + return rc; }
if (func == PTA_CMD_GET_DEVICES_SUPP) device_create_file(&optee_device->dev, &dev_attr_need_supplicant);
- return rc; + return 0; }
static int __optee_enumerate_devices(u32 func)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maximilian Heyne mheyne@amazon.de
commit fa765c4b4aed2d64266b694520ecb025c862c5a9 upstream.
shutdown_pirq and startup_pirq are not taking the irq_mapping_update_lock because they can't due to lock inversion. Both are called with the irq_desc->lock being taking. The lock order, however, is first irq_mapping_update_lock and then irq_desc->lock.
This opens multiple races: - shutdown_pirq can be interrupted by a function that allocates an event channel:
CPU0 CPU1 shutdown_pirq { xen_evtchn_close(e) __startup_pirq { EVTCHNOP_bind_pirq -> returns just freed evtchn e set_evtchn_to_irq(e, irq) } xen_irq_info_cleanup() { set_evtchn_to_irq(e, -1) } }
Assume here event channel e refers here to the same event channel number. After this race the evtchn_to_irq mapping for e is invalid (-1).
- __startup_pirq races with __unbind_from_irq in a similar way. Because __startup_pirq doesn't take irq_mapping_update_lock it can grab the evtchn that __unbind_from_irq is currently freeing and cleaning up. In this case even though the event channel is allocated, its mapping can be unset in evtchn_to_irq.
The fix is to first cleanup the mappings and then close the event channel. In this way, when an event channel gets allocated it's potential previous evtchn_to_irq mappings are guaranteed to be unset already. This is also the reverse order of the allocation where first the event channel is allocated and then the mappings are setup.
On a 5.10 kernel prior to commit 3fcdaf3d7634 ("xen/events: modify internal [un]bind interfaces"), we hit a BUG like the following during probing of NVMe devices. The issue is that during nvme_setup_io_queues, pci_free_irq is called for every device which results in a call to shutdown_pirq. With many nvme devices it's therefore likely to hit this race during boot because there will be multiple calls to shutdown_pirq and startup_pirq are running potentially in parallel.
------------[ cut here ]------------ blkfront: xvda: barrier or flush: disabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled kernel BUG at drivers/xen/events/events_base.c:499! invalid opcode: 0000 [#1] SMP PTI CPU: 44 PID: 375 Comm: kworker/u257:23 Not tainted 5.10.201-191.748.amzn2.x86_64 #1 Hardware name: Xen HVM domU, BIOS 4.11.amazon 08/24/2006 Workqueue: nvme-reset-wq nvme_reset_work RIP: 0010:bind_evtchn_to_cpu+0xdf/0xf0 Code: 5d 41 5e c3 cc cc cc cc 44 89 f7 e8 2b 55 ad ff 49 89 c5 48 85 c0 0f 84 64 ff ff ff 4c 8b 68 30 41 83 fe ff 0f 85 60 ff ff ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 RSP: 0000:ffffc9000d533b08 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006 RDX: 0000000000000028 RSI: 00000000ffffffff RDI: 00000000ffffffff RBP: ffff888107419680 R08: 0000000000000000 R09: ffffffff82d72b00 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000001ed R13: 0000000000000000 R14: 00000000ffffffff R15: 0000000000000002 FS: 0000000000000000(0000) GS:ffff88bc8b500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000002610001 CR4: 00000000001706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? show_trace_log_lvl+0x1c1/0x2d9 ? show_trace_log_lvl+0x1c1/0x2d9 ? set_affinity_irq+0xdc/0x1c0 ? __die_body.cold+0x8/0xd ? die+0x2b/0x50 ? do_trap+0x90/0x110 ? bind_evtchn_to_cpu+0xdf/0xf0 ? do_error_trap+0x65/0x80 ? bind_evtchn_to_cpu+0xdf/0xf0 ? exc_invalid_op+0x4e/0x70 ? bind_evtchn_to_cpu+0xdf/0xf0 ? asm_exc_invalid_op+0x12/0x20 ? bind_evtchn_to_cpu+0xdf/0xf0 ? bind_evtchn_to_cpu+0xc5/0xf0 set_affinity_irq+0xdc/0x1c0 irq_do_set_affinity+0x1d7/0x1f0 irq_setup_affinity+0xd6/0x1a0 irq_startup+0x8a/0xf0 __setup_irq+0x639/0x6d0 ? nvme_suspend+0x150/0x150 request_threaded_irq+0x10c/0x180 ? nvme_suspend+0x150/0x150 pci_request_irq+0xa8/0xf0 ? __blk_mq_free_request+0x74/0xa0 queue_request_irq+0x6f/0x80 nvme_create_queue+0x1af/0x200 nvme_create_io_queues+0xbd/0xf0 nvme_setup_io_queues+0x246/0x320 ? nvme_irq_check+0x30/0x30 nvme_reset_work+0x1c8/0x400 process_one_work+0x1b0/0x350 worker_thread+0x49/0x310 ? process_one_work+0x350/0x350 kthread+0x11b/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30 Modules linked in: ---[ end trace a11715de1eee1873 ]---
Fixes: d46a78b05c0e ("xen: implement pirq type event channels") Cc: stable@vger.kernel.org Co-debugged-by: Andrew Panyakin apanyaki@amazon.com Signed-off-by: Maximilian Heyne mheyne@amazon.de Reviewed-by: Juergen Gross jgross@suse.com Link: https://lore.kernel.org/r/20240124163130.31324-1-mheyne@amazon.de Signed-off-by: Juergen Gross jgross@suse.com [apanyaki: backport to v5.10-stable] Signed-off-by: Andrew Paniakin apanyaki@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/events/events_base.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -885,8 +885,8 @@ static void shutdown_pirq(struct irq_dat return;
do_mask(info, EVT_MASK_REASON_EXPLICIT); - xen_evtchn_close(evtchn); xen_irq_info_cleanup(info); + xen_evtchn_close(evtchn); }
static void enable_pirq(struct irq_data *data) @@ -929,8 +929,6 @@ static void __unbind_from_irq(unsigned i if (VALID_EVTCHN(evtchn)) { unsigned int cpu = cpu_from_irq(irq);
- xen_evtchn_close(evtchn); - switch (type_from_irq(irq)) { case IRQT_VIRQ: per_cpu(virq_to_irq, cpu)[virq_from_irq(irq)] = -1; @@ -943,6 +941,7 @@ static void __unbind_from_irq(unsigned i }
xen_irq_info_cleanup(info); + xen_evtchn_close(evtchn); }
xen_free_irq(irq);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Ogness john.ogness@linutronix.de
[ Upstream commit 8076972468584d4a21dab9aa50e388b3ea9ad8c7 ]
console_trylock_spinning() may takeover the console lock from a schedulable context. Update @console_may_schedule to make sure it reflects a trylock acquire.
Reported-by: Mukesh Ojha quic_mojha@quicinc.com Closes: https://lore.kernel.org/lkml/20240222090538.23017-1-quic_mojha@quicinc.com Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes") Signed-off-by: John Ogness john.ogness@linutronix.de Reviewed-by: Mukesh Ojha quic_mojha@quicinc.com Reviewed-by: Petr Mladek pmladek@suse.com Link: https://lore.kernel.org/r/875xybmo2z.fsf@jogness.linutronix.de Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/printk/printk.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 255654ea12447..a8af93cbc2936 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1866,6 +1866,12 @@ static int console_trylock_spinning(void) */ mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_);
+ /* + * Update @console_may_schedule for trylock because the previous + * owner may have been schedulable. + */ + console_may_schedule = 0; + return 1; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Goldwyn Rodrigues rgoldwyn@suse.com
commit c853a5783ebe123847886d432354931874367292 upstream.
Instead of using kmalloc() to allocate btrfs_ioctl_defrag_range_args, allocate btrfs_ioctl_defrag_range_args on stack, the size is reasonably small and ioctls are called in process context.
sizeof(btrfs_ioctl_defrag_range_args) = 48
Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Goldwyn Rodrigues rgoldwyn@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com [ This patch is needed to fix a memory leak of "range" that was introduced when commit 173431b274a9 ("btrfs: defrag: reject unknown flags of btrfs_ioctl_defrag_range_args") was backported to kernels lacking this patch. Now with these two patches applied in reverse order, range->flags needed to change back to range.flags. This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc.] Signed-off-by: Maximilian Heyne mheyne@amazon.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/ioctl.c | 25 ++++++++----------------- 1 file changed, 8 insertions(+), 17 deletions(-)
--- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3148,7 +3148,7 @@ static int btrfs_ioctl_defrag(struct fil { struct inode *inode = file_inode(file); struct btrfs_root *root = BTRFS_I(inode)->root; - struct btrfs_ioctl_defrag_range_args *range; + struct btrfs_ioctl_defrag_range_args range = {0}; int ret;
ret = mnt_want_write_file(file); @@ -3180,37 +3180,28 @@ static int btrfs_ioctl_defrag(struct fil goto out; }
- range = kzalloc(sizeof(*range), GFP_KERNEL); - if (!range) { - ret = -ENOMEM; - goto out; - } - if (argp) { - if (copy_from_user(range, argp, - sizeof(*range))) { + if (copy_from_user(&range, argp, sizeof(range))) { ret = -EFAULT; - kfree(range); goto out; } - if (range->flags & ~BTRFS_DEFRAG_RANGE_FLAGS_SUPP) { + if (range.flags & ~BTRFS_DEFRAG_RANGE_FLAGS_SUPP) { ret = -EOPNOTSUPP; goto out; } /* compression requires us to start the IO */ - if ((range->flags & BTRFS_DEFRAG_RANGE_COMPRESS)) { - range->flags |= BTRFS_DEFRAG_RANGE_START_IO; - range->extent_thresh = (u32)-1; + if ((range.flags & BTRFS_DEFRAG_RANGE_COMPRESS)) { + range.flags |= BTRFS_DEFRAG_RANGE_START_IO; + range.extent_thresh = (u32)-1; } } else { /* the rest are all set to zero by kzalloc */ - range->len = (u64)-1; + range.len = (u64)-1; } ret = btrfs_defrag_file(file_inode(file), file, - range, BTRFS_OLDEST_GENERATION, 0); + &range, BTRFS_OLDEST_GENERATION, 0); if (ret > 0) ret = 0; - kfree(range); break; default: ret = -EINVAL;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
From: "H. Peter Anvin (Intel)" hpa@zytor.com
commit f87bc8dc7a7c438c70f97b4e51c76a183313272e upstream.
Add a macro _ASM_RIP() to add a (%rip) suffix on 64 bits only. This is useful for immediate memory references where one doesn't want gcc to possibly use a register indirection as it may in the case of an "m" constraint.
Signed-off-by: H. Peter Anvin (Intel) hpa@zytor.com Signed-off-by: Borislav Petkov bp@suse.de Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Link: https://lkml.kernel.org/r/20210910195910.2542662-3-hpa@zytor.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/asm.h | 5 +++++ 1 file changed, 5 insertions(+)
--- a/arch/x86/include/asm/asm.h +++ b/arch/x86/include/asm/asm.h @@ -6,12 +6,14 @@ # define __ASM_FORM(x) x # define __ASM_FORM_RAW(x) x # define __ASM_FORM_COMMA(x) x, +# define __ASM_REGPFX % #else #include <linux/stringify.h>
# define __ASM_FORM(x) " " __stringify(x) " " # define __ASM_FORM_RAW(x) __stringify(x) # define __ASM_FORM_COMMA(x) " " __stringify(x) "," +# define __ASM_REGPFX %% #endif
#ifndef __x86_64__ @@ -48,6 +50,9 @@ #define _ASM_SI __ASM_REG(si) #define _ASM_DI __ASM_REG(di)
+/* Adds a (%rip) suffix on 64 bits only; for immediate memory references */ +#define _ASM_RIP(x) __ASM_SEL_RAW(x, x (__ASM_REGPFX rip)) + #ifndef __x86_64__ /* 32 bit */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit baf8361e54550a48a7087b603313ad013cc13386 upstream.
MDS mitigation requires clearing the CPU buffers before returning to user. This needs to be done late in the exit-to-user path. Current location of VERW leaves a possibility of kernel data ending up in CPU buffers for memory accesses done after VERW such as:
1. Kernel data accessed by an NMI between VERW and return-to-user can remain in CPU buffers since NMI returning to kernel does not execute VERW to clear CPU buffers. 2. Alyssa reported that after VERW is executed, CONFIG_GCC_PLUGIN_STACKLEAK=y scrubs the stack used by a system call. Memory accesses during stack scrubbing can move kernel stack contents into CPU buffers. 3. When caller saved registers are restored after a return from function executing VERW, the kernel stack accesses can remain in CPU buffers(since they occur after VERW).
To fix this VERW needs to be moved very late in exit-to-user path.
In preparation for moving VERW to entry/exit asm code, create macros that can be used in asm. Also make VERW patching depend on a new feature flag X86_FEATURE_CLEAR_CPU_BUF.
[pawan: - Runtime patch jmp instead of verw in macro CLEAR_CPU_BUFFERS due to lack of relative addressing support for relocations in kernels < v6.5. - Add UNWIND_HINT_EMPTY to avoid warning: arch/x86/entry/entry.o: warning: objtool: mds_verw_sel+0x0: unreachable instruction]
Reported-by: Alyssa Milburn alyssa.milburn@intel.com Suggested-by: Andrew Cooper andrew.cooper3@citrix.com Suggested-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lore.kernel.org/all/20240213-delay-verw-v8-1-a6216d83edb7%40linux.in... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/entry/entry.S | 23 +++++++++++++++++++++++ arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/nospec-branch.h | 15 +++++++++++++++ 3 files changed, 39 insertions(+)
--- a/arch/x86/entry/entry.S +++ b/arch/x86/entry/entry.S @@ -6,6 +6,9 @@ #include <linux/linkage.h> #include <asm/export.h> #include <asm/msr-index.h> +#include <asm/unwind_hints.h> +#include <asm/segment.h> +#include <asm/cache.h>
.pushsection .noinstr.text, "ax"
@@ -20,3 +23,23 @@ SYM_FUNC_END(entry_ibpb) EXPORT_SYMBOL_GPL(entry_ibpb);
.popsection + +/* + * Define the VERW operand that is disguised as entry code so that + * it can be referenced with KPTI enabled. This ensure VERW can be + * used late in exit-to-user path after page tables are switched. + */ +.pushsection .entry.text, "ax" + +.align L1_CACHE_BYTES, 0xcc +SYM_CODE_START_NOALIGN(mds_verw_sel) + UNWIND_HINT_EMPTY + ANNOTATE_NOENDBR + .word __KERNEL_DS +.align L1_CACHE_BYTES, 0xcc +SYM_CODE_END(mds_verw_sel); +/* For KVM */ +EXPORT_SYMBOL_GPL(mds_verw_sel); + +.popsection + --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -300,6 +300,7 @@ #define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */ #define X86_FEATURE_RSB_VMEXIT_LITE (11*32+17) /* "" Fill RSB on VM exit when EIBRS is enabled */ #define X86_FEATURE_MSR_TSX_CTRL (11*32+18) /* "" MSR IA32_TSX_CTRL (Intel) implemented */ +#define X86_FEATURE_CLEAR_CPU_BUF (11*32+19) /* "" Clear CPU buffers using VERW */
#define X86_FEATURE_SRSO (11*32+24) /* "" AMD BTB untrain RETs */ #define X86_FEATURE_SRSO_ALIAS (11*32+25) /* "" AMD BTB untrain RETs through aliasing */ --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -182,6 +182,19 @@ #endif .endm
+/* + * Macro to execute VERW instruction that mitigate transient data sampling + * attacks such as MDS. On affected systems a microcode update overloaded VERW + * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. + * + * Note: Only the memory operand variant of VERW clears the CPU buffers. + */ +.macro CLEAR_CPU_BUFFERS + ALTERNATIVE "jmp .Lskip_verw_@", "", X86_FEATURE_CLEAR_CPU_BUF + verw _ASM_RIP(mds_verw_sel) +.Lskip_verw_@: +.endm + #else /* __ASSEMBLY__ */
#define ANNOTATE_RETPOLINE_SAFE \ @@ -362,6 +375,8 @@ DECLARE_STATIC_KEY_FALSE(mds_idle_clear)
DECLARE_STATIC_KEY_FALSE(mmio_stale_data_clear);
+extern u16 mds_verw_sel; + #include <asm/segment.h>
/**
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 3c7501722e6b31a6e56edd23cea5e77dbb9ffd1a upstream.
Mitigation for MDS is to use VERW instruction to clear any secrets in CPU Buffers. Any memory accesses after VERW execution can still remain in CPU buffers. It is safer to execute VERW late in return to user path to minimize the window in which kernel data can end up in CPU buffers. There are not many kernel secrets to be had after SWITCH_TO_USER_CR3.
Add support for deploying VERW mitigation after user register state is restored. This helps minimize the chances of kernel data ending up into CPU buffers after executing VERW.
Note that the mitigation at the new location is not yet enabled.
Corner case not handled ======================= Interrupts returning to kernel don't clear CPUs buffers since the exit-to-user path is expected to do that anyways. But, there could be a case when an NMI is generated in kernel after the exit-to-user path has cleared the buffers. This case is not handled and NMI returning to kernel don't clear CPU buffers because:
1. It is rare to get an NMI after VERW, but before returning to user. 2. For an unprivileged user, there is no known way to make that NMI less rare or target it. 3. It would take a large number of these precisely-timed NMIs to mount an actual attack. There's presumably not enough bandwidth. 4. The NMI in question occurs after a VERW, i.e. when user state is restored and most interesting data is already scrubbed. Whats left is only the data that NMI touches, and that may or may not be of any interest.
[ pawan: resolved conflict in syscall_return_via_sysret, added CLEAR_CPU_BUFFERS to USERGS_SYSRET64 ]
Suggested-by: Dave Hansen dave.hansen@intel.com Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lore.kernel.org/all/20240213-delay-verw-v8-2-a6216d83edb7%40linux.in... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/entry/entry_64.S | 10 ++++++++++ arch/x86/entry/entry_64_compat.S | 1 + arch/x86/include/asm/irqflags.h | 1 + 3 files changed, 12 insertions(+)
--- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -615,6 +615,7 @@ SYM_INNER_LABEL(swapgs_restore_regs_and_ /* Restore RDI. */ popq %rdi SWAPGS + CLEAR_CPU_BUFFERS INTERRUPT_RETURN
@@ -721,6 +722,8 @@ native_irq_return_ldt: */ popq %rax /* Restore user RAX */
+ CLEAR_CPU_BUFFERS + /* * RSP now points to an ordinary IRET frame, except that the page * is read-only and RSP[31:16] are preloaded with the userspace @@ -1488,6 +1491,12 @@ nmi_restore: movq $0, 5*8(%rsp) /* clear "NMI executing" */
/* + * Skip CLEAR_CPU_BUFFERS here, since it only helps in rare cases like + * NMI in kernel after user state is restored. For an unprivileged user + * these conditions are hard to meet. + */ + + /* * iretq reads the "iret" frame and exits the NMI stack in a * single instruction. We are returning to kernel mode, so this * cannot result in a fault. Similarly, we don't need to worry @@ -1504,6 +1513,7 @@ SYM_CODE_END(asm_exc_nmi) SYM_CODE_START(ignore_sysret) UNWIND_HINT_EMPTY mov $-ENOSYS, %eax + CLEAR_CPU_BUFFERS sysretl SYM_CODE_END(ignore_sysret) #endif --- a/arch/x86/entry/entry_64_compat.S +++ b/arch/x86/entry/entry_64_compat.S @@ -319,6 +319,7 @@ sysret32_from_system_call: xorl %r9d, %r9d xorl %r10d, %r10d swapgs + CLEAR_CPU_BUFFERS sysretl SYM_CODE_END(entry_SYSCALL_compat)
--- a/arch/x86/include/asm/irqflags.h +++ b/arch/x86/include/asm/irqflags.h @@ -134,6 +134,7 @@ static __always_inline unsigned long arc #define INTERRUPT_RETURN jmp native_iret #define USERGS_SYSRET64 \ swapgs; \ + CLEAR_CPU_BUFFERS; \ sysretq; #define USERGS_SYSRET32 \ swapgs; \
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit a0e2dab44d22b913b4c228c8b52b2a104434b0b3 upstream.
As done for entry_64, add support for executing VERW late in exit to user path for 32-bit mode.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lore.kernel.org/all/20240213-delay-verw-v8-3-a6216d83edb7%40linux.in... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/entry/entry_32.S | 3 +++ 1 file changed, 3 insertions(+)
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -949,6 +949,7 @@ SYM_FUNC_START(entry_SYSENTER_32) BUG_IF_WRONG_CR3 no_user_check=1 popfl popl %eax + CLEAR_CPU_BUFFERS
/* * Return back to the vDSO, which will pop ecx and edx. @@ -1021,6 +1022,7 @@ restore_all_switch_stack:
/* Restore user state */ RESTORE_REGS pop=4 # skip orig_eax/error_code + CLEAR_CPU_BUFFERS .Lirq_return: /* * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on IRET core serialization @@ -1219,6 +1221,7 @@ SYM_CODE_START(asm_exc_nmi)
/* Not on SYSENTER stack. */ call exc_nmi + CLEAR_CPU_BUFFERS jmp .Lnmi_return
.Lnmi_from_sysenter_stack:
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 6613d82e617dd7eb8b0c40b2fe3acea655b1d611 upstream.
The VERW mitigation at exit-to-user is enabled via a static branch mds_user_clear. This static branch is never toggled after boot, and can be safely replaced with an ALTERNATIVE() which is convenient to use in asm.
Switch to ALTERNATIVE() to use the VERW mitigation late in exit-to-user path. Also remove the now redundant VERW in exc_nmi() and arch_exit_to_user_mode().
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lore.kernel.org/all/20240213-delay-verw-v8-4-a6216d83edb7%40linux.in... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/x86/mds.rst | 36 +++++++++++++++++++++++++---------- arch/x86/include/asm/entry-common.h | 1 arch/x86/include/asm/nospec-branch.h | 12 ----------- arch/x86/kernel/cpu/bugs.c | 15 +++++--------- arch/x86/kernel/nmi.c | 3 -- arch/x86/kvm/vmx/vmx.c | 2 - 6 files changed, 33 insertions(+), 36 deletions(-)
--- a/Documentation/x86/mds.rst +++ b/Documentation/x86/mds.rst @@ -95,6 +95,9 @@ The kernel provides a function to invoke
mds_clear_cpu_buffers()
+Also macro CLEAR_CPU_BUFFERS can be used in ASM late in exit-to-user path. +Other than CFLAGS.ZF, this macro doesn't clobber any registers. + The mitigation is invoked on kernel/userspace, hypervisor/guest and C-state (idle) transitions.
@@ -138,17 +141,30 @@ Mitigation points
When transitioning from kernel to user space the CPU buffers are flushed on affected CPUs when the mitigation is not disabled on the kernel - command line. The migitation is enabled through the static key - mds_user_clear. + command line. The mitigation is enabled through the feature flag + X86_FEATURE_CLEAR_CPU_BUF.
- The mitigation is invoked in prepare_exit_to_usermode() which covers - all but one of the kernel to user space transitions. The exception - is when we return from a Non Maskable Interrupt (NMI), which is - handled directly in do_nmi(). - - (The reason that NMI is special is that prepare_exit_to_usermode() can - enable IRQs. In NMI context, NMIs are blocked, and we don't want to - enable IRQs with NMIs blocked.) + The mitigation is invoked just before transitioning to userspace after + user registers are restored. This is done to minimize the window in + which kernel data could be accessed after VERW e.g. via an NMI after + VERW. + + **Corner case not handled** + Interrupts returning to kernel don't clear CPUs buffers since the + exit-to-user path is expected to do that anyways. But, there could be + a case when an NMI is generated in kernel after the exit-to-user path + has cleared the buffers. This case is not handled and NMI returning to + kernel don't clear CPU buffers because: + + 1. It is rare to get an NMI after VERW, but before returning to userspace. + 2. For an unprivileged user, there is no known way to make that NMI + less rare or target it. + 3. It would take a large number of these precisely-timed NMIs to mount + an actual attack. There's presumably not enough bandwidth. + 4. The NMI in question occurs after a VERW, i.e. when user state is + restored and most interesting data is already scrubbed. Whats left + is only the data that NMI touches, and that may or may not be of + any interest.
2. C-State transition --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -77,7 +77,6 @@ static inline void arch_exit_to_user_mod
static __always_inline void arch_exit_to_user_mode(void) { - mds_user_clear_cpu_buffers(); amd_clear_divider(); } #define arch_exit_to_user_mode arch_exit_to_user_mode --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -370,7 +370,6 @@ DECLARE_STATIC_KEY_FALSE(switch_to_cond_ DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb); DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
-DECLARE_STATIC_KEY_FALSE(mds_user_clear); DECLARE_STATIC_KEY_FALSE(mds_idle_clear);
DECLARE_STATIC_KEY_FALSE(mmio_stale_data_clear); @@ -403,17 +402,6 @@ static __always_inline void mds_clear_cp }
/** - * mds_user_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability - * - * Clear CPU buffers if the corresponding static key is enabled - */ -static __always_inline void mds_user_clear_cpu_buffers(void) -{ - if (static_branch_likely(&mds_user_clear)) - mds_clear_cpu_buffers(); -} - -/** * mds_idle_clear_cpu_buffers - Mitigation for MDS vulnerability * * Clear CPU buffers if the corresponding static key is enabled --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -109,9 +109,6 @@ DEFINE_STATIC_KEY_FALSE(switch_mm_cond_i /* Control unconditional IBPB in switch_mm() */ DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
-/* Control MDS CPU buffer clear before returning to user space */ -DEFINE_STATIC_KEY_FALSE(mds_user_clear); -EXPORT_SYMBOL_GPL(mds_user_clear); /* Control MDS CPU buffer clear before idling (halt, mwait) */ DEFINE_STATIC_KEY_FALSE(mds_idle_clear); EXPORT_SYMBOL_GPL(mds_idle_clear); @@ -249,7 +246,7 @@ static void __init mds_select_mitigation if (!boot_cpu_has(X86_FEATURE_MD_CLEAR)) mds_mitigation = MDS_MITIGATION_VMWERV;
- static_branch_enable(&mds_user_clear); + setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF);
if (!boot_cpu_has(X86_BUG_MSBDS_ONLY) && (mds_nosmt || cpu_mitigations_auto_nosmt())) @@ -353,7 +350,7 @@ static void __init taa_select_mitigation * For guests that can't determine whether the correct microcode is * present on host, enable the mitigation for UCODE_NEEDED as well. */ - static_branch_enable(&mds_user_clear); + setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF);
if (taa_nosmt || cpu_mitigations_auto_nosmt()) cpu_smt_disable(false); @@ -421,7 +418,7 @@ static void __init mmio_select_mitigatio */ if (boot_cpu_has_bug(X86_BUG_MDS) || (boot_cpu_has_bug(X86_BUG_TAA) && boot_cpu_has(X86_FEATURE_RTM))) - static_branch_enable(&mds_user_clear); + setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF); else static_branch_enable(&mmio_stale_data_clear);
@@ -481,12 +478,12 @@ static void __init md_clear_update_mitig if (cpu_mitigations_off()) return;
- if (!static_key_enabled(&mds_user_clear)) + if (!boot_cpu_has(X86_FEATURE_CLEAR_CPU_BUF)) goto out;
/* - * mds_user_clear is now enabled. Update MDS, TAA and MMIO Stale Data - * mitigation, if necessary. + * X86_FEATURE_CLEAR_CPU_BUF is now enabled. Update MDS, TAA and MMIO + * Stale Data mitigation, if necessary. */ if (mds_mitigation == MDS_MITIGATION_OFF && boot_cpu_has_bug(X86_BUG_MDS)) { --- a/arch/x86/kernel/nmi.c +++ b/arch/x86/kernel/nmi.c @@ -519,9 +519,6 @@ nmi_restart: write_cr2(this_cpu_read(nmi_cr2)); if (this_cpu_dec_return(nmi_state)) goto nmi_restart; - - if (user_mode(regs)) - mds_user_clear_cpu_buffers(); }
#if defined(CONFIG_X86_64) && IS_ENABLED(CONFIG_KVM_INTEL) --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6795,7 +6795,7 @@ static noinstr void vmx_vcpu_enter_exit( /* L1D Flush includes CPU buffer clear to mitigate MDS */ if (static_branch_unlikely(&vmx_l1d_should_flush)) vmx_l1d_flush(vcpu); - else if (static_branch_unlikely(&mds_user_clear)) + else if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF)) mds_clear_cpu_buffers(); else if (static_branch_unlikely(&mmio_stale_data_clear) && kvm_arch_has_assigned_device(vcpu->kvm))
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
From: Sean Christopherson seanjc@google.com
commit 706a189dcf74d3b3f955e9384785e726ed6c7c80 upstream.
Use EFLAGS.CF instead of EFLAGS.ZF to track whether to use VMRESUME versus VMLAUNCH. Freeing up EFLAGS.ZF will allow doing VERW, which clobbers ZF, for MDS mitigations as late as possible without needing to duplicate VERW for both paths.
[ pawan: resolved merge conflict in __vmx_vcpu_run in backport. ]
Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Nikolay Borisov nik.borisov@suse.com Link: https://lore.kernel.org/all/20240213-delay-verw-v8-5-a6216d83edb7%40linux.in... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/vmx/run_flags.h | 7 +++++-- arch/x86/kvm/vmx/vmenter.S | 6 +++--- 2 files changed, 8 insertions(+), 5 deletions(-)
--- a/arch/x86/kvm/vmx/run_flags.h +++ b/arch/x86/kvm/vmx/run_flags.h @@ -2,7 +2,10 @@ #ifndef __KVM_X86_VMX_RUN_FLAGS_H #define __KVM_X86_VMX_RUN_FLAGS_H
-#define VMX_RUN_VMRESUME (1 << 0) -#define VMX_RUN_SAVE_SPEC_CTRL (1 << 1) +#define VMX_RUN_VMRESUME_SHIFT 0 +#define VMX_RUN_SAVE_SPEC_CTRL_SHIFT 1 + +#define VMX_RUN_VMRESUME BIT(VMX_RUN_VMRESUME_SHIFT) +#define VMX_RUN_SAVE_SPEC_CTRL BIT(VMX_RUN_SAVE_SPEC_CTRL_SHIFT)
#endif /* __KVM_X86_VMX_RUN_FLAGS_H */ --- a/arch/x86/kvm/vmx/vmenter.S +++ b/arch/x86/kvm/vmx/vmenter.S @@ -77,7 +77,7 @@ SYM_FUNC_START(__vmx_vcpu_run) mov (%_ASM_SP), %_ASM_AX
/* Check if vmlaunch or vmresume is needed */ - testb $VMX_RUN_VMRESUME, %bl + bt $VMX_RUN_VMRESUME_SHIFT, %bx
/* Load guest registers. Don't clobber flags. */ mov VCPU_RCX(%_ASM_AX), %_ASM_CX @@ -99,8 +99,8 @@ SYM_FUNC_START(__vmx_vcpu_run) /* Load guest RAX. This kills the @regs pointer! */ mov VCPU_RAX(%_ASM_AX), %_ASM_AX
- /* Check EFLAGS.ZF from 'testb' above */ - jz .Lvmlaunch + /* Check EFLAGS.CF from the VMX_RUN_VMRESUME bit test above. */ + jnc .Lvmlaunch
/* * After a successful VMRESUME/VMLAUNCH, control flow "magically"
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 43fb862de8f628c5db5e96831c915b9aebf62d33 upstream.
During VMentry VERW is executed to mitigate MDS. After VERW, any memory access like register push onto stack may put host data in MDS affected CPU buffers. A guest can then use MDS to sample host data.
Although likelihood of secrets surviving in registers at current VERW callsite is less, but it can't be ruled out. Harden the MDS mitigation by moving the VERW mitigation late in VMentry path.
Note that VERW for MMIO Stale Data mitigation is unchanged because of the complexity of per-guest conditional VERW which is not easy to handle that late in asm with no GPRs available. If the CPU is also affected by MDS, VERW is unconditionally executed late in asm regardless of guest having MMIO access.
[ pawan: conflict resolved in backport ]
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Acked-by: Sean Christopherson seanjc@google.com Link: https://lore.kernel.org/all/20240213-delay-verw-v8-6-a6216d83edb7%40linux.in... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/vmx/vmenter.S | 3 +++ arch/x86/kvm/vmx/vmx.c | 12 ++++++++---- 2 files changed, 11 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/vmx/vmenter.S +++ b/arch/x86/kvm/vmx/vmenter.S @@ -99,6 +99,9 @@ SYM_FUNC_START(__vmx_vcpu_run) /* Load guest RAX. This kills the @regs pointer! */ mov VCPU_RAX(%_ASM_AX), %_ASM_AX
+ /* Clobbers EFLAGS.ZF */ + CLEAR_CPU_BUFFERS + /* Check EFLAGS.CF from the VMX_RUN_VMRESUME bit test above. */ jnc .Lvmlaunch
--- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -397,7 +397,8 @@ static __always_inline void vmx_enable_f
static void vmx_update_fb_clear_dis(struct kvm_vcpu *vcpu, struct vcpu_vmx *vmx) { - vmx->disable_fb_clear = vmx_fb_clear_ctrl_available; + vmx->disable_fb_clear = !cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF) && + vmx_fb_clear_ctrl_available;
/* * If guest will not execute VERW, there is no need to set FB_CLEAR_DIS @@ -6792,11 +6793,14 @@ static noinstr void vmx_vcpu_enter_exit( guest_enter_irqoff(); lockdep_hardirqs_on(CALLER_ADDR0);
- /* L1D Flush includes CPU buffer clear to mitigate MDS */ + /* + * L1D Flush includes CPU buffer clear to mitigate MDS, but VERW + * mitigation for MDS is done late in VMentry and is still + * executed in spite of L1D Flush. This is because an extra VERW + * should not matter much after the big hammer L1D Flush. + */ if (static_branch_unlikely(&vmx_l1d_should_flush)) vmx_l1d_flush(vcpu); - else if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF)) - mds_clear_cpu_buffers(); else if (static_branch_unlikely(&mmio_stale_data_clear) && kvm_arch_has_assigned_device(vcpu->kvm)) mds_clear_cpu_buffers();
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit e95df4ec0c0c9791941f112db699fae794b9862a upstream.
Currently MMIO Stale Data mitigation for CPUs not affected by MDS/TAA is to only deploy VERW at VMentry by enabling mmio_stale_data_clear static branch. No mitigation is needed for kernel->user transitions. If such CPUs are also affected by RFDS, its mitigation may set X86_FEATURE_CLEAR_CPU_BUF to deploy VERW at kernel->user and VMentry. This could result in duplicate VERW at VMentry.
Fix this by disabling mmio_stale_data_clear static branch when X86_FEATURE_CLEAR_CPU_BUF is enabled.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Dave Hansen dave.hansen@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/bugs.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -419,6 +419,13 @@ static void __init mmio_select_mitigatio if (boot_cpu_has_bug(X86_BUG_MDS) || (boot_cpu_has_bug(X86_BUG_TAA) && boot_cpu_has(X86_FEATURE_RTM))) setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF); + + /* + * X86_FEATURE_CLEAR_CPU_BUF could be enabled by other VERW based + * mitigations, disable KVM-only mitigation in that case. + */ + if (boot_cpu_has(X86_FEATURE_CLEAR_CPU_BUF)) + static_branch_disable(&mmio_stale_data_clear); else static_branch_enable(&mmio_stale_data_clear);
@@ -495,8 +502,11 @@ static void __init md_clear_update_mitig taa_mitigation = TAA_MITIGATION_VERW; taa_select_mitigation(); } - if (mmio_mitigation == MMIO_MITIGATION_OFF && - boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA)) { + /* + * MMIO_MITIGATION_OFF is not checked here so that mmio_stale_data_clear + * gets updated correctly as per X86_FEATURE_CLEAR_CPU_BUF state. + */ + if (boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA)) { mmio_mitigation = MMIO_MITIGATION_VERW; mmio_select_mitigation(); }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 4e42765d1be01111df0c0275bbaf1db1acef346e upstream.
Add the documentation for transient execution vulnerability Register File Data Sampling (RFDS) that affects Intel Atom CPUs.
[ pawan: s/ATOM_GRACEMONT/ALDERLAKE_N/ ]
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Thomas Gleixner tglx@linutronix.de Acked-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/hw-vuln/index.rst | 1 Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst | 104 +++++++++++ 2 files changed, 105 insertions(+)
--- a/Documentation/admin-guide/hw-vuln/index.rst +++ b/Documentation/admin-guide/hw-vuln/index.rst @@ -18,3 +18,4 @@ are configurable at compile, boot or run processor_mmio_stale_data.rst gather_data_sampling.rst srso + reg-file-data-sampling --- /dev/null +++ b/Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst @@ -0,0 +1,104 @@ +================================== +Register File Data Sampling (RFDS) +================================== + +Register File Data Sampling (RFDS) is a microarchitectural vulnerability that +only affects Intel Atom parts(also branded as E-cores). RFDS may allow +a malicious actor to infer data values previously used in floating point +registers, vector registers, or integer registers. RFDS does not provide the +ability to choose which data is inferred. CVE-2023-28746 is assigned to RFDS. + +Affected Processors +=================== +Below is the list of affected Intel processors [#f1]_: + + =================== ============ + Common name Family_Model + =================== ============ + ATOM_GOLDMONT 06_5CH + ATOM_GOLDMONT_D 06_5FH + ATOM_GOLDMONT_PLUS 06_7AH + ATOM_TREMONT_D 06_86H + ATOM_TREMONT 06_96H + ALDERLAKE 06_97H + ALDERLAKE_L 06_9AH + ATOM_TREMONT_L 06_9CH + RAPTORLAKE 06_B7H + RAPTORLAKE_P 06_BAH + ALDERLAKE_N 06_BEH + RAPTORLAKE_S 06_BFH + =================== ============ + +As an exception to this table, Intel Xeon E family parts ALDERLAKE(06_97H) and +RAPTORLAKE(06_B7H) codenamed Catlow are not affected. They are reported as +vulnerable in Linux because they share the same family/model with an affected +part. Unlike their affected counterparts, they do not enumerate RFDS_CLEAR or +CPUID.HYBRID. This information could be used to distinguish between the +affected and unaffected parts, but it is deemed not worth adding complexity as +the reporting is fixed automatically when these parts enumerate RFDS_NO. + +Mitigation +========== +Intel released a microcode update that enables software to clear sensitive +information using the VERW instruction. Like MDS, RFDS deploys the same +mitigation strategy to force the CPU to clear the affected buffers before an +attacker can extract the secrets. This is achieved by using the otherwise +unused and obsolete VERW instruction in combination with a microcode update. +The microcode clears the affected CPU buffers when the VERW instruction is +executed. + +Mitigation points +----------------- +VERW is executed by the kernel before returning to user space, and by KVM +before VMentry. None of the affected cores support SMT, so VERW is not required +at C-state transitions. + +New bits in IA32_ARCH_CAPABILITIES +---------------------------------- +Newer processors and microcode update on existing affected processors added new +bits to IA32_ARCH_CAPABILITIES MSR. These bits can be used to enumerate +vulnerability and mitigation capability: + +- Bit 27 - RFDS_NO - When set, processor is not affected by RFDS. +- Bit 28 - RFDS_CLEAR - When set, processor is affected by RFDS, and has the + microcode that clears the affected buffers on VERW execution. + +Mitigation control on the kernel command line +--------------------------------------------- +The kernel command line allows to control RFDS mitigation at boot time with the +parameter "reg_file_data_sampling=". The valid arguments are: + + ========== ================================================================= + on If the CPU is vulnerable, enable mitigation; CPU buffer clearing + on exit to userspace and before entering a VM. + off Disables mitigation. + ========== ================================================================= + +Mitigation default is selected by CONFIG_MITIGATION_RFDS. + +Mitigation status information +----------------------------- +The Linux kernel provides a sysfs interface to enumerate the current +vulnerability status of the system: whether the system is vulnerable, and +which mitigations are active. The relevant sysfs file is: + + /sys/devices/system/cpu/vulnerabilities/reg_file_data_sampling + +The possible values in this file are: + + .. list-table:: + + * - 'Not affected' + - The processor is not vulnerable + * - 'Vulnerable' + - The processor is vulnerable, but no mitigation enabled + * - 'Vulnerable: No microcode' + - The processor is vulnerable but microcode is not updated. + * - 'Mitigation: Clear Register File' + - The processor is vulnerable and the CPU buffer clearing mitigation is + enabled. + +References +---------- +.. [#f1] Affected Processors + https://www.intel.com/content/www/us/en/developer/topic-technology/software-...
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 8076fcde016c9c0e0660543e67bff86cb48a7c9c upstream.
RFDS is a CPU vulnerability that may allow userspace to infer kernel stale data previously used in floating point registers, vector registers and integer registers. RFDS only affects certain Intel Atom processors.
Intel released a microcode update that uses VERW instruction to clear the affected CPU buffers. Unlike MDS, none of the affected cores support SMT.
Add RFDS bug infrastructure and enable the VERW based mitigation by default, that clears the affected buffers just before exiting to userspace. Also add sysfs reporting and cmdline parameter "reg_file_data_sampling" to control the mitigation.
For details see: Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst
[ pawan: - Resolved conflicts in sysfs reporting. - s/ATOM_GRACEMONT/ALDERLAKE_N/ATOM_GRACEMONT is called ALDERLAKE_N in 6.6. ]
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Thomas Gleixner tglx@linutronix.de Acked-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/ABI/testing/sysfs-devices-system-cpu | 1 Documentation/admin-guide/kernel-parameters.txt | 21 +++++ arch/x86/Kconfig | 11 ++ arch/x86/include/asm/cpufeatures.h | 1 arch/x86/include/asm/msr-index.h | 8 ++ arch/x86/kernel/cpu/bugs.c | 78 ++++++++++++++++++++- arch/x86/kernel/cpu/common.c | 38 +++++++++- drivers/base/cpu.c | 8 ++ include/linux/cpu.h | 2 9 files changed, 162 insertions(+), 6 deletions(-)
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -507,6 +507,7 @@ What: /sys/devices/system/cpu/vulnerabi /sys/devices/system/cpu/vulnerabilities/mds /sys/devices/system/cpu/vulnerabilities/meltdown /sys/devices/system/cpu/vulnerabilities/mmio_stale_data + /sys/devices/system/cpu/vulnerabilities/reg_file_data_sampling /sys/devices/system/cpu/vulnerabilities/retbleed /sys/devices/system/cpu/vulnerabilities/spec_store_bypass /sys/devices/system/cpu/vulnerabilities/spectre_v1 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -993,6 +993,26 @@ The filter can be disabled or changed to another driver later using sysfs.
+ reg_file_data_sampling= + [X86] Controls mitigation for Register File Data + Sampling (RFDS) vulnerability. RFDS is a CPU + vulnerability which may allow userspace to infer + kernel data values previously stored in floating point + registers, vector registers, or integer registers. + RFDS only affects Intel Atom processors. + + on: Turns ON the mitigation. + off: Turns OFF the mitigation. + + This parameter overrides the compile time default set + by CONFIG_MITIGATION_RFDS. Mitigation cannot be + disabled when other VERW based mitigations (like MDS) + are enabled. In order to disable RFDS mitigation all + VERW based mitigations need to be disabled. + + For details see: + Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst + driver_async_probe= [KNL] List of driver names to be probed asynchronously. Format: <driver_name1>,<driver_name2>... @@ -2919,6 +2939,7 @@ nopti [X86,PPC] nospectre_v1 [X86,PPC] nospectre_v2 [X86,PPC,S390,ARM64] + reg_file_data_sampling=off [X86] retbleed=off [X86] spec_store_bypass_disable=off [X86,PPC] spectre_v2_user=off [X86] --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2508,6 +2508,17 @@ config GDS_FORCE_MITIGATION
If in doubt, say N.
+config MITIGATION_RFDS + bool "RFDS Mitigation" + depends on CPU_SUP_INTEL + default y + help + Enable mitigation for Register File Data Sampling (RFDS) by default. + RFDS is a hardware vulnerability which affects Intel Atom CPUs. It + allows unprivileged speculative access to stale data previously + stored in floating point, vector and integer registers. + See also file:Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst + endif
config ARCH_HAS_ADD_PAGES --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -454,4 +454,5 @@ /* BUG word 2 */ #define X86_BUG_SRSO X86_BUG(1*32 + 0) /* AMD SRSO bug */ #define X86_BUG_DIV0 X86_BUG(1*32 + 1) /* AMD DIV0 speculation bug */ +#define X86_BUG_RFDS X86_BUG(1*32 + 2) /* CPU is vulnerable to Register File Data Sampling */ #endif /* _ASM_X86_CPUFEATURES_H */ --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -168,6 +168,14 @@ * CPU is not vulnerable to Gather * Data Sampling (GDS). */ +#define ARCH_CAP_RFDS_NO BIT(27) /* + * Not susceptible to Register + * File Data Sampling. + */ +#define ARCH_CAP_RFDS_CLEAR BIT(28) /* + * VERW clears CPU Register + * File. + */
#define MSR_IA32_FLUSH_CMD 0x0000010b #define L1D_FLUSH BIT(0) /* --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -478,6 +478,57 @@ static int __init mmio_stale_data_parse_ early_param("mmio_stale_data", mmio_stale_data_parse_cmdline);
#undef pr_fmt +#define pr_fmt(fmt) "Register File Data Sampling: " fmt + +enum rfds_mitigations { + RFDS_MITIGATION_OFF, + RFDS_MITIGATION_VERW, + RFDS_MITIGATION_UCODE_NEEDED, +}; + +/* Default mitigation for Register File Data Sampling */ +static enum rfds_mitigations rfds_mitigation __ro_after_init = + IS_ENABLED(CONFIG_MITIGATION_RFDS) ? RFDS_MITIGATION_VERW : RFDS_MITIGATION_OFF; + +static const char * const rfds_strings[] = { + [RFDS_MITIGATION_OFF] = "Vulnerable", + [RFDS_MITIGATION_VERW] = "Mitigation: Clear Register File", + [RFDS_MITIGATION_UCODE_NEEDED] = "Vulnerable: No microcode", +}; + +static void __init rfds_select_mitigation(void) +{ + if (!boot_cpu_has_bug(X86_BUG_RFDS) || cpu_mitigations_off()) { + rfds_mitigation = RFDS_MITIGATION_OFF; + return; + } + if (rfds_mitigation == RFDS_MITIGATION_OFF) + return; + + if (x86_read_arch_cap_msr() & ARCH_CAP_RFDS_CLEAR) + setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF); + else + rfds_mitigation = RFDS_MITIGATION_UCODE_NEEDED; +} + +static __init int rfds_parse_cmdline(char *str) +{ + if (!str) + return -EINVAL; + + if (!boot_cpu_has_bug(X86_BUG_RFDS)) + return 0; + + if (!strcmp(str, "off")) + rfds_mitigation = RFDS_MITIGATION_OFF; + else if (!strcmp(str, "on")) + rfds_mitigation = RFDS_MITIGATION_VERW; + + return 0; +} +early_param("reg_file_data_sampling", rfds_parse_cmdline); + +#undef pr_fmt #define pr_fmt(fmt) "" fmt
static void __init md_clear_update_mitigation(void) @@ -510,6 +561,11 @@ static void __init md_clear_update_mitig mmio_mitigation = MMIO_MITIGATION_VERW; mmio_select_mitigation(); } + if (rfds_mitigation == RFDS_MITIGATION_OFF && + boot_cpu_has_bug(X86_BUG_RFDS)) { + rfds_mitigation = RFDS_MITIGATION_VERW; + rfds_select_mitigation(); + } out: if (boot_cpu_has_bug(X86_BUG_MDS)) pr_info("MDS: %s\n", mds_strings[mds_mitigation]); @@ -519,6 +575,8 @@ out: pr_info("MMIO Stale Data: %s\n", mmio_strings[mmio_mitigation]); else if (boot_cpu_has_bug(X86_BUG_MMIO_UNKNOWN)) pr_info("MMIO Stale Data: Unknown: No mitigations\n"); + if (boot_cpu_has_bug(X86_BUG_RFDS)) + pr_info("Register File Data Sampling: %s\n", rfds_strings[rfds_mitigation]); }
static void __init md_clear_select_mitigation(void) @@ -526,11 +584,12 @@ static void __init md_clear_select_mitig mds_select_mitigation(); taa_select_mitigation(); mmio_select_mitigation(); + rfds_select_mitigation();
/* - * As MDS, TAA and MMIO Stale Data mitigations are inter-related, update - * and print their mitigation after MDS, TAA and MMIO Stale Data - * mitigation selection is done. + * As these mitigations are inter-related and rely on VERW instruction + * to clear the microarchitural buffers, update and print their status + * after mitigation selection is done for each of these vulnerabilities. */ md_clear_update_mitigation(); } @@ -2530,6 +2589,11 @@ static ssize_t mmio_stale_data_show_stat sched_smt_active() ? "vulnerable" : "disabled"); }
+static ssize_t rfds_show_state(char *buf) +{ + return sysfs_emit(buf, "%s\n", rfds_strings[rfds_mitigation]); +} + static char *stibp_state(void) { if (spectre_v2_in_eibrs_mode(spectre_v2_enabled)) @@ -2690,6 +2754,9 @@ static ssize_t cpu_show_common(struct de case X86_BUG_SRSO: return srso_show_state(buf);
+ case X86_BUG_RFDS: + return rfds_show_state(buf); + default: break; } @@ -2764,4 +2831,9 @@ ssize_t cpu_show_spec_rstack_overflow(st { return cpu_show_common(dev, attr, buf, X86_BUG_SRSO); } + +ssize_t cpu_show_reg_file_data_sampling(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpu_show_common(dev, attr, buf, X86_BUG_RFDS); +} #endif --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1132,6 +1132,8 @@ static const __initconst struct x86_cpu_ #define SRSO BIT(5) /* CPU is affected by GDS */ #define GDS BIT(6) +/* CPU is affected by Register File Data Sampling */ +#define RFDS BIT(7)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS), @@ -1159,9 +1161,18 @@ static const struct x86_cpu_id cpu_vuln_ VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS), VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED), VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS), - VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPING_ANY, MMIO | MMIO_SBDS), - VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_D, X86_STEPPING_ANY, MMIO), - VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS), + VULNBL_INTEL_STEPPINGS(ALDERLAKE, X86_STEPPING_ANY, RFDS), + VULNBL_INTEL_STEPPINGS(ALDERLAKE_L, X86_STEPPING_ANY, RFDS), + VULNBL_INTEL_STEPPINGS(RAPTORLAKE, X86_STEPPING_ANY, RFDS), + VULNBL_INTEL_STEPPINGS(RAPTORLAKE_P, X86_STEPPING_ANY, RFDS), + VULNBL_INTEL_STEPPINGS(RAPTORLAKE_S, X86_STEPPING_ANY, RFDS), + VULNBL_INTEL_STEPPINGS(ALDERLAKE_N, X86_STEPPING_ANY, RFDS), + VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RFDS), + VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_D, X86_STEPPING_ANY, MMIO | RFDS), + VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RFDS), + VULNBL_INTEL_STEPPINGS(ATOM_GOLDMONT, X86_STEPPING_ANY, RFDS), + VULNBL_INTEL_STEPPINGS(ATOM_GOLDMONT_D, X86_STEPPING_ANY, RFDS), + VULNBL_INTEL_STEPPINGS(ATOM_GOLDMONT_PLUS, X86_STEPPING_ANY, RFDS),
VULNBL_AMD(0x15, RETBLEED), VULNBL_AMD(0x16, RETBLEED), @@ -1195,6 +1206,24 @@ static bool arch_cap_mmio_immune(u64 ia3 ia32_cap & ARCH_CAP_SBDR_SSDP_NO); }
+static bool __init vulnerable_to_rfds(u64 ia32_cap) +{ + /* The "immunity" bit trumps everything else: */ + if (ia32_cap & ARCH_CAP_RFDS_NO) + return false; + + /* + * VMMs set ARCH_CAP_RFDS_CLEAR for processors not in the blacklist to + * indicate that mitigation is needed because guest is running on a + * vulnerable hardware or may migrate to such hardware: + */ + if (ia32_cap & ARCH_CAP_RFDS_CLEAR) + return true; + + /* Only consult the blacklist when there is no enumeration: */ + return cpu_matches(cpu_vuln_blacklist, RFDS); +} + static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) { u64 ia32_cap = x86_read_arch_cap_msr(); @@ -1303,6 +1332,9 @@ static void __init cpu_set_bug_bits(stru setup_force_cpu_bug(X86_BUG_SRSO); }
+ if (vulnerable_to_rfds(ia32_cap)) + setup_force_cpu_bug(X86_BUG_RFDS); + if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) return;
--- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -591,6 +591,12 @@ ssize_t __weak cpu_show_spec_rstack_over return sysfs_emit(buf, "Not affected\n"); }
+ssize_t __weak cpu_show_reg_file_data_sampling(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "Not affected\n"); +} + static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); @@ -604,6 +610,7 @@ static DEVICE_ATTR(mmio_stale_data, 0444 static DEVICE_ATTR(retbleed, 0444, cpu_show_retbleed, NULL); static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL); static DEVICE_ATTR(spec_rstack_overflow, 0444, cpu_show_spec_rstack_overflow, NULL); +static DEVICE_ATTR(reg_file_data_sampling, 0444, cpu_show_reg_file_data_sampling, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_meltdown.attr, @@ -619,6 +626,7 @@ static struct attribute *cpu_root_vulner &dev_attr_retbleed.attr, &dev_attr_gather_data_sampling.attr, &dev_attr_spec_rstack_overflow.attr, + &dev_attr_reg_file_data_sampling.attr, NULL };
--- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -74,6 +74,8 @@ extern ssize_t cpu_show_spec_rstack_over struct device_attribute *attr, char *buf); extern ssize_t cpu_show_gds(struct device *dev, struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_reg_file_data_sampling(struct device *dev, + struct device_attribute *attr, char *buf);
extern __printf(4, 5) struct device *cpu_device_create(struct device *parent, void *drvdata,
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 2a0180129d726a4b953232175857d442651b55a0 upstream.
Mitigation for RFDS requires RFDS_CLEAR capability which is enumerated by MSR_IA32_ARCH_CAPABILITIES bit 27. If the host has it set, export it to guests so that they can deploy the mitigation.
RFDS_NO indicates that the system is not vulnerable to RFDS, export it to guests so that they don't deploy the mitigation unnecessarily. When the host is not affected by X86_BUG_RFDS, but has RFDS_NO=0, synthesize RFDS_NO to the guest.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Thomas Gleixner tglx@linutronix.de Acked-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/x86.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1389,7 +1389,8 @@ static unsigned int num_msr_based_featur ARCH_CAP_SKIP_VMENTRY_L1DFLUSH | ARCH_CAP_SSB_NO | ARCH_CAP_MDS_NO | \ ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \ ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \ - ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | ARCH_CAP_GDS_NO) + ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | ARCH_CAP_GDS_NO | \ + ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR)
static u64 kvm_get_arch_capabilities(void) { @@ -1426,6 +1427,8 @@ static u64 kvm_get_arch_capabilities(voi data |= ARCH_CAP_SSB_NO; if (!boot_cpu_has_bug(X86_BUG_MDS)) data |= ARCH_CAP_MDS_NO; + if (!boot_cpu_has_bug(X86_BUG_RFDS)) + data |= ARCH_CAP_RFDS_NO;
if (!boot_cpu_has(X86_FEATURE_RTM)) { /*
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yang Jihong yangjihong1@huawei.com
commit 6b959ba22d34ca793ffdb15b5715457c78e38b1a upstream.
perf_output_read_group may respond to IPI request of other cores and invoke __perf_install_in_context function. As a result, hwc configuration is modified. causing inconsistency and unexpected consequences.
Interrupts are not disabled when perf_output_read_group reads PMU counter. In this case, IPI request may be received from other cores. As a result, PMU configuration is modified and an error occurs when reading PMU counter:
CPU0 CPU1 __se_sys_perf_event_open perf_install_in_context perf_output_read_group smp_call_function_single for_each_sibling_event(sub, leader) { generic_exec_single if ((sub != event) && remote_function (sub->state == PERF_EVENT_STATE_ACTIVE)) | <enter IPI handler: __perf_install_in_context> <----RAISE IPI-----+ __perf_install_in_context ctx_resched event_sched_out armpmu_del ... hwc->idx = -1; // event->hwc.idx is set to -1 ... <exit IPI> sub->pmu->read(sub); armpmu_read armv8pmu_read_counter armv8pmu_read_hw_counter int idx = event->hw.idx; // idx = -1 u64 val = armv8pmu_read_evcntr(idx); u32 counter = ARMV8_IDX_TO_COUNTER(idx); // invalid counter = 30 read_pmevcntrn(counter) // undefined instruction
Signed-off-by: Yang Jihong yangjihong1@huawei.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20220902082918.179248-1-yangjihong1@huawei.com Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@igalia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/events/core.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6890,9 +6890,16 @@ static void perf_output_read_group(struc { struct perf_event *leader = event->group_leader, *sub; u64 read_format = event->attr.read_format; + unsigned long flags; u64 values[6]; int n = 0;
+ /* + * Disabling interrupts avoids all counter scheduling + * (context switches, timer based rotation and IPIs). + */ + local_irq_save(flags); + values[n++] = 1 + leader->nr_siblings;
if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED) @@ -6928,6 +6935,8 @@ static void perf_output_read_group(struc
__output_copy(handle, values, n * sizeof(u64)); } + + local_irq_restore(flags); }
#define PERF_FORMAT_TOTAL_TIMES (PERF_FORMAT_TOTAL_TIME_ENABLED|\
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tim Schumacher timschumi@gmx.de
commit f45812cc23fb74bef62d4eb8a69fe7218f4b9f2a upstream.
Work around a quirk in a few old (2011-ish) UEFI implementations, where a call to `GetNextVariableName` with a buffer size larger than 512 bytes will always return EFI_INVALID_PARAMETER.
There is some lore around EFI variable names being up to 1024 bytes in size, but this has no basis in the UEFI specification, and the upper bounds are typically platform specific, and apply to the entire variable (name plus payload).
Given that Linux does not permit creating files with names longer than NAME_MAX (255) bytes, 512 bytes (== 256 UTF-16 characters) is a reasonable limit.
Cc: stable@vger.kernel.org # 6.1+ Signed-off-by: Tim Schumacher timschumi@gmx.de Signed-off-by: Ard Biesheuvel ardb@kernel.org [timschumi@gmx.de: adjusted diff for changed context and code move] Signed-off-by: Tim Schumacher timschumi@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/firmware/efi/vars.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-)
--- a/drivers/firmware/efi/vars.c +++ b/drivers/firmware/efi/vars.c @@ -415,7 +415,7 @@ int efivar_init(int (*func)(efi_char16_t void *data, bool duplicates, struct list_head *head) { const struct efivar_operations *ops; - unsigned long variable_name_size = 1024; + unsigned long variable_name_size = 512; efi_char16_t *variable_name; efi_status_t status; efi_guid_t vendor_guid; @@ -438,12 +438,13 @@ int efivar_init(int (*func)(efi_char16_t }
/* - * Per EFI spec, the maximum storage allocated for both - * the variable name and variable data is 1024 bytes. + * A small set of old UEFI implementations reject sizes + * above a certain threshold, the lowest seen in the wild + * is 512. */
do { - variable_name_size = 1024; + variable_name_size = 512;
status = ops->get_next_variable(&variable_name_size, variable_name, @@ -491,9 +492,13 @@ int efivar_init(int (*func)(efi_char16_t break; case EFI_NOT_FOUND: break; + case EFI_BUFFER_TOO_SMALL: + pr_warn("efivars: Variable name size exceeds maximum (%lu > 512)\n", + variable_name_size); + status = EFI_NOT_FOUND; + break; default: - printk(KERN_WARNING "efivars: get_next_variable: status=%lx\n", - status); + pr_warn("efivars: get_next_variable: status=%lx\n", status); status = EFI_NOT_FOUND; break; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
commit 35f20786c481d5ced9283ff42de5c69b65e5ed13 upstream.
arch/powerpc/lib/xor_vmx.o is built with '-msoft-float' (from the main powerpc Makefile) and '-maltivec' (from its CFLAGS), which causes an error when building with clang after a recent change in main:
error: option '-msoft-float' cannot be specified with '-maltivec' make[6]: *** [scripts/Makefile.build:243: arch/powerpc/lib/xor_vmx.o] Error 1
Explicitly add '-mhard-float' before '-maltivec' in xor_vmx.o's CFLAGS to override the previous inclusion of '-msoft-float' (as the last option wins), which matches how other areas of the kernel use '-maltivec', such as AMDGPU.
Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/1986 Link: https://github.com/llvm/llvm-project/commit/4792f912b232141ecba4cbae538873be... Signed-off-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20240127-ppc-xor_vmx-drop-msoft-float-v1-1-f24140e81376@k... [nathan: Fixed conflicts due to lack of 04e85bbf71c9 in older trees] Signed-off-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/lib/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/lib/Makefile +++ b/arch/powerpc/lib/Makefile @@ -67,6 +67,6 @@ obj-$(CONFIG_PPC_LIB_RHEAP) += rheap.o obj-$(CONFIG_FTR_FIXUP_SELFTEST) += feature-fixups-test.o
obj-$(CONFIG_ALTIVEC) += xor_vmx.o xor_vmx_glue.o -CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec) +CFLAGS_xor_vmx.o += -mhard-float -maltivec $(call cc-option,-mabi=altivec)
obj-$(CONFIG_PPC64) += $(obj64-y)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hugo Villeneuve hvilleneuve@dimonoff.com
commit dbf4ab821804df071c8b566d9813083125e6d97b upstream.
The SC16IS7XX IC supports a burst mode to access the FIFOs where the initial register address is sent ($00), followed by all the FIFO data without having to resend the register address each time. In this mode, the IC doesn't increment the register address for each R/W byte.
The regmap_raw_read() and regmap_raw_write() are functions which can perform IO over multiple registers. They are currently used to read/write from/to the FIFO, and although they operate correctly in this burst mode on the SPI bus, they would corrupt the regmap cache if it was not disabled manually. The reason is that when the R/W size is more than 1 byte, these functions assume that the register address is incremented and handle the cache accordingly.
Convert FIFO R/W functions to use the regmap _noinc_ versions in order to remove the manual cache control which was a workaround when using the _raw_ versions. FIFO registers are properly declared as volatile so cache will not be used/updated for FIFO accesses.
Fixes: dfeae619d781 ("serial: sc16is7xx") Cc: stable@vger.kernel.org Signed-off-by: Hugo Villeneuve hvilleneuve@dimonoff.com Link: https://lore.kernel.org/r/20231211171353.2901416-6-hugo@hugovil.com Cc: Hugo Villeneuve hvilleneuve@dimonoff.com Signed-off-by: GONG, Ruiqi gongruiqi1@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/sc16is7xx.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
--- a/drivers/tty/serial/sc16is7xx.c +++ b/drivers/tty/serial/sc16is7xx.c @@ -376,9 +376,7 @@ static void sc16is7xx_fifo_read(struct u const u8 line = sc16is7xx_line(port); u8 addr = (SC16IS7XX_RHR_REG << SC16IS7XX_REG_SHIFT) | line;
- regcache_cache_bypass(s->regmap, true); - regmap_raw_read(s->regmap, addr, s->buf, rxlen); - regcache_cache_bypass(s->regmap, false); + regmap_noinc_read(s->regmap, addr, s->buf, rxlen); }
static void sc16is7xx_fifo_write(struct uart_port *port, u8 to_send) @@ -394,9 +392,7 @@ static void sc16is7xx_fifo_write(struct if (unlikely(!to_send)) return;
- regcache_cache_bypass(s->regmap, true); - regmap_raw_write(s->regmap, addr, s->buf, to_send); - regcache_cache_bypass(s->regmap, false); + regmap_noinc_write(s->regmap, addr, s->buf, to_send); }
static void sc16is7xx_port_update(struct uart_port *port, u8 reg, @@ -489,6 +485,11 @@ static bool sc16is7xx_regmap_precious(st return false; }
+static bool sc16is7xx_regmap_noinc(struct device *dev, unsigned int reg) +{ + return reg == SC16IS7XX_RHR_REG; +} + static int sc16is7xx_set_baud(struct uart_port *port, int baud) { struct sc16is7xx_port *s = dev_get_drvdata(port->dev); @@ -1439,6 +1440,8 @@ static struct regmap_config regcfg = { .cache_type = REGCACHE_RBTREE, .volatile_reg = sc16is7xx_regmap_volatile, .precious_reg = sc16is7xx_regmap_precious, + .writeable_noinc_reg = sc16is7xx_regmap_noinc, + .readable_noinc_reg = sc16is7xx_regmap_noinc, };
#ifdef CONFIG_SERIAL_SC16IS7XX_SPI
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liu Shixin liushixin2@huawei.com
When backport commit c79c5a0a00a9 to 5.10-stable, there is a mistake change. The head page instead of tail page should be passed to try_to_unmap(), otherwise unmap will failed as follows.
Memory failure: 0x121c10: failed to unmap page (mapcount=1) Memory failure: 0x121c10: recovery action for unmapping failed page: Ignored
Fixes: 70168fdc743b ("mm/memory-failure: check the mapcount of the precise page") Signed-off-by: Liu Shixin liushixin2@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory-failure.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1075,7 +1075,7 @@ static bool hwpoison_user_mappings(struc unmap_success = false; } } else { - unmap_success = try_to_unmap(p, ttu); + unmap_success = try_to_unmap(hpage, ttu); } } if (!unmap_success)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zi Yan ziy@nvidia.com
The tail pages in a THP can have swap entry information stored in their private field. When migrating to a new page, all tail pages of the new page need to update ->private to avoid future data corruption.
This fix is stable-only, since after commit 07e09c483cbe ("mm/huge_memory: work on folio->swap instead of page->private when splitting folio"), subpages of a swapcached THP no longer requires the maintenance.
Adding THPs to the swapcache was introduced in commit 38d8b4e6bdc87 ("mm, THP, swap: delay splitting THP during swap out"), where each subpage of a THP added to the swapcache had its own swapcache entry and required the ->private field to point to the correct swapcache entry. Later, when THP migration functionality was implemented in commit 616b8371539a6 ("mm: thp: enable thp migration in generic path"), it initially did not handle the subpages of swapcached THPs, failing to update their ->private fields or replace the subpage pointers in the swapcache. Subsequently, commit e71769ae5260 ("mm: enable thp migration for shmem thp") addressed the swapcache update aspect. This patch fixes the update of subpage ->private fields.
Closes: https://lore.kernel.org/linux-mm/1707814102-22682-1-git-send-email-quic_char... Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path") Signed-off-by: Zi Yan ziy@nvidia.com Acked-by: David Hildenbrand david@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/migrate.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/mm/migrate.c +++ b/mm/migrate.c @@ -447,8 +447,12 @@ int migrate_page_move_mapping(struct add if (PageSwapBacked(page)) { __SetPageSwapBacked(newpage); if (PageSwapCache(page)) { + int i; + SetPageSwapCache(newpage); - set_page_private(newpage, page_private(page)); + for (i = 0; i < (1 << compound_order(page)); i++) + set_page_private(newpage + i, + page_private(page + i)); } } else { VM_BUG_ON_PAGE(PageSwapCache(page), page);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Sperbeck jsperbeck@google.com
commit 4624b346cf67400ef46a31771011fb798dd2f999 upstream.
If initrd data is larger than 2Gb, we'll eventually fail to write to the /initrd.image file when we hit that limit, unless O_LARGEFILE is set.
Link: https://lkml.kernel.org/r/20240317221522.896040-1-jsperbeck@google.com Signed-off-by: John Sperbeck jsperbeck@google.com Cc: Jens Axboe axboe@kernel.dk Cc: Nick Desaulniers ndesaulniers@google.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- init/initramfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/init/initramfs.c +++ b/init/initramfs.c @@ -589,7 +589,7 @@ static void __init populate_initrd_image
printk(KERN_INFO "rootfs image is not initramfs (%s); looks like an initrd\n", err); - file = filp_open("/initrd.image", O_WRONLY | O_CREAT, 0700); + file = filp_open("/initrd.image", O_WRONLY|O_CREAT|O_LARGEFILE, 0700); if (IS_ERR(file)) return;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Felix Fietkau nbd@nbd.name
commit 4f2bdb3c5e3189297e156b3ff84b140423d64685 upstream.
When moving a station out of a VLAN and deleting the VLAN afterwards, the fast_rx entry still holds a pointer to the VLAN's netdev, which can cause use-after-free bugs. Fix this by immediately calling ieee80211_check_fast_rx after the VLAN change.
Cc: stable@vger.kernel.org Reported-by: ranygh@riseup.net Signed-off-by: Felix Fietkau nbd@nbd.name Link: https://msgid.link/20240316074336.40442-1-nbd@nbd.name Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mac80211/cfg.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/net/mac80211/cfg.c +++ b/net/mac80211/cfg.c @@ -1811,15 +1811,14 @@ static int ieee80211_change_station(stru }
if (sta->sdata->vif.type == NL80211_IFTYPE_AP_VLAN && - sta->sdata->u.vlan.sta) { - ieee80211_clear_fast_rx(sta); + sta->sdata->u.vlan.sta) RCU_INIT_POINTER(sta->sdata->u.vlan.sta, NULL); - }
if (test_sta_flag(sta, WLAN_STA_AUTHORIZED)) ieee80211_vif_dec_num_mcast(sta->sdata);
sta->sdata = vlansdata; + ieee80211_check_fast_rx(sta); ieee80211_check_fast_xmit(sta);
if (test_sta_flag(sta, WLAN_STA_AUTHORIZED)) {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Max Filippov jcmvbkbc@gmail.com
commit 2aea94ac14d1e0a8ae9e34febebe208213ba72f7 upstream.
In NOMMU kernel the value of linux_binprm::p is the offset inside the temporary program arguments array maintained in separate pages in the linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p thus must be adjusted when that array is copied to the user stack. Without that adjustment the value passed by the NOMMU kernel to the ELF program in the AT_EXECFN entry of the aux array doesn't make any sense and it may break programs that try to access memory pointed to by that entry.
Adjust linux_binprm::exec before the successful return from the transfer_args_to_stack().
Cc: stable@vger.kernel.org Fixes: b6a2fea39318 ("mm: variable length argument support") Fixes: 5edc2a5123a7 ("binfmt_elf_fdpic: wire up AT_EXECFD, AT_EXECFN, AT_SECURE") Signed-off-by: Max Filippov jcmvbkbc@gmail.com Link: https://lore.kernel.org/r/20240320182607.1472887-1-jcmvbkbc@gmail.com Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/exec.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/exec.c +++ b/fs/exec.c @@ -888,6 +888,7 @@ int transfer_args_to_stack(struct linux_ goto out; }
+ bprm->exec += *sp_location - MAX_ARG_PAGES * PAGE_SIZE; *sp_location = sp;
out:
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
commit 549aa9678a0b3981d4821bf244579d9937650562 upstream.
After the linked LLVM change, the build fails with CONFIG_LD_ORPHAN_WARN_LEVEL="error", which happens with allmodconfig:
ld.lld: error: vmlinux.a(init/main.o):(.hexagon.attributes) is being placed in '.hexagon.attributes'
Handle the attributes section in a similar manner as arm and riscv by adding it after the primary ELF_DETAILS grouping in vmlinux.lds.S, which fixes the error.
Link: https://lkml.kernel.org/r/20240319-hexagon-handle-attributes-section-vmlinux... Fixes: 113616ec5b64 ("hexagon: select ARCH_WANT_LD_ORPHAN_WARN") Link: https://github.com/llvm/llvm-project/commit/31f4b329c8234fab9afa59494d7f8bda... Signed-off-by: Nathan Chancellor nathan@kernel.org Reviewed-by: Brian Cain bcain@quicinc.com Cc: Bill Wendling morbo@google.com Cc: Justin Stitt justinstitt@google.com Cc: Nick Desaulniers ndesaulniers@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/hexagon/kernel/vmlinux.lds.S | 1 + 1 file changed, 1 insertion(+)
--- a/arch/hexagon/kernel/vmlinux.lds.S +++ b/arch/hexagon/kernel/vmlinux.lds.S @@ -64,6 +64,7 @@ SECTIONS STABS_DEBUG DWARF_DEBUG ELF_DETAILS + .hexagon.attributes 0 : { *(.hexagon.attributes) }
DISCARDS }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikko Rapeli mikko.rapeli@linaro.org
commit 0cdfe5b0bf295c0dee97436a8ed13336933a0211 upstream.
Commit 4d0c8d0aef63 ("mmc: core: Use mrq.sbc in close-ended ffu") adds flags uint to struct mmc_blk_ioc_data, but it does not get initialized for RPMB ioctls which now fails.
Let's fix this by always initializing the struct and flags to zero.
Fixes: 4d0c8d0aef63 ("mmc: core: Use mrq.sbc in close-ended ffu") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218587 Link: https://lore.kernel.org/all/20231129092535.3278-1-avri.altman@wdc.com/ Cc: stable@vger.kernel.org Signed-off-by: Mikko Rapeli mikko.rapeli@linaro.org Reviewed-by: Avri Altman avri.altman@wdc.com Acked-by: Adrian Hunter adrian.hunter@intel.com Tested-by: Francesco Dolcini francesco.dolcini@toradex.com Link: https://lore.kernel.org/r/20240313133744.2405325-1-mikko.rapeli@linaro.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mmc/core/block.c +++ b/drivers/mmc/core/block.c @@ -359,7 +359,7 @@ static struct mmc_blk_ioc_data *mmc_blk_ struct mmc_blk_ioc_data *idata; int err;
- idata = kmalloc(sizeof(*idata), GFP_KERNEL); + idata = kzalloc(sizeof(*idata), GFP_KERNEL); if (!idata) { err = -ENOMEM; goto out;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikko Rapeli mikko.rapeli@linaro.org
commit cf55a7acd1ed38afe43bba1c8a0935b51d1dc014 upstream.
Commit 4d0c8d0aef63 ("mmc: core: Use mrq.sbc in close-ended ffu") assigns prev_idata = idatas[i - 1], but doesn't check that the iterator i is greater than zero. Let's fix this by adding a check.
Fixes: 4d0c8d0aef63 ("mmc: core: Use mrq.sbc in close-ended ffu") Link: https://lore.kernel.org/all/20231129092535.3278-1-avri.altman@wdc.com/ Cc: stable@vger.kernel.org Signed-off-by: Mikko Rapeli mikko.rapeli@linaro.org Reviewed-by: Avri Altman avri.altman@wdc.com Tested-by: Francesco Dolcini francesco.dolcini@toradex.com Link: https://lore.kernel.org/r/20240313133744.2405325-2-mikko.rapeli@linaro.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mmc/core/block.c +++ b/drivers/mmc/core/block.c @@ -468,7 +468,7 @@ static int __mmc_blk_ioctl_cmd(struct mm if (idata->flags & MMC_BLK_IOC_DROP) return 0;
- if (idata->flags & MMC_BLK_IOC_SBC) + if (idata->flags & MMC_BLK_IOC_SBC && i > 0) prev_idata = idatas[i - 1];
/*
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Claus Hansen Ries chr@terma.com
commit 3a38a829c8bc27d78552c28e582eb1d885d07d11 upstream.
The function platform_get_resource was replaced with devm_platform_ioremap_resource_byname and is called using 0 as name.
This eventually ends up in platform_get_resource_byname in the call stack, where it causes a null pointer in strcmp.
if (type == resource_type(r) && !strcmp(r->name, name))
It should have been replaced with devm_platform_ioremap_resource.
Fixes: bd69058f50d5 ("net: ll_temac: Use devm_platform_ioremap_resource_byname()") Signed-off-by: Claus Hansen Ries chr@terma.com Cc: stable@vger.kernel.org Reviewed-by: Simon Horman horms@kernel.org Link: https://lore.kernel.org/r/cca18f9c630a41c18487729770b492bb@terma.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/xilinx/ll_temac_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c +++ b/drivers/net/ethernet/xilinx/ll_temac_main.c @@ -1427,7 +1427,7 @@ static int temac_probe(struct platform_d }
/* map device registers */ - lp->regs = devm_platform_ioremap_resource_byname(pdev, 0); + lp->regs = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(lp->regs)) { dev_err(&pdev->dev, "could not map TEMAC registers\n"); return -ENOMEM;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oliver Neukum oneukum@suse.com
commit 339f83612f3a569b194680768b22bf113c26a29d upstream.
wdm_read() cannot race with itself. However, in service_outstanding_interrupt() it can race with the workqueue, which can be triggered by error handling.
Hence we need to make sure that the WDM_RESPONDING flag is not just only set but tested.
Fixes: afba937e540c9 ("USB: CDC WDM driver") Cc: stable stable@kernel.org Signed-off-by: Oliver Neukum oneukum@suse.com Link: https://lore.kernel.org/r/20240314115132.3907-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/cdc-wdm.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/usb/class/cdc-wdm.c +++ b/drivers/usb/class/cdc-wdm.c @@ -471,6 +471,7 @@ out_free_mem: static int service_outstanding_interrupt(struct wdm_device *desc) { int rv = 0; + int used;
/* submit read urb only if the device is waiting for it */ if (!desc->resp_count || !--desc->resp_count) @@ -485,7 +486,10 @@ static int service_outstanding_interrupt goto out; }
- set_bit(WDM_RESPONDING, &desc->flags); + used = test_and_set_bit(WDM_RESPONDING, &desc->flags); + if (used) + goto out; + spin_unlock_irq(&desc->iuspin); rv = usb_submit_urb(desc->response, GFP_KERNEL); spin_lock_irq(&desc->iuspin);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Duoming Zhou duoming@zju.edu.cn
commit 051e0840ffa8ab25554d6b14b62c9ab9e4901457 upstream.
The dreamcastcard->timer could schedule the spu_dma_work and the spu_dma_work could also arm the dreamcastcard->timer.
When the snd_pcm_substream is closing, the aica_channel will be deallocated. But it could still be dereferenced in the worker thread. The reason is that del_timer() will return directly regardless of whether the timer handler is running or not and the worker could be rescheduled in the timer handler. As a result, the UAF bug will happen. The racy situation is shown below:
(Thread 1) | (Thread 2) snd_aicapcm_pcm_close() | ... | run_spu_dma() //worker | mod_timer() flush_work() | del_timer() | aica_period_elapsed() //timer kfree(dreamcastcard->channel) | schedule_work() | run_spu_dma() //worker ... | dreamcastcard->channel-> //USE
In order to mitigate this bug and other possible corner cases, call mod_timer() conditionally in run_spu_dma(), then implement PCM sync_stop op to cancel both the timer and worker. The sync_stop op will be called from PCM core appropriately when needed.
Fixes: 198de43d758c ("[ALSA] Add ALSA support for the SEGA Dreamcast PCM device") Suggested-by: Takashi Iwai tiwai@suse.de Signed-off-by: Duoming Zhou duoming@zju.edu.cn Message-ID: 20240326094238.95442-1-duoming@zju.edu.cn Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/sh/aica.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)
--- a/sound/sh/aica.c +++ b/sound/sh/aica.c @@ -279,7 +279,8 @@ static void run_spu_dma(struct work_stru dreamcastcard->clicks++; if (unlikely(dreamcastcard->clicks >= AICA_PERIOD_NUMBER)) dreamcastcard->clicks %= AICA_PERIOD_NUMBER; - mod_timer(&dreamcastcard->timer, jiffies + 1); + if (snd_pcm_running(dreamcastcard->substream)) + mod_timer(&dreamcastcard->timer, jiffies + 1); } }
@@ -291,6 +292,8 @@ static void aica_period_elapsed(struct t /*timer function - so cannot sleep */ int play_period; struct snd_pcm_runtime *runtime; + if (!snd_pcm_running(substream)) + return; runtime = substream->runtime; dreamcastcard = substream->pcm->private_data; /* Have we played out an additional period? */ @@ -351,12 +354,19 @@ static int snd_aicapcm_pcm_open(struct s return 0; }
+static int snd_aicapcm_pcm_sync_stop(struct snd_pcm_substream *substream) +{ + struct snd_card_aica *dreamcastcard = substream->pcm->private_data; + + del_timer_sync(&dreamcastcard->timer); + cancel_work_sync(&dreamcastcard->spu_dma_work); + return 0; +} + static int snd_aicapcm_pcm_close(struct snd_pcm_substream *substream) { struct snd_card_aica *dreamcastcard = substream->pcm->private_data; - flush_work(&(dreamcastcard->spu_dma_work)); - del_timer(&dreamcastcard->timer); dreamcastcard->substream = NULL; kfree(dreamcastcard->channel); spu_disable(); @@ -402,6 +412,7 @@ static const struct snd_pcm_ops snd_aica .prepare = snd_aicapcm_pcm_prepare, .trigger = snd_aicapcm_pcm_trigger, .pointer = snd_aicapcm_pcm_pointer, + .sync_stop = snd_aicapcm_pcm_sync_stop, };
/* TO DO: set up to handle more than one pcm instance */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guilherme G. Piccoli gpiccoli@igalia.com
commit f23a4d6e07570826fe95023ca1aa96a011fa9f84 upstream.
Commit fc663711b944 ("scsi: core: Remove the /proc/scsi/${proc_name} directory earlier") fixed a bug related to modules loading/unloading, by adding a call to scsi_proc_hostdir_rm() on scsi_remove_host(). But that led to a potential duplicate call to the hostdir_rm() routine, since it's also called from scsi_host_dev_release(). That triggered a regression report, which was then fixed by commit be03df3d4bfe ("scsi: core: Fix a procfs host directory removal regression"). The fix just dropped the hostdir_rm() call from dev_release().
But it happens that this proc directory is created on scsi_host_alloc(), and that function "pairs" with scsi_host_dev_release(), while scsi_remove_host() pairs with scsi_add_host(). In other words, it seems the reason for removing the proc directory on dev_release() was meant to cover cases in which a SCSI host structure was allocated, but the call to scsi_add_host() didn't happen. And that pattern happens to exist in some error paths, for example.
Syzkaller causes that by using USB raw gadget device, error'ing on usb-storage driver, at usb_stor_probe2(). By checking that path, we can see that the BadDevice label leads to a scsi_host_put() after a SCSI host allocation, but there's no call to scsi_add_host() in such path. That leads to messages like this in dmesg (and a leak of the SCSI host proc structure):
usb-storage 4-1:87.51: USB Mass Storage device detected proc_dir_entry 'scsi/usb-storage' already registered WARNING: CPU: 1 PID: 3519 at fs/proc/generic.c:377 proc_register+0x347/0x4e0 fs/proc/generic.c:376
The proper fix seems to still call scsi_proc_hostdir_rm() on dev_release(), but guard that with the state check for SHOST_CREATED; there is even a comment in scsi_host_dev_release() detailing that: such conditional is meant for cases where the SCSI host was allocated but there was no calls to {add,remove}_host(), like the usb-storage case.
This is what we propose here and with that, the error path of usb-storage does not trigger the warning anymore.
Reported-by: syzbot+c645abf505ed21f931b5@syzkaller.appspotmail.com Fixes: be03df3d4bfe ("scsi: core: Fix a procfs host directory removal regression") Cc: stable@vger.kernel.org Cc: Bart Van Assche bvanassche@acm.org Cc: John Garry john.g.garry@oracle.com Cc: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Signed-off-by: Guilherme G. Piccoli gpiccoli@igalia.com Link: https://lore.kernel.org/r/20240313113006.2834799-1-gpiccoli@igalia.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/hosts.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -334,12 +334,13 @@ static void scsi_host_dev_release(struct
if (shost->shost_state == SHOST_CREATED) { /* - * Free the shost_dev device name here if scsi_host_alloc() - * and scsi_host_put() have been called but neither + * Free the shost_dev device name and remove the proc host dir + * here if scsi_host_{alloc,put}() have been called but neither * scsi_host_add() nor scsi_host_remove() has been called. * This avoids that the memory allocated for the shost_dev - * name is leaked. + * name as well as the proc dir structure are leaked. */ + scsi_proc_hostdir_rm(shost->hostt); kfree(dev_name(&shost->shost_dev)); }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit ef25725b7f8aaffd7756974d3246ec44fae0a5cf upstream.
gcc-14 warns about this strncpy() that results in a non-terminated string for an overflow:
In file included from include/linux/string.h:369, from drivers/staging/vc04_services/vchiq-mmal/mmal-vchiq.c:20: In function 'strncpy', inlined from 'create_component' at drivers/staging/vc04_services/vchiq-mmal/mmal-vchiq.c:940:2: include/linux/fortify-string.h:108:33: error: '__builtin_strncpy' specified bound 128 equals destination size [-Werror=stringop-truncation]
Change it to strscpy_pad(), which produces a properly terminated and zero-padded string.
Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/20240313163712.224585-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/vc04_services/vchiq-mmal/mmal-vchiq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/staging/vc04_services/vchiq-mmal/mmal-vchiq.c +++ b/drivers/staging/vc04_services/vchiq-mmal/mmal-vchiq.c @@ -940,8 +940,8 @@ static int create_component(struct vchiq /* build component create message */ m.h.type = MMAL_MSG_TYPE_COMPONENT_CREATE; m.u.component_create.client_component = component->client_component; - strncpy(m.u.component_create.name, name, - sizeof(m.u.component_create.name)); + strscpy_pad(m.u.component_create.name, name, + sizeof(m.u.component_create.name));
ret = send_synchronous_mmal_msg(instance, &m, sizeof(m.u.component_create),
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
commit f37e76abd614b68987abc8e5c22d986013349771 upstream.
The m.u.component_create.pid field is for debugging and in the mainline kernel it's not used anything. However, it still needs to be set to something to prevent disclosing uninitialized stack data. Set it to zero.
Fixes: 7b3ad5abf027 ("staging: Import the BCM2835 MMAL-based V4L2 camera driver.") Cc: stable stable@kernel.org Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/2d972847-9ebd-481b-b6f9-af390f5aabd3@moroto.mounta... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/vc04_services/vchiq-mmal/mmal-vchiq.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/staging/vc04_services/vchiq-mmal/mmal-vchiq.c +++ b/drivers/staging/vc04_services/vchiq-mmal/mmal-vchiq.c @@ -942,6 +942,7 @@ static int create_component(struct vchiq m.u.component_create.client_component = component->client_component; strscpy_pad(m.u.component_create.name, name, sizeof(m.u.component_create.name)); + m.u.component_create.pid = 0;
ret = send_synchronous_mmal_msg(instance, &m, sizeof(m.u.component_create),
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alan Stern stern@rowland.harvard.edu
commit ee113b860aa169e9a4d2c167c95d0f1961c6e1b8 upstream.
Create hub_get() and hub_put() routines to encapsulate the kref_get() and kref_put() calls in hub.c. The new routines will be used by the next patch in this series.
Signed-off-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/604da420-ae8a-4a9e-91a4-2d511ff404fb@rowland.harva... Cc: stable stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/hub.c | 23 ++++++++++++++++------- drivers/usb/core/hub.h | 2 ++ 2 files changed, 18 insertions(+), 7 deletions(-)
--- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -116,7 +116,6 @@ EXPORT_SYMBOL_GPL(ehci_cf_port_reset_rws #define HUB_DEBOUNCE_STEP 25 #define HUB_DEBOUNCE_STABLE 100
-static void hub_release(struct kref *kref); static int usb_reset_and_verify_device(struct usb_device *udev); static int hub_port_disable(struct usb_hub *hub, int port1, int set_state); static bool hub_port_warm_reset_required(struct usb_hub *hub, int port1, @@ -678,14 +677,14 @@ static void kick_hub_wq(struct usb_hub * */ intf = to_usb_interface(hub->intfdev); usb_autopm_get_interface_no_resume(intf); - kref_get(&hub->kref); + hub_get(hub);
if (queue_work(hub_wq, &hub->events)) return;
/* the work has already been scheduled */ usb_autopm_put_interface_async(intf); - kref_put(&hub->kref, hub_release); + hub_put(hub); }
void usb_kick_hub_wq(struct usb_device *hdev) @@ -1053,7 +1052,7 @@ static void hub_activate(struct usb_hub goto init2; goto init3; } - kref_get(&hub->kref); + hub_get(hub);
/* The superspeed hub except for root hub has to use Hub Depth * value as an offset into the route string to locate the bits @@ -1301,7 +1300,7 @@ static void hub_activate(struct usb_hub device_unlock(&hdev->dev); }
- kref_put(&hub->kref, hub_release); + hub_put(hub); }
/* Implement the continuations for the delays above */ @@ -1717,6 +1716,16 @@ static void hub_release(struct kref *kre kfree(hub); }
+void hub_get(struct usb_hub *hub) +{ + kref_get(&hub->kref); +} + +void hub_put(struct usb_hub *hub) +{ + kref_put(&hub->kref, hub_release); +} + static unsigned highspeed_hubs;
static void hub_disconnect(struct usb_interface *intf) @@ -1763,7 +1772,7 @@ static void hub_disconnect(struct usb_in if (hub->quirk_disable_autosuspend) usb_autopm_put_interface(intf);
- kref_put(&hub->kref, hub_release); + hub_put(hub); }
static bool hub_descriptor_is_sane(struct usb_host_interface *desc) @@ -5857,7 +5866,7 @@ out_hdev_lock:
/* Balance the stuff in kick_hub_wq() and allow autosuspend */ usb_autopm_put_interface(intf); - kref_put(&hub->kref, hub_release); + hub_put(hub);
kcov_remote_stop(); } --- a/drivers/usb/core/hub.h +++ b/drivers/usb/core/hub.h @@ -117,6 +117,8 @@ extern void usb_hub_remove_port_device(s extern int usb_hub_set_port_power(struct usb_device *hdev, struct usb_hub *hub, int port1, bool set); extern struct usb_hub *usb_hub_to_struct_hub(struct usb_device *hdev); +extern void hub_get(struct usb_hub *hub); +extern void hub_put(struct usb_hub *hub); extern int hub_port_debounce(struct usb_hub *hub, int port1, bool must_be_connected); extern int usb_clear_port_feature(struct usb_device *hdev,
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Minas Harutyunyan Minas.Harutyunyan@synopsys.com
commit bae2bc73a59c200db53b6c15fb26bb758e2c6108 upstream.
Starting from core v4.30a changed order of programming GPWRDN_PMUACTV to 0 in case of exit from hibernation on remote wakeup signaling from device.
Fixes: c5c403dc4336 ("usb: dwc2: Add host/device hibernation functions") CC: stable@vger.kernel.org Signed-off-by: Minas Harutyunyan Minas.Harutyunyan@synopsys.com Link: https://lore.kernel.org/r/99385ec55ce73445b6fbd0f471c9bd40eb1c9b9e.170893979... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc2/core.h | 1 + drivers/usb/dwc2/hcd.c | 17 +++++++++++++---- 2 files changed, 14 insertions(+), 4 deletions(-)
--- a/drivers/usb/dwc2/core.h +++ b/drivers/usb/dwc2/core.h @@ -1097,6 +1097,7 @@ struct dwc2_hsotg { bool needs_byte_swap;
/* DWC OTG HW Release versions */ +#define DWC2_CORE_REV_4_30a 0x4f54430a #define DWC2_CORE_REV_2_71a 0x4f54271a #define DWC2_CORE_REV_2_72a 0x4f54272a #define DWC2_CORE_REV_2_80a 0x4f54280a --- a/drivers/usb/dwc2/hcd.c +++ b/drivers/usb/dwc2/hcd.c @@ -5523,10 +5523,12 @@ int dwc2_host_exit_hibernation(struct dw dwc2_writel(hsotg, hr->hcfg, HCFG);
/* De-assert Wakeup Logic */ - gpwrdn = dwc2_readl(hsotg, GPWRDN); - gpwrdn &= ~GPWRDN_PMUACTV; - dwc2_writel(hsotg, gpwrdn, GPWRDN); - udelay(10); + if (!(rem_wakeup && hsotg->hw_params.snpsid >= DWC2_CORE_REV_4_30a)) { + gpwrdn = dwc2_readl(hsotg, GPWRDN); + gpwrdn &= ~GPWRDN_PMUACTV; + dwc2_writel(hsotg, gpwrdn, GPWRDN); + udelay(10); + }
hprt0 = hr->hprt0; hprt0 |= HPRT0_PWR; @@ -5551,6 +5553,13 @@ int dwc2_host_exit_hibernation(struct dw hprt0 |= HPRT0_RES; dwc2_writel(hsotg, hprt0, HPRT0);
+ /* De-assert Wakeup Logic */ + if ((rem_wakeup && hsotg->hw_params.snpsid >= DWC2_CORE_REV_4_30a)) { + gpwrdn = dwc2_readl(hsotg, GPWRDN); + gpwrdn &= ~GPWRDN_PMUACTV; + dwc2_writel(hsotg, gpwrdn, GPWRDN); + udelay(10); + } /* Wait for Resume time and then program HPRT again */ mdelay(100); hprt0 &= ~HPRT0_RES;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Minas Harutyunyan Minas.Harutyunyan@synopsys.com
commit 3c7b9856a82227db01a20171d2e24c7ce305d59b upstream.
Added to backup/restore registers HFLBADDR, HCCHARi, HCSPLTi, HCTSIZi, HCDMAi and HCDMABi.
Fixes: 58e52ff6a6c3 ("usb: dwc2: Move register save and restore functions") Fixes: d17ee77b3044 ("usb: dwc2: add controller hibernation support") CC: stable@vger.kernel.org Signed-off-by: Minas Harutyunyan Minas.Harutyunyan@synopsys.com Link: https://lore.kernel.org/r/c2d10ee6098b9b009a8e94191e046004747d3bdd.170894544... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc2/core.h | 12 ++++++++++++ drivers/usb/dwc2/hcd.c | 18 ++++++++++++++++-- 2 files changed, 28 insertions(+), 2 deletions(-)
--- a/drivers/usb/dwc2/core.h +++ b/drivers/usb/dwc2/core.h @@ -748,8 +748,14 @@ struct dwc2_dregs_backup { * struct dwc2_hregs_backup - Holds host registers state before * entering partial power down * @hcfg: Backup of HCFG register + * @hflbaddr: Backup of HFLBADDR register * @haintmsk: Backup of HAINTMSK register + * @hcchar: Backup of HCCHAR register + * @hcsplt: Backup of HCSPLT register * @hcintmsk: Backup of HCINTMSK register + * @hctsiz: Backup of HCTSIZ register + * @hdma: Backup of HCDMA register + * @hcdmab: Backup of HCDMAB register * @hprt0: Backup of HPTR0 register * @hfir: Backup of HFIR register * @hptxfsiz: Backup of HPTXFSIZ register @@ -757,8 +763,14 @@ struct dwc2_dregs_backup { */ struct dwc2_hregs_backup { u32 hcfg; + u32 hflbaddr; u32 haintmsk; + u32 hcchar[MAX_EPS_CHANNELS]; + u32 hcsplt[MAX_EPS_CHANNELS]; u32 hcintmsk[MAX_EPS_CHANNELS]; + u32 hctsiz[MAX_EPS_CHANNELS]; + u32 hcidma[MAX_EPS_CHANNELS]; + u32 hcidmab[MAX_EPS_CHANNELS]; u32 hprt0; u32 hfir; u32 hptxfsiz; --- a/drivers/usb/dwc2/hcd.c +++ b/drivers/usb/dwc2/hcd.c @@ -5319,9 +5319,16 @@ int dwc2_backup_host_registers(struct dw /* Backup Host regs */ hr = &hsotg->hr_backup; hr->hcfg = dwc2_readl(hsotg, HCFG); + hr->hflbaddr = dwc2_readl(hsotg, HFLBADDR); hr->haintmsk = dwc2_readl(hsotg, HAINTMSK); - for (i = 0; i < hsotg->params.host_channels; ++i) + for (i = 0; i < hsotg->params.host_channels; ++i) { + hr->hcchar[i] = dwc2_readl(hsotg, HCCHAR(i)); + hr->hcsplt[i] = dwc2_readl(hsotg, HCSPLT(i)); hr->hcintmsk[i] = dwc2_readl(hsotg, HCINTMSK(i)); + hr->hctsiz[i] = dwc2_readl(hsotg, HCTSIZ(i)); + hr->hcidma[i] = dwc2_readl(hsotg, HCDMA(i)); + hr->hcidmab[i] = dwc2_readl(hsotg, HCDMAB(i)); + }
hr->hprt0 = dwc2_read_hprt0(hsotg); hr->hfir = dwc2_readl(hsotg, HFIR); @@ -5355,10 +5362,17 @@ int dwc2_restore_host_registers(struct d hr->valid = false;
dwc2_writel(hsotg, hr->hcfg, HCFG); + dwc2_writel(hsotg, hr->hflbaddr, HFLBADDR); dwc2_writel(hsotg, hr->haintmsk, HAINTMSK);
- for (i = 0; i < hsotg->params.host_channels; ++i) + for (i = 0; i < hsotg->params.host_channels; ++i) { + dwc2_writel(hsotg, hr->hcchar[i], HCCHAR(i)); + dwc2_writel(hsotg, hr->hcsplt[i], HCSPLT(i)); dwc2_writel(hsotg, hr->hcintmsk[i], HCINTMSK(i)); + dwc2_writel(hsotg, hr->hctsiz[i], HCTSIZ(i)); + dwc2_writel(hsotg, hr->hcidma[i], HCDMA(i)); + dwc2_writel(hsotg, hr->hcidmab[i], HCDMAB(i)); + }
dwc2_writel(hsotg, hr->hprt0, HPRT0); dwc2_writel(hsotg, hr->hfir, HFIR);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Minas Harutyunyan Minas.Harutyunyan@synopsys.com
commit b258e42688501cadb1a6dd658d6f015df9f32d8f upstream.
Fixed ISOC completion flow in DDMA mode. Added isoc descriptor actual length value and update urb's start_frame value. Fixed initialization of ISOC DMA descriptors flow.
Fixes: 56f5b1cff22a ("staging: Core files for the DWC2 driver") Fixes: 20f2eb9c4cf8 ("staging: dwc2: add microframe scheduler from downstream Pi kernel") Fixes: c17b337c1ea4 ("usb: dwc2: host: program descriptor for next frame") Fixes: dc4c76e7b22c ("staging: HCD descriptor DMA support for the DWC2 driver") Fixes: 762d3a1a9cd7 ("usb: dwc2: host: process all completed urbs") CC: stable@vger.kernel.org Signed-off-by: Minas Harutyunyan Minas.Harutyunyan@synopsys.com Link: https://lore.kernel.org/r/a8b1e1711cc6cabfb45d92ede12e35445c66f06c.170894469... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc2/hcd.c | 12 ++++++++++-- drivers/usb/dwc2/hcd_ddma.c | 17 +++++++++++------ drivers/usb/dwc2/hw.h | 2 +- 3 files changed, 22 insertions(+), 9 deletions(-)
--- a/drivers/usb/dwc2/hcd.c +++ b/drivers/usb/dwc2/hcd.c @@ -2736,8 +2736,11 @@ enum dwc2_transaction_type dwc2_hcd_sele hsotg->available_host_channels--; } qh = list_entry(qh_ptr, struct dwc2_qh, qh_list_entry); - if (dwc2_assign_and_init_hc(hsotg, qh)) + if (dwc2_assign_and_init_hc(hsotg, qh)) { + if (hsotg->params.uframe_sched) + hsotg->available_host_channels++; break; + }
/* * Move the QH from the periodic ready schedule to the @@ -2770,8 +2773,11 @@ enum dwc2_transaction_type dwc2_hcd_sele hsotg->available_host_channels--; }
- if (dwc2_assign_and_init_hc(hsotg, qh)) + if (dwc2_assign_and_init_hc(hsotg, qh)) { + if (hsotg->params.uframe_sched) + hsotg->available_host_channels++; break; + }
/* * Move the QH from the non-periodic inactive schedule to the @@ -4125,6 +4131,8 @@ void dwc2_host_complete(struct dwc2_hsot urb->actual_length);
if (usb_pipetype(urb->pipe) == PIPE_ISOCHRONOUS) { + if (!hsotg->params.dma_desc_enable) + urb->start_frame = qtd->qh->start_active_frame; urb->error_count = dwc2_hcd_urb_get_error_count(qtd->urb); for (i = 0; i < urb->number_of_packets; ++i) { urb->iso_frame_desc[i].actual_length = --- a/drivers/usb/dwc2/hcd_ddma.c +++ b/drivers/usb/dwc2/hcd_ddma.c @@ -589,7 +589,7 @@ static void dwc2_init_isoc_dma_desc(stru idx = qh->td_last; inc = qh->host_interval; hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg); - cur_idx = dwc2_frame_list_idx(hsotg->frame_number); + cur_idx = idx; next_idx = dwc2_desclist_idx_inc(qh->td_last, inc, qh->dev_speed);
/* @@ -896,6 +896,8 @@ static int dwc2_cmpl_host_isoc_dma_desc( { struct dwc2_dma_desc *dma_desc; struct dwc2_hcd_iso_packet_desc *frame_desc; + u16 frame_desc_idx; + struct urb *usb_urb = qtd->urb->priv; u16 remain = 0; int rc = 0;
@@ -908,8 +910,11 @@ static int dwc2_cmpl_host_isoc_dma_desc( DMA_FROM_DEVICE);
dma_desc = &qh->desc_list[idx]; + frame_desc_idx = (idx - qtd->isoc_td_first) & (usb_urb->number_of_packets - 1);
- frame_desc = &qtd->urb->iso_descs[qtd->isoc_frame_index_last]; + frame_desc = &qtd->urb->iso_descs[frame_desc_idx]; + if (idx == qtd->isoc_td_first) + usb_urb->start_frame = dwc2_hcd_get_frame_number(hsotg); dma_desc->buf = (u32)(qtd->urb->dma + frame_desc->offset); if (chan->ep_is_in) remain = (dma_desc->status & HOST_DMA_ISOC_NBYTES_MASK) >> @@ -930,7 +935,7 @@ static int dwc2_cmpl_host_isoc_dma_desc( frame_desc->status = 0; }
- if (++qtd->isoc_frame_index == qtd->urb->packet_count) { + if (++qtd->isoc_frame_index == usb_urb->number_of_packets) { /* * urb->status is not used for isoc transfers here. The * individual frame_desc status are used instead. @@ -1035,11 +1040,11 @@ static void dwc2_complete_isoc_xfer_ddma return; idx = dwc2_desclist_idx_inc(idx, qh->host_interval, chan->speed); - if (!rc) + if (rc == 0) continue;
- if (rc == DWC2_CMPL_DONE) - break; + if (rc == DWC2_CMPL_DONE || rc == DWC2_CMPL_STOP) + goto stop_scan;
/* rc == DWC2_CMPL_STOP */
--- a/drivers/usb/dwc2/hw.h +++ b/drivers/usb/dwc2/hw.h @@ -727,7 +727,7 @@ #define TXSTS_QTOP_TOKEN_MASK (0x3 << 25) #define TXSTS_QTOP_TOKEN_SHIFT 25 #define TXSTS_QTOP_TERMINATE BIT(24) -#define TXSTS_QSPCAVAIL_MASK (0xff << 16) +#define TXSTS_QSPCAVAIL_MASK (0x7f << 16) #define TXSTS_QSPCAVAIL_SHIFT 16 #define TXSTS_FSPCAVAIL_MASK (0xffff << 0) #define TXSTS_FSPCAVAIL_SHIFT 0
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Minas Harutyunyan Minas.Harutyunyan@synopsys.com
commit 5d69a3b54e5a630c90d82a4c2bdce3d53dc78710 upstream.
Added functionality to exit from L1 state by device initiation using remote wakeup signaling, in case when function driver queuing request while core in L1 state.
Fixes: 273d576c4d41 ("usb: dwc2: gadget: Add functionality to exit from LPM L1 state") Fixes: 88b02f2cb1e1 ("usb: dwc2: Add core state checking") CC: stable@vger.kernel.org Signed-off-by: Minas Harutyunyan Minas.Harutyunyan@synopsys.com Link: https://lore.kernel.org/r/b4d9de5382375dddbf7ef6049d9a82066ad87d5d.171016639... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc2/core.h | 1 drivers/usb/dwc2/core_intr.c | 65 ++++++++++++++++++++++++++++--------------- drivers/usb/dwc2/gadget.c | 4 ++ 3 files changed, 48 insertions(+), 22 deletions(-)
--- a/drivers/usb/dwc2/core.h +++ b/drivers/usb/dwc2/core.h @@ -1348,6 +1348,7 @@ int dwc2_backup_global_registers(struct int dwc2_restore_global_registers(struct dwc2_hsotg *hsotg);
void dwc2_enable_acg(struct dwc2_hsotg *hsotg); +void dwc2_wakeup_from_lpm_l1(struct dwc2_hsotg *hsotg, bool remotewakeup);
/* This function should be called on every hardware interrupt. */ irqreturn_t dwc2_handle_common_intr(int irq, void *dev); --- a/drivers/usb/dwc2/core_intr.c +++ b/drivers/usb/dwc2/core_intr.c @@ -344,10 +344,11 @@ static void dwc2_handle_session_req_intr * @hsotg: Programming view of DWC_otg controller * */ -static void dwc2_wakeup_from_lpm_l1(struct dwc2_hsotg *hsotg) +void dwc2_wakeup_from_lpm_l1(struct dwc2_hsotg *hsotg, bool remotewakeup) { u32 glpmcfg; - u32 i = 0; + u32 pcgctl; + u32 dctl;
if (hsotg->lx_state != DWC2_L1) { dev_err(hsotg->dev, "Core isn't in DWC2_L1 state\n"); @@ -356,37 +357,57 @@ static void dwc2_wakeup_from_lpm_l1(stru
glpmcfg = dwc2_readl(hsotg, GLPMCFG); if (dwc2_is_device_mode(hsotg)) { - dev_dbg(hsotg->dev, "Exit from L1 state\n"); + dev_dbg(hsotg->dev, "Exit from L1 state, remotewakeup=%d\n", remotewakeup); glpmcfg &= ~GLPMCFG_ENBLSLPM; - glpmcfg &= ~GLPMCFG_HIRD_THRES_EN; + glpmcfg &= ~GLPMCFG_HIRD_THRES_MASK; dwc2_writel(hsotg, glpmcfg, GLPMCFG);
- do { - glpmcfg = dwc2_readl(hsotg, GLPMCFG); - - if (!(glpmcfg & (GLPMCFG_COREL1RES_MASK | - GLPMCFG_L1RESUMEOK | GLPMCFG_SLPSTS))) - break; + pcgctl = dwc2_readl(hsotg, PCGCTL); + pcgctl &= ~PCGCTL_ENBL_SLEEP_GATING; + dwc2_writel(hsotg, pcgctl, PCGCTL); + + glpmcfg = dwc2_readl(hsotg, GLPMCFG); + if (glpmcfg & GLPMCFG_ENBESL) { + glpmcfg |= GLPMCFG_RSTRSLPSTS; + dwc2_writel(hsotg, glpmcfg, GLPMCFG); + }
- udelay(1); - } while (++i < 200); + if (remotewakeup) { + if (dwc2_hsotg_wait_bit_set(hsotg, GLPMCFG, GLPMCFG_L1RESUMEOK, 1000)) { + dev_warn(hsotg->dev, "%s: timeout GLPMCFG_L1RESUMEOK\n", __func__); + goto fail; + return; + } + + dctl = dwc2_readl(hsotg, DCTL); + dctl |= DCTL_RMTWKUPSIG; + dwc2_writel(hsotg, dctl, DCTL); + + if (dwc2_hsotg_wait_bit_set(hsotg, GINTSTS, GINTSTS_WKUPINT, 1000)) { + dev_warn(hsotg->dev, "%s: timeout GINTSTS_WKUPINT\n", __func__); + goto fail; + return; + } + }
- if (i == 200) { - dev_err(hsotg->dev, "Failed to exit L1 sleep state in 200us.\n"); + glpmcfg = dwc2_readl(hsotg, GLPMCFG); + if (glpmcfg & GLPMCFG_COREL1RES_MASK || glpmcfg & GLPMCFG_SLPSTS || + glpmcfg & GLPMCFG_L1RESUMEOK) { + goto fail; return; } - dwc2_gadget_init_lpm(hsotg); + + /* Inform gadget to exit from L1 */ + call_gadget(hsotg, resume); + /* Change to L0 state */ + hsotg->lx_state = DWC2_L0; + hsotg->bus_suspended = false; +fail: dwc2_gadget_init_lpm(hsotg); } else { /* TODO */ dev_err(hsotg->dev, "Host side LPM is not supported.\n"); return; } - - /* Change to L0 state */ - hsotg->lx_state = DWC2_L0; - - /* Inform gadget to exit from L1 */ - call_gadget(hsotg, resume); }
/* @@ -407,7 +428,7 @@ static void dwc2_handle_wakeup_detected_ dev_dbg(hsotg->dev, "%s lxstate = %d\n", __func__, hsotg->lx_state);
if (hsotg->lx_state == DWC2_L1) { - dwc2_wakeup_from_lpm_l1(hsotg); + dwc2_wakeup_from_lpm_l1(hsotg, false); return; }
--- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -1416,6 +1416,10 @@ static int dwc2_hsotg_ep_queue(struct us ep->name, req, req->length, req->buf, req->no_interrupt, req->zero, req->short_not_ok);
+ if (hs->lx_state == DWC2_L1) { + dwc2_wakeup_from_lpm_l1(hs, true); + } + /* Prevent new request submission when controller is suspended */ if (hs->lx_state != DWC2_L0) { dev_dbg(hs->dev, "%s: submit request only in active state\n",
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: yuan linyu yuanlinyu@hihonor.com
commit 2a587a035214fa1b5ef598aea0b81848c5b72e5e upstream.
It is possible trigger below warning message from mass storage function,
WARNING: CPU: 6 PID: 3839 at drivers/usb/gadget/udc/core.c:294 usb_ep_queue+0x7c/0x104 pc : usb_ep_queue+0x7c/0x104 lr : fsg_main_thread+0x494/0x1b3c
Root cause is mass storage function try to queue request from main thread, but other thread may already disable ep when function disable.
As there is no function failure in the driver, in order to avoid effort to fix warning, change WARN_ON_ONCE() in usb_ep_queue() to pr_debug().
Suggested-by: Alan Stern stern@rowland.harvard.edu Cc: stable@vger.kernel.org Signed-off-by: yuan linyu yuanlinyu@hihonor.com Reviewed-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/20240315020144.2715575-1-yuanlinyu@hihonor.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/udc/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/gadget/udc/core.c +++ b/drivers/usb/gadget/udc/core.c @@ -273,7 +273,9 @@ int usb_ep_queue(struct usb_ep *ep, { int ret = 0;
- if (WARN_ON_ONCE(!ep->enabled && ep->address)) { + if (!ep->enabled && ep->address) { + pr_debug("USB gadget: queue request to disabled ep 0x%x (%s)\n", + ep->address, ep->name); ret = -ESHUTDOWN; goto out; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian A. Ehrhardt lk@c--e.de
commit 6b5c85ddeea77d18c4b69e3bda60e9374a20c304 upstream.
If a command completes the OPM must send an ack. This applies to unsupported commands, too.
Send the required ACK for unsupported commands.
Signed-off-by: Christian A. Ehrhardt lk@c--e.de Cc: stable stable@kernel.org Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Tested-by: Neil Armstrong neil.armstrong@linaro.org # on SM8550-QRD Link: https://lore.kernel.org/r/20240320073927.1641788-4-lk@c--e.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/ucsi/ucsi.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -138,8 +138,12 @@ static int ucsi_exec_command(struct ucsi if (!(cci & UCSI_CCI_COMMAND_COMPLETE)) return -EIO;
- if (cci & UCSI_CCI_NOT_SUPPORTED) + if (cci & UCSI_CCI_NOT_SUPPORTED) { + if (ucsi_acknowledge_command(ucsi) < 0) + dev_err(ucsi->dev, + "ACK of unsupported command failed\n"); return -EOPNOTSUPP; + }
if (cci & UCSI_CCI_ERROR) { if (cmd == UCSI_GET_ERROR_STATUS)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian A. Ehrhardt lk@c--e.de
commit 3de4f996a0b5412aa451729008130a488f71563e upstream.
Check the UCSI_CCI_RESET_COMPLETE complete flag before starting another reset. Use a UCSI_SET_NOTIFICATION_ENABLE command to clear the flag if it is set.
Signed-off-by: Christian A. Ehrhardt lk@c--e.de Cc: stable stable@kernel.org Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Tested-by: Neil Armstrong neil.armstrong@linaro.org # on SM8550-QRD Link: https://lore.kernel.org/r/20240320073927.1641788-6-lk@c--e.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/ucsi/ucsi.c | 36 +++++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-)
--- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -852,13 +852,47 @@ static int ucsi_reset_connector(struct u
static int ucsi_reset_ppm(struct ucsi *ucsi) { - u64 command = UCSI_PPM_RESET; + u64 command; unsigned long tmo; u32 cci; int ret;
mutex_lock(&ucsi->ppm_lock);
+ ret = ucsi->ops->read(ucsi, UCSI_CCI, &cci, sizeof(cci)); + if (ret < 0) + goto out; + + /* + * If UCSI_CCI_RESET_COMPLETE is already set we must clear + * the flag before we start another reset. Send a + * UCSI_SET_NOTIFICATION_ENABLE command to achieve this. + * Ignore a timeout and try the reset anyway if this fails. + */ + if (cci & UCSI_CCI_RESET_COMPLETE) { + command = UCSI_SET_NOTIFICATION_ENABLE; + ret = ucsi->ops->async_write(ucsi, UCSI_CONTROL, &command, + sizeof(command)); + if (ret < 0) + goto out; + + tmo = jiffies + msecs_to_jiffies(UCSI_TIMEOUT_MS); + do { + ret = ucsi->ops->read(ucsi, UCSI_CCI, + &cci, sizeof(cci)); + if (ret < 0) + goto out; + if (cci & UCSI_CCI_COMMAND_COMPLETE) + break; + if (time_is_before_jiffies(tmo)) + break; + msleep(20); + } while (1); + + WARN_ON(cci & UCSI_CCI_RESET_COMPLETE); + } + + command = UCSI_PPM_RESET; ret = ucsi->ops->async_write(ucsi, UCSI_CONTROL, &command, sizeof(command)); if (ret < 0)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Quinn Tran qutran@marvell.com
commit 76a192e1a566e15365704b9f8fb3b70825f85064 upstream.
Current code combines the allocation of FCE|EFT trace buffers and enables the features all in 1 step.
Split this step into separate steps in preparation for follow-on patch to allow user to have a choice to enable / disable FCE trace feature.
Cc: stable@vger.kernel.org Reported-by: kernel test robot lkp@intel.com Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20240227164127.36465-4-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_init.c | 102 ++++++++++++++++------------------------ 1 file changed, 41 insertions(+), 61 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -2280,6 +2280,40 @@ exit: return rval; }
+static void qla_enable_fce_trace(scsi_qla_host_t *vha) +{ + int rval; + struct qla_hw_data *ha = vha->hw; + + if (ha->fce) { + ha->flags.fce_enabled = 1; + memset(ha->fce, 0, fce_calc_size(ha->fce_bufs)); + rval = qla2x00_enable_fce_trace(vha, + ha->fce_dma, ha->fce_bufs, ha->fce_mb, &ha->fce_bufs); + + if (rval) { + ql_log(ql_log_warn, vha, 0x8033, + "Unable to reinitialize FCE (%d).\n", rval); + ha->flags.fce_enabled = 0; + } + } +} + +static void qla_enable_eft_trace(scsi_qla_host_t *vha) +{ + int rval; + struct qla_hw_data *ha = vha->hw; + + if (ha->eft) { + memset(ha->eft, 0, EFT_SIZE); + rval = qla2x00_enable_eft_trace(vha, ha->eft_dma, EFT_NUM_BUFFERS); + + if (rval) { + ql_log(ql_log_warn, vha, 0x8034, + "Unable to reinitialize EFT (%d).\n", rval); + } + } +} /* * qla2x00_initialize_adapter * Initialize board. @@ -3230,9 +3264,8 @@ qla24xx_chip_diag(scsi_qla_host_t *vha) }
static void -qla2x00_init_fce_trace(scsi_qla_host_t *vha) +qla2x00_alloc_fce_trace(scsi_qla_host_t *vha) { - int rval; dma_addr_t tc_dma; void *tc; struct qla_hw_data *ha = vha->hw; @@ -3261,27 +3294,17 @@ qla2x00_init_fce_trace(scsi_qla_host_t * return; }
- rval = qla2x00_enable_fce_trace(vha, tc_dma, FCE_NUM_BUFFERS, - ha->fce_mb, &ha->fce_bufs); - if (rval) { - ql_log(ql_log_warn, vha, 0x00bf, - "Unable to initialize FCE (%d).\n", rval); - dma_free_coherent(&ha->pdev->dev, FCE_SIZE, tc, tc_dma); - return; - } - ql_dbg(ql_dbg_init, vha, 0x00c0, "Allocated (%d KB) for FCE...\n", FCE_SIZE / 1024);
- ha->flags.fce_enabled = 1; ha->fce_dma = tc_dma; ha->fce = tc; + ha->fce_bufs = FCE_NUM_BUFFERS; }
static void -qla2x00_init_eft_trace(scsi_qla_host_t *vha) +qla2x00_alloc_eft_trace(scsi_qla_host_t *vha) { - int rval; dma_addr_t tc_dma; void *tc; struct qla_hw_data *ha = vha->hw; @@ -3306,14 +3329,6 @@ qla2x00_init_eft_trace(scsi_qla_host_t * return; }
- rval = qla2x00_enable_eft_trace(vha, tc_dma, EFT_NUM_BUFFERS); - if (rval) { - ql_log(ql_log_warn, vha, 0x00c2, - "Unable to initialize EFT (%d).\n", rval); - dma_free_coherent(&ha->pdev->dev, EFT_SIZE, tc, tc_dma); - return; - } - ql_dbg(ql_dbg_init, vha, 0x00c3, "Allocated (%d KB) EFT ...\n", EFT_SIZE / 1024);
@@ -3321,13 +3336,6 @@ qla2x00_init_eft_trace(scsi_qla_host_t * ha->eft = tc; }
-static void -qla2x00_alloc_offload_mem(scsi_qla_host_t *vha) -{ - qla2x00_init_fce_trace(vha); - qla2x00_init_eft_trace(vha); -} - void qla2x00_alloc_fw_dump(scsi_qla_host_t *vha) { @@ -3382,10 +3390,10 @@ qla2x00_alloc_fw_dump(scsi_qla_host_t *v if (ha->tgt.atio_ring) mq_size += ha->tgt.atio_q_length * sizeof(request_t);
- qla2x00_init_fce_trace(vha); + qla2x00_alloc_fce_trace(vha); if (ha->fce) fce_size = sizeof(struct qla2xxx_fce_chain) + FCE_SIZE; - qla2x00_init_eft_trace(vha); + qla2x00_alloc_eft_trace(vha); if (ha->eft) eft_size = EFT_SIZE; } @@ -3784,7 +3792,6 @@ qla2x00_setup_chip(scsi_qla_host_t *vha) struct qla_hw_data *ha = vha->hw; struct device_reg_2xxx __iomem *reg = &ha->iobase->isp; unsigned long flags; - uint16_t fw_major_version; int done_once = 0;
if (IS_P3P_TYPE(ha)) { @@ -3851,7 +3858,6 @@ execute_fw_with_lr: goto failed;
enable_82xx_npiv: - fw_major_version = ha->fw_major_version; if (IS_P3P_TYPE(ha)) qla82xx_check_md_needed(vha); else @@ -3880,12 +3886,11 @@ enable_82xx_npiv: if (rval != QLA_SUCCESS) goto failed;
- if (!fw_major_version && !(IS_P3P_TYPE(ha))) - qla2x00_alloc_offload_mem(vha); - if (ql2xallocfwdump && !(IS_P3P_TYPE(ha))) qla2x00_alloc_fw_dump(vha);
+ qla_enable_fce_trace(vha); + qla_enable_eft_trace(vha); } else { goto failed; } @@ -7012,7 +7017,6 @@ qla2x00_abort_isp_cleanup(scsi_qla_host_ int qla2x00_abort_isp(scsi_qla_host_t *vha) { - int rval; uint8_t status = 0; struct qla_hw_data *ha = vha->hw; struct scsi_qla_host *vp; @@ -7100,31 +7104,7 @@ qla2x00_abort_isp(scsi_qla_host_t *vha)
if (IS_QLA81XX(ha) || IS_QLA8031(ha)) qla2x00_get_fw_version(vha); - if (ha->fce) { - ha->flags.fce_enabled = 1; - memset(ha->fce, 0, - fce_calc_size(ha->fce_bufs)); - rval = qla2x00_enable_fce_trace(vha, - ha->fce_dma, ha->fce_bufs, ha->fce_mb, - &ha->fce_bufs); - if (rval) { - ql_log(ql_log_warn, vha, 0x8033, - "Unable to reinitialize FCE " - "(%d).\n", rval); - ha->flags.fce_enabled = 0; - } - }
- if (ha->eft) { - memset(ha->eft, 0, EFT_SIZE); - rval = qla2x00_enable_eft_trace(vha, - ha->eft_dma, EFT_NUM_BUFFERS); - if (rval) { - ql_log(ql_log_warn, vha, 0x8034, - "Unable to reinitialize EFT " - "(%d).\n", rval); - } - } } else { /* failed the ISP abort */ vha->flags.online = 1; if (test_bit(ISP_ABORT_RETRY, &vha->dpc_flags)) {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Quinn Tran qutran@marvell.com
commit a27d4d0e7de305def8a5098a614053be208d1aa1 upstream.
System crash due to command failed to flush back to SCSI layer.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 27 PID: 793455 Comm: kworker/u130:6 Kdump: loaded Tainted: G OE --------- - - 4.18.0-372.9.1.el8.x86_64 #1 Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 09/03/2021 Workqueue: nvme-wq nvme_fc_connect_ctrl_work [nvme_fc] RIP: 0010:__wake_up_common+0x4c/0x190 Code: 24 10 4d 85 c9 74 0a 41 f6 01 04 0f 85 9d 00 00 00 48 8b 43 08 48 83 c3 08 4c 8d 48 e8 49 8d 41 18 48 39 c3 0f 84 f0 00 00 00 <49> 8b 41 18 89 54 24 08 31 ed 4c 8d 70 e8 45 8b 29 41 f6 c5 04 75 RSP: 0018:ffff95f3e0cb7cd0 EFLAGS: 00010086 RAX: 0000000000000000 RBX: ffff8b08d3b26328 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8b08d3b26320 RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffffffffffe8 R10: 0000000000000000 R11: ffff95f3e0cb7a60 R12: ffff95f3e0cb7d20 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8b2fdf6c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000002f1e410002 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: __wake_up_common_lock+0x7c/0xc0 qla_nvme_ls_req+0x355/0x4c0 [qla2xxx] qla2xxx [0000:12:00.1]-f084:3: qlt_free_session_done: se_sess 0000000000000000 / sess ffff8ae1407ca000 from port 21:32:00:02:ac:07:ee:b8 loop_id 0x02 s_id 01:02:00 logout 1 keep 0 els_logo 0 ? __nvme_fc_send_ls_req+0x260/0x380 [nvme_fc] qla2xxx [0000:12:00.1]-207d:3: FCPort 21:32:00:02:ac:07:ee:b8 state transitioned from ONLINE to LOST - portid=010200. ? nvme_fc_send_ls_req.constprop.42+0x1a/0x45 [nvme_fc] qla2xxx [0000:12:00.1]-2109:3: qla2x00_schedule_rport_del 21320002ac07eeb8. rport ffff8ae598122000 roles 1 ? nvme_fc_connect_ctrl_work.cold.63+0x1e3/0xa7d [nvme_fc] qla2xxx [0000:12:00.1]-f084:3: qlt_free_session_done: se_sess 0000000000000000 / sess ffff8ae14801e000 from port 21:32:01:02:ad:f7:ee:b8 loop_id 0x04 s_id 01:02:01 logout 1 keep 0 els_logo 0 ? __switch_to+0x10c/0x450 ? process_one_work+0x1a7/0x360 qla2xxx [0000:12:00.1]-207d:3: FCPort 21:32:01:02:ad:f7:ee:b8 state transitioned from ONLINE to LOST - portid=010201. ? worker_thread+0x1ce/0x390 ? create_worker+0x1a0/0x1a0 qla2xxx [0000:12:00.1]-2109:3: qla2x00_schedule_rport_del 21320102adf7eeb8. rport ffff8ae3b2312800 roles 70 ? kthread+0x10a/0x120 qla2xxx [0000:12:00.1]-2112:3: qla_nvme_unregister_remote_port: unregister remoteport on ffff8ae14801e000 21320102adf7eeb8 ? set_kthread_struct+0x40/0x40 qla2xxx [0000:12:00.1]-2110:3: remoteport_delete of ffff8ae14801e000 21320102adf7eeb8 completed. ? ret_from_fork+0x1f/0x40 qla2xxx [0000:12:00.1]-f086:3: qlt_free_session_done: waiting for sess ffff8ae14801e000 logout
The system was under memory stress where driver was not able to allocate an SRB to carry out error recovery of cable pull. The failure to flush causes upper layer to start modifying scsi_cmnd. When the system frees up some memory, the subsequent cable pull trigger another command flush. At this point the driver access a null pointer when attempting to DMA unmap the SGL.
Add a check to make sure commands are flush back on session tear down to prevent the null pointer access.
Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20240227164127.36465-7-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_target.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/drivers/scsi/qla2xxx/qla_target.c +++ b/drivers/scsi/qla2xxx/qla_target.c @@ -1038,6 +1038,16 @@ void qlt_free_session_done(struct work_s "%s: sess %p logout completed\n", __func__, sess); }
+ /* check for any straggling io left behind */ + if (!(sess->flags & FCF_FCP2_DEVICE) && + qla2x00_eh_wait_for_pending_commands(sess->vha, sess->d_id.b24, 0, WAIT_TARGET)) { + ql_log(ql_log_warn, vha, 0x3027, + "IO not return. Resetting.\n"); + set_bit(ISP_ABORT_NEEDED, &vha->dpc_flags); + qla2xxx_wake_dpc(vha); + qla2x00_wait_for_chip_reset(vha); + } + if (sess->logo_ack_needed) { sess->logo_ack_needed = 0; qla24xx_async_notify_ack(vha, sess,
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Quinn Tran qutran@marvell.com
commit 591c1fdf2016d118b8fbde427b796fac13f3f070 upstream.
Currently when PCI error is detected, I/O is aborted manually through the ABORT IOCB mechanism which is not guaranteed to succeed.
Instead, wait for the OS or system to notify driver to wind down I/O through the pci_error_handlers api. Set eeh_busy flag to pause all traffic and wait for I/O to drain.
Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20240227164127.36465-11-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_attr.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_attr.c +++ b/drivers/scsi/qla2xxx/qla_attr.c @@ -2689,7 +2689,13 @@ qla2x00_dev_loss_tmo_callbk(struct fc_rp return;
if (unlikely(pci_channel_offline(fcport->vha->hw->pdev))) { - qla2x00_abort_all_cmds(fcport->vha, DID_NO_CONNECT << 16); + /* Will wait for wind down of adapter */ + ql_dbg(ql_dbg_aer, fcport->vha, 0x900c, + "%s pci offline detected (id %06x)\n", __func__, + fcport->d_id.b24); + qla_pci_set_eeh_busy(fcport->vha); + qla2x00_eh_wait_for_pending_commands(fcport->vha, fcport->d_id.b24, + 0, WAIT_TARGET); return; } } @@ -2711,7 +2717,11 @@ qla2x00_terminate_rport_io(struct fc_rpo vha = fcport->vha;
if (unlikely(pci_channel_offline(fcport->vha->hw->pdev))) { - qla2x00_abort_all_cmds(fcport->vha, DID_NO_CONNECT << 16); + /* Will wait for wind down of adapter */ + ql_dbg(ql_dbg_aer, fcport->vha, 0x900b, + "%s pci offline detected (id %06x)\n", __func__, + fcport->d_id.b24); + qla_pci_set_eeh_busy(vha); qla2x00_eh_wait_for_pending_commands(fcport->vha, fcport->d_id.b24, 0, WAIT_TARGET); return;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kim Phillips kim.phillips@amd.com
commit fd470a8beed88440b160d690344fbae05a0b9b1b upstream.
Unlike Intel's Enhanced IBRS feature, AMD's Automatic IBRS does not provide protection to processes running at CPL3/user mode, see section "Extended Feature Enable Register (EFER)" in the APM v2 at https://bugzilla.kernel.org/attachment.cgi?id=304652
Explicitly enable STIBP to protect against cross-thread CPL3 branch target injections on systems with Automatic IBRS enabled.
Also update the relevant documentation.
Fixes: e7862eda309e ("x86/cpu: Support AMD Automatic IBRS") Reported-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Kim Phillips kim.phillips@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230720194727.67022-1-kim.phillips@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/hw-vuln/spectre.rst | 11 +++++++---- arch/x86/kernel/cpu/bugs.c | 15 +++++++++------ 2 files changed, 16 insertions(+), 10 deletions(-)
--- a/Documentation/admin-guide/hw-vuln/spectre.rst +++ b/Documentation/admin-guide/hw-vuln/spectre.rst @@ -484,11 +484,14 @@ Spectre variant 2
Systems which support enhanced IBRS (eIBRS) enable IBRS protection once at boot, by setting the IBRS bit, and they're automatically protected against - Spectre v2 variant attacks, including cross-thread branch target injections - on SMT systems (STIBP). In other words, eIBRS enables STIBP too. + Spectre v2 variant attacks.
- Legacy IBRS systems clear the IBRS bit on exit to userspace and - therefore explicitly enable STIBP for that + On Intel's enhanced IBRS systems, this includes cross-thread branch target + injections on SMT systems (STIBP). In other words, Intel eIBRS enables + STIBP, too. + + AMD Automatic IBRS does not protect userspace, and Legacy IBRS systems clear + the IBRS bit on exit to userspace, therefore both explicitly enable STIBP.
The retpoline mitigation is turned on by default on vulnerable CPUs. It can be forced on or off by the administrator --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1317,19 +1317,21 @@ spectre_v2_user_select_mitigation(void) }
/* - * If no STIBP, enhanced IBRS is enabled, or SMT impossible, STIBP + * If no STIBP, Intel enhanced IBRS is enabled, or SMT impossible, STIBP * is not required. * - * Enhanced IBRS also protects against cross-thread branch target + * Intel's Enhanced IBRS also protects against cross-thread branch target * injection in user-mode as the IBRS bit remains always set which * implicitly enables cross-thread protections. However, in legacy IBRS * mode, the IBRS bit is set only on kernel entry and cleared on return - * to userspace. This disables the implicit cross-thread protection, - * so allow for STIBP to be selected in that case. + * to userspace. AMD Automatic IBRS also does not protect userspace. + * These modes therefore disable the implicit cross-thread protection, + * so allow for STIBP to be selected in those cases. */ if (!boot_cpu_has(X86_FEATURE_STIBP) || !smt_possible || - spectre_v2_in_eibrs_mode(spectre_v2_enabled)) + (spectre_v2_in_eibrs_mode(spectre_v2_enabled) && + !boot_cpu_has(X86_FEATURE_AUTOIBRS))) return;
/* @@ -2596,7 +2598,8 @@ static ssize_t rfds_show_state(char *buf
static char *stibp_state(void) { - if (spectre_v2_in_eibrs_mode(spectre_v2_enabled)) + if (spectre_v2_in_eibrs_mode(spectre_v2_enabled) && + !boot_cpu_has(X86_FEATURE_AUTOIBRS)) return "";
switch (spectre_v2_user_stibp) {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mika Westerberg mika.westerberg@linux.intel.com
commit 3b8803494a0612acdeee714cb72aa142b1e05ce5 upstream.
Commit 5459c0b70467 ("PCI/DPC: Quirk PIO log size for certain Intel Root Ports") added quirks for Tiger and Alder Lake Root Ports but missed that the same issue exists also in the previous generation, Ice Lake.
Apply the quirk for Ice Lake Root Ports as well. This prevents kernel complaints like:
DPC: RP PIO log size 0 is invalid
and also enables the DPC driver to dump the RP PIO Log registers when DPC is triggered.
[bhelgaas: add dmesg warning and RP PIO Log dump info] Closes: https://bugzilla.kernel.org/show_bug.cgi?id=209943 Link: https://lore.kernel.org/r/20230511121905.73949-1-mika.westerberg@linux.intel... Reported-by: Mark Blakeney mark.blakeney@bullet-systems.net Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/quirks.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5912,8 +5912,9 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_I
#ifdef CONFIG_PCIE_DPC /* - * Intel Tiger Lake and Alder Lake BIOS has a bug that clears the DPC - * RP PIO Log Size of the integrated Thunderbolt PCIe Root Ports. + * Intel Ice Lake, Tiger Lake and Alder Lake BIOS has a bug that clears + * the DPC RP PIO Log Size of the integrated Thunderbolt PCIe Root + * Ports. */ static void dpc_log_size(struct pci_dev *dev) { @@ -5936,6 +5937,10 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_I DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x462f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x463f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x466e, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8a1d, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8a1f, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8a21, dpc_log_size); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8a23, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a23, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a25, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a27, dpc_log_size);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Muhammad Usama Anjum usama.anjum@collabora.com
commit 28d41991182c210ec1654f8af2e140ef4cc73f20 upstream.
The wqe is of type lpfc_wqe128. It should be memset with the same type.
Fixes: 6c621a2229b0 ("scsi: lpfc: Separate NVMET RQ buffer posting from IO resources SGL/iocbq/context") Signed-off-by: Muhammad Usama Anjum usama.anjum@collabora.com Link: https://lore.kernel.org/r/20240304090649.833953-1-usama.anjum@collabora.com Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Reviewed-by: Justin Tee justintee8345@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/lpfc/lpfc_nvmet.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/scsi/lpfc/lpfc_nvmet.c +++ b/drivers/scsi/lpfc/lpfc_nvmet.c @@ -1579,7 +1579,7 @@ lpfc_nvmet_setup_io_context(struct lpfc_ wqe = &nvmewqe->wqe;
/* Initialize WQE */ - memset(wqe, 0, sizeof(union lpfc_wqe)); + memset(wqe, 0, sizeof(*wqe));
ctx_buf->iocbq->context1 = NULL; spin_lock(&phba->sli4_hba.sgl_list_lock);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alan Stern stern@rowland.harvard.edu
commit 80ba43e9f799cbdd83842fc27db667289b3150f5 upstream.
Among the attribute file callback routines in drivers/usb/core/sysfs.c, the interface_authorized_store() function is the only one which acquires a device lock on an ancestor device: It calls usb_deauthorize_interface(), which locks the interface's parent USB device.
The will lead to deadlock if another process already owns that lock and tries to remove the interface, whether through a configuration change or because the device has been disconnected. As part of the removal procedure, device_del() waits for all ongoing sysfs attribute callbacks to complete. But usb_deauthorize_interface() can't complete until the device lock has been released, and the lock won't be released until the removal has finished.
The mechanism provided by sysfs to prevent this kind of deadlock is to use the sysfs_break_active_protection() function, which tells sysfs not to wait for the attribute callback.
Reported-and-tested by: Yue Sun samsun1006219@gmail.com Reported by: xingwei lee xrivendell7@gmail.com
Signed-off-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/linux-usb/CAEkJfYO6jRVC8Tfrd_R=cjO0hguhrV31fDPrLrNOO... Fixes: 310d2b4124c0 ("usb: interface authorization: SysFS part of USB interface authorization") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1c37eea1-9f56-4534-b9d8-b443438dc869@rowland.harva... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/sysfs.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
--- a/drivers/usb/core/sysfs.c +++ b/drivers/usb/core/sysfs.c @@ -1166,14 +1166,24 @@ static ssize_t interface_authorized_stor { struct usb_interface *intf = to_usb_interface(dev); bool val; + struct kernfs_node *kn;
if (strtobool(buf, &val) != 0) return -EINVAL;
- if (val) + if (val) { usb_authorize_interface(intf); - else - usb_deauthorize_interface(intf); + } else { + /* + * Prevent deadlock if another process is concurrently + * trying to unregister intf. + */ + kn = sysfs_break_active_protection(&dev->kobj, &attr->attr); + if (kn) { + usb_deauthorize_interface(intf); + sysfs_unbreak_active_protection(kn); + } + }
return count; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryosuke Yasuoka ryasuoka@redhat.com
[ Upstream commit d24b03535e5eb82e025219c2f632b485409c898f ]
syzbot reported the following uninit-value access issue [1][2]:
nci_rx_work() parses and processes received packet. When the payload length is zero, each message type handler reads uninitialized payload and KMSAN detects this issue. The receipt of a packet with a zero-size payload is considered unexpected, and therefore, such packets should be silently discarded.
This patch resolved this issue by checking payload size before calling each message type handler codes.
Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Reported-and-tested-by: syzbot+7ea9413ea6749baf5574@syzkaller.appspotmail.com Reported-and-tested-by: syzbot+29b5ca705d2e0f4a44d2@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7ea9413ea6749baf5574 [1] Closes: https://syzkaller.appspot.com/bug?extid=29b5ca705d2e0f4a44d2 [2] Signed-off-by: Ryosuke Yasuoka ryasuoka@redhat.com Reviewed-by: Jeremy Cline jeremy@jcline.org Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/nfc/nci/core.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/net/nfc/nci/core.c b/net/nfc/nci/core.c index 5bfaf06f7be7f..d8002065baaef 100644 --- a/net/nfc/nci/core.c +++ b/net/nfc/nci/core.c @@ -1502,6 +1502,11 @@ static void nci_rx_work(struct work_struct *work) nfc_send_to_raw_sock(ndev->nfc_dev, skb, RAW_PAYLOAD_NCI, NFC_DIRECTION_RX);
+ if (!nci_plen(skb->data)) { + kfree_skb(skb); + break; + } + /* Process frame */ switch (nci_mt(skb->data)) { case NCI_MT_RSP_PKT:
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Przemek Kitszel przemyslaw.kitszel@intel.com
[ Upstream commit aec806fb4afba5fe80b09e29351379a4292baa43 ]
Change kzalloc() flags used in ixgbe_ipsec_vf_add_sa() to GFP_ATOMIC, to avoid sleeping in IRQ context.
Dan Carpenter, with the help of Smatch, has found following issue: The patch eda0333ac293: "ixgbe: add VF IPsec management" from Aug 13, 2018 (linux-next), leads to the following Smatch static checker warning: drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c:917 ixgbe_ipsec_vf_add_sa() warn: sleeping in IRQ context
The call tree that Smatch is worried about is: ixgbe_msix_other() <- IRQ handler -> ixgbe_msg_task() -> ixgbe_rcv_msg_from_vf() -> ixgbe_ipsec_vf_add_sa()
Fixes: eda0333ac293 ("ixgbe: add VF IPsec management") Reported-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/intel-wired-lan/db31a0b0-4d9f-4e6b-aed8-88266eb5665c... Reviewed-by: Michal Kubiak michal.kubiak@intel.com Signed-off-by: Przemek Kitszel przemyslaw.kitszel@intel.com Reviewed-by: Shannon Nelson shannon.nelson@amd.com Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c index 319620856cba1..512da34e70a35 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c @@ -909,7 +909,13 @@ int ixgbe_ipsec_vf_add_sa(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf) goto err_out; }
- xs = kzalloc(sizeof(*xs), GFP_KERNEL); + algo = xfrm_aead_get_byname(aes_gcm_name, IXGBE_IPSEC_AUTH_BITS, 1); + if (unlikely(!algo)) { + err = -ENOENT; + goto err_out; + } + + xs = kzalloc(sizeof(*xs), GFP_ATOMIC); if (unlikely(!xs)) { err = -ENOMEM; goto err_out; @@ -925,14 +931,8 @@ int ixgbe_ipsec_vf_add_sa(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf) memcpy(&xs->id.daddr.a4, sam->addr, sizeof(xs->id.daddr.a4)); xs->xso.dev = adapter->netdev;
- algo = xfrm_aead_get_byname(aes_gcm_name, IXGBE_IPSEC_AUTH_BITS, 1); - if (unlikely(!algo)) { - err = -ENOENT; - goto err_xs; - } - aead_len = sizeof(*xs->aead) + IXGBE_IPSEC_KEY_BITS / 8; - xs->aead = kzalloc(aead_len, GFP_KERNEL); + xs->aead = kzalloc(aead_len, GFP_ATOMIC); if (unlikely(!xs->aead)) { err = -ENOMEM; goto err_xs;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 151c9c724d05d5b0dd8acd3e11cb69ef1f2dbada ]
We had various syzbot reports about tcp timers firing after the corresponding netns has been dismantled.
Fortunately Josef Bacik could trigger the issue more often, and could test a patch I wrote two years ago.
When TCP sockets are closed, we call inet_csk_clear_xmit_timers() to 'stop' the timers.
inet_csk_clear_xmit_timers() can be called from any context, including when socket lock is held. This is the reason it uses sk_stop_timer(), aka del_timer(). This means that ongoing timers might finish much later.
For user sockets, this is fine because each running timer holds a reference on the socket, and the user socket holds a reference on the netns.
For kernel sockets, we risk that the netns is freed before timer can complete, because kernel sockets do not hold reference on the netns.
This patch adds inet_csk_clear_xmit_timers_sync() function that using sk_stop_timer_sync() to make sure all timers are terminated before the kernel socket is released. Modules using kernel sockets close them in their netns exit() handler.
Also add sock_not_owned_by_me() helper to get LOCKDEP support : inet_csk_clear_xmit_timers_sync() must not be called while socket lock is held.
It is very possible we can revert in the future commit 3a58f13a881e ("net: rds: acquire refcount on TCP sockets") which attempted to solve the issue in rds only. (net/smc/af_smc.c and net/mptcp/subflow.c have similar code)
We probably can remove the check_net() tests from tcp_out_of_resources() and __tcp_close() in the future.
Reported-by: Josef Bacik josef@toxicpanda.com Closes: https://lore.kernel.org/netdev/20240314210740.GA2823176@perftesting/ Fixes: 26abe14379f8 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.") Fixes: 8a68173691f0 ("net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket") Link: https://lore.kernel.org/bpf/CANn89i+484ffqb93aQm1N-tjxxvb3WDKX0EbD7318RwRgsa... Signed-off-by: Eric Dumazet edumazet@google.com Tested-by: Josef Bacik josef@toxicpanda.com Cc: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Link: https://lore.kernel.org/r/20240322135732.1535772-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/inet_connection_sock.h | 1 + include/net/sock.h | 7 +++++++ net/ipv4/inet_connection_sock.c | 14 ++++++++++++++ net/ipv4/tcp.c | 2 ++ 4 files changed, 24 insertions(+)
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index 568121fa0965c..f5967805c33fd 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -172,6 +172,7 @@ void inet_csk_init_xmit_timers(struct sock *sk, void (*delack_handler)(struct timer_list *), void (*keepalive_handler)(struct timer_list *)); void inet_csk_clear_xmit_timers(struct sock *sk); +void inet_csk_clear_xmit_timers_sync(struct sock *sk);
static inline void inet_csk_schedule_ack(struct sock *sk) { diff --git a/include/net/sock.h b/include/net/sock.h index 87ee284ea9cb3..8bcc96bf291c3 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1681,6 +1681,13 @@ static inline void sock_owned_by_me(const struct sock *sk) #endif }
+static inline void sock_not_owned_by_me(const struct sock *sk) +{ +#ifdef CONFIG_LOCKDEP + WARN_ON_ONCE(lockdep_sock_is_held(sk) && debug_locks); +#endif +} + static inline bool sock_owned_by_user(const struct sock *sk) { sock_owned_by_me(sk); diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index b15c9ad0095a2..6ebe43b4d28f7 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -580,6 +580,20 @@ void inet_csk_clear_xmit_timers(struct sock *sk) } EXPORT_SYMBOL(inet_csk_clear_xmit_timers);
+void inet_csk_clear_xmit_timers_sync(struct sock *sk) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + + /* ongoing timer handlers need to acquire socket lock. */ + sock_not_owned_by_me(sk); + + icsk->icsk_pending = icsk->icsk_ack.pending = 0; + + sk_stop_timer_sync(sk, &icsk->icsk_retransmit_timer); + sk_stop_timer_sync(sk, &icsk->icsk_delack_timer); + sk_stop_timer_sync(sk, &sk->sk_timer); +} + void inet_csk_delete_keepalive_timer(struct sock *sk) { sk_stop_timer(sk, &sk->sk_timer); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 2e874ec859715..ac6cb2dc60380 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2717,6 +2717,8 @@ void tcp_close(struct sock *sk, long timeout) lock_sock(sk); __tcp_close(sk, timeout); release_sock(sk); + if (!sk->sk_net_refcnt) + inet_csk_clear_xmit_timers_sync(sk); sock_put(sk); } EXPORT_SYMBOL(tcp_close);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikita Kiryushin kiryushin@ancud.ru
[ Upstream commit 40e2710860e57411ab57a1529c5a2748abbe8a19 ]
ACPICA commit 9061cd9aa131205657c811a52a9f8325a040c6c9
Errors in acpi_evaluate_object() can lead to incorrect state of buffer.
This can lead to access to data in previously ACPI_FREEd buffer and secondary ACPI_FREE to the same buffer later.
Handle errors in acpi_evaluate_object the same way it is done earlier with acpi_ns_handle_to_pathname.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Link: https://github.com/acpica/acpica/commit/9061cd9a Fixes: 5fd033288a86 ("ACPICA: debugger: add command to dump all fields of particular subtype") Signed-off-by: Nikita Kiryushin kiryushin@ancud.ru Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/acpica/dbnames.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/acpica/dbnames.c b/drivers/acpi/acpica/dbnames.c index b91155ea9c343..c9131259f717b 100644 --- a/drivers/acpi/acpica/dbnames.c +++ b/drivers/acpi/acpica/dbnames.c @@ -550,8 +550,12 @@ acpi_db_walk_for_fields(acpi_handle obj_handle, ACPI_FREE(buffer.pointer);
buffer.length = ACPI_ALLOCATE_LOCAL_BUFFER; - acpi_evaluate_object(obj_handle, NULL, NULL, &buffer); - + status = acpi_evaluate_object(obj_handle, NULL, NULL, &buffer); + if (ACPI_FAILURE(status)) { + acpi_os_printf("Could Not evaluate object %p\n", + obj_handle); + return (AE_OK); + } /* * Since this is a field unit, surround the output in braces */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrei Matei andreimatei1@gmail.com
[ Upstream commit ecc6a2101840177e57c925c102d2d29f260d37c8 ]
This patch re-introduces protection against the size of access to stack memory being negative; the access size can appear negative as a result of overflowing its signed int representation. This should not actually happen, as there are other protections along the way, but we should protect against it anyway. One code path was missing such protections (fixed in the previous patch in the series), causing out-of-bounds array accesses in check_stack_range_initialized(). This patch causes the verification of a program with such a non-sensical access size to fail.
This check used to exist in a more indirect way, but was inadvertendly removed in a833a17aeac7.
Fixes: a833a17aeac7 ("bpf: Fix verification of indirect var-off stack access") Reported-by: syzbot+33f4297b5f927648741a@syzkaller.appspotmail.com Reported-by: syzbot+aafd0513053a1cbf52ef@syzkaller.appspotmail.com Closes: https://lore.kernel.org/bpf/CAADnVQLORV5PT0iTAhRER+iLBTkByCYNBYyvBSgjN1T31K+... Acked-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Andrei Matei andreimatei1@gmail.com Link: https://lore.kernel.org/r/20240327024245.318299-3-andreimatei1@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/verifier.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index fce2345f600f2..25f8a8716e88d 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3941,6 +3941,11 @@ static int check_stack_access_within_bounds( err = check_stack_slot_within_bounds(min_off, state, type); if (!err && max_off > 0) err = -EINVAL; /* out of stack access into non-negative offsets */ + if (!err && access_size < 0) + /* access_size should not be negative (or overflow an int); others checks + * along the way should have prevented such an access. + */ + err = -EFAULT; /* invalid negative access size; integer overflow? */
if (err) { if (tnum_is_const(reg->var_off)) {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hariprasad Kelam hkelam@marvell.com
[ Upstream commit 40d4b4807cadd83fb3f46cc8cd67a945b5b25461 ]
The Octeontx2 MAC block (CGX) has separate data paths (SMU and GMP) for different speeds, allowing for efficient data transfer.
The previous patch which added pause frame configuration has a bug due to which pause frame feature is not working in GMP mode.
This patch fixes the issue by configurating appropriate registers.
Fixes: f7e086e754fe ("octeontx2-af: Pause frame configuration at cgx") Signed-off-by: Hariprasad Kelam hkelam@marvell.com Reviewed-by: Simon Horman horms@kernel.org Link: https://lore.kernel.org/r/20240326052720.4441-1-hkelam@marvell.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/octeontx2/af/cgx.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cgx.c b/drivers/net/ethernet/marvell/octeontx2/af/cgx.c index 55dfe1a20bc99..7f82baf8e7403 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/cgx.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/cgx.c @@ -403,6 +403,11 @@ int cgx_lmac_set_pause_frm(void *cgxd, int lmac_id, if (!cgx || lmac_id >= cgx->lmac_count) return -ENODEV;
+ cfg = cgx_read(cgx, lmac_id, CGXX_GMP_GMI_RXX_FRM_CTL); + cfg &= ~CGX_GMP_GMI_RXX_FRM_CTL_CTL_BCK; + cfg |= rx_pause ? CGX_GMP_GMI_RXX_FRM_CTL_CTL_BCK : 0x0; + cgx_write(cgx, lmac_id, CGXX_GMP_GMI_RXX_FRM_CTL, cfg); + cfg = cgx_read(cgx, lmac_id, CGXX_SMUX_RX_FRM_CTL); cfg &= ~CGX_SMUX_RX_FRM_CTL_CTL_BCK; cfg |= rx_pause ? CGX_SMUX_RX_FRM_CTL_CTL_BCK : 0x0;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 8e91c2342351e0f5ef6c0a704384a7f6fc70c3b2 ]
Depending on the value of CONFIG_HZ, clang complains about a pointless comparison:
drivers/md/dm-integrity.c:4085:12: error: result of comparison of constant 42949672950 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (val >= (uint64_t)UINT_MAX * 1000 / HZ) {
As the check remains useful for other configurations, shut up the warning by adding a second type cast to uint64_t.
Fixes: 468dfca38b1a ("dm integrity: add a bitmap mode") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Mikulas Patocka mpatocka@redhat.com Reviewed-by: Justin Stitt justinstitt@google.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-integrity.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c index 62cae34ca3b43..067be1d9f51cb 100644 --- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -3937,7 +3937,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv) } else if (sscanf(opt_string, "sectors_per_bit:%llu%c", &llval, &dummy) == 1) { log2_sectors_per_bitmap_bit = !llval ? 0 : __ilog2_u64(llval); } else if (sscanf(opt_string, "bitmap_flush_interval:%u%c", &val, &dummy) == 1) { - if (val >= (uint64_t)UINT_MAX * 1000 / HZ) { + if ((uint64_t)val >= (uint64_t)UINT_MAX * 1000 / HZ) { r = -EINVAL; ti->error = "Invalid bitmap_flush_interval argument"; goto bad;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiner Kallweit hkallweit1@gmail.com
commit 5d872c9f46bd2ea3524af3c2420a364a13667135 upstream.
On some boards with this chip version the BIOS is buggy and misses to reset the PHY page selector. This results in the PHY ID read accessing registers on a different page, returning a more or less random value. Fix this by resetting the page selector first.
Fixes: f1e911d5d0df ("r8169: add basic phylib support") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Reviewed-by: Simon Horman horms@kernel.org Link: https://lore.kernel.org/r/64f2055e-98b8-45ec-8568-665e3d54d4e6@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169_main.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -5181,6 +5181,15 @@ static int r8169_mdio_register(struct rt struct mii_bus *new_bus; int ret;
+ /* On some boards with this chip version the BIOS is buggy and misses + * to reset the PHY page selector. This results in the PHY ID read + * accessing registers on a different page, returning a more or + * less random value. Fix this by resetting the page selector first. + */ + if (tp->mac_version == RTL_GIGA_MAC_VER_25 || + tp->mac_version == RTL_GIGA_MAC_VER_26) + r8169_mdio_write(tp, 0x1f, 0); + new_bus = devm_mdiobus_alloc(&pdev->dev); if (!new_bus) return -ENOMEM;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sandipan Das sandipan.das@amd.com
commit 7f274e609f3d5f45c22b1dd59053f6764458b492 upstream.
Add a new word for scattered features because all free bits among the existing Linux-defined auxiliary flags have been exhausted.
Signed-off-by: Sandipan Das sandipan.das@amd.com Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/8380d2a0da469a1f0ad75b8954a79fb689599ff6.171109158... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/cpufeature.h | 6 ++++-- arch/x86/include/asm/cpufeatures.h | 2 +- arch/x86/include/asm/disabled-features.h | 3 ++- arch/x86/include/asm/required-features.h | 3 ++- 4 files changed, 9 insertions(+), 5 deletions(-)
--- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -93,8 +93,9 @@ extern const char * const x86_bug_flags[ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 19, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 20, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 21, feature_bit) || \ REQUIRED_MASK_CHECK || \ - BUILD_BUG_ON_ZERO(NCAPINTS != 21)) + BUILD_BUG_ON_ZERO(NCAPINTS != 22))
#define DISABLED_MASK_BIT_SET(feature_bit) \ ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 0, feature_bit) || \ @@ -118,8 +119,9 @@ extern const char * const x86_bug_flags[ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 19, feature_bit) || \ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 20, feature_bit) || \ + CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 21, feature_bit) || \ DISABLED_MASK_CHECK || \ - BUILD_BUG_ON_ZERO(NCAPINTS != 21)) + BUILD_BUG_ON_ZERO(NCAPINTS != 22))
#define cpu_has(c, bit) \ (__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \ --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -13,7 +13,7 @@ /* * Defines x86 CPU feature bits */ -#define NCAPINTS 21 /* N 32-bit words worth of info */ +#define NCAPINTS 22 /* N 32-bit words worth of info */ #define NBUGINTS 2 /* N 32-bit bug flags */
/* --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -103,6 +103,7 @@ #define DISABLED_MASK18 0 #define DISABLED_MASK19 0 #define DISABLED_MASK20 0 -#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 21) +#define DISABLED_MASK21 0 +#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 22)
#endif /* _ASM_X86_DISABLED_FEATURES_H */ --- a/arch/x86/include/asm/required-features.h +++ b/arch/x86/include/asm/required-features.h @@ -103,6 +103,7 @@ #define REQUIRED_MASK18 0 #define REQUIRED_MASK19 0 #define REQUIRED_MASK20 0 -#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 21) +#define REQUIRED_MASK21 0 +#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 22)
#endif /* _ASM_X86_REQUIRED_FEATURES_H */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hui Wang hui.wang@canonical.com
commit c569242cd49287d53b73a94233db40097d838535 upstream.
We have a BT headset (Lenovo Thinkplus XT99), the pairing and connecting has no problem, once this headset is paired, bluez will remember this device and will auto re-connect it whenever the device is powered on. The auto re-connecting works well with Windows and Android, but with Linux, it always fails. Through debugging, we found at the rfcomm connection stage, the bluetooth stack reports "Connection refused - security block (0x0003)".
For this device, the re-connecting negotiation process is different from other BT headsets, it sends the Link_KEY_REQUEST command before the CONNECT_REQUEST completes, and it doesn't send ENCRYPT_CHANGE command during the negotiation. When the device sends the "connect complete" to hci, the ev->encr_mode is 1.
So here in the conn_complete_evt(), if ev->encr_mode is 1, link type is ACL and HCI_CONN_ENCRYPT is not set, we set HCI_CONN_ENCRYPT to this conn, and update conn->enc_key_size accordingly.
After this change, this BT headset could re-connect with Linux successfully. This is the btmon log after applying the patch, after receiving the "Connect Complete" with "Encryption: Enabled", will send the command to read encryption key size:
HCI Event: Connect Request (0x04) plen 10
Address: 8C:3C:AA:D8:11:67 (OUI 8C-3C-AA) Class: 0x240404 Major class: Audio/Video (headset, speaker, stereo, video, vcr) Minor class: Wearable Headset Device Rendering (Printing, Speaker) Audio (Speaker, Microphone, Headset) Link type: ACL (0x01) ...
HCI Event: Link Key Request (0x17) plen 6
Address: 8C:3C:AA:D8:11:67 (OUI 8C-3C-AA) < HCI Command: Link Key Request Reply (0x01|0x000b) plen 22 Address: 8C:3C:AA:D8:11:67 (OUI 8C-3C-AA) Link key: ${32-hex-digits-key} ...
HCI Event: Connect Complete (0x03) plen 11
Status: Success (0x00) Handle: 256 Address: 8C:3C:AA:D8:11:67 (OUI 8C-3C-AA) Link type: ACL (0x01) Encryption: Enabled (0x01) < HCI Command: Read Encryption Key... (0x05|0x0008) plen 2 Handle: 256 < ACL Data TX: Handle 256 flags 0x00 dlen 10 L2CAP: Information Request (0x0a) ident 1 len 2 Type: Extended features supported (0x0002)
HCI Event: Command Complete (0x0e) plen 7
Read Encryption Key Size (0x05|0x0008) ncmd 1 Status: Success (0x00) Handle: 256 Key size: 16
Cc: stable@vger.kernel.org Link: https://github.com/bluez/bluez/issues/704 Reviewed-by: Paul Menzel pmenzel@molgen.mpg.de Reviewed-by: Luiz Augusto von Dentz luiz.dentz@gmail.com Signed-off-by: Hui Wang hui.wang@canonical.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bluetooth/hci_event.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
--- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -2636,6 +2636,31 @@ static void hci_conn_complete_evt(struct if (test_bit(HCI_ENCRYPT, &hdev->flags)) set_bit(HCI_CONN_ENCRYPT, &conn->flags);
+ /* "Link key request" completed ahead of "connect request" completes */ + if (ev->encr_mode == 1 && !test_bit(HCI_CONN_ENCRYPT, &conn->flags) && + ev->link_type == ACL_LINK) { + struct link_key *key; + struct hci_cp_read_enc_key_size cp; + + key = hci_find_link_key(hdev, &ev->bdaddr); + if (key) { + set_bit(HCI_CONN_ENCRYPT, &conn->flags); + + if (!(hdev->commands[20] & 0x10)) { + conn->enc_key_size = HCI_LINK_KEY_SIZE; + } else { + cp.handle = cpu_to_le16(conn->handle); + if (hci_send_cmd(hdev, HCI_OP_READ_ENC_KEY_SIZE, + sizeof(cp), &cp)) { + bt_dev_err(hdev, "sending read key size failed"); + conn->enc_key_size = HCI_LINK_KEY_SIZE; + } + } + + hci_encrypt_cfm(conn, ev->status); + } + } + /* Get remote features */ if (conn->type == ACL_LINK) { struct hci_cp_read_remote_features cp;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bastien Nocera hadess@hadess.net
commit 7835fcfd132eb88b87e8eb901f88436f63ab60f7 upstream.
struct hci_dev members conn_info_max_age, conn_info_min_age, le_conn_max_interval, le_conn_min_interval, le_adv_max_interval, and le_adv_min_interval can be modified from the HCI core code, as well through debugfs.
The debugfs implementation, that's only available to privileged users, will check for boundaries, making sure that the minimum value being set is strictly above the maximum value that already exists, and vice-versa.
However, as both minimum and maximum values can be changed concurrently to us modifying them, we need to make sure that the value we check is the value we end up using.
For example, with ->conn_info_max_age set to 10, conn_info_min_age_set() gets called from vfs handlers to set conn_info_min_age to 8.
In conn_info_min_age_set(), this goes through: if (val == 0 || val > hdev->conn_info_max_age) return -EINVAL;
Concurrently, conn_info_max_age_set() gets called to set to set the conn_info_max_age to 7: if (val == 0 || val > hdev->conn_info_max_age) return -EINVAL; That check will also pass because we used the old value (10) for conn_info_max_age.
After those checks that both passed, the struct hci_dev access is mutex-locked, disabling concurrent access, but that does not matter because the invalid value checks both passed, and we'll end up with conn_info_min_age = 8 and conn_info_max_age = 7
To fix this problem, we need to lock the structure access before so the check and assignment are not interrupted.
This fix was originally devised by the BassCheck[1] team, and considered the problem to be an atomicity one. This isn't the case as there aren't any concerns about the variable changing while we check it, but rather after we check it parallel to another change.
This patch fixes CVE-2024-24858 and CVE-2024-24857.
[1] https://sites.google.com/view/basscheck/
Co-developed-by: Gui-Dong Han 2045gemini@gmail.com Signed-off-by: Gui-Dong Han 2045gemini@gmail.com Link: https://lore.kernel.org/linux-bluetooth/20231222161317.6255-1-2045gemini@gma... Link: https://nvd.nist.gov/vuln/detail/CVE-2024-24858 Link: https://lore.kernel.org/linux-bluetooth/20231222162931.6553-1-2045gemini@gma... Link: https://lore.kernel.org/linux-bluetooth/20231222162310.6461-1-2045gemini@gma... Link: https://nvd.nist.gov/vuln/detail/CVE-2024-24857 Fixes: 31ad169148df ("Bluetooth: Add conn info lifetime parameters to debugfs") Fixes: 729a1051da6f ("Bluetooth: Expose default LE advertising interval via debugfs") Fixes: 71c3b60ec6d2 ("Bluetooth: Move BR/EDR debugfs file creation into hci_debugfs.c") Signed-off-by: Bastien Nocera hadess@hadess.net Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bluetooth/hci_debugfs.c | 48 +++++++++++++++++++++++++++++--------------- 1 file changed, 32 insertions(+), 16 deletions(-)
--- a/net/bluetooth/hci_debugfs.c +++ b/net/bluetooth/hci_debugfs.c @@ -216,10 +216,12 @@ static int conn_info_min_age_set(void *d { struct hci_dev *hdev = data;
- if (val == 0 || val > hdev->conn_info_max_age) + hci_dev_lock(hdev); + if (val == 0 || val > hdev->conn_info_max_age) { + hci_dev_unlock(hdev); return -EINVAL; + }
- hci_dev_lock(hdev); hdev->conn_info_min_age = val; hci_dev_unlock(hdev);
@@ -244,10 +246,12 @@ static int conn_info_max_age_set(void *d { struct hci_dev *hdev = data;
- if (val == 0 || val < hdev->conn_info_min_age) + hci_dev_lock(hdev); + if (val == 0 || val < hdev->conn_info_min_age) { + hci_dev_unlock(hdev); return -EINVAL; + }
- hci_dev_lock(hdev); hdev->conn_info_max_age = val; hci_dev_unlock(hdev);
@@ -526,10 +530,12 @@ static int sniff_min_interval_set(void * { struct hci_dev *hdev = data;
- if (val == 0 || val % 2 || val > hdev->sniff_max_interval) + hci_dev_lock(hdev); + if (val == 0 || val % 2 || val > hdev->sniff_max_interval) { + hci_dev_unlock(hdev); return -EINVAL; + }
- hci_dev_lock(hdev); hdev->sniff_min_interval = val; hci_dev_unlock(hdev);
@@ -554,10 +560,12 @@ static int sniff_max_interval_set(void * { struct hci_dev *hdev = data;
- if (val == 0 || val % 2 || val < hdev->sniff_min_interval) + hci_dev_lock(hdev); + if (val == 0 || val % 2 || val < hdev->sniff_min_interval) { + hci_dev_unlock(hdev); return -EINVAL; + }
- hci_dev_lock(hdev); hdev->sniff_max_interval = val; hci_dev_unlock(hdev);
@@ -798,10 +806,12 @@ static int conn_min_interval_set(void *d { struct hci_dev *hdev = data;
- if (val < 0x0006 || val > 0x0c80 || val > hdev->le_conn_max_interval) + hci_dev_lock(hdev); + if (val < 0x0006 || val > 0x0c80 || val > hdev->le_conn_max_interval) { + hci_dev_unlock(hdev); return -EINVAL; + }
- hci_dev_lock(hdev); hdev->le_conn_min_interval = val; hci_dev_unlock(hdev);
@@ -826,10 +836,12 @@ static int conn_max_interval_set(void *d { struct hci_dev *hdev = data;
- if (val < 0x0006 || val > 0x0c80 || val < hdev->le_conn_min_interval) + hci_dev_lock(hdev); + if (val < 0x0006 || val > 0x0c80 || val < hdev->le_conn_min_interval) { + hci_dev_unlock(hdev); return -EINVAL; + }
- hci_dev_lock(hdev); hdev->le_conn_max_interval = val; hci_dev_unlock(hdev);
@@ -938,10 +950,12 @@ static int adv_min_interval_set(void *da { struct hci_dev *hdev = data;
- if (val < 0x0020 || val > 0x4000 || val > hdev->le_adv_max_interval) + hci_dev_lock(hdev); + if (val < 0x0020 || val > 0x4000 || val > hdev->le_adv_max_interval) { + hci_dev_unlock(hdev); return -EINVAL; + }
- hci_dev_lock(hdev); hdev->le_adv_min_interval = val; hci_dev_unlock(hdev);
@@ -966,10 +980,12 @@ static int adv_max_interval_set(void *da { struct hci_dev *hdev = data;
- if (val < 0x0020 || val > 0x4000 || val < hdev->le_adv_min_interval) + hci_dev_lock(hdev); + if (val < 0x0020 || val > 0x4000 || val < hdev->le_adv_min_interval) { + hci_dev_unlock(hdev); return -EINVAL; + }
- hci_dev_lock(hdev); hdev->le_adv_max_interval = val; hci_dev_unlock(hdev);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
commit e26d3009efda338f19016df4175f354a9bd0a4ab upstream.
Never used from userspace, disallow these parameters.
Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org [Keerthana: code surrounding the patch is different because nft_set_desc is not present in v4.19-v5.10] Signed-off-by: Keerthana K keerthana.kalyanasundaram@broadcom.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nf_tables_api.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -4457,6 +4457,9 @@ static int nf_tables_newset(struct net * if (!(flags & NFT_SET_TIMEOUT)) return -EINVAL;
+ if (flags & NFT_SET_ANONYMOUS) + return -EOPNOTSUPP; + err = nf_msecs_to_jiffies64(nla[NFTA_SET_TIMEOUT], &timeout); if (err) return err; @@ -4465,6 +4468,10 @@ static int nf_tables_newset(struct net * if (nla[NFTA_SET_GC_INTERVAL] != NULL) { if (!(flags & NFT_SET_TIMEOUT)) return -EINVAL; + + if (flags & NFT_SET_ANONYMOUS) + return -EOPNOTSUPP; + gc_int = ntohl(nla_get_be32(nla[NFTA_SET_GC_INTERVAL])); }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mahmoud Adam mngyadam@amazon.com
commit 62fc3357e079a07a22465b9b6ef71bb6ea75ee4b upstream.
cp might be null, calling cp->cp_conn would produce null dereference
[Simon Horman adds:]
Analysis:
* cp is a parameter of __rds_rdma_map and is not reassigned.
* The following call-sites pass a NULL cp argument to __rds_rdma_map()
- rds_get_mr() - rds_get_mr_for_dest
* Prior to the code above, the following assumes that cp may be NULL (which is indicative, but could itself be unnecessary)
trans_private = rs->rs_transport->get_mr( sg, nents, rs, &mr->r_key, cp ? cp->cp_conn : NULL, args->vec.addr, args->vec.bytes, need_odp ? ODP_ZEROBASED : ODP_NOT_NEEDED);
* The code modified by this patch is guarded by IS_ERR(trans_private), where trans_private is assigned as per the previous point in this analysis.
The only implementation of get_mr that I could locate is rds_ib_get_mr() which can return an ERR_PTR if the conn (4th) argument is NULL.
* ret is set to PTR_ERR(trans_private). rds_ib_get_mr can return ERR_PTR(-ENODEV) if the conn (4th) argument is NULL. Thus ret may be -ENODEV in which case the code in question will execute.
Conclusion: * cp may be NULL at the point where this patch adds a check; this patch does seem to address a possible bug
Fixes: c055fc00c07b ("net/rds: fix WARNING in rds_conn_connect_if_down") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Mahmoud Adam mngyadam@amazon.com Reviewed-by: Simon Horman horms@kernel.org Link: https://lore.kernel.org/r/20240326153132.55580-1-mngyadam@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/rds/rdma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/rds/rdma.c +++ b/net/rds/rdma.c @@ -302,7 +302,7 @@ static int __rds_rdma_map(struct rds_soc } ret = PTR_ERR(trans_private); /* Trigger connection so that its ready for the next retry */ - if (ret == -ENODEV) + if (ret == -ENODEV && cp) rds_conn_connect_if_down(cp->cp_conn); goto out; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Williamson alex.williamson@redhat.com
[ Upstream commit fe9a7082684eb059b925c535682e68c34d487d43 ]
Currently for devices requiring masking at the irqchip for INTx, ie. devices without DisINTx support, the IRQ is enabled in request_irq() and subsequently disabled as necessary to align with the masked status flag. This presents a window where the interrupt could fire between these events, resulting in the IRQ incrementing the disable depth twice. This would be unrecoverable for a user since the masked flag prevents nested enables through vfio.
Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx is never auto-enabled, then unmask as required.
Cc: stable@vger.kernel.org Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reviewed-by: Kevin Tian kevin.tian@intel.com Reviewed-by: Eric Auger eric.auger@redhat.com Link: https://lore.kernel.org/r/20240308230557.805580-2-alex.williamson@redhat.com Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vfio/pci/vfio_pci_intrs.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)
--- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -199,8 +199,15 @@ static int vfio_intx_set_signal(struct v
vdev->ctx[0].trigger = trigger;
+ /* + * Devices without DisINTx support require an exclusive interrupt, + * IRQ masking is performed at the IRQ chip. The masked status is + * protected by vdev->irqlock. Setup the IRQ without auto-enable and + * unmask as necessary below under lock. DisINTx is unmodified by + * the IRQ configuration and may therefore use auto-enable. + */ if (!vdev->pci_2_3) - irqflags = 0; + irqflags = IRQF_NO_AUTOEN;
ret = request_irq(pdev->irq, vfio_intx_handler, irqflags, vdev->ctx[0].name, vdev); @@ -211,13 +218,9 @@ static int vfio_intx_set_signal(struct v return ret; }
- /* - * INTx disable will stick across the new irq setup, - * disable_irq won't. - */ spin_lock_irqsave(&vdev->irqlock, flags); - if (!vdev->pci_2_3 && vdev->ctx[0].masked) - disable_irq_nosync(pdev->irq); + if (!vdev->pci_2_3 && !vdev->ctx[0].masked) + enable_irq(pdev->irq); spin_unlock_irqrestore(&vdev->irqlock, flags);
return 0;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Williamson alex.williamson@redhat.com
[ Upstream commit 810cd4bb53456d0503cc4e7934e063835152c1b7 ]
Mask operations through config space changes to DisINTx may race INTx configuration changes via ioctl. Create wrappers that add locking for paths outside of the core interrupt code.
In particular, irq_type is updated holding igate, therefore testing is_intx() requires holding igate. For example clearing DisINTx from config space can otherwise race changes of the interrupt configuration.
This aligns interfaces which may trigger the INTx eventfd into two camps, one side serialized by igate and the other only enabled while INTx is configured. A subsequent patch introduces synchronization for the latter flows.
Cc: stable@vger.kernel.org Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reported-by: Reinette Chatre reinette.chatre@intel.com Reviewed-by: Kevin Tian kevin.tian@intel.com Reviewed-by: Reinette Chatre reinette.chatre@intel.com Reviewed-by: Eric Auger eric.auger@redhat.com Link: https://lore.kernel.org/r/20240308230557.805580-3-alex.williamson@redhat.com Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vfio/pci/vfio_pci_intrs.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-)
--- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -33,11 +33,13 @@ static void vfio_send_intx_eventfd(void eventfd_signal(vdev->ctx[0].trigger, 1); }
-void vfio_pci_intx_mask(struct vfio_pci_device *vdev) +static void __vfio_pci_intx_mask(struct vfio_pci_device *vdev) { struct pci_dev *pdev = vdev->pdev; unsigned long flags;
+ lockdep_assert_held(&vdev->igate); + spin_lock_irqsave(&vdev->irqlock, flags);
/* @@ -65,6 +67,13 @@ void vfio_pci_intx_mask(struct vfio_pci_ spin_unlock_irqrestore(&vdev->irqlock, flags); }
+void vfio_pci_intx_mask(struct vfio_pci_device *vdev) +{ + mutex_lock(&vdev->igate); + __vfio_pci_intx_mask(vdev); + mutex_unlock(&vdev->igate); +} + /* * If this is triggered by an eventfd, we can't call eventfd_signal * or else we'll deadlock on the eventfd wait queue. Return >0 when @@ -107,12 +116,21 @@ static int vfio_pci_intx_unmask_handler( return ret; }
-void vfio_pci_intx_unmask(struct vfio_pci_device *vdev) +static void __vfio_pci_intx_unmask(struct vfio_pci_device *vdev) { + lockdep_assert_held(&vdev->igate); + if (vfio_pci_intx_unmask_handler(vdev, NULL) > 0) vfio_send_intx_eventfd(vdev, NULL); }
+void vfio_pci_intx_unmask(struct vfio_pci_device *vdev) +{ + mutex_lock(&vdev->igate); + __vfio_pci_intx_unmask(vdev); + mutex_unlock(&vdev->igate); +} + static irqreturn_t vfio_intx_handler(int irq, void *dev_id) { struct vfio_pci_device *vdev = dev_id; @@ -428,11 +446,11 @@ static int vfio_pci_set_intx_unmask(stru return -EINVAL;
if (flags & VFIO_IRQ_SET_DATA_NONE) { - vfio_pci_intx_unmask(vdev); + __vfio_pci_intx_unmask(vdev); } else if (flags & VFIO_IRQ_SET_DATA_BOOL) { uint8_t unmask = *(uint8_t *)data; if (unmask) - vfio_pci_intx_unmask(vdev); + __vfio_pci_intx_unmask(vdev); } else if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { int32_t fd = *(int32_t *)data; if (fd >= 0) @@ -455,11 +473,11 @@ static int vfio_pci_set_intx_mask(struct return -EINVAL;
if (flags & VFIO_IRQ_SET_DATA_NONE) { - vfio_pci_intx_mask(vdev); + __vfio_pci_intx_mask(vdev); } else if (flags & VFIO_IRQ_SET_DATA_BOOL) { uint8_t mask = *(uint8_t *)data; if (mask) - vfio_pci_intx_mask(vdev); + __vfio_pci_intx_mask(vdev); } else if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { return -ENOTTY; /* XXX implement me */ }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Williamson alex.williamson@redhat.com
[ Upstream commit b620ecbd17a03cacd06f014a5d3f3a11285ce053 ]
In order to synchronize changes that can affect the thread callback, introduce an interface to force a flush of the inject workqueue. The irqfd pointer is only valid under spinlock, but the workqueue cannot be flushed under spinlock. Therefore the flush work for the irqfd is queued under spinlock. The vfio_irqfd_cleanup_wq workqueue is re-used for queuing this work such that flushing the workqueue is also ordered relative to shutdown.
Reviewed-by: Kevin Tian kevin.tian@intel.com Reviewed-by: Reinette Chatre reinette.chatre@intel.com Reviewed-by: Eric Auger eric.auger@redhat.com Link: https://lore.kernel.org/r/20240308230557.805580-4-alex.williamson@redhat.com Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vfio/virqfd.c | 21 +++++++++++++++++++++ include/linux/vfio.h | 2 ++ 2 files changed, 23 insertions(+)
--- a/drivers/vfio/virqfd.c +++ b/drivers/vfio/virqfd.c @@ -101,6 +101,13 @@ static void virqfd_inject(struct work_st virqfd->thread(virqfd->opaque, virqfd->data); }
+static void virqfd_flush_inject(struct work_struct *work) +{ + struct virqfd *virqfd = container_of(work, struct virqfd, flush_inject); + + flush_work(&virqfd->inject); +} + int vfio_virqfd_enable(void *opaque, int (*handler)(void *, void *), void (*thread)(void *, void *), @@ -124,6 +131,7 @@ int vfio_virqfd_enable(void *opaque,
INIT_WORK(&virqfd->shutdown, virqfd_shutdown); INIT_WORK(&virqfd->inject, virqfd_inject); + INIT_WORK(&virqfd->flush_inject, virqfd_flush_inject);
irqfd = fdget(fd); if (!irqfd.file) { @@ -214,6 +222,19 @@ void vfio_virqfd_disable(struct virqfd * } EXPORT_SYMBOL_GPL(vfio_virqfd_disable);
+void vfio_virqfd_flush_thread(struct virqfd **pvirqfd) +{ + unsigned long flags; + + spin_lock_irqsave(&virqfd_lock, flags); + if (*pvirqfd && (*pvirqfd)->thread) + queue_work(vfio_irqfd_cleanup_wq, &(*pvirqfd)->flush_inject); + spin_unlock_irqrestore(&virqfd_lock, flags); + + flush_workqueue(vfio_irqfd_cleanup_wq); +} +EXPORT_SYMBOL_GPL(vfio_virqfd_flush_thread); + module_init(vfio_virqfd_init); module_exit(vfio_virqfd_exit);
--- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -221,6 +221,7 @@ struct virqfd { wait_queue_entry_t wait; poll_table pt; struct work_struct shutdown; + struct work_struct flush_inject; struct virqfd **pvirqfd; };
@@ -229,5 +230,6 @@ extern int vfio_virqfd_enable(void *opaq void (*thread)(void *, void *), void *data, struct virqfd **pvirqfd, int fd); extern void vfio_virqfd_disable(struct virqfd **pvirqfd); +void vfio_virqfd_flush_thread(struct virqfd **pvirqfd);
#endif /* VFIO_H */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Williamson alex.williamson@redhat.com
[ Upstream commit 18c198c96a815c962adc2b9b77909eec0be7df4d ]
A vulnerability exists where the eventfd for INTx signaling can be deconfigured, which unregisters the IRQ handler but still allows eventfds to be signaled with a NULL context through the SET_IRQS ioctl or through unmask irqfd if the device interrupt is pending.
Ideally this could be solved with some additional locking; the igate mutex serializes the ioctl and config space accesses, and the interrupt handler is unregistered relative to the trigger, but the irqfd path runs asynchronous to those. The igate mutex cannot be acquired from the atomic context of the eventfd wake function. Disabling the irqfd relative to the eventfd registration is potentially incompatible with existing userspace.
As a result, the solution implemented here moves configuration of the INTx interrupt handler to track the lifetime of the INTx context object and irq_type configuration, rather than registration of a particular trigger eventfd. Synchronization is added between the ioctl path and eventfd_signal() wrapper such that the eventfd trigger can be dynamically updated relative to in-flight interrupts or irqfd callbacks.
Cc: stable@vger.kernel.org Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reported-by: Reinette Chatre reinette.chatre@intel.com Reviewed-by: Kevin Tian kevin.tian@intel.com Reviewed-by: Reinette Chatre reinette.chatre@intel.com Reviewed-by: Eric Auger eric.auger@redhat.com Link: https://lore.kernel.org/r/20240308230557.805580-5-alex.williamson@redhat.com Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vfio/pci/vfio_pci_intrs.c | 149 ++++++++++++++++++++------------------ 1 file changed, 82 insertions(+), 67 deletions(-)
--- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -29,8 +29,13 @@ static void vfio_send_intx_eventfd(void { struct vfio_pci_device *vdev = opaque;
- if (likely(is_intx(vdev) && !vdev->virq_disabled)) - eventfd_signal(vdev->ctx[0].trigger, 1); + if (likely(is_intx(vdev) && !vdev->virq_disabled)) { + struct eventfd_ctx *trigger; + + trigger = READ_ONCE(vdev->ctx[0].trigger); + if (likely(trigger)) + eventfd_signal(trigger, 1); + } }
static void __vfio_pci_intx_mask(struct vfio_pci_device *vdev) @@ -157,98 +162,104 @@ static irqreturn_t vfio_intx_handler(int return ret; }
-static int vfio_intx_enable(struct vfio_pci_device *vdev) +static int vfio_intx_enable(struct vfio_pci_device *vdev, + struct eventfd_ctx *trigger) { + struct pci_dev *pdev = vdev->pdev; + unsigned long irqflags; + char *name; + int ret; + if (!is_irq_none(vdev)) return -EINVAL;
- if (!vdev->pdev->irq) + if (!pdev->irq) return -ENODEV;
+ name = kasprintf(GFP_KERNEL, "vfio-intx(%s)", pci_name(pdev)); + if (!name) + return -ENOMEM; + vdev->ctx = kzalloc(sizeof(struct vfio_pci_irq_ctx), GFP_KERNEL); if (!vdev->ctx) return -ENOMEM;
vdev->num_ctx = 1;
+ vdev->ctx[0].name = name; + vdev->ctx[0].trigger = trigger; + /* - * If the virtual interrupt is masked, restore it. Devices - * supporting DisINTx can be masked at the hardware level - * here, non-PCI-2.3 devices will have to wait until the - * interrupt is enabled. + * Fill the initial masked state based on virq_disabled. After + * enable, changing the DisINTx bit in vconfig directly changes INTx + * masking. igate prevents races during setup, once running masked + * is protected via irqlock. + * + * Devices supporting DisINTx also reflect the current mask state in + * the physical DisINTx bit, which is not affected during IRQ setup. + * + * Devices without DisINTx support require an exclusive interrupt. + * IRQ masking is performed at the IRQ chip. Again, igate protects + * against races during setup and IRQ handlers and irqfds are not + * yet active, therefore masked is stable and can be used to + * conditionally auto-enable the IRQ. + * + * irq_type must be stable while the IRQ handler is registered, + * therefore it must be set before request_irq(). */ vdev->ctx[0].masked = vdev->virq_disabled; - if (vdev->pci_2_3) - pci_intx(vdev->pdev, !vdev->ctx[0].masked); + if (vdev->pci_2_3) { + pci_intx(pdev, !vdev->ctx[0].masked); + irqflags = IRQF_SHARED; + } else { + irqflags = vdev->ctx[0].masked ? IRQF_NO_AUTOEN : 0; + }
vdev->irq_type = VFIO_PCI_INTX_IRQ_INDEX;
+ ret = request_irq(pdev->irq, vfio_intx_handler, + irqflags, vdev->ctx[0].name, vdev); + if (ret) { + vdev->irq_type = VFIO_PCI_NUM_IRQS; + kfree(name); + vdev->num_ctx = 0; + kfree(vdev->ctx); + return ret; + } + return 0; }
-static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd) +static int vfio_intx_set_signal(struct vfio_pci_device *vdev, + struct eventfd_ctx *trigger) { struct pci_dev *pdev = vdev->pdev; - unsigned long irqflags = IRQF_SHARED; - struct eventfd_ctx *trigger; - unsigned long flags; - int ret; - - if (vdev->ctx[0].trigger) { - free_irq(pdev->irq, vdev); - kfree(vdev->ctx[0].name); - eventfd_ctx_put(vdev->ctx[0].trigger); - vdev->ctx[0].trigger = NULL; - } - - if (fd < 0) /* Disable only */ - return 0; - - vdev->ctx[0].name = kasprintf(GFP_KERNEL, "vfio-intx(%s)", - pci_name(pdev)); - if (!vdev->ctx[0].name) - return -ENOMEM; - - trigger = eventfd_ctx_fdget(fd); - if (IS_ERR(trigger)) { - kfree(vdev->ctx[0].name); - return PTR_ERR(trigger); - } + struct eventfd_ctx *old;
- vdev->ctx[0].trigger = trigger; + old = vdev->ctx[0].trigger;
- /* - * Devices without DisINTx support require an exclusive interrupt, - * IRQ masking is performed at the IRQ chip. The masked status is - * protected by vdev->irqlock. Setup the IRQ without auto-enable and - * unmask as necessary below under lock. DisINTx is unmodified by - * the IRQ configuration and may therefore use auto-enable. - */ - if (!vdev->pci_2_3) - irqflags = IRQF_NO_AUTOEN; + WRITE_ONCE(vdev->ctx[0].trigger, trigger);
- ret = request_irq(pdev->irq, vfio_intx_handler, - irqflags, vdev->ctx[0].name, vdev); - if (ret) { - vdev->ctx[0].trigger = NULL; - kfree(vdev->ctx[0].name); - eventfd_ctx_put(trigger); - return ret; + /* Releasing an old ctx requires synchronizing in-flight users */ + if (old) { + synchronize_irq(pdev->irq); + vfio_virqfd_flush_thread(&vdev->ctx[0].unmask); + eventfd_ctx_put(old); }
- spin_lock_irqsave(&vdev->irqlock, flags); - if (!vdev->pci_2_3 && !vdev->ctx[0].masked) - enable_irq(pdev->irq); - spin_unlock_irqrestore(&vdev->irqlock, flags); - return 0; }
static void vfio_intx_disable(struct vfio_pci_device *vdev) { + struct pci_dev *pdev = vdev->pdev; + vfio_virqfd_disable(&vdev->ctx[0].unmask); vfio_virqfd_disable(&vdev->ctx[0].mask); - vfio_intx_set_signal(vdev, -1); + free_irq(pdev->irq, vdev); + if (vdev->ctx[0].trigger) + eventfd_ctx_put(vdev->ctx[0].trigger); + kfree(vdev->ctx[0].name); vdev->irq_type = VFIO_PCI_NUM_IRQS; vdev->num_ctx = 0; kfree(vdev->ctx); @@ -498,19 +509,23 @@ static int vfio_pci_set_intx_trigger(str return -EINVAL;
if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { + struct eventfd_ctx *trigger = NULL; int32_t fd = *(int32_t *)data; int ret;
+ if (fd >= 0) { + trigger = eventfd_ctx_fdget(fd); + if (IS_ERR(trigger)) + return PTR_ERR(trigger); + } + if (is_intx(vdev)) - return vfio_intx_set_signal(vdev, fd); + ret = vfio_intx_set_signal(vdev, trigger); + else + ret = vfio_intx_enable(vdev, trigger);
- ret = vfio_intx_enable(vdev); - if (ret) - return ret; - - ret = vfio_intx_set_signal(vdev, fd); - if (ret) - vfio_intx_disable(vdev); + if (ret && trigger) + eventfd_ctx_put(trigger);
return ret; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Williamson alex.williamson@redhat.com
[ Upstream commit 675daf435e9f8e5a5eab140a9864dfad6668b375 ]
The vfio-platform SET_IRQS ioctl currently allows loopback triggering of an interrupt before a signaling eventfd has been configured by the user, which thereby allows a NULL pointer dereference.
Rather than register the IRQ relative to a valid trigger, register all IRQs in a disabled state in the device open path. This allows mask operations on the IRQ to nest within the overall enable state governed by a valid eventfd signal. This decouples @masked, protected by the @locked spinlock from @trigger, protected via the @igate mutex.
In doing so, it's guaranteed that changes to @trigger cannot race the IRQ handlers because the IRQ handler is synchronously disabled before modifying the trigger, and loopback triggering of the IRQ via ioctl is safe due to serialization with trigger changes via igate.
For compatibility, request_irq() failures are maintained to be local to the SET_IRQS ioctl rather than a fatal error in the open device path. This allows, for example, a userspace driver with polling mode support to continue to work regardless of moving the request_irq() call site. This necessarily blocks all SET_IRQS access to the failed index.
Cc: Eric Auger eric.auger@redhat.com Cc: stable@vger.kernel.org Fixes: 57f972e2b341 ("vfio/platform: trigger an interrupt via eventfd") Reviewed-by: Kevin Tian kevin.tian@intel.com Reviewed-by: Eric Auger eric.auger@redhat.com Link: https://lore.kernel.org/r/20240308230557.805580-7-alex.williamson@redhat.com Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vfio/platform/vfio_platform_irq.c | 101 ++++++++++++++++++++---------- 1 file changed, 68 insertions(+), 33 deletions(-)
--- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -136,6 +136,16 @@ static int vfio_platform_set_irq_unmask( return 0; }
+/* + * The trigger eventfd is guaranteed valid in the interrupt path + * and protected by the igate mutex when triggered via ioctl. + */ +static void vfio_send_eventfd(struct vfio_platform_irq *irq_ctx) +{ + if (likely(irq_ctx->trigger)) + eventfd_signal(irq_ctx->trigger, 1); +} + static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id) { struct vfio_platform_irq *irq_ctx = dev_id; @@ -155,7 +165,7 @@ static irqreturn_t vfio_automasked_irq_h spin_unlock_irqrestore(&irq_ctx->lock, flags);
if (ret == IRQ_HANDLED) - eventfd_signal(irq_ctx->trigger, 1); + vfio_send_eventfd(irq_ctx);
return ret; } @@ -164,22 +174,19 @@ static irqreturn_t vfio_irq_handler(int { struct vfio_platform_irq *irq_ctx = dev_id;
- eventfd_signal(irq_ctx->trigger, 1); + vfio_send_eventfd(irq_ctx);
return IRQ_HANDLED; }
static int vfio_set_trigger(struct vfio_platform_device *vdev, int index, - int fd, irq_handler_t handler) + int fd) { struct vfio_platform_irq *irq = &vdev->irqs[index]; struct eventfd_ctx *trigger; - int ret;
if (irq->trigger) { - irq_clear_status_flags(irq->hwirq, IRQ_NOAUTOEN); - free_irq(irq->hwirq, irq); - kfree(irq->name); + disable_irq(irq->hwirq); eventfd_ctx_put(irq->trigger); irq->trigger = NULL; } @@ -187,30 +194,20 @@ static int vfio_set_trigger(struct vfio_ if (fd < 0) /* Disable only */ return 0;
- irq->name = kasprintf(GFP_KERNEL, "vfio-irq[%d](%s)", - irq->hwirq, vdev->name); - if (!irq->name) - return -ENOMEM; - trigger = eventfd_ctx_fdget(fd); - if (IS_ERR(trigger)) { - kfree(irq->name); + if (IS_ERR(trigger)) return PTR_ERR(trigger); - }
irq->trigger = trigger;
- irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN); - ret = request_irq(irq->hwirq, handler, 0, irq->name, irq); - if (ret) { - kfree(irq->name); - eventfd_ctx_put(trigger); - irq->trigger = NULL; - return ret; - } - - if (!irq->masked) - enable_irq(irq->hwirq); + /* + * irq->masked effectively provides nested disables within the overall + * enable relative to trigger. Specifically request_irq() is called + * with NO_AUTOEN, therefore the IRQ is initially disabled. The user + * may only further disable the IRQ with a MASK operations because + * irq->masked is initially false. + */ + enable_irq(irq->hwirq);
return 0; } @@ -229,7 +226,7 @@ static int vfio_platform_set_irq_trigger handler = vfio_irq_handler;
if (!count && (flags & VFIO_IRQ_SET_DATA_NONE)) - return vfio_set_trigger(vdev, index, -1, handler); + return vfio_set_trigger(vdev, index, -1);
if (start != 0 || count != 1) return -EINVAL; @@ -237,7 +234,7 @@ static int vfio_platform_set_irq_trigger if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { int32_t fd = *(int32_t *)data;
- return vfio_set_trigger(vdev, index, fd, handler); + return vfio_set_trigger(vdev, index, fd); }
if (flags & VFIO_IRQ_SET_DATA_NONE) { @@ -261,6 +258,14 @@ int vfio_platform_set_irqs_ioctl(struct unsigned start, unsigned count, uint32_t flags, void *data) = NULL;
+ /* + * For compatibility, errors from request_irq() are local to the + * SET_IRQS path and reflected in the name pointer. This allows, + * for example, polling mode fallback for an exclusive IRQ failure. + */ + if (IS_ERR(vdev->irqs[index].name)) + return PTR_ERR(vdev->irqs[index].name); + switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) { case VFIO_IRQ_SET_ACTION_MASK: func = vfio_platform_set_irq_mask; @@ -281,7 +286,7 @@ int vfio_platform_set_irqs_ioctl(struct
int vfio_platform_irq_init(struct vfio_platform_device *vdev) { - int cnt = 0, i; + int cnt = 0, i, ret = 0;
while (vdev->get_irq(vdev, cnt) >= 0) cnt++; @@ -292,29 +297,54 @@ int vfio_platform_irq_init(struct vfio_p
for (i = 0; i < cnt; i++) { int hwirq = vdev->get_irq(vdev, i); + irq_handler_t handler = vfio_irq_handler;
- if (hwirq < 0) + if (hwirq < 0) { + ret = -EINVAL; goto err; + }
spin_lock_init(&vdev->irqs[i].lock);
vdev->irqs[i].flags = VFIO_IRQ_INFO_EVENTFD;
- if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK) + if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK) { vdev->irqs[i].flags |= VFIO_IRQ_INFO_MASKABLE | VFIO_IRQ_INFO_AUTOMASKED; + handler = vfio_automasked_irq_handler; + }
vdev->irqs[i].count = 1; vdev->irqs[i].hwirq = hwirq; vdev->irqs[i].masked = false; + vdev->irqs[i].name = kasprintf(GFP_KERNEL, + "vfio-irq[%d](%s)", hwirq, + vdev->name); + if (!vdev->irqs[i].name) { + ret = -ENOMEM; + goto err; + } + + ret = request_irq(hwirq, handler, IRQF_NO_AUTOEN, + vdev->irqs[i].name, &vdev->irqs[i]); + if (ret) { + kfree(vdev->irqs[i].name); + vdev->irqs[i].name = ERR_PTR(ret); + } }
vdev->num_irqs = cnt;
return 0; err: + for (--i; i >= 0; i--) { + if (!IS_ERR(vdev->irqs[i].name)) { + free_irq(vdev->irqs[i].hwirq, &vdev->irqs[i]); + kfree(vdev->irqs[i].name); + } + } kfree(vdev->irqs); - return -EINVAL; + return ret; }
void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev) @@ -324,7 +354,12 @@ void vfio_platform_irq_cleanup(struct vf for (i = 0; i < vdev->num_irqs; i++) { vfio_virqfd_disable(&vdev->irqs[i].mask); vfio_virqfd_disable(&vdev->irqs[i].unmask); - vfio_set_trigger(vdev, i, -1, NULL); + if (!IS_ERR(vdev->irqs[i].name)) { + free_irq(vdev->irqs[i].hwirq, &vdev->irqs[i]); + if (vdev->irqs[i].trigger) + eventfd_ctx_put(vdev->irqs[i].trigger); + kfree(vdev->irqs[i].name); + } }
vdev->num_irqs = 0;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Williamson alex.williamson@redhat.com
[ Upstream commit 7447d911af699a15f8d050dfcb7c680a86f87012 ]
The eventfd_ctx trigger pointer of the vfio_fsl_mc_irq object is initially NULL and may become NULL if the user sets the trigger eventfd to -1. The interrupt handler itself is guaranteed that trigger is always valid between request_irq() and free_irq(), but the loopback testing mechanisms to invoke the handler function need to test the trigger. The triggering and setting ioctl paths both make use of igate and are therefore mutually exclusive.
The vfio-fsl-mc driver does not make use of irqfds, nor does it support any sort of masking operations, therefore unlike vfio-pci and vfio-platform, the flow can remain essentially unchanged.
Cc: Diana Craciun diana.craciun@oss.nxp.com Cc: stable@vger.kernel.org Fixes: cc0ee20bd969 ("vfio/fsl-mc: trigger an interrupt via eventfd") Reviewed-by: Kevin Tian kevin.tian@intel.com Reviewed-by: Eric Auger eric.auger@redhat.com Link: https://lore.kernel.org/r/20240308230557.805580-8-alex.williamson@redhat.com Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c +++ b/drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c @@ -142,13 +142,14 @@ static int vfio_fsl_mc_set_irq_trigger(s irq = &vdev->mc_irqs[index];
if (flags & VFIO_IRQ_SET_DATA_NONE) { - vfio_fsl_mc_irq_handler(hwirq, irq); + if (irq->trigger) + eventfd_signal(irq->trigger, 1);
} else if (flags & VFIO_IRQ_SET_DATA_BOOL) { u8 trigger = *(u8 *)data;
- if (trigger) - vfio_fsl_mc_irq_handler(hwirq, irq); + if (trigger && irq->trigger) + eventfd_signal(irq->trigger, 1); }
return 0;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
A previous backport mistakenly removed code that cleared 'ret' to zero, as the SCM logging was performed. Fix up the return value so we don't return an errant error on fixed file registration.
Fixes: a6771f343af9 ("io_uring: drop any code related to SCM_RIGHTS") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -8247,7 +8247,7 @@ static int io_sqe_files_register(struct }
io_rsrc_node_switch(ctx, NULL); - return ret; + return 0; out_fput: for (i = 0; i < ctx->nr_user_files; i++) { file = io_file_from_index(ctx, i);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ingo Molnar mingo@kernel.org
commit c567f2948f57bdc03ed03403ae0234085f376b7d upstream.
This reverts commit d794734c9bbfe22f86686dc2909c25f5ffe1a572.
While the original change tries to fix a bug, it also unintentionally broke existing systems, see the regressions reported at:
https://lore.kernel.org/all/3a1b9909-45ac-4f97-ad68-d16ef1ce99db@pavinjoseph...
Since d794734c9bbf was also marked for -stable, let's back it out before causing more damage.
Note that due to another upstream change the revert was not 100% automatic:
0a845e0f6348 mm/treewide: replace pud_large() with pud_leaf()
Signed-off-by: Ingo Molnar mingo@kernel.org Cc: stable@vger.kernel.org Cc: Russ Anderson rja@hpe.com Cc: Steve Wahl steve.wahl@hpe.com Cc: Dave Hansen dave.hansen@linux.intel.com Link: https://lore.kernel.org/all/3a1b9909-45ac-4f97-ad68-d16ef1ce99db@pavinjoseph... Fixes: d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.") Signed-off-by: Steve Wahl steve.wahl@hpe.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/mm/ident_map.c | 23 +++++------------------ 1 file changed, 5 insertions(+), 18 deletions(-)
--- a/arch/x86/mm/ident_map.c +++ b/arch/x86/mm/ident_map.c @@ -26,31 +26,18 @@ static int ident_pud_init(struct x86_map for (; addr < end; addr = next) { pud_t *pud = pud_page + pud_index(addr); pmd_t *pmd; - bool use_gbpage;
next = (addr & PUD_MASK) + PUD_SIZE; if (next > end) next = end;
- /* if this is already a gbpage, this portion is already mapped */ - if (pud_large(*pud)) - continue; - - /* Is using a gbpage allowed? */ - use_gbpage = info->direct_gbpages; - - /* Don't use gbpage if it maps more than the requested region. */ - /* at the begining: */ - use_gbpage &= ((addr & ~PUD_MASK) == 0); - /* ... or at the end: */ - use_gbpage &= ((next & ~PUD_MASK) == 0); - - /* Never overwrite existing mappings */ - use_gbpage &= !pud_present(*pud); - - if (use_gbpage) { + if (info->direct_gbpages) { pud_t pudval;
+ if (pud_present(*pud)) + continue; + + addr &= PUD_MASK; pudval = __pud((addr - info->offset) | info->page_flag); set_pud(pud, pudval); continue;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vlastimil Babka vbabka@suse.cz
commit 803de9000f334b771afacb6ff3e78622916668b0 upstream.
Sven reports an infinite loop in __alloc_pages_slowpath() for costly order __GFP_RETRY_MAYFAIL allocations that are also GFP_NOIO. Such combination can happen in a suspend/resume context where a GFP_KERNEL allocation can have __GFP_IO masked out via gfp_allowed_mask.
Quoting Sven:
1. try to do a "costly" allocation (order > PAGE_ALLOC_COSTLY_ORDER) with __GFP_RETRY_MAYFAIL set.
2. page alloc's __alloc_pages_slowpath tries to get a page from the freelist. This fails because there is nothing free of that costly order.
3. page alloc tries to reclaim by calling __alloc_pages_direct_reclaim, which bails out because a zone is ready to be compacted; it pretends to have made a single page of progress.
4. page alloc tries to compact, but this always bails out early because __GFP_IO is not set (it's not passed by the snd allocator, and even if it were, we are suspending so the __GFP_IO flag would be cleared anyway).
5. page alloc believes reclaim progress was made (because of the pretense in item 3) and so it checks whether it should retry compaction. The compaction retry logic thinks it should try again, because: a) reclaim is needed because of the early bail-out in item 4 b) a zonelist is suitable for compaction
6. goto 2. indefinite stall.
(end quote)
The immediate root cause is confusing the COMPACT_SKIPPED returned from __alloc_pages_direct_compact() (step 4) due to lack of __GFP_IO to be indicating a lack of order-0 pages, and in step 5 evaluating that in should_compact_retry() as a reason to retry, before incrementing and limiting the number of retries. There are however other places that wrongly assume that compaction can happen while we lack __GFP_IO.
To fix this, introduce gfp_compaction_allowed() to abstract the __GFP_IO evaluation and switch the open-coded test in try_to_compact_pages() to use it.
Also use the new helper in: - compaction_ready(), which will make reclaim not bail out in step 3, so there's at least one attempt to actually reclaim, even if chances are small for a costly order - in_reclaim_compaction() which will make should_continue_reclaim() return false and we don't over-reclaim unnecessarily - in __alloc_pages_slowpath() to set a local variable can_compact, which is then used to avoid retrying reclaim/compaction for costly allocations (step 5) if we can't compact and also to skip the early compaction attempt that we do in some cases
Link: https://lkml.kernel.org/r/20240221114357.13655-2-vbabka@suse.cz Fixes: 3250845d0526 ("Revert "mm, oom: prevent premature OOM killer invocation for high order request"") Signed-off-by: Vlastimil Babka vbabka@suse.cz Reported-by: Sven van Ashbrook svenva@chromium.org Closes: https://lore.kernel.org/all/CAG-rBihs_xMKb3wrMO1%2B-%2Bp4fowP9oy1pa_OTkfxBzP... Tested-by: Karthikeyan Ramasubramanian kramasub@chromium.org Cc: Brian Geffon bgeffon@google.com Cc: Curtis Malainey cujomalainey@chromium.org Cc: Jaroslav Kysela perex@perex.cz Cc: Mel Gorman mgorman@techsingularity.net Cc: Michal Hocko mhocko@kernel.org Cc: Takashi Iwai tiwai@suse.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Vlastimil Babka vbabka@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/gfp.h | 9 +++++++++ mm/compaction.c | 7 +------ mm/page_alloc.c | 10 ++++++---- mm/vmscan.c | 5 ++++- 4 files changed, 20 insertions(+), 11 deletions(-)
--- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -623,6 +623,15 @@ static inline bool pm_suspended_storage( } #endif /* CONFIG_PM_SLEEP */
+/* + * Check if the gfp flags allow compaction - GFP_NOIO is a really + * tricky context because the migration might require IO. + */ +static inline bool gfp_compaction_allowed(gfp_t gfp_mask) +{ + return IS_ENABLED(CONFIG_COMPACTION) && (gfp_mask & __GFP_IO); +} + #ifdef CONFIG_CONTIG_ALLOC /* The below functions must be run on a range from a single zone. */ extern int alloc_contig_range(unsigned long start, unsigned long end, --- a/mm/compaction.c +++ b/mm/compaction.c @@ -2466,16 +2466,11 @@ enum compact_result try_to_compact_pages unsigned int alloc_flags, const struct alloc_context *ac, enum compact_priority prio, struct page **capture) { - int may_perform_io = gfp_mask & __GFP_IO; struct zoneref *z; struct zone *zone; enum compact_result rc = COMPACT_SKIPPED;
- /* - * Check if the GFP flags allow compaction - GFP_NOIO is really - * tricky context because the migration might require IO - */ - if (!may_perform_io) + if (!gfp_compaction_allowed(gfp_mask)) return COMPACT_SKIPPED;
trace_mm_compaction_try_to_compact_pages(order, gfp_mask, prio); --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4644,6 +4644,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, u struct alloc_context *ac) { bool can_direct_reclaim = gfp_mask & __GFP_DIRECT_RECLAIM; + bool can_compact = gfp_compaction_allowed(gfp_mask); const bool costly_order = order > PAGE_ALLOC_COSTLY_ORDER; struct page *page = NULL; unsigned int alloc_flags; @@ -4709,7 +4710,7 @@ restart: * Don't try this for allocations that are allowed to ignore * watermarks, as the ALLOC_NO_WATERMARKS attempt didn't yet happen. */ - if (can_direct_reclaim && + if (can_direct_reclaim && can_compact && (costly_order || (order > 0 && ac->migratetype != MIGRATE_MOVABLE)) && !gfp_pfmemalloc_allowed(gfp_mask)) { @@ -4806,9 +4807,10 @@ retry:
/* * Do not retry costly high order allocations unless they are - * __GFP_RETRY_MAYFAIL + * __GFP_RETRY_MAYFAIL and we can compact */ - if (costly_order && !(gfp_mask & __GFP_RETRY_MAYFAIL)) + if (costly_order && (!can_compact || + !(gfp_mask & __GFP_RETRY_MAYFAIL))) goto nopage;
if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags, @@ -4821,7 +4823,7 @@ retry: * implementation of the compaction depends on the sufficient amount * of free memory (see __compaction_suitable) */ - if (did_some_progress > 0 && + if (did_some_progress > 0 && can_compact && should_compact_retry(ac, order, alloc_flags, compact_result, &compact_priority, &compaction_retries)) --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2546,7 +2546,7 @@ static void shrink_lruvec(struct lruvec /* Use reclaim/compaction for costly allocs or under memory pressure */ static bool in_reclaim_compaction(struct scan_control *sc) { - if (IS_ENABLED(CONFIG_COMPACTION) && sc->order && + if (gfp_compaction_allowed(sc->gfp_mask) && sc->order && (sc->order > PAGE_ALLOC_COSTLY_ORDER || sc->priority < DEF_PRIORITY - 2)) return true; @@ -2873,6 +2873,9 @@ static inline bool compaction_ready(stru unsigned long watermark; enum compact_result suitable;
+ if (!gfp_compaction_allowed(sc->gfp_mask)) + return false; + suitable = compaction_suitable(zone, sc->order, 0, sc->reclaim_idx); if (suitable == COMPACT_SUCCESS) /* Allocation should succeed already. Don't reclaim. */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pu Wen puwen@hygon.cn
commit a5ef7d68cea1344cf524f04981c2b3f80bedbb0d upstream.
Add mitigation for the speculative return stack overflow vulnerability which exists on Hygon processors too.
Signed-off-by: Pu Wen puwen@hygon.cn Signed-off-by: Ingo Molnar mingo@kernel.org Acked-by: Borislav Petkov (AMD) bp@alien8.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/tencent_4A14812842F104E93AA722EC939483CEFF05@qq.co... Signed-off-by: Ashwin Dayanand Kamat ashwin.kamat@broadcom.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1177,7 +1177,7 @@ static const struct x86_cpu_id cpu_vuln_ VULNBL_AMD(0x15, RETBLEED), VULNBL_AMD(0x16, RETBLEED), VULNBL_AMD(0x17, RETBLEED | SRSO), - VULNBL_HYGON(0x18, RETBLEED), + VULNBL_HYGON(0x18, RETBLEED | SRSO), VULNBL_AMD(0x19, SRSO), {} };
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Min Li min15.li@samsung.com
commit 6f64f866aa1ae6975c95d805ed51d7e9433a0016 upstream.
Before calling add partition or resize partition, there is no check on whether the length is aligned with the logical block size. If the logical block size of the disk is larger than 512 bytes, then the partition size maybe not the multiple of the logical block size, and when the last sector is read, bio_truncate() will adjust the bio size, resulting in an IO error if the size of the read command is smaller than the logical block size.If integrity data is supported, this will also result in a null pointer dereference when calling bio_integrity_free.
Cc: stable@vger.kernel.org Signed-off-by: Min Li min15.li@samsung.com Reviewed-by: Damien Le Moal dlemoal@kernel.org Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Reviewed-by: Christoph Hellwig hch@lst.de Link: https://lore.kernel.org/r/20230629142517.121241-1-min15.li@samsung.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Ashwin Dayanand Kamat ashwin.kamat@broadcom.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/ioctl.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
--- a/block/ioctl.c +++ b/block/ioctl.c @@ -17,7 +17,7 @@ static int blkpg_do_ioctl(struct block_d struct blkpg_partition __user *upart, int op) { struct blkpg_partition p; - long long start, length; + sector_t start, length;
if (!capable(CAP_SYS_ADMIN)) return -EACCES; @@ -32,6 +32,12 @@ static int blkpg_do_ioctl(struct block_d if (op == BLKPG_DEL_PARTITION) return bdev_del_partition(bdev, p.pno);
+ if (p.start < 0 || p.length <= 0 || p.start + p.length < 0) + return -EINVAL; + /* Check that the partition is aligned to the block size */ + if (!IS_ALIGNED(p.start | p.length, bdev_logical_block_size(bdev))) + return -EINVAL; + start = p.start >> SECTOR_SHIFT; length = p.length >> SECTOR_SHIFT;
@@ -46,9 +52,6 @@ static int blkpg_do_ioctl(struct block_d
switch (op) { case BLKPG_ADD_PARTITION: - /* check if partition is aligned to blocksize */ - if (p.start & (bdev_logical_block_size(bdev) - 1)) - return -EINVAL; return bdev_add_partition(bdev, p.pno, start, length); case BLKPG_RESIZE_PARTITION: return bdev_resize_partition(bdev, p.pno, start, length);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
commit 994209ddf4f430946f6247616b2e33d179243769 upstream.
When dormant flag is toggled, hooks are disabled in the commit phase by iterating over current chains in table (existing and new).
The following configuration allows for an inconsistent state:
add table x add chain x y { type filter hook input priority 0; } add table x { flags dormant; } add chain x w { type filter hook input priority 1; }
which triggers the following warning when trying to unregister chain w which is already unregistered.
[ 127.322252] WARNING: CPU: 7 PID: 1211 at net/netfilter/core.c:50 1 __nf_unregister_net_hook+0x21a/0x260 [...] [ 127.322519] Call Trace: [ 127.322521] <TASK> [ 127.322524] ? __warn+0x9f/0x1a0 [ 127.322531] ? __nf_unregister_net_hook+0x21a/0x260 [ 127.322537] ? report_bug+0x1b1/0x1e0 [ 127.322545] ? handle_bug+0x3c/0x70 [ 127.322552] ? exc_invalid_op+0x17/0x40 [ 127.322556] ? asm_exc_invalid_op+0x1a/0x20 [ 127.322563] ? kasan_save_free_info+0x3b/0x60 [ 127.322570] ? __nf_unregister_net_hook+0x6a/0x260 [ 127.322577] ? __nf_unregister_net_hook+0x21a/0x260 [ 127.322583] ? __nf_unregister_net_hook+0x6a/0x260 [ 127.322590] ? __nf_tables_unregister_hook+0x8a/0xe0 [nf_tables] [ 127.322655] nft_table_disable+0x75/0xf0 [nf_tables] [ 127.322717] nf_tables_commit+0x2571/0x2620 [nf_tables]
Fixes: 179d9ba5559a ("netfilter: nf_tables: fix table flag updates") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nf_tables_api.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -2225,6 +2225,9 @@ static int nf_tables_addchain(struct nft struct nft_stats __percpu *stats = NULL; struct nft_chain_hook hook;
+ if (table->flags & __NFT_TABLE_F_UPDATE) + return -EINVAL; + if (flags & NFT_CHAIN_BINDING) return -EOPNOTSUPP;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
commit 24cea9677025e0de419989ecb692acd4bb34cac2 upstream.
Similar to 2c9f0293280e ("netfilter: nf_tables: flush pending destroy work before netlink notifier") to address a race between exit_net and the destroy workqueue.
The trace below shows an element to be released via destroy workqueue while exit_net path (triggered via module removal) has already released the set that is used in such transaction.
[ 1360.547789] BUG: KASAN: slab-use-after-free in nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables] [ 1360.547861] Read of size 8 at addr ffff888140500cc0 by task kworker/4:1/152465 [ 1360.547870] CPU: 4 PID: 152465 Comm: kworker/4:1 Not tainted 6.8.0+ #359 [ 1360.547882] Workqueue: events nf_tables_trans_destroy_work [nf_tables] [ 1360.547984] Call Trace: [ 1360.547991] <TASK> [ 1360.547998] dump_stack_lvl+0x53/0x70 [ 1360.548014] print_report+0xc4/0x610 [ 1360.548026] ? __virt_addr_valid+0xba/0x160 [ 1360.548040] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 1360.548054] ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables] [ 1360.548176] kasan_report+0xae/0xe0 [ 1360.548189] ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables] [ 1360.548312] nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables] [ 1360.548447] ? __pfx_nf_tables_trans_destroy_work+0x10/0x10 [nf_tables] [ 1360.548577] ? _raw_spin_unlock_irq+0x18/0x30 [ 1360.548591] process_one_work+0x2f1/0x670 [ 1360.548610] worker_thread+0x4d3/0x760 [ 1360.548627] ? __pfx_worker_thread+0x10/0x10 [ 1360.548640] kthread+0x16b/0x1b0 [ 1360.548653] ? __pfx_kthread+0x10/0x10 [ 1360.548665] ret_from_fork+0x2f/0x50 [ 1360.548679] ? __pfx_kthread+0x10/0x10 [ 1360.548690] ret_from_fork_asm+0x1a/0x30 [ 1360.548707] </TASK>
[ 1360.548719] Allocated by task 192061: [ 1360.548726] kasan_save_stack+0x20/0x40 [ 1360.548739] kasan_save_track+0x14/0x30 [ 1360.548750] __kasan_kmalloc+0x8f/0xa0 [ 1360.548760] __kmalloc_node+0x1f1/0x450 [ 1360.548771] nf_tables_newset+0x10c7/0x1b50 [nf_tables] [ 1360.548883] nfnetlink_rcv_batch+0xbc4/0xdc0 [nfnetlink] [ 1360.548909] nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink] [ 1360.548927] netlink_unicast+0x367/0x4f0 [ 1360.548935] netlink_sendmsg+0x34b/0x610 [ 1360.548944] ____sys_sendmsg+0x4d4/0x510 [ 1360.548953] ___sys_sendmsg+0xc9/0x120 [ 1360.548961] __sys_sendmsg+0xbe/0x140 [ 1360.548971] do_syscall_64+0x55/0x120 [ 1360.548982] entry_SYSCALL_64_after_hwframe+0x55/0x5d
[ 1360.548994] Freed by task 192222: [ 1360.548999] kasan_save_stack+0x20/0x40 [ 1360.549009] kasan_save_track+0x14/0x30 [ 1360.549019] kasan_save_free_info+0x3b/0x60 [ 1360.549028] poison_slab_object+0x100/0x180 [ 1360.549036] __kasan_slab_free+0x14/0x30 [ 1360.549042] kfree+0xb6/0x260 [ 1360.549049] __nft_release_table+0x473/0x6a0 [nf_tables] [ 1360.549131] nf_tables_exit_net+0x170/0x240 [nf_tables] [ 1360.549221] ops_exit_list+0x50/0xa0 [ 1360.549229] free_exit_list+0x101/0x140 [ 1360.549236] unregister_pernet_operations+0x107/0x160 [ 1360.549245] unregister_pernet_subsys+0x1c/0x30 [ 1360.549254] nf_tables_module_exit+0x43/0x80 [nf_tables] [ 1360.549345] __do_sys_delete_module+0x253/0x370 [ 1360.549352] do_syscall_64+0x55/0x120 [ 1360.549360] entry_SYSCALL_64_after_hwframe+0x55/0x5d
(gdb) list *__nft_release_table+0x473 0x1e033 is in __nft_release_table (net/netfilter/nf_tables_api.c:11354). 11349 list_for_each_entry_safe(flowtable, nf, &table->flowtables, list) { 11350 list_del(&flowtable->list); 11351 nft_use_dec(&table->use); 11352 nf_tables_flowtable_destroy(flowtable); 11353 } 11354 list_for_each_entry_safe(set, ns, &table->sets, list) { 11355 list_del(&set->list); 11356 nft_use_dec(&table->use); 11357 if (set->flags & (NFT_SET_MAP | NFT_SET_OBJECT)) 11358 nft_map_deactivate(&ctx, set); (gdb)
[ 1360.549372] Last potentially related work creation: [ 1360.549376] kasan_save_stack+0x20/0x40 [ 1360.549384] __kasan_record_aux_stack+0x9b/0xb0 [ 1360.549392] __queue_work+0x3fb/0x780 [ 1360.549399] queue_work_on+0x4f/0x60 [ 1360.549407] nft_rhash_remove+0x33b/0x340 [nf_tables] [ 1360.549516] nf_tables_commit+0x1c6a/0x2620 [nf_tables] [ 1360.549625] nfnetlink_rcv_batch+0x728/0xdc0 [nfnetlink] [ 1360.549647] nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink] [ 1360.549671] netlink_unicast+0x367/0x4f0 [ 1360.549680] netlink_sendmsg+0x34b/0x610 [ 1360.549690] ____sys_sendmsg+0x4d4/0x510 [ 1360.549697] ___sys_sendmsg+0xc9/0x120 [ 1360.549706] __sys_sendmsg+0xbe/0x140 [ 1360.549715] do_syscall_64+0x55/0x120 [ 1360.549725] entry_SYSCALL_64_after_hwframe+0x55/0x5d
Fixes: 0935d5588400 ("netfilter: nf_tables: asynchronous release") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nf_tables_api.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -9796,6 +9796,7 @@ static void __exit nf_tables_module_exit unregister_netdevice_notifier(&nf_tables_flowtable_notifier); nft_chain_filter_fini(); nft_chain_route_fini(); + nf_tables_trans_destroy_flush_work(); unregister_pernet_subsys(&nf_tables_net_ops); cancel_work_sync(&trans_gc_work); cancel_work_sync(&trans_destroy_work);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ziyang Xuan william.xuanziyang@huawei.com
commit 24225011d81b471acc0e1e315b7d9905459a6304 upstream.
nft_unregister_flowtable_type() within nf_flow_inet_module_exit() can concurrent with __nft_flowtable_type_get() within nf_tables_newflowtable(). And thhere is not any protection when iterate over nf_tables_flowtables list in __nft_flowtable_type_get(). Therefore, there is pertential data-race of nf_tables_flowtables list entry.
Use list_for_each_entry_rcu() to iterate over nf_tables_flowtables list in __nft_flowtable_type_get(), and use rcu_read_lock() in the caller nft_flowtable_type_get() to protect the entire type query process.
Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend") Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nf_tables_api.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -6879,11 +6879,12 @@ static int nft_flowtable_parse_hook(cons return err; }
+/* call under rcu_read_lock */ static const struct nf_flowtable_type *__nft_flowtable_type_get(u8 family) { const struct nf_flowtable_type *type;
- list_for_each_entry(type, &nf_tables_flowtables, list) { + list_for_each_entry_rcu(type, &nf_tables_flowtables, list) { if (family == type->family) return type; } @@ -6895,9 +6896,13 @@ nft_flowtable_type_get(struct net *net, { const struct nf_flowtable_type *type;
+ rcu_read_lock(); type = __nft_flowtable_type_get(family); - if (type != NULL && try_module_get(type->owner)) + if (type != NULL && try_module_get(type->owner)) { + rcu_read_unlock(); return type; + } + rcu_read_unlock();
lockdep_nfnl_nft_mutex_not_held(); #ifdef CONFIG_MODULES
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
commit 0c83842df40f86e529db6842231154772c20edcc upstream.
I got multiple syzbot reports showing old bugs exposed by BPF after commit 20f2505fb436 ("bpf: Try to avoid kzalloc in cgroup/{s,g}etsockopt")
setsockopt() @optlen argument should be taken into account before copying data.
BUG: KASAN: slab-out-of-bounds in copy_from_sockptr_offset include/linux/sockptr.h:49 [inline] BUG: KASAN: slab-out-of-bounds in copy_from_sockptr include/linux/sockptr.h:55 [inline] BUG: KASAN: slab-out-of-bounds in do_replace net/ipv4/netfilter/ip_tables.c:1111 [inline] BUG: KASAN: slab-out-of-bounds in do_ipt_set_ctl+0x902/0x3dd0 net/ipv4/netfilter/ip_tables.c:1627 Read of size 96 at addr ffff88802cd73da0 by task syz-executor.4/7238
CPU: 1 PID: 7238 Comm: syz-executor.4 Not tainted 6.9.0-rc2-next-20240403-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189 __asan_memcpy+0x29/0x70 mm/kasan/shadow.c:105 copy_from_sockptr_offset include/linux/sockptr.h:49 [inline] copy_from_sockptr include/linux/sockptr.h:55 [inline] do_replace net/ipv4/netfilter/ip_tables.c:1111 [inline] do_ipt_set_ctl+0x902/0x3dd0 net/ipv4/netfilter/ip_tables.c:1627 nf_setsockopt+0x295/0x2c0 net/netfilter/nf_sockopt.c:101 do_sock_setsockopt+0x3af/0x720 net/socket.c:2311 __sys_setsockopt+0x1ae/0x250 net/socket.c:2334 __do_sys_setsockopt net/socket.c:2343 [inline] __se_sys_setsockopt net/socket.c:2340 [inline] __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340 do_syscall_64+0xfb/0x240 entry_SYSCALL_64_after_hwframe+0x72/0x7a RIP: 0033:0x7fd22067dde9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fd21f9ff0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 00007fd2207abf80 RCX: 00007fd22067dde9 RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00007fd2206ca47a R08: 0000000000000001 R09: 0000000000000000 R10: 0000000020000880 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007fd2207abf80 R15: 00007ffd2d0170d8 </TASK>
Allocated by task 7238: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:370 [inline] __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387 kasan_kmalloc include/linux/kasan.h:211 [inline] __do_kmalloc_node mm/slub.c:4069 [inline] __kmalloc_noprof+0x200/0x410 mm/slub.c:4082 kmalloc_noprof include/linux/slab.h:664 [inline] __cgroup_bpf_run_filter_setsockopt+0xd47/0x1050 kernel/bpf/cgroup.c:1869 do_sock_setsockopt+0x6b4/0x720 net/socket.c:2293 __sys_setsockopt+0x1ae/0x250 net/socket.c:2334 __do_sys_setsockopt net/socket.c:2343 [inline] __se_sys_setsockopt net/socket.c:2340 [inline] __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340 do_syscall_64+0xfb/0x240 entry_SYSCALL_64_after_hwframe+0x72/0x7a
The buggy address belongs to the object at ffff88802cd73da0 which belongs to the cache kmalloc-8 of size 8 The buggy address is located 0 bytes inside of allocated 1-byte region [ffff88802cd73da0, ffff88802cd73da1)
The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88802cd73020 pfn:0x2cd73 flags: 0xfff80000000000(node=0|zone=1|lastcpupid=0xfff) page_type: 0xffffefff(slab) raw: 00fff80000000000 ffff888015041280 dead000000000100 dead000000000122 raw: ffff88802cd73020 000000008080007f 00000001ffffefff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5103, tgid 2119833701 (syz-executor.4), ts 5103, free_ts 70804600828 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1490 prep_new_page mm/page_alloc.c:1498 [inline] get_page_from_freelist+0x2e7e/0x2f40 mm/page_alloc.c:3454 __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4712 __alloc_pages_node_noprof include/linux/gfp.h:244 [inline] alloc_pages_node_noprof include/linux/gfp.h:271 [inline] alloc_slab_page+0x5f/0x120 mm/slub.c:2249 allocate_slab+0x5a/0x2e0 mm/slub.c:2412 new_slab mm/slub.c:2465 [inline] ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3615 __slab_alloc+0x58/0xa0 mm/slub.c:3705 __slab_alloc_node mm/slub.c:3758 [inline] slab_alloc_node mm/slub.c:3936 [inline] __do_kmalloc_node mm/slub.c:4068 [inline] kmalloc_node_track_caller_noprof+0x286/0x450 mm/slub.c:4089 kstrdup+0x3a/0x80 mm/util.c:62 device_rename+0xb5/0x1b0 drivers/base/core.c:4558 dev_change_name+0x275/0x860 net/core/dev.c:1232 do_setlink+0xa4b/0x41f0 net/core/rtnetlink.c:2864 __rtnl_newlink net/core/rtnetlink.c:3680 [inline] rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3727 rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6594 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline] netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361 page last free pid 5146 tgid 5146 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1110 [inline] free_unref_page+0xd3c/0xec0 mm/page_alloc.c:2617 discard_slab mm/slub.c:2511 [inline] __put_partials+0xeb/0x130 mm/slub.c:2980 put_cpu_partial+0x17c/0x250 mm/slub.c:3055 __slab_free+0x2ea/0x3d0 mm/slub.c:4254 qlink_free mm/kasan/quarantine.c:163 [inline] qlist_free_all+0x9e/0x140 mm/kasan/quarantine.c:179 kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286 __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:322 kasan_slab_alloc include/linux/kasan.h:201 [inline] slab_post_alloc_hook mm/slub.c:3888 [inline] slab_alloc_node mm/slub.c:3948 [inline] __do_kmalloc_node mm/slub.c:4068 [inline] __kmalloc_node_noprof+0x1d7/0x450 mm/slub.c:4076 kmalloc_node_noprof include/linux/slab.h:681 [inline] kvmalloc_node_noprof+0x72/0x190 mm/util.c:634 bucket_table_alloc lib/rhashtable.c:186 [inline] rhashtable_rehash_alloc+0x9e/0x290 lib/rhashtable.c:367 rht_deferred_worker+0x4e1/0x2440 lib/rhashtable.c:427 process_one_work kernel/workqueue.c:3218 [inline] process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299 worker_thread+0x86d/0xd70 kernel/workqueue.c:3380 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
Memory state around the buggy address: ffff88802cd73c80: 07 fc fc fc 05 fc fc fc 05 fc fc fc fa fc fc fc ffff88802cd73d00: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc
ffff88802cd73d80: fa fc fc fc 01 fc fc fc fa fc fc fc fa fc fc fc
^ ffff88802cd73e00: fa fc fc fc fa fc fc fc 05 fc fc fc 07 fc fc fc ffff88802cd73e80: 07 fc fc fc 07 fc fc fc 07 fc fc fc 07 fc fc fc
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: Pablo Neira Ayuso pablo@netfilter.org Link: https://lore.kernel.org/r/20240404122051.2303764-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bridge/netfilter/ebtables.c | 6 ++++++ net/ipv4/netfilter/arp_tables.c | 4 ++++ net/ipv4/netfilter/ip_tables.c | 4 ++++ net/ipv6/netfilter/ip6_tables.c | 4 ++++ 4 files changed, 18 insertions(+)
--- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -1070,6 +1070,8 @@ static int do_replace(struct net *net, s struct ebt_table_info *newinfo; struct ebt_replace tmp;
+ if (len < sizeof(tmp)) + return -EINVAL; if (copy_from_sockptr(&tmp, arg, sizeof(tmp)) != 0) return -EFAULT;
@@ -1309,6 +1311,8 @@ static int update_counters(struct net *n { struct ebt_replace hlp;
+ if (len < sizeof(hlp)) + return -EINVAL; if (copy_from_sockptr(&hlp, arg, sizeof(hlp))) return -EFAULT;
@@ -2238,6 +2242,8 @@ static int compat_update_counters(struct { struct compat_ebt_replace hlp;
+ if (len < sizeof(hlp)) + return -EINVAL; if (copy_from_sockptr(&hlp, arg, sizeof(hlp))) return -EFAULT;
--- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -955,6 +955,8 @@ static int do_replace(struct net *net, s void *loc_cpu_entry; struct arpt_entry *iter;
+ if (len < sizeof(tmp)) + return -EINVAL; if (copy_from_sockptr(&tmp, arg, sizeof(tmp)) != 0) return -EFAULT;
@@ -1253,6 +1255,8 @@ static int compat_do_replace(struct net void *loc_cpu_entry; struct arpt_entry *iter;
+ if (len < sizeof(tmp)) + return -EINVAL; if (copy_from_sockptr(&tmp, arg, sizeof(tmp)) != 0) return -EFAULT;
--- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -1109,6 +1109,8 @@ do_replace(struct net *net, sockptr_t ar void *loc_cpu_entry; struct ipt_entry *iter;
+ if (len < sizeof(tmp)) + return -EINVAL; if (copy_from_sockptr(&tmp, arg, sizeof(tmp)) != 0) return -EFAULT;
@@ -1493,6 +1495,8 @@ compat_do_replace(struct net *net, sockp void *loc_cpu_entry; struct ipt_entry *iter;
+ if (len < sizeof(tmp)) + return -EINVAL; if (copy_from_sockptr(&tmp, arg, sizeof(tmp)) != 0) return -EFAULT;
--- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -1127,6 +1127,8 @@ do_replace(struct net *net, sockptr_t ar void *loc_cpu_entry; struct ip6t_entry *iter;
+ if (len < sizeof(tmp)) + return -EINVAL; if (copy_from_sockptr(&tmp, arg, sizeof(tmp)) != 0) return -EFAULT;
@@ -1503,6 +1505,8 @@ compat_do_replace(struct net *net, sockp void *loc_cpu_entry; struct ip6t_entry *iter;
+ if (len < sizeof(tmp)) + return -EINVAL; if (copy_from_sockptr(&tmp, arg, sizeof(tmp)) != 0) return -EFAULT;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
commit de3f64b738af57e2732b91a0774facc675b75b54 upstream.
If an load_nls_xxx() function fails a few lines above, the 'sbi->bdi_id' is still 0. So, in the error handling path, we will call ida_simple_remove(..., 0) which is not allocated yet.
In order to prevent a spurious "ida_free called for id=0 which is not allocated." message, tweak the error handling path and add a new label.
Fixes: 0fd169576648 ("fs: Add VirtualBox guest shared folder (vboxsf) support") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Link: https://lore.kernel.org/r/d09eaaa4e2e08206c58a1a27ca9b3e81dc168773.169883573... Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/vboxsf/super.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/vboxsf/super.c +++ b/fs/vboxsf/super.c @@ -151,7 +151,7 @@ static int vboxsf_fill_super(struct supe if (!sbi->nls) { vbg_err("vboxsf: Count not load '%s' nls\n", nls_name); err = -EINVAL; - goto fail_free; + goto fail_destroy_idr; } }
@@ -224,6 +224,7 @@ fail_free: ida_simple_remove(&vboxsf_bdi_ida, sbi->bdi_id); if (sbi->nls) unload_nls(sbi->nls); +fail_destroy_idr: idr_destroy(&sbi->ino_idr); kfree(sbi); return err;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Sitnicki jakub@cloudflare.com
commit ff91059932401894e6c86341915615c5eb0eca48 upstream.
syzkaller started using corpuses where a BPF tracing program deletes elements from a sockmap/sockhash map. Because BPF tracing programs can be invoked from any interrupt context, locks taken during a map_delete_elem operation must be hardirq-safe. Otherwise a deadlock due to lock inversion is possible, as reported by lockdep:
CPU0 CPU1 ---- ---- lock(&htab->buckets[i].lock); local_irq_disable(); lock(&host->lock); lock(&htab->buckets[i].lock); <Interrupt> lock(&host->lock);
Locks in sockmap are hardirq-unsafe by design. We expects elements to be deleted from sockmap/sockhash only in task (normal) context with interrupts enabled, or in softirq context.
Detect when map_delete_elem operation is invoked from a context which is _not_ hardirq-unsafe, that is interrupts are disabled, and bail out with an error.
Note that map updates are not affected by this issue. BPF verifier does not allow updating sockmap/sockhash from a BPF tracing program today.
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Reported-by: xingwei lee xrivendell7@gmail.com Reported-by: yue sun samsun1006219@gmail.com Reported-by: syzbot+bc922f476bd65abbd466@syzkaller.appspotmail.com Reported-by: syzbot+d4066896495db380182e@syzkaller.appspotmail.com Signed-off-by: Jakub Sitnicki jakub@cloudflare.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Tested-by: syzbot+d4066896495db380182e@syzkaller.appspotmail.com Acked-by: John Fastabend john.fastabend@gmail.com Closes: https://syzkaller.appspot.com/bug?extid=d4066896495db380182e Closes: https://syzkaller.appspot.com/bug?extid=bc922f476bd65abbd466 Link: https://lore.kernel.org/bpf/20240402104621.1050319-1-jakub@cloudflare.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/sock_map.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -422,6 +422,9 @@ static int __sock_map_delete(struct bpf_ struct sock *sk; int err = 0;
+ if (irqs_disabled()) + return -EOPNOTSUPP; /* locks here are hardirq-unsafe */ + raw_spin_lock_bh(&stab->lock); sk = *psk; if (!sk_test || sk_test == sk) @@ -955,6 +958,9 @@ static int sock_hash_delete_elem(struct struct bpf_shtab_elem *elem; int ret = -ENOENT;
+ if (irqs_disabled()) + return -EOPNOTSUPP; /* locks here are hardirq-unsafe */ + hash = sock_hash_bucket_hash(key, key_size); bucket = sock_hash_select_bucket(htab, hash);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
commit d313eb8b77557a6d5855f42d2234bd592c7b50dd upstream.
syzbot found that tcf_skbmod_dump() was copying four bytes from kernel stack to user space [1].
The issue here is that 'struct tc_skbmod' has a four bytes hole.
We need to clear the structure before filling fields.
[1] BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline] BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline] BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185 instrument_copy_to_user include/linux/instrumented.h:114 [inline] copy_to_user_iter lib/iov_iter.c:24 [inline] iterate_ubuf include/linux/iov_iter.h:29 [inline] iterate_and_advance2 include/linux/iov_iter.h:245 [inline] iterate_and_advance include/linux/iov_iter.h:271 [inline] _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185 copy_to_iter include/linux/uio.h:196 [inline] simple_copy_to_iter net/core/datagram.c:532 [inline] __skb_datagram_iter+0x185/0x1000 net/core/datagram.c:420 skb_copy_datagram_iter+0x5c/0x200 net/core/datagram.c:546 skb_copy_datagram_msg include/linux/skbuff.h:4050 [inline] netlink_recvmsg+0x432/0x1610 net/netlink/af_netlink.c:1962 sock_recvmsg_nosec net/socket.c:1046 [inline] sock_recvmsg+0x2c4/0x340 net/socket.c:1068 __sys_recvfrom+0x35a/0x5f0 net/socket.c:2242 __do_sys_recvfrom net/socket.c:2260 [inline] __se_sys_recvfrom net/socket.c:2256 [inline] __x64_sys_recvfrom+0x126/0x1d0 net/socket.c:2256 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75
Uninit was stored to memory at: pskb_expand_head+0x30f/0x19d0 net/core/skbuff.c:2253 netlink_trim+0x2c2/0x330 net/netlink/af_netlink.c:1317 netlink_unicast+0x9f/0x1260 net/netlink/af_netlink.c:1351 nlmsg_unicast include/net/netlink.h:1144 [inline] nlmsg_notify+0x21d/0x2f0 net/netlink/af_netlink.c:2610 rtnetlink_send+0x73/0x90 net/core/rtnetlink.c:741 rtnetlink_maybe_send include/linux/rtnetlink.h:17 [inline] tcf_add_notify net/sched/act_api.c:2048 [inline] tcf_action_add net/sched/act_api.c:2071 [inline] tc_ctl_action+0x146e/0x19d0 net/sched/act_api.c:2119 rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595 netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559 rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline] netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361 netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:745 ____sys_sendmsg+0x877/0xb60 net/socket.c:2584 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638 __sys_sendmsg net/socket.c:2667 [inline] __do_sys_sendmsg net/socket.c:2676 [inline] __se_sys_sendmsg net/socket.c:2674 [inline] __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75
Uninit was stored to memory at: __nla_put lib/nlattr.c:1041 [inline] nla_put+0x1c6/0x230 lib/nlattr.c:1099 tcf_skbmod_dump+0x23f/0xc20 net/sched/act_skbmod.c:256 tcf_action_dump_old net/sched/act_api.c:1191 [inline] tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227 tcf_action_dump+0x1fd/0x460 net/sched/act_api.c:1251 tca_get_fill+0x519/0x7a0 net/sched/act_api.c:1628 tcf_add_notify_msg net/sched/act_api.c:2023 [inline] tcf_add_notify net/sched/act_api.c:2042 [inline] tcf_action_add net/sched/act_api.c:2071 [inline] tc_ctl_action+0x1365/0x19d0 net/sched/act_api.c:2119 rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595 netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559 rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline] netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361 netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:745 ____sys_sendmsg+0x877/0xb60 net/socket.c:2584 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638 __sys_sendmsg net/socket.c:2667 [inline] __do_sys_sendmsg net/socket.c:2676 [inline] __se_sys_sendmsg net/socket.c:2674 [inline] __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75
Local variable opt created at: tcf_skbmod_dump+0x9d/0xc20 net/sched/act_skbmod.c:244 tcf_action_dump_old net/sched/act_api.c:1191 [inline] tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227
Bytes 188-191 of 248 are uninitialized Memory access of size 248 starts at ffff888117697680 Data copied to user address 00007ffe56d855f0
Fixes: 86da71b57383 ("net_sched: Introduce skbmod action") Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Jamal Hadi Salim jhs@mojatatu.com Link: https://lore.kernel.org/r/20240403130908.93421-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/act_skbmod.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/net/sched/act_skbmod.c +++ b/net/sched/act_skbmod.c @@ -219,13 +219,13 @@ static int tcf_skbmod_dump(struct sk_buf struct tcf_skbmod *d = to_skbmod(a); unsigned char *b = skb_tail_pointer(skb); struct tcf_skbmod_params *p; - struct tc_skbmod opt = { - .index = d->tcf_index, - .refcnt = refcount_read(&d->tcf_refcnt) - ref, - .bindcnt = atomic_read(&d->tcf_bindcnt) - bind, - }; + struct tc_skbmod opt; struct tcf_t t;
+ memset(&opt, 0, sizeof(opt)); + opt.index = d->tcf_index; + opt.refcnt = refcount_read(&d->tcf_refcnt) - ref, + opt.bindcnt = atomic_read(&d->tcf_bindcnt) - bind; spin_lock_bh(&d->tcf_lock); opt.action = d->tcf_action; p = rcu_dereference_protected(d->skbmod_p,
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Piotr Wejman piotrwejman90@gmail.com
commit b3da86d432b7cd65b025a11f68613e333d2483db upstream.
The driver should ensure that same priority is not mapped to multiple rx queues. From DesignWare Cores Ethernet Quality-of-Service Databook, section 17.1.29 MAC_RxQ_Ctrl2: "[...]The software must ensure that the content of this field is mutually exclusive to the PSRQ fields for other queues, that is, the same priority is not mapped to multiple Rx queues[...]"
Previously rx_queue_priority() function was: - clearing all priorities from a queue - adding new priorities to that queue After this patch it will: - first assign new priorities to a queue - then remove those priorities from all other queues - keep other priorities previously assigned to that queue
Fixes: a8f5102af2a7 ("net: stmmac: TX and RX queue priority configuration") Fixes: 2142754f8b9c ("net: stmmac: Add MAC related callbacks for XGMAC2") Signed-off-by: Piotr Wejman piotrwejman90@gmail.com Link: https://lore.kernel.org/r/20240401192239.33942-1-piotrwejman90@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 40 +++++++++++++++----- drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 38 +++++++++++++++---- 2 files changed, 62 insertions(+), 16 deletions(-)
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c @@ -75,19 +75,41 @@ static void dwmac4_rx_queue_priority(str u32 prio, u32 queue) { void __iomem *ioaddr = hw->pcsr; - u32 base_register; - u32 value; + u32 clear_mask = 0; + u32 ctrl2, ctrl3; + int i;
- base_register = (queue < 4) ? GMAC_RXQ_CTRL2 : GMAC_RXQ_CTRL3; - if (queue >= 4) - queue -= 4; + ctrl2 = readl(ioaddr + GMAC_RXQ_CTRL2); + ctrl3 = readl(ioaddr + GMAC_RXQ_CTRL3); + + /* The software must ensure that the same priority + * is not mapped to multiple Rx queues + */ + for (i = 0; i < 4; i++) + clear_mask |= ((prio << GMAC_RXQCTRL_PSRQX_SHIFT(i)) & + GMAC_RXQCTRL_PSRQX_MASK(i));
- value = readl(ioaddr + base_register); + ctrl2 &= ~clear_mask; + ctrl3 &= ~clear_mask;
- value &= ~GMAC_RXQCTRL_PSRQX_MASK(queue); - value |= (prio << GMAC_RXQCTRL_PSRQX_SHIFT(queue)) & + /* First assign new priorities to a queue, then + * clear them from others queues + */ + if (queue < 4) { + ctrl2 |= (prio << GMAC_RXQCTRL_PSRQX_SHIFT(queue)) & GMAC_RXQCTRL_PSRQX_MASK(queue); - writel(value, ioaddr + base_register); + + writel(ctrl2, ioaddr + GMAC_RXQ_CTRL2); + writel(ctrl3, ioaddr + GMAC_RXQ_CTRL3); + } else { + queue -= 4; + + ctrl3 |= (prio << GMAC_RXQCTRL_PSRQX_SHIFT(queue)) & + GMAC_RXQCTRL_PSRQX_MASK(queue); + + writel(ctrl3, ioaddr + GMAC_RXQ_CTRL3); + writel(ctrl2, ioaddr + GMAC_RXQ_CTRL2); + } }
static void dwmac4_tx_queue_priority(struct mac_device_info *hw, --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c @@ -97,17 +97,41 @@ static void dwxgmac2_rx_queue_prio(struc u32 queue) { void __iomem *ioaddr = hw->pcsr; - u32 value, reg; + u32 clear_mask = 0; + u32 ctrl2, ctrl3; + int i;
- reg = (queue < 4) ? XGMAC_RXQ_CTRL2 : XGMAC_RXQ_CTRL3; - if (queue >= 4) + ctrl2 = readl(ioaddr + XGMAC_RXQ_CTRL2); + ctrl3 = readl(ioaddr + XGMAC_RXQ_CTRL3); + + /* The software must ensure that the same priority + * is not mapped to multiple Rx queues + */ + for (i = 0; i < 4; i++) + clear_mask |= ((prio << XGMAC_PSRQ_SHIFT(i)) & + XGMAC_PSRQ(i)); + + ctrl2 &= ~clear_mask; + ctrl3 &= ~clear_mask; + + /* First assign new priorities to a queue, then + * clear them from others queues + */ + if (queue < 4) { + ctrl2 |= (prio << XGMAC_PSRQ_SHIFT(queue)) & + XGMAC_PSRQ(queue); + + writel(ctrl2, ioaddr + XGMAC_RXQ_CTRL2); + writel(ctrl3, ioaddr + XGMAC_RXQ_CTRL3); + } else { queue -= 4;
- value = readl(ioaddr + reg); - value &= ~XGMAC_PSRQ(queue); - value |= (prio << XGMAC_PSRQ_SHIFT(queue)) & XGMAC_PSRQ(queue); + ctrl3 |= (prio << XGMAC_PSRQ_SHIFT(queue)) & + XGMAC_PSRQ(queue);
- writel(value, ioaddr + reg); + writel(ctrl3, ioaddr + XGMAC_RXQ_CTRL3); + writel(ctrl2, ioaddr + XGMAC_RXQ_CTRL2); + } }
static void dwxgmac2_tx_queue_prio(struct mac_device_info *hw, u32 prio,
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
commit 17af420545a750f763025149fa7b833a4fc8b8f0 upstream.
syzbot reported a problem in ip6erspan_rcv() [1]
Issue is that ip6erspan_rcv() (and erspan_rcv()) no longer make sure erspan_base_hdr is present in skb linear part (skb->head) before getting @ver field from it.
Add the missing pskb_may_pull() calls.
v2: Reload iph pointer in erspan_rcv() after pskb_may_pull() because skb->head might have changed.
[1]
BUG: KMSAN: uninit-value in pskb_may_pull_reason include/linux/skbuff.h:2742 [inline] BUG: KMSAN: uninit-value in pskb_may_pull include/linux/skbuff.h:2756 [inline] BUG: KMSAN: uninit-value in ip6erspan_rcv net/ipv6/ip6_gre.c:541 [inline] BUG: KMSAN: uninit-value in gre_rcv+0x11f8/0x1930 net/ipv6/ip6_gre.c:610 pskb_may_pull_reason include/linux/skbuff.h:2742 [inline] pskb_may_pull include/linux/skbuff.h:2756 [inline] ip6erspan_rcv net/ipv6/ip6_gre.c:541 [inline] gre_rcv+0x11f8/0x1930 net/ipv6/ip6_gre.c:610 ip6_protocol_deliver_rcu+0x1d4c/0x2ca0 net/ipv6/ip6_input.c:438 ip6_input_finish net/ipv6/ip6_input.c:483 [inline] NF_HOOK include/linux/netfilter.h:314 [inline] ip6_input+0x15d/0x430 net/ipv6/ip6_input.c:492 ip6_mc_input+0xa7e/0xc80 net/ipv6/ip6_input.c:586 dst_input include/net/dst.h:460 [inline] ip6_rcv_finish+0x955/0x970 net/ipv6/ip6_input.c:79 NF_HOOK include/linux/netfilter.h:314 [inline] ipv6_rcv+0xde/0x390 net/ipv6/ip6_input.c:310 __netif_receive_skb_one_core net/core/dev.c:5538 [inline] __netif_receive_skb+0x1da/0xa00 net/core/dev.c:5652 netif_receive_skb_internal net/core/dev.c:5738 [inline] netif_receive_skb+0x58/0x660 net/core/dev.c:5798 tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1549 tun_get_user+0x5566/0x69e0 drivers/net/tun.c:2002 tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048 call_write_iter include/linux/fs.h:2108 [inline] new_sync_write fs/read_write.c:497 [inline] vfs_write+0xb63/0x1520 fs/read_write.c:590 ksys_write+0x20f/0x4c0 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __x64_sys_write+0x93/0xe0 fs/read_write.c:652 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75
Uninit was created at: slab_post_alloc_hook mm/slub.c:3804 [inline] slab_alloc_node mm/slub.c:3845 [inline] kmem_cache_alloc_node+0x613/0xc50 mm/slub.c:3888 kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:577 __alloc_skb+0x35b/0x7a0 net/core/skbuff.c:668 alloc_skb include/linux/skbuff.h:1318 [inline] alloc_skb_with_frags+0xc8/0xbf0 net/core/skbuff.c:6504 sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2795 tun_alloc_skb drivers/net/tun.c:1525 [inline] tun_get_user+0x209a/0x69e0 drivers/net/tun.c:1846 tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048 call_write_iter include/linux/fs.h:2108 [inline] new_sync_write fs/read_write.c:497 [inline] vfs_write+0xb63/0x1520 fs/read_write.c:590 ksys_write+0x20f/0x4c0 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __x64_sys_write+0x93/0xe0 fs/read_write.c:652 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75
CPU: 1 PID: 5045 Comm: syz-executor114 Not tainted 6.9.0-rc1-syzkaller-00021-g962490525cff #0
Fixes: cb73ee40b1b3 ("net: ip_gre: use erspan key field for tunnel lookup") Reported-by: syzbot+1c1cf138518bf0c53d68@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/000000000000772f2c0614b66ef7@google.com/ Signed-off-by: Eric Dumazet edumazet@google.com Cc: Lorenzo Bianconi lorenzo@kernel.org Link: https://lore.kernel.org/r/20240328112248.1101491-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_gre.c | 5 +++++ net/ipv6/ip6_gre.c | 3 +++ 2 files changed, 8 insertions(+)
--- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -278,8 +278,13 @@ static int erspan_rcv(struct sk_buff *sk tpi->flags | TUNNEL_NO_KEY, iph->saddr, iph->daddr, 0); } else { + if (unlikely(!pskb_may_pull(skb, + gre_hdr_len + sizeof(*ershdr)))) + return PACKET_REJECT; + ershdr = (struct erspan_base_hdr *)(skb->data + gre_hdr_len); ver = ershdr->ver; + iph = ip_hdr(skb); tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, tpi->flags | TUNNEL_KEY, iph->saddr, iph->daddr, tpi->key); --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -533,6 +533,9 @@ static int ip6erspan_rcv(struct sk_buff struct ip6_tnl *tunnel; u8 ver;
+ if (unlikely(!pskb_may_pull(skb, sizeof(*ershdr)))) + return PACKET_REJECT; + ipv6h = ipv6_hdr(skb); ershdr = (struct erspan_base_hdr *)skb->data; ver = ershdr->ver;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
commit 31974122cfdeaf56abc18d8ab740d580d9833e90 upstream.
The netdev CI runs in a VM and captures serial, so stdout and stderr get combined. Because there's a missing new line in stderr the test ends up corrupting KTAP:
# Successok 1 selftests: net: reuseaddr_conflict
which should have been:
# Success ok 1 selftests: net: reuseaddr_conflict
Fixes: 422d8dc6fd3a ("selftest: add a reuseaddr test") Reviewed-by: Muhammad Usama Anjum usama.anjum@collabora.com Link: https://lore.kernel.org/r/20240329160559.249476-1-kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/reuseaddr_conflict.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/net/reuseaddr_conflict.c +++ b/tools/testing/selftests/net/reuseaddr_conflict.c @@ -109,6 +109,6 @@ int main(void) fd1 = open_port(0, 1); if (fd1 >= 0) error(1, 0, "Was allowed to create an ipv4 reuseport on an already bound non-reuseport socket with no ipv6"); - fprintf(stderr, "Success"); + fprintf(stderr, "Success\n"); return 0; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
commit d21d40605bca7bd5fc23ef03d4c1ca1f48bc2cae upstream.
syzkaller reported infinite recursive calls of fib6_dump_done() during netlink socket destruction. [1]
From the log, syzkaller sent an AF_UNSPEC RTM_GETROUTE message, and then
the response was generated. The following recvmmsg() resumed the dump for IPv6, but the first call of inet6_dump_fib() failed at kzalloc() due to the fault injection. [0]
12:01:34 executing program 3: r0 = socket$nl_route(0x10, 0x3, 0x0) sendmsg$nl_route(r0, ... snip ...) recvmmsg(r0, ... snip ...) (fail_nth: 8)
Here, fib6_dump_done() was set to nlk_sk(sk)->cb.done, and the next call of inet6_dump_fib() set it to nlk_sk(sk)->cb.args[3]. syzkaller stopped receiving the response halfway through, and finally netlink_sock_destruct() called nlk_sk(sk)->cb.done().
fib6_dump_done() calls fib6_dump_end() and nlk_sk(sk)->cb.done() if it is still not NULL. fib6_dump_end() rewrites nlk_sk(sk)->cb.done() by nlk_sk(sk)->cb.args[3], but it has the same function, not NULL, calling itself recursively and hitting the stack guard page.
To avoid the issue, let's set the destructor after kzalloc().
[0]: FAULT_INJECTION: forcing a failure. name failslab, interval 1, probability 0, space 0, times 0 CPU: 1 PID: 432110 Comm: syz-executor.3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:117) should_fail_ex (lib/fault-inject.c:52 lib/fault-inject.c:153) should_failslab (mm/slub.c:3733) kmalloc_trace (mm/slub.c:3748 mm/slub.c:3827 mm/slub.c:3992) inet6_dump_fib (./include/linux/slab.h:628 ./include/linux/slab.h:749 net/ipv6/ip6_fib.c:662) rtnl_dump_all (net/core/rtnetlink.c:4029) netlink_dump (net/netlink/af_netlink.c:2269) netlink_recvmsg (net/netlink/af_netlink.c:1988) ____sys_recvmsg (net/socket.c:1046 net/socket.c:2801) ___sys_recvmsg (net/socket.c:2846) do_recvmmsg (net/socket.c:2943) __x64_sys_recvmmsg (net/socket.c:3041 net/socket.c:3034 net/socket.c:3034)
[1]: BUG: TASK stack guard page was hit at 00000000f2fa9af1 (stack is 00000000b7912430..000000009a436beb) stack guard page: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 223719 Comm: kworker/1:3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Workqueue: events netlink_sock_destruct_work RIP: 0010:fib6_dump_done (net/ipv6/ip6_fib.c:570) Code: 3c 24 e8 f3 e9 51 fd e9 28 fd ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 48 89 fd <53> 48 8d 5d 60 e8 b6 4d 07 fd 48 89 da 48 b8 00 00 00 00 00 fc ff RSP: 0018:ffffc9000d980000 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffffff84405990 RCX: ffffffff844059d3 RDX: ffff8881028e0000 RSI: ffffffff84405ac2 RDI: ffff88810c02f358 RBP: ffff88810c02f358 R08: 0000000000000007 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000224 R12: 0000000000000000 R13: ffff888007c82c78 R14: ffff888007c82c68 R15: ffff888007c82c68 FS: 0000000000000000(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffc9000d97fff8 CR3: 0000000102309002 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <#DF> </#DF> <TASK> fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1)) fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1)) ... fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1)) fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1)) netlink_sock_destruct (net/netlink/af_netlink.c:401) __sk_destruct (net/core/sock.c:2177 (discriminator 2)) sk_destruct (net/core/sock.c:2224) __sk_free (net/core/sock.c:2235) sk_free (net/core/sock.c:2246) process_one_work (kernel/workqueue.c:3259) worker_thread (kernel/workqueue.c:3329 kernel/workqueue.c:3416) kthread (kernel/kthread.c:388) ret_from_fork (arch/x86/kernel/process.c:153) ret_from_fork_asm (arch/x86/entry/entry_64.S:256) Modules linked in:
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzkaller syzkaller@googlegroups.com Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Reviewed-by: Eric Dumazet edumazet@google.com Reviewed-by: David Ahern dsahern@kernel.org Link: https://lore.kernel.org/r/20240401211003.25274-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ip6_fib.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -643,19 +643,19 @@ static int inet6_dump_fib(struct sk_buff if (!w) { /* New dump: * - * 1. hook callback destructor. - */ - cb->args[3] = (long)cb->done; - cb->done = fib6_dump_done; - - /* - * 2. allocate and initialize walker. + * 1. allocate and initialize walker. */ w = kzalloc(sizeof(*w), GFP_ATOMIC); if (!w) return -ENOMEM; w->func = fib6_dump_node; cb->args[2] = (long)w; + + /* 2. hook callback destructor. + */ + cb->args[3] = (long)cb->done; + cb->done = fib6_dump_done; + }
arg.skb = skb;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Antoine Tenart atenart@kernel.org
commit f0b8c30345565344df2e33a8417a27503589247d upstream.
UDP GRO validates checksums and in udp4/6_gro_complete fraglist packets are converted to CHECKSUM_UNNECESSARY to avoid later checks. However this is an issue for CHECKSUM_PARTIAL packets as they can be looped in an egress path and then their partial checksums are not fixed.
Different issues can be observed, from invalid checksum on packets to traces like:
gen01: hw csum failure skb len=3008 headroom=160 headlen=1376 tailroom=0 mac=(106,14) net=(120,40) trans=160 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0) hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12 ...
Fix this by only converting CHECKSUM_NONE packets to CHECKSUM_UNNECESSARY by reusing __skb_incr_checksum_unnecessary. All other checksum types are kept as-is, including CHECKSUM_COMPLETE as fraglist packets being segmented back would have their skb->csum valid.
Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") Signed-off-by: Antoine Tenart atenart@kernel.org Reviewed-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/udp_offload.c | 8 +------- net/ipv6/udp_offload.c | 8 +------- 2 files changed, 2 insertions(+), 14 deletions(-)
--- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -668,13 +668,7 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_com skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4); skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
- if (skb->ip_summed == CHECKSUM_UNNECESSARY) { - if (skb->csum_level < SKB_MAX_CSUM_LEVEL) - skb->csum_level++; - } else { - skb->ip_summed = CHECKSUM_UNNECESSARY; - skb->csum_level = 0; - } + __skb_incr_checksum_unnecessary(skb);
return 0; } --- a/net/ipv6/udp_offload.c +++ b/net/ipv6/udp_offload.c @@ -169,13 +169,7 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_com skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4); skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
- if (skb->ip_summed == CHECKSUM_UNNECESSARY) { - if (skb->csum_level < SKB_MAX_CSUM_LEVEL) - skb->csum_level++; - } else { - skb->ip_summed = CHECKSUM_UNNECESSARY; - skb->csum_level = 0; - } + __skb_incr_checksum_unnecessary(skb);
return 0; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Su Hui suhui@nfschina.com
commit e709acbd84fb6ef32736331b0147f027a3ef4c20 upstream.
otx2_rxtx_enable() return negative error code such as -EIO, check -EIO rather than EIO to fix this problem.
Fixes: c926252205c4 ("octeontx2-pf: Disable packet I/O for graceful exit") Signed-off-by: Su Hui suhui@nfschina.com Reviewed-by: Subbaraya Sundeep sbhatta@marvell.com Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Link: https://lore.kernel.org/r/20240328020620.4054692-1-suhui@nfschina.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c @@ -1598,7 +1598,7 @@ int otx2_open(struct net_device *netdev) * mcam entries are enabled to receive the packets. Hence disable the * packet I/O. */ - if (err == EIO) + if (err == -EIO) goto err_disable_rxtx; else if (err) goto err_tx_stop_queues;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Loktionov aleksandr.loktionov@intel.com
commit eb58c598ce45b7e787568fe27016260417c3d807 upstream.
The bug usually affects untrusted VFs, because they are limited to 18 MACs, it affects them badly, not letting to create MAC all filters. Not stable to reproduce, it happens when VF user creates MAC filters when other MACVLAN operations are happened in parallel. But consequence is that VF can't receive desired traffic.
Fix counter to be bumped only for new or active filters.
Fixes: 621650cabee5 ("i40e: Refactoring VF MAC filters counting to make more reliable") Signed-off-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Reviewed-by: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com Reviewed-by: Paul Menzel pmenzel@molgen.mpg.de Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -1230,8 +1230,11 @@ int i40e_count_filters(struct i40e_vsi * int bkt; int cnt = 0;
- hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) - ++cnt; + hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) { + if (f->state == I40E_FILTER_NEW || + f->state == I40E_FILTER_ACTIVE) + ++cnt; + }
return cnt; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Loktionov aleksandr.loktionov@intel.com
commit f37c4eac99c258111d414d31b740437e1925b8e8 upstream.
To fix the regression introduced by commit 52424f974bc5, which causes servers hang in very hard to reproduce conditions with resets races. Using two sources for the information is the root cause. In this function before the fix bumping v didn't mean bumping vf pointer. But the code used this variables interchangeably, so stale vf could point to different/not intended vf.
Remove redundant "v" variable and iterate via single VF pointer across whole function instead to guarantee VF pointer validity.
Fixes: 52424f974bc5 ("i40e: Fix VF hang when reset is triggered on another VF") Signed-off-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Reviewed-by: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Reviewed-by: Paul Menzel pmenzel@molgen.mpg.de Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 34 +++++++++------------ 1 file changed, 16 insertions(+), 18 deletions(-)
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -1573,8 +1573,8 @@ bool i40e_reset_all_vfs(struct i40e_pf * { struct i40e_hw *hw = &pf->hw; struct i40e_vf *vf; - int i, v; u32 reg; + int i;
/* If we don't have any VFs, then there is nothing to reset */ if (!pf->num_alloc_vfs) @@ -1585,11 +1585,10 @@ bool i40e_reset_all_vfs(struct i40e_pf * return false;
/* Begin reset on all VFs at once */ - for (v = 0; v < pf->num_alloc_vfs; v++) { - vf = &pf->vf[v]; + for (vf = &pf->vf[0]; vf < &pf->vf[pf->num_alloc_vfs]; ++vf) { /* If VF is being reset no need to trigger reset again */ if (!test_bit(I40E_VF_STATE_RESETTING, &vf->vf_states)) - i40e_trigger_vf_reset(&pf->vf[v], flr); + i40e_trigger_vf_reset(vf, flr); }
/* HW requires some time to make sure it can flush the FIFO for a VF @@ -1598,14 +1597,13 @@ bool i40e_reset_all_vfs(struct i40e_pf * * the VFs using a simple iterator that increments once that VF has * finished resetting. */ - for (i = 0, v = 0; i < 10 && v < pf->num_alloc_vfs; i++) { + for (i = 0, vf = &pf->vf[0]; i < 10 && vf < &pf->vf[pf->num_alloc_vfs]; ++i) { usleep_range(10000, 20000);
/* Check each VF in sequence, beginning with the VF to fail * the previous check. */ - while (v < pf->num_alloc_vfs) { - vf = &pf->vf[v]; + while (vf < &pf->vf[pf->num_alloc_vfs]) { if (!test_bit(I40E_VF_STATE_RESETTING, &vf->vf_states)) { reg = rd32(hw, I40E_VPGEN_VFRSTAT(vf->vf_id)); if (!(reg & I40E_VPGEN_VFRSTAT_VFRD_MASK)) @@ -1615,7 +1613,7 @@ bool i40e_reset_all_vfs(struct i40e_pf * /* If the current VF has finished resetting, move on * to the next VF in sequence. */ - v++; + ++vf; } }
@@ -1625,39 +1623,39 @@ bool i40e_reset_all_vfs(struct i40e_pf * /* Display a warning if at least one VF didn't manage to reset in * time, but continue on with the operation. */ - if (v < pf->num_alloc_vfs) + if (vf < &pf->vf[pf->num_alloc_vfs]) dev_err(&pf->pdev->dev, "VF reset check timeout on VF %d\n", - pf->vf[v].vf_id); + vf->vf_id); usleep_range(10000, 20000);
/* Begin disabling all the rings associated with VFs, but do not wait * between each VF. */ - for (v = 0; v < pf->num_alloc_vfs; v++) { + for (vf = &pf->vf[0]; vf < &pf->vf[pf->num_alloc_vfs]; ++vf) { /* On initial reset, we don't have any queues to disable */ - if (pf->vf[v].lan_vsi_idx == 0) + if (vf->lan_vsi_idx == 0) continue;
/* If VF is reset in another thread just continue */ if (test_bit(I40E_VF_STATE_RESETTING, &vf->vf_states)) continue;
- i40e_vsi_stop_rings_no_wait(pf->vsi[pf->vf[v].lan_vsi_idx]); + i40e_vsi_stop_rings_no_wait(pf->vsi[vf->lan_vsi_idx]); }
/* Now that we've notified HW to disable all of the VF rings, wait * until they finish. */ - for (v = 0; v < pf->num_alloc_vfs; v++) { + for (vf = &pf->vf[0]; vf < &pf->vf[pf->num_alloc_vfs]; ++vf) { /* On initial reset, we don't have any queues to disable */ - if (pf->vf[v].lan_vsi_idx == 0) + if (vf->lan_vsi_idx == 0) continue;
/* If VF is reset in another thread just continue */ if (test_bit(I40E_VF_STATE_RESETTING, &vf->vf_states)) continue;
- i40e_vsi_wait_queues_disabled(pf->vsi[pf->vf[v].lan_vsi_idx]); + i40e_vsi_wait_queues_disabled(pf->vsi[vf->lan_vsi_idx]); }
/* Hw may need up to 50ms to finish disabling the RX queues. We @@ -1666,12 +1664,12 @@ bool i40e_reset_all_vfs(struct i40e_pf * mdelay(50);
/* Finish the reset on each VF */ - for (v = 0; v < pf->num_alloc_vfs; v++) { + for (vf = &pf->vf[0]; vf < &pf->vf[pf->num_alloc_vfs]; ++vf) { /* If VF is reset in another thread just continue */ if (test_bit(I40E_VF_STATE_RESETTING, &vf->vf_states)) continue;
- i40e_cleanup_reset_vf(&pf->vf[v]); + i40e_cleanup_reset_vf(vf); }
i40e_flush(hw);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bikash Hazarika bhazarika@marvell.com
[ Upstream commit 1ccad27716ecad1fd58c35e579bedb81fa5e1ad5 ]
Update manufacturer details to indicate Marvell Semiconductors.
Link: https://lore.kernel.org/r/20220713052045.10683-10-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Bikash Hazarika bhazarika@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Stable-dep-of: 688fa069fda6 ("scsi: qla2xxx: Update manufacturer detail") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_def.h | 2 +- drivers/scsi/qla2xxx/qla_gs.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 6645b69fc2a0f..7dfd93cb4674d 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -56,7 +56,7 @@ typedef struct { #include "qla_nvme.h" #define QLA2XXX_DRIVER_NAME "qla2xxx" #define QLA2XXX_APIDEV "ql2xapidev" -#define QLA2XXX_MANUFACTURER "QLogic Corporation" +#define QLA2XXX_MANUFACTURER "Marvell Semiconductor, Inc."
/* * We have MAILBOX_REGISTER_COUNT sized arrays in a few places, diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c index 20bbd69e35e51..d9ac17dbad789 100644 --- a/drivers/scsi/qla2xxx/qla_gs.c +++ b/drivers/scsi/qla2xxx/qla_gs.c @@ -1614,7 +1614,7 @@ qla2x00_hba_attributes(scsi_qla_host_t *vha, void *entries, eiter->type = cpu_to_be16(FDMI_HBA_MANUFACTURER); alen = scnprintf( eiter->a.manufacturer, sizeof(eiter->a.manufacturer), - "%s", "QLogic Corporation"); + "%s", QLA2XXX_MANUFACTURER); alen += FDMI_ATTR_ALIGNMENT(alen); alen += FDMI_ATTR_TYPELEN(eiter); eiter->len = cpu_to_be16(alen);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bikash Hazarika bhazarika@marvell.com
[ Upstream commit 688fa069fda6fce24d243cddfe0c7024428acb74 ]
Update manufacturer detail from "Marvell Semiconductor, Inc." to "Marvell".
Cc: stable@vger.kernel.org Signed-off-by: Bikash Hazarika bhazarika@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20240227164127.36465-5-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_def.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 7dfd93cb4674d..b8628bceb3aeb 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -56,7 +56,7 @@ typedef struct { #include "qla_nvme.h" #define QLA2XXX_DRIVER_NAME "qla2xxx" #define QLA2XXX_APIDEV "ql2xapidev" -#define QLA2XXX_MANUFACTURER "Marvell Semiconductor, Inc." +#define QLA2XXX_MANUFACTURER "Marvell"
/* * We have MAILBOX_REGISTER_COUNT sized arrays in a few places,
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Stein alexander.stein@ew.tq-group.com
[ Upstream commit fdada0db0b2ae2addef4ccafe50937874dbeeebe ]
This reverts commit 75fd6485cccef269ac9eb3b71cf56753341195ef. This patch was applied twice by accident, causing probe failures. Revert the accident.
Signed-off-by: Alexander Stein alexander.stein@ew.tq-group.com Fixes: 75fd6485ccce ("usb: phy: generic: Get the vbus supply") Cc: stable stable@kernel.org Reviewed-by: Sean Anderson sean.anderson@seco.com Link: https://lore.kernel.org/r/20240314092628.1869414-1-alexander.stein@ew.tq-gro... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/phy/phy-generic.c | 7 ------- 1 file changed, 7 deletions(-)
diff --git a/drivers/usb/phy/phy-generic.c b/drivers/usb/phy/phy-generic.c index 34b9f81401871..661a229c105dd 100644 --- a/drivers/usb/phy/phy-generic.c +++ b/drivers/usb/phy/phy-generic.c @@ -268,13 +268,6 @@ int usb_phy_gen_create_phy(struct device *dev, struct usb_phy_generic *nop) return -EPROBE_DEFER; }
- nop->vbus_draw = devm_regulator_get_exclusive(dev, "vbus"); - if (PTR_ERR(nop->vbus_draw) == -ENODEV) - nop->vbus_draw = NULL; - if (IS_ERR(nop->vbus_draw)) - return dev_err_probe(dev, PTR_ERR(nop->vbus_draw), - "could not get vbus regulator\n"); - nop->dev = dev; nop->phy.dev = nop->dev; nop->phy.label = "nop-xceiv";
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Antoine Tenart atenart@kernel.org
[ Upstream commit 3d010c8031e39f5fa1e8b13ada77e0321091011f ]
When rx-udp-gro-forwarding is enabled UDP packets might be GROed when being forwarded. If such packets might land in a tunnel this can cause various issues and udp_gro_receive makes sure this isn't the case by looking for a matching socket. This is performed in udp4/6_gro_lookup_skb but only in the current netns. This is an issue with tunneled packets when the endpoint is in another netns. In such cases the packets will be GROed at the UDP level, which leads to various issues later on. The same thing can happen with rx-gro-list.
We saw this with geneve packets being GROed at the UDP level. In such case gso_size is set; later the packet goes through the geneve rx path, the geneve header is pulled, the offset are adjusted and frag_list skbs are not adjusted with regard to geneve. When those skbs hit skb_fragment, it will misbehave. Different outcomes are possible depending on what the GROed skbs look like; from corrupted packets to kernel crashes.
One example is a BUG_ON[1] triggered in skb_segment while processing the frag_list. Because gso_size is wrong (geneve header was pulled) skb_segment thinks there is "geneve header size" of data in frag_list, although it's in fact the next packet. The BUG_ON itself has nothing to do with the issue. This is only one of the potential issues.
Looking up for a matching socket in udp_gro_receive is fragile: the lookup could be extended to all netns (not speaking about performances) but nothing prevents those packets from being modified in between and we could still not find a matching socket. It's OK to keep the current logic there as it should cover most cases but we also need to make sure we handle tunnel packets being GROed too early.
This is done by extending the checks in udp_unexpected_gso: GSO packets lacking the SKB_GSO_UDP_TUNNEL/_CSUM bits and landing in a tunnel must be segmented.
[1] kernel BUG at net/core/skbuff.c:4408! RIP: 0010:skb_segment+0xd2a/0xf70 __udp_gso_segment+0xaa/0x560
Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets") Signed-off-by: Antoine Tenart atenart@kernel.org Reviewed-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/udp.h | 28 ++++++++++++++++++++++++++++ net/ipv4/udp.c | 7 +++++++ net/ipv4/udp_offload.c | 5 +++++ net/ipv6/udp.c | 2 +- 4 files changed, 41 insertions(+), 1 deletion(-)
diff --git a/include/linux/udp.h b/include/linux/udp.h index ae58ff3b6b5b8..a220880019d6b 100644 --- a/include/linux/udp.h +++ b/include/linux/udp.h @@ -131,6 +131,24 @@ static inline void udp_cmsg_recv(struct msghdr *msg, struct sock *sk, } }
+DECLARE_STATIC_KEY_FALSE(udp_encap_needed_key); +#if IS_ENABLED(CONFIG_IPV6) +DECLARE_STATIC_KEY_FALSE(udpv6_encap_needed_key); +#endif + +static inline bool udp_encap_needed(void) +{ + if (static_branch_unlikely(&udp_encap_needed_key)) + return true; + +#if IS_ENABLED(CONFIG_IPV6) + if (static_branch_unlikely(&udpv6_encap_needed_key)) + return true; +#endif + + return false; +} + static inline bool udp_unexpected_gso(struct sock *sk, struct sk_buff *skb) { if (!skb_is_gso(skb)) @@ -142,6 +160,16 @@ static inline bool udp_unexpected_gso(struct sock *sk, struct sk_buff *skb) if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST && !udp_sk(sk)->accept_udp_fraglist) return true;
+ /* GSO packets lacking the SKB_GSO_UDP_TUNNEL/_CSUM bits might still + * land in a tunnel as the socket check in udp_gro_receive cannot be + * foolproof. + */ + if (udp_encap_needed() && + READ_ONCE(udp_sk(sk)->encap_rcv) && + !(skb_shinfo(skb)->gso_type & + (SKB_GSO_UDP_TUNNEL | SKB_GSO_UDP_TUNNEL_CSUM))) + return true; + return false; }
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index b2541c7d7c87f..0b7e76e6f2028 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -602,6 +602,13 @@ static inline bool __udp_is_mcast_sock(struct net *net, struct sock *sk, }
DEFINE_STATIC_KEY_FALSE(udp_encap_needed_key); +EXPORT_SYMBOL(udp_encap_needed_key); + +#if IS_ENABLED(CONFIG_IPV6) +DEFINE_STATIC_KEY_FALSE(udpv6_encap_needed_key); +EXPORT_SYMBOL(udpv6_encap_needed_key); +#endif + void udp_encap_enable(void) { static_branch_inc(&udp_encap_needed_key); diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c index 9f6dcdaf86a4d..445d8bc30fdd1 100644 --- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -512,6 +512,11 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb, unsigned int off = skb_gro_offset(skb); int flush = 1;
+ /* We can do L4 aggregation only if the packet can't land in a tunnel + * otherwise we could corrupt the inner stream. Detecting such packets + * cannot be foolproof and the aggregation might still happen in some + * cases. Such packets should be caught in udp_unexpected_gso later. + */ NAPI_GRO_CB(skb)->is_flist = 0; if (skb->dev->features & NETIF_F_GRO_FRAGLIST) NAPI_GRO_CB(skb)->is_flist = sk ? !udp_sk(sk)->gro_enabled: 1; diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 5385037209a6b..b5d879f2501da 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -474,7 +474,7 @@ int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, goto try_again; }
-DEFINE_STATIC_KEY_FALSE(udpv6_encap_needed_key); +DECLARE_STATIC_KEY_FALSE(udpv6_encap_needed_key); void udpv6_encap_enable(void) { static_branch_inc(&udpv6_encap_needed_key);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Barker paul.barker.ct@bp.renesas.com
[ Upstream commit 596a4254915f94c927217fe09c33a6828f33fb25 ]
The TX queue should be serviced each time the poll function is called, even if the full RX work budget has been consumed. This prevents starvation of the TX queue when RX bandwidth usage is high.
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Paul Barker paul.barker.ct@bp.renesas.com Reviewed-by: Sergey Shtylyov s.shtylyov@omp.ru Link: https://lore.kernel.org/r/20240402145305.82148-1-paul.barker.ct@bp.renesas.c... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/renesas/ravb_main.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c index 8a4dff0566f7d..b08478aabc6e6 100644 --- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -911,12 +911,12 @@ static int ravb_poll(struct napi_struct *napi, int budget) int q = napi - priv->napi; int mask = BIT(q); int quota = budget; + bool unmask;
/* Processing RX Descriptor Ring */ /* Clear RX interrupt */ ravb_write(ndev, ~(mask | RIS0_RESERVED), RIS0); - if (ravb_rx(ndev, "a, q)) - goto out; + unmask = !ravb_rx(ndev, "a, q);
/* Processing RX Descriptor Ring */ spin_lock_irqsave(&priv->lock, flags); @@ -926,6 +926,9 @@ static int ravb_poll(struct napi_struct *napi, int budget) netif_wake_subqueue(ndev, q); spin_unlock_irqrestore(&priv->lock, flags);
+ if (!unmask) + goto out; + napi_complete(napi);
/* Re-enable RX/TX interrupts */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Venkata Lakshmi Narayana Gubba gubbaven@codeaurora.org
[ Upstream commit a307a9773420dc7d385991f61fbede2fe100bd78 ]
Removed voting for RPMH_RF_CLK2 which is not required as it is getting managed by BT SoC through SW_CTRL line.
Cc: Matthias Kaehlcke mka@chromium.org Signed-off-by: Venkata Lakshmi Narayana Gubba gubbaven@codeaurora.org Signed-off-by: Douglas Anderson dianders@chromium.org Reviewed-by: Matthias Kaehlcke mka@chromium.org Reviewed-by: Stephen Boyd swboyd@chromium.org Link: https://lore.kernel.org/r/20210301133318.v2.8.I80c268f163e6d49a70af1238be442... Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Stable-dep-of: e12e28009e58 ("arm64: dts: qcom: sc7180-trogdor: mark bluetooth address as broken") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi | 1 - 1 file changed, 1 deletion(-)
diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi index cb2c47f13a8a4..3f5883e8bf319 100644 --- a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi @@ -810,7 +810,6 @@ ap_spi_fp: &spi10 { vddrf-supply = <&pp1300_l2c>; vddch0-supply = <&pp3300_l10c>; max-speed = <3200000>; - clocks = <&rpmhcc RPMH_RF_CLK2>; }; };
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
[ Upstream commit e12e28009e584c8f8363439f6a928ec86278a106 ]
Several Qualcomm Bluetooth controllers lack persistent storage for the device address and instead one can be provided by the boot firmware using the 'local-bd-address' devicetree property.
The Bluetooth bindings clearly states that the address should be specified in little-endian order, but due to a long-standing bug in the Qualcomm driver which reversed the address some boot firmware has been providing the address in big-endian order instead.
The boot firmware in SC7180 Trogdor Chromebooks is known to be affected so mark the 'local-bd-address' property as broken to maintain backwards compatibility with older firmware when fixing the underlying driver bug.
Note that ChromeOS always updates the kernel and devicetree in lockstep so that there is no need to handle backwards compatibility with older devicetrees.
Fixes: 7ec3e67307f8 ("arm64: dts: qcom: sc7180-trogdor: add initial trogdor and lazor dt") Cc: stable@vger.kernel.org # 5.10 Cc: Rob Clark robdclark@chromium.org Reviewed-by: Douglas Anderson dianders@chromium.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Acked-by: Bjorn Andersson andersson@kernel.org Reviewed-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi index 3f5883e8bf319..9ce8bfbf7ea21 100644 --- a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi @@ -810,6 +810,8 @@ ap_spi_fp: &spi10 { vddrf-supply = <&pp1300_l2c>; vddch0-supply = <&pp3300_l10c>; max-speed = <3200000>; + + qcom,local-bd-address-broken; }; };
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephen Lee slee08177@gmail.com
[ Upstream commit fc563aa900659a850e2ada4af26b9d7a3de6c591 ]
In snd_soc_info_volsw(), mask is generated by figuring out the index of the most significant bit set in max and converting the index to a bitmask through bit shift 1. Unintended wraparound occurs when max is an integer value with msb bit set. Since the bit shift value 1 is treated as an integer type, the left shift operation will wraparound and set mask to 0 instead of all 1's. In order to fix this, we type cast 1 as `1ULL` to prevent the wraparound.
Fixes: 7077148fb50a ("ASoC: core: Split ops out of soc-core.c") Signed-off-by: Stephen Lee slee08177@gmail.com Link: https://msgid.link/r/20240326010131.6211-1-slee08177@gmail.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/soc-ops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c index daecd386d5ec8..a83cd8d8a9633 100644 --- a/sound/soc/soc-ops.c +++ b/sound/soc/soc-ops.c @@ -246,7 +246,7 @@ int snd_soc_get_volsw(struct snd_kcontrol *kcontrol, int max = mc->max; int min = mc->min; int sign_bit = mc->sign_bit; - unsigned int mask = (1 << fls(max)) - 1; + unsigned int mask = (1ULL << fls(max)) - 1; unsigned int invert = mc->invert; int val; int ret;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 52f80bb181a9a1530ade30bc18991900bbb9697f ]
gcc warns about a memcpy() with overlapping pointers because of an incorrect size calculation:
In file included from include/linux/string.h:369, from drivers/ata/sata_sx4.c:66: In function 'memcpy_fromio', inlined from 'pdc20621_get_from_dimm.constprop' at drivers/ata/sata_sx4.c:962:2: include/linux/fortify-string.h:97:33: error: '__builtin_memcpy' accessing 4294934464 bytes at offsets 0 and [16, 16400] overlaps 6442385281 bytes at offset -2147450817 [-Werror=restrict] 97 | #define __underlying_memcpy __builtin_memcpy | ^ include/linux/fortify-string.h:620:9: note: in expansion of macro '__underlying_memcpy' 620 | __underlying_##op(p, q, __fortify_size); \ | ^~~~~~~~~~~~~ include/linux/fortify-string.h:665:26: note: in expansion of macro '__fortify_memcpy_chk' 665 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ | ^~~~~~~~~~~~~~~~~~~~ include/asm-generic/io.h:1184:9: note: in expansion of macro 'memcpy' 1184 | memcpy(buffer, __io_virt(addr), size); | ^~~~~~
The problem here is the overflow of an unsigned 32-bit number to a negative that gets converted into a signed 'long', keeping a large positive number.
Replace the complex calculation with a more readable min() variant that avoids the warning.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/sata_sx4.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/ata/sata_sx4.c b/drivers/ata/sata_sx4.c index 4c01190a5e370..c95685f693a68 100644 --- a/drivers/ata/sata_sx4.c +++ b/drivers/ata/sata_sx4.c @@ -1004,8 +1004,7 @@ static void pdc20621_get_from_dimm(struct ata_host *host, void *psource,
offset -= (idx * window_size); idx++; - dist = ((long) (window_size - (offset + size))) >= 0 ? size : - (long) (window_size - offset); + dist = min(size, window_size - offset); memcpy_fromio(psource, dimm_mmio + offset / 4, dist);
psource += dist; @@ -1053,8 +1052,7 @@ static void pdc20621_put_to_dimm(struct ata_host *host, void *psource, readl(mmio + PDC_DIMM_WINDOW_CTLR); offset -= (idx * window_size); idx++; - dist = ((long)(s32)(window_size - (offset + size))) >= 0 ? size : - (long) (window_size - offset); + dist = min(size, window_size - offset); memcpy_toio(dimm_mmio + offset / 4, psource, dist); writel(0x01, mmio + PDC_GENERAL_CTLR); readl(mmio + PDC_GENERAL_CTLR);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 1197c5b2099f716b3de327437fb50900a0b936c9 ]
The myrb and myrs drivers use an odd way of implementing their sysfs files, calling snprintf() with a fixed length of 32 bytes to print into a page sized buffer. One of the strings is actually longer than 32 bytes, which clang can warn about:
drivers/scsi/myrb.c:1906:10: error: 'snprintf' will always be truncated; specified size is 32, but format string expands to at least 34 [-Werror,-Wformat-truncation] drivers/scsi/myrs.c:1089:10: error: 'snprintf' will always be truncated; specified size is 32, but format string expands to at least 34 [-Werror,-Wformat-truncation]
These could all be plain sprintf() without a length as the buffer is always long enough. On the other hand, sysfs files should not be overly long either, so just double the length to make sure the longest strings don't get truncated here.
Fixes: 77266186397c ("scsi: myrs: Add Mylex RAID controller (SCSI interface)") Fixes: 081ff398c56c ("scsi: myrb: Add Mylex RAID controller (block interface)") Signed-off-by: Arnd Bergmann arnd@arndb.de Link: https://lore.kernel.org/r/20240326223825.4084412-8-arnd@kernel.org Reviewed-by: Hannes Reinecke hare@suse.de Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/myrb.c | 20 ++++++++++---------- drivers/scsi/myrs.c | 24 ++++++++++++------------ 2 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/drivers/scsi/myrb.c b/drivers/scsi/myrb.c index ad17c2beaacad..d4c3f00faf1ba 100644 --- a/drivers/scsi/myrb.c +++ b/drivers/scsi/myrb.c @@ -1803,9 +1803,9 @@ static ssize_t raid_state_show(struct device *dev,
name = myrb_devstate_name(ldev_info->state); if (name) - ret = snprintf(buf, 32, "%s\n", name); + ret = snprintf(buf, 64, "%s\n", name); else - ret = snprintf(buf, 32, "Invalid (%02X)\n", + ret = snprintf(buf, 64, "Invalid (%02X)\n", ldev_info->state); } else { struct myrb_pdev_state *pdev_info = sdev->hostdata; @@ -1824,9 +1824,9 @@ static ssize_t raid_state_show(struct device *dev, else name = myrb_devstate_name(pdev_info->state); if (name) - ret = snprintf(buf, 32, "%s\n", name); + ret = snprintf(buf, 64, "%s\n", name); else - ret = snprintf(buf, 32, "Invalid (%02X)\n", + ret = snprintf(buf, 64, "Invalid (%02X)\n", pdev_info->state); } return ret; @@ -1914,11 +1914,11 @@ static ssize_t raid_level_show(struct device *dev,
name = myrb_raidlevel_name(ldev_info->raid_level); if (!name) - return snprintf(buf, 32, "Invalid (%02X)\n", + return snprintf(buf, 64, "Invalid (%02X)\n", ldev_info->state); - return snprintf(buf, 32, "%s\n", name); + return snprintf(buf, 64, "%s\n", name); } - return snprintf(buf, 32, "Physical Drive\n"); + return snprintf(buf, 64, "Physical Drive\n"); } static DEVICE_ATTR_RO(raid_level);
@@ -1931,15 +1931,15 @@ static ssize_t rebuild_show(struct device *dev, unsigned char status;
if (sdev->channel < myrb_logical_channel(sdev->host)) - return snprintf(buf, 32, "physical device - not rebuilding\n"); + return snprintf(buf, 64, "physical device - not rebuilding\n");
status = myrb_get_rbld_progress(cb, &rbld_buf);
if (rbld_buf.ldev_num != sdev->id || status != MYRB_STATUS_SUCCESS) - return snprintf(buf, 32, "not rebuilding\n"); + return snprintf(buf, 64, "not rebuilding\n");
- return snprintf(buf, 32, "rebuilding block %u of %u\n", + return snprintf(buf, 64, "rebuilding block %u of %u\n", rbld_buf.ldev_size - rbld_buf.blocks_left, rbld_buf.ldev_size); } diff --git a/drivers/scsi/myrs.c b/drivers/scsi/myrs.c index e6a6678967e52..857a73e856a14 100644 --- a/drivers/scsi/myrs.c +++ b/drivers/scsi/myrs.c @@ -950,9 +950,9 @@ static ssize_t raid_state_show(struct device *dev,
name = myrs_devstate_name(ldev_info->dev_state); if (name) - ret = snprintf(buf, 32, "%s\n", name); + ret = snprintf(buf, 64, "%s\n", name); else - ret = snprintf(buf, 32, "Invalid (%02X)\n", + ret = snprintf(buf, 64, "Invalid (%02X)\n", ldev_info->dev_state); } else { struct myrs_pdev_info *pdev_info; @@ -961,9 +961,9 @@ static ssize_t raid_state_show(struct device *dev, pdev_info = sdev->hostdata; name = myrs_devstate_name(pdev_info->dev_state); if (name) - ret = snprintf(buf, 32, "%s\n", name); + ret = snprintf(buf, 64, "%s\n", name); else - ret = snprintf(buf, 32, "Invalid (%02X)\n", + ret = snprintf(buf, 64, "Invalid (%02X)\n", pdev_info->dev_state); } return ret; @@ -1069,13 +1069,13 @@ static ssize_t raid_level_show(struct device *dev, ldev_info = sdev->hostdata; name = myrs_raid_level_name(ldev_info->raid_level); if (!name) - return snprintf(buf, 32, "Invalid (%02X)\n", + return snprintf(buf, 64, "Invalid (%02X)\n", ldev_info->dev_state);
} else name = myrs_raid_level_name(MYRS_RAID_PHYSICAL);
- return snprintf(buf, 32, "%s\n", name); + return snprintf(buf, 64, "%s\n", name); } static DEVICE_ATTR_RO(raid_level);
@@ -1089,7 +1089,7 @@ static ssize_t rebuild_show(struct device *dev, unsigned char status;
if (sdev->channel < cs->ctlr_info->physchan_present) - return snprintf(buf, 32, "physical device - not rebuilding\n"); + return snprintf(buf, 64, "physical device - not rebuilding\n");
ldev_info = sdev->hostdata; ldev_num = ldev_info->ldev_num; @@ -1101,11 +1101,11 @@ static ssize_t rebuild_show(struct device *dev, return -EIO; } if (ldev_info->rbld_active) { - return snprintf(buf, 32, "rebuilding block %zu of %zu\n", + return snprintf(buf, 64, "rebuilding block %zu of %zu\n", (size_t)ldev_info->rbld_lba, (size_t)ldev_info->cfg_devsize); } else - return snprintf(buf, 32, "not rebuilding\n"); + return snprintf(buf, 64, "not rebuilding\n"); }
static ssize_t rebuild_store(struct device *dev, @@ -1194,7 +1194,7 @@ static ssize_t consistency_check_show(struct device *dev, unsigned char status;
if (sdev->channel < cs->ctlr_info->physchan_present) - return snprintf(buf, 32, "physical device - not checking\n"); + return snprintf(buf, 64, "physical device - not checking\n");
ldev_info = sdev->hostdata; if (!ldev_info) @@ -1202,11 +1202,11 @@ static ssize_t consistency_check_show(struct device *dev, ldev_num = ldev_info->ldev_num; status = myrs_get_ldev_info(cs, ldev_num, ldev_info); if (ldev_info->cc_active) - return snprintf(buf, 32, "checking block %zu of %zu\n", + return snprintf(buf, 64, "checking block %zu of %zu\n", (size_t)ldev_info->cc_lba, (size_t)ldev_info->cfg_devsize); else - return snprintf(buf, 32, "not checking\n"); + return snprintf(buf, 64, "not checking\n"); }
static ssize_t consistency_check_store(struct device *dev,
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 3137b83a90646917c90951d66489db466b4ae106 ]
Building with W=1 shows a warning for an unused variable when CONFIG_PCI is diabled:
drivers/ata/sata_mv.c:790:35: error: unused variable 'mv_pci_tbl' [-Werror,-Wunused-const-variable] static const struct pci_device_id mv_pci_tbl[] = {
Move the table into the same block that containsn the pci_driver definition.
Fixes: 7bb3c5290ca0 ("sata_mv: Remove PCI dependency") Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/sata_mv.c | 63 +++++++++++++++++++++---------------------- 1 file changed, 31 insertions(+), 32 deletions(-)
diff --git a/drivers/ata/sata_mv.c b/drivers/ata/sata_mv.c index 11be88f70690e..cede4b517646f 100644 --- a/drivers/ata/sata_mv.c +++ b/drivers/ata/sata_mv.c @@ -783,37 +783,6 @@ static const struct ata_port_info mv_port_info[] = { }, };
-static const struct pci_device_id mv_pci_tbl[] = { - { PCI_VDEVICE(MARVELL, 0x5040), chip_504x }, - { PCI_VDEVICE(MARVELL, 0x5041), chip_504x }, - { PCI_VDEVICE(MARVELL, 0x5080), chip_5080 }, - { PCI_VDEVICE(MARVELL, 0x5081), chip_508x }, - /* RocketRAID 1720/174x have different identifiers */ - { PCI_VDEVICE(TTI, 0x1720), chip_6042 }, - { PCI_VDEVICE(TTI, 0x1740), chip_6042 }, - { PCI_VDEVICE(TTI, 0x1742), chip_6042 }, - - { PCI_VDEVICE(MARVELL, 0x6040), chip_604x }, - { PCI_VDEVICE(MARVELL, 0x6041), chip_604x }, - { PCI_VDEVICE(MARVELL, 0x6042), chip_6042 }, - { PCI_VDEVICE(MARVELL, 0x6080), chip_608x }, - { PCI_VDEVICE(MARVELL, 0x6081), chip_608x }, - - { PCI_VDEVICE(ADAPTEC2, 0x0241), chip_604x }, - - /* Adaptec 1430SA */ - { PCI_VDEVICE(ADAPTEC2, 0x0243), chip_7042 }, - - /* Marvell 7042 support */ - { PCI_VDEVICE(MARVELL, 0x7042), chip_7042 }, - - /* Highpoint RocketRAID PCIe series */ - { PCI_VDEVICE(TTI, 0x2300), chip_7042 }, - { PCI_VDEVICE(TTI, 0x2310), chip_7042 }, - - { } /* terminate list */ -}; - static const struct mv_hw_ops mv5xxx_ops = { .phy_errata = mv5_phy_errata, .enable_leds = mv5_enable_leds, @@ -4307,6 +4276,36 @@ static int mv_pci_init_one(struct pci_dev *pdev, static int mv_pci_device_resume(struct pci_dev *pdev); #endif
+static const struct pci_device_id mv_pci_tbl[] = { + { PCI_VDEVICE(MARVELL, 0x5040), chip_504x }, + { PCI_VDEVICE(MARVELL, 0x5041), chip_504x }, + { PCI_VDEVICE(MARVELL, 0x5080), chip_5080 }, + { PCI_VDEVICE(MARVELL, 0x5081), chip_508x }, + /* RocketRAID 1720/174x have different identifiers */ + { PCI_VDEVICE(TTI, 0x1720), chip_6042 }, + { PCI_VDEVICE(TTI, 0x1740), chip_6042 }, + { PCI_VDEVICE(TTI, 0x1742), chip_6042 }, + + { PCI_VDEVICE(MARVELL, 0x6040), chip_604x }, + { PCI_VDEVICE(MARVELL, 0x6041), chip_604x }, + { PCI_VDEVICE(MARVELL, 0x6042), chip_6042 }, + { PCI_VDEVICE(MARVELL, 0x6080), chip_608x }, + { PCI_VDEVICE(MARVELL, 0x6081), chip_608x }, + + { PCI_VDEVICE(ADAPTEC2, 0x0241), chip_604x }, + + /* Adaptec 1430SA */ + { PCI_VDEVICE(ADAPTEC2, 0x0243), chip_7042 }, + + /* Marvell 7042 support */ + { PCI_VDEVICE(MARVELL, 0x7042), chip_7042 }, + + /* Highpoint RocketRAID PCIe series */ + { PCI_VDEVICE(TTI, 0x2300), chip_7042 }, + { PCI_VDEVICE(TTI, 0x2310), chip_7042 }, + + { } /* terminate list */ +};
static struct pci_driver mv_pci_driver = { .name = DRV_NAME, @@ -4319,6 +4318,7 @@ static struct pci_driver mv_pci_driver = { #endif
}; +MODULE_DEVICE_TABLE(pci, mv_pci_tbl);
/** * mv_print_info - Dump key info to kernel log for perusal. @@ -4491,7 +4491,6 @@ static void __exit mv_exit(void) MODULE_AUTHOR("Brett Russ"); MODULE_DESCRIPTION("SCSI low-level driver for Marvell SATA controllers"); MODULE_LICENSE("GPL v2"); -MODULE_DEVICE_TABLE(pci, mv_pci_tbl); MODULE_VERSION(DRV_VERSION); MODULE_ALIAS("platform:" DRV_NAME);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: I Gede Agastya Darma Laksana gedeagas22@gmail.com
commit 1576f263ee2147dc395531476881058609ad3d38 upstream.
This patch addresses an issue with the Panasonic CF-SZ6's existing quirk, specifically its headset microphone functionality. Previously, the quirk used ALC269_FIXUP_HEADSET_MODE, which does not support the CF-SZ6's design of a single 3.5mm jack for both mic and audio output effectively. The device uses pin 0x19 for the headset mic without jack detection.
Following verification on the CF-SZ6 and discussions with the original patch author, i determined that the update to ALC269_FIXUP_ASPIRE_HEADSET_MIC is the appropriate solution. This change is custom-designed for the CF-SZ6's unique hardware setup, which includes a single 3.5mm jack for both mic and audio output, connecting the headset microphone to pin 0x19 without the use of jack detection.
Fixes: 0fca97a29b83 ("ALSA: hda/realtek - Add Panasonic CF-SZ6 headset jack quirk") Signed-off-by: I Gede Agastya Darma Laksana gedeagas22@gmail.com Cc: stable@vger.kernel.org Message-ID: 20240401174602.14133-1-gedeagas22@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9192,7 +9192,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x10ec, 0x124c, "Intel Reference board", ALC295_FIXUP_CHROME_BOOK), SND_PCI_QUIRK(0x10ec, 0x1252, "Intel Reference board", ALC295_FIXUP_CHROME_BOOK), SND_PCI_QUIRK(0x10ec, 0x1254, "Intel Reference board", ALC295_FIXUP_CHROME_BOOK), - SND_PCI_QUIRK(0x10f7, 0x8338, "Panasonic CF-SZ6", ALC269_FIXUP_HEADSET_MODE), + SND_PCI_QUIRK(0x10f7, 0x8338, "Panasonic CF-SZ6", ALC269_FIXUP_ASPIRE_HEADSET_MIC), SND_PCI_QUIRK(0x144d, 0xc109, "Samsung Ativ book 9 (NP900X3G)", ALC269_FIXUP_INV_DMIC), SND_PCI_QUIRK(0x144d, 0xc169, "Samsung Notebook 9 Pen (NP930SBE-K01US)", ALC298_FIXUP_SAMSUNG_AMP), SND_PCI_QUIRK(0x144d, 0xc176, "Samsung Notebook 9 Pro (NP930MBE-K04US)", ALC298_FIXUP_SAMSUNG_AMP),
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Herve Codina herve.codina@bootlin.com
commit 0462c56c290a99a7f03e817ae5b843116dfb575c upstream.
The commit 80dd33cf72d1 ("drivers: base: Fix device link removal") introduces a workqueue to release the consumer and supplier devices used in the devlink. In the job queued, devices are release and in turn, when all the references to these devices are dropped, the release function of the device itself is called.
Nothing is present to provide some synchronisation with this workqueue in order to ensure that all ongoing releasing operations are done and so, some other operations can be started safely.
For instance, in the following sequence: 1) of_platform_depopulate() 2) of_overlay_remove()
During the step 1, devices are released and related devlinks are removed (jobs pushed in the workqueue). During the step 2, OF nodes are destroyed but, without any synchronisation with devlink removal jobs, of_overlay_remove() can raise warnings related to missing of_node_put(): ERROR: memory leak, expected refcount 1 instead of 2
Indeed, the missing of_node_put() call is going to be done, too late, from the workqueue job execution.
Introduce device_link_wait_removal() to offer a way to synchronize operations waiting for the end of devlink removals (i.e. end of workqueue jobs). Also, as a flushing operation is done on the workqueue, the workqueue used is moved from a system-wide workqueue to a local one.
Cc: stable@vger.kernel.org Signed-off-by: Herve Codina herve.codina@bootlin.com Tested-by: Luca Ceresoli luca.ceresoli@bootlin.com Reviewed-by: Nuno Sa nuno.sa@analog.com Reviewed-by: Saravana Kannan saravanak@google.com Acked-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Link: https://lore.kernel.org/r/20240325152140.198219-2-herve.codina@bootlin.com Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/core.c | 26 +++++++++++++++++++++++--- include/linux/device.h | 1 + 2 files changed, 24 insertions(+), 3 deletions(-)
--- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -53,6 +53,7 @@ static unsigned int defer_sync_state_cou static unsigned int defer_fw_devlink_count; static LIST_HEAD(deferred_fw_devlink); static DEFINE_MUTEX(defer_fw_devlink_lock); +static struct workqueue_struct *device_link_wq; static bool fw_devlink_is_permissive(void);
#ifdef CONFIG_SRCU @@ -364,12 +365,26 @@ static void devlink_dev_release(struct d /* * It may take a while to complete this work because of the SRCU * synchronization in device_link_release_fn() and if the consumer or - * supplier devices get deleted when it runs, so put it into the "long" - * workqueue. + * supplier devices get deleted when it runs, so put it into the + * dedicated workqueue. */ - queue_work(system_long_wq, &link->rm_work); + queue_work(device_link_wq, &link->rm_work); }
+/** + * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate + */ +void device_link_wait_removal(void) +{ + /* + * devlink removal jobs are queued in the dedicated work queue. + * To be sure that all removal jobs are terminated, ensure that any + * scheduled work has run to completion. + */ + flush_workqueue(device_link_wq); +} +EXPORT_SYMBOL_GPL(device_link_wait_removal); + static struct class devlink_class = { .name = "devlink", .owner = THIS_MODULE, @@ -3415,9 +3430,14 @@ int __init devices_init(void) sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj); if (!sysfs_dev_char_kobj) goto char_kobj_err; + device_link_wq = alloc_workqueue("device_link_wq", 0, 0); + if (!device_link_wq) + goto wq_err;
return 0;
+ wq_err: + kobject_put(sysfs_dev_char_kobj); char_kobj_err: kobject_put(sysfs_dev_block_kobj); block_kobj_err: --- a/include/linux/device.h +++ b/include/linux/device.h @@ -973,6 +973,7 @@ void device_link_del(struct device_link void device_link_remove(void *consumer, struct device *supplier); void device_links_supplier_sync_state_pause(void); void device_links_supplier_sync_state_resume(void); +void device_link_wait_removal(void);
extern __printf(3, 4) int dev_err_probe(const struct device *dev, int err, const char *fmt, ...);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Herve Codina herve.codina@bootlin.com
commit 8917e7385346bd6584890ed362985c219fe6ae84 upstream.
In the following sequence: 1) of_platform_depopulate() 2) of_overlay_remove()
During the step 1, devices are destroyed and devlinks are removed. During the step 2, OF nodes are destroyed but __of_changeset_entry_destroy() can raise warnings related to missing of_node_put(): ERROR: memory leak, expected refcount 1 instead of 2 ...
Indeed, during the devlink removals performed at step 1, the removal itself releasing the device (and the attached of_node) is done by a job queued in a workqueue and so, it is done asynchronously with respect to function calls. When the warning is present, of_node_put() will be called but wrongly too late from the workqueue job.
In order to be sure that any ongoing devlink removals are done before the of_node destruction, synchronize the of_changeset_destroy() with the devlink removals.
Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal") Cc: stable@vger.kernel.org Signed-off-by: Herve Codina herve.codina@bootlin.com Reviewed-by: Saravana Kannan saravanak@google.com Tested-by: Luca Ceresoli luca.ceresoli@bootlin.com Reviewed-by: Nuno Sa nuno.sa@analog.com Link: https://lore.kernel.org/r/20240325152140.198219-3-herve.codina@bootlin.com Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/dynamic.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/of/dynamic.c +++ b/drivers/of/dynamic.c @@ -9,6 +9,7 @@
#define pr_fmt(fmt) "OF: " fmt
+#include <linux/device.h> #include <linux/of.h> #include <linux/spinlock.h> #include <linux/slab.h> @@ -675,6 +676,17 @@ void of_changeset_destroy(struct of_chan { struct of_changeset_entry *ce, *cen;
+ /* + * When a device is deleted, the device links to/from it are also queued + * for deletion. Until these device links are freed, the devices + * themselves aren't freed. If the device being deleted is due to an + * overlay change, this device might be holding a reference to a device + * node that will be freed. So, wait until all already pending device + * links are deleted before freeing a device node. This ensures we don't + * free any device node that has a non-zero reference count. + */ + device_link_wait_removal(); + list_for_each_entry_safe_reverse(ce, cen, &ocs->entries, node) __of_changeset_entry_destroy(ce); }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov (AMD) bp@alien8.de
commit 3ddf944b32f88741c303f0b21459dbb3872b8bc5 upstream.
Modifying a MCA bank's MCA_CTL bits which control which error types to be reported is done over
/sys/devices/system/machinecheck/ ├── machinecheck0 │ ├── bank0 │ ├── bank1 │ ├── bank10 │ ├── bank11 ...
sysfs nodes by writing the new bit mask of events to enable.
When the write is accepted, the kernel deletes all current timers and reinits all banks.
Doing that in parallel can lead to initializing a timer which is already armed and in the timer wheel, i.e., in use already:
ODEBUG: init active (active state 0) object: ffff888063a28000 object type: timer_list hint: mce_timer_fn+0x0/0x240 arch/x86/kernel/cpu/mce/core.c:2642 WARNING: CPU: 0 PID: 8120 at lib/debugobjects.c:514 debug_print_object+0x1a0/0x2a0 lib/debugobjects.c:514
Fix that by grabbing the sysfs mutex as the rest of the MCA sysfs code does.
Reported by: Yue Sun samsun1006219@gmail.com Reported by: xingwei lee xrivendell7@gmail.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Cc: stable@kernel.org Link: https://lore.kernel.org/r/CAEkJfYNiENwQY8yV1LYJ9LjJs%2Bx_-PqMv98gKig55=2vbzf... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/mce/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -2389,12 +2389,14 @@ static ssize_t set_bank(struct device *s return -EINVAL;
b = &per_cpu(mce_banks_array, s->id)[bank]; - if (!b->init) return -ENODEV;
b->ctl = new; + + mutex_lock(&mce_sysfs_mutex); mce_restart(); + mutex_unlock(&mce_sysfs_mutex);
return size; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sumanth Korikkar sumanthk@linux.ibm.com
commit 378ca2d2ad410a1cd5690d06b46c5e2297f4c8c0 upstream.
Align system call table on 8 bytes. With sys_call_table entry size of 8 bytes that eliminates the possibility of a system call pointer crossing cache line boundary.
Cc: stable@kernel.org Suggested-by: Ulrich Weigand ulrich.weigand@de.ibm.com Reviewed-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Sumanth Korikkar sumanthk@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/kernel/entry.S | 1 + 1 file changed, 1 insertion(+)
--- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -1298,6 +1298,7 @@ ENDPROC(stack_overflow)
#endif .section .rodata, "a" + .balign 8 #define SYSCALL(esame,emu) .quad __s390x_ ## esame .globl sys_call_table sys_call_table:
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Samuel Holland samuel.holland@sifive.com
commit d080a08b06b6266cc3e0e86c5acfd80db937cb6b upstream.
These macros did not initialize __kr_err, so they could fail even if the access did not fault.
Cc: stable@vger.kernel.org Fixes: d464118cdc41 ("riscv: implement __get_kernel_nofault and __put_user_nofault") Signed-off-by: Samuel Holland samuel.holland@sifive.com Reviewed-by: Alexandre Ghiti alexghiti@rivosinc.com Reviewed-by: Charlie Jenkins charlie@rivosinc.com Link: https://lore.kernel.org/r/20240312022030.320789-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/include/asm/uaccess.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/riscv/include/asm/uaccess.h +++ b/arch/riscv/include/asm/uaccess.h @@ -468,7 +468,7 @@ unsigned long __must_check clear_user(vo
#define __get_kernel_nofault(dst, src, type, err_label) \ do { \ - long __kr_err; \ + long __kr_err = 0; \ \ __get_user_nocheck(*((type *)(dst)), (type *)(src), __kr_err); \ if (unlikely(__kr_err)) \ @@ -477,7 +477,7 @@ do { \
#define __put_kernel_nofault(dst, src, type, err_label) \ do { \ - long __kr_err; \ + long __kr_err = 0; \ \ __put_user_nocheck(*((type *)(src)), (type *)(dst), __kr_err); \ if (unlikely(__kr_err)) \
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Borislav Petkov (AMD)" bp@alien8.de
Commit 4535e1a4174c4111d92c5a9a21e542d232e0fcaa upstream.
The original version of the mitigation would patch in the calls to the untraining routines directly. That is, the alternative() in UNTRAIN_RET will patch in the CALL to srso_alias_untrain_ret() directly.
However, even if commit e7c25c441e9e ("x86/cpu: Cleanup the untrain mess") meant well in trying to clean up the situation, due to micro- architectural reasons, the untraining routine srso_alias_untrain_ret() must be the target of a CALL instruction and not of a JMP instruction as it is done now.
Reshuffle the alternative macros to accomplish that.
Fixes: e7c25c441e9e ("x86/cpu: Cleanup the untrain mess") Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Ingo Molnar mingo@kernel.org Cc: stable@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/asm-prototypes.h | 1 + arch/x86/include/asm/nospec-branch.h | 20 ++++++++++++++------ arch/x86/lib/retpoline.S | 4 +--- 3 files changed, 16 insertions(+), 9 deletions(-)
--- a/arch/x86/include/asm/asm-prototypes.h +++ b/arch/x86/include/asm/asm-prototypes.h @@ -12,6 +12,7 @@ #include <asm/special_insns.h> #include <asm/preempt.h> #include <asm/asm.h> +#include <asm/nospec-branch.h>
#ifndef CONFIG_X86_CMPXCHG64 extern void cmpxchg8b_emu(void); --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -155,11 +155,20 @@ .Lskip_rsb_@: .endm
+/* + * The CALL to srso_alias_untrain_ret() must be patched in directly at + * the spot where untraining must be done, ie., srso_alias_untrain_ret() + * must be the target of a CALL instruction instead of indirectly + * jumping to a wrapper which then calls it. Therefore, this macro is + * called outside of __UNTRAIN_RET below, for the time being, before the + * kernel can support nested alternatives with arbitrary nesting. + */ +.macro CALL_UNTRAIN_RET #ifdef CONFIG_CPU_UNRET_ENTRY -#define CALL_UNTRAIN_RET "call entry_untrain_ret" -#else -#define CALL_UNTRAIN_RET "" + ALTERNATIVE_2 "", "call entry_untrain_ret", X86_FEATURE_UNRET, \ + "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS #endif +.endm
/* * Mitigate RETBleed for AMD/Hygon Zen uarch. Requires KERNEL CR3 because the @@ -176,9 +185,8 @@ #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \ defined(CONFIG_CPU_SRSO) ANNOTATE_UNRET_END - ALTERNATIVE_2 "", \ - CALL_UNTRAIN_RET, X86_FEATURE_UNRET, \ - "call entry_ibpb", X86_FEATURE_ENTRY_IBPB + CALL_UNTRAIN_RET + ALTERNATIVE "", "call entry_ibpb", X86_FEATURE_ENTRY_IBPB #endif .endm
--- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -249,9 +249,7 @@ SYM_CODE_START(srso_return_thunk) SYM_CODE_END(srso_return_thunk)
SYM_FUNC_START(entry_untrain_ret) - ALTERNATIVE_2 "jmp retbleed_untrain_ret", \ - "jmp srso_untrain_ret", X86_FEATURE_SRSO, \ - "jmp srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS + ALTERNATIVE "jmp retbleed_untrain_ret", "jmp srso_untrain_ret", X86_FEATURE_SRSO SYM_FUNC_END(entry_untrain_ret) __EXPORT_THUNK(entry_untrain_ret)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Borislav Petkov (AMD)" bp@alien8.de
Commit 0e110732473e14d6520e49d75d2c88ef7d46fe67 upstream.
The srso_alias_untrain_ret() dummy thunk in the !CONFIG_MITIGATION_SRSO case is there only for the altenative in CALL_UNTRAIN_RET to have a symbol to resolve.
However, testing with kernels which don't have CONFIG_MITIGATION_SRSO enabled, leads to the warning in patch_return() to fire:
missing return thunk: srso_alias_untrain_ret+0x0/0x10-0x0: eb 0e 66 66 2e WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:826 apply_returns (arch/x86/kernel/alternative.c:826
Put in a plain "ret" there so that gcc doesn't put a return thunk in in its place which special and gets checked.
In addition:
ERROR: modpost: "srso_alias_untrain_ret" [arch/x86/kvm/kvm-amd.ko] undefined! make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Chyba 1 make[1]: *** [/usr/src/linux-6.8.3/Makefile:1873: modpost] Chyba 2 make: *** [Makefile:240: __sub-make] Chyba 2
since !SRSO builds would use the dummy return thunk as reported by petr.pisar@atlas.cz, https://bugzilla.kernel.org/show_bug.cgi?id=218679.
Reported-by: kernel test robot oliver.sang@intel.com Closes: https://lore.kernel.org/oe-lkp/202404020901.da75a60f-oliver.sang@intel.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/all/202404020901.da75a60f-oliver.sang@intel.com/ Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/lib/retpoline.S | 1 + 1 file changed, 1 insertion(+)
--- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -108,6 +108,7 @@ SYM_START(srso_alias_untrain_ret, SYM_L_ ret int3 SYM_FUNC_END(srso_alias_untrain_ret) +__EXPORT_THUNK(srso_alias_untrain_ret) #endif
SYM_START(srso_alias_safe_ret, SYM_L_GLOBAL, SYM_A_NONE)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Davide Caratti dcaratti@redhat.com
commit 7a1b3490f47e88ec4cbde65f1a77a0f4bc972282 upstream.
Current MPTCP servers increment MPTcpExtMPCapableFallbackACK when they accept non-MPC connections. As reported by Christoph, this is "surprising" because the counter might become greater than MPTcpExtMPCapableSYNRX.
MPTcpExtMPCapableFallbackACK counter's name suggests it should only be incremented when a connection was seen using MPTCP options, then a fallback to TCP has been done. Let's do that by incrementing it when the subflow context of an inbound MPC connection attempt is dropped. Also, update mptcp_connect.sh kselftest, to ensure that the above MIB does not increment in case a pure TCP client connects to a MPTCP server.
Fixes: fc518953bc9c ("mptcp: add and use MIB counter infrastructure") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch cpaasch@apple.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/449 Signed-off-by: Davide Caratti dcaratti@redhat.com Reviewed-by: Mat Martineau martineau@kernel.org Reviewed-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://lore.kernel.org/r/20240329-upstream-net-20240329-fallback-mib-v1-1-3... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 3 --- net/mptcp/subflow.c | 3 +++ 2 files changed, 3 insertions(+), 3 deletions(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2218,9 +2218,6 @@ static struct sock *mptcp_accept(struct
__MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPCAPABLEPASSIVEACK); local_bh_enable(); - } else { - MPTCP_INC_STATS(sock_net(sk), - MPTCP_MIB_MPCAPABLEPASSIVEFALLBACK); }
out: --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -595,6 +595,9 @@ create_child: if (fallback_is_fatal) goto dispose_child;
+ if (fallback) + SUBFLOW_REQ_INC_STATS(req, MPTCP_MIB_MPCAPABLEPASSIVEFALLBACK); + subflow_drop_ctx(child); goto out; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 8cb4a9a82b21623dbb4b3051dd30d98356cf95bc upstream.
Add CPUID_LNX_5 to track cpufeatures' word 21, and add the appropriate compile-time assert in KVM to prevent direct lookups on the features in CPUID_LNX_5. KVM uses X86_FEATURE_* flags to manage guest CPUID, and so must translate features that are scattered by Linux from the Linux-defined bit to the hardware-defined bit, i.e. should never try to directly access scattered features in guest CPUID.
Opportunistically add NR_CPUID_WORDS to enum cpuid_leafs, along with a compile-time assert in KVM's CPUID infrastructure to ensure that future additions update cpuid_leafs along with NCAPINTS.
No functional change intended.
Fixes: 7f274e609f3d ("x86/cpufeatures: Add new word for scattered features") Cc: Sandipan Das sandipan.das@amd.com Signed-off-by: Sean Christopherson seanjc@google.com Acked-by: Dave Hansen dave.hansen@linux.intel.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Daniel Sneddon daniel.sneddon@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/cpufeature.h | 2 ++ arch/x86/kvm/cpuid.h | 2 ++ 2 files changed, 4 insertions(+)
--- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -33,6 +33,8 @@ enum cpuid_leafs CPUID_7_EDX, CPUID_8000_001F_EAX, CPUID_8000_0021_EAX, + CPUID_LNX_5, + NR_CPUID_WORDS, };
#ifdef CONFIG_X86_FEATURE_NAMES --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -76,10 +76,12 @@ static const struct cpuid_reg reverse_cp */ static __always_inline void reverse_cpuid_check(unsigned int x86_leaf) { + BUILD_BUG_ON(NR_CPUID_WORDS != NCAPINTS); BUILD_BUG_ON(x86_leaf == CPUID_LNX_1); BUILD_BUG_ON(x86_leaf == CPUID_LNX_2); BUILD_BUG_ON(x86_leaf == CPUID_LNX_3); BUILD_BUG_ON(x86_leaf == CPUID_LNX_4); + BUILD_BUG_ON(x86_leaf == CPUID_LNX_5); BUILD_BUG_ON(x86_leaf >= ARRAY_SIZE(reverse_cpuid)); BUILD_BUG_ON(reverse_cpuid[x86_leaf].function == 0); }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josh Poimboeuf jpoimboe@redhat.com
commit 081df94301e317e84c3413686043987da2c3e39d upstream.
To be used for adding asm functions to the ignore list. The "aw" is needed to help the ELF section metadata match GCC-created sections. Otherwise the linker creates duplicate sections instead of combining them.
Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Link: https://lore.kernel.org/r/8faa476f9a5ac89af27944ec184c89f95f3c6c49.161126346... Signed-off-by: Daniel Sneddon daniel.sneddon@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/objtool.h | 8 ++++++++ tools/include/linux/objtool.h | 8 ++++++++ 2 files changed, 16 insertions(+)
--- a/include/linux/objtool.h +++ b/include/linux/objtool.h @@ -141,6 +141,12 @@ struct unwind_hint { .popsection .endm
+.macro STACK_FRAME_NON_STANDARD func:req + .pushsection .discard.func_stack_frame_non_standard, "aw" + .long \func - . + .popsection +.endm + #endif /* __ASSEMBLY__ */
#else /* !CONFIG_STACK_VALIDATION */ @@ -158,6 +164,8 @@ struct unwind_hint { .endm .macro ANNOTATE_NOENDBR .endm +.macro STACK_FRAME_NON_STANDARD func:req +.endm #endif
#endif /* CONFIG_STACK_VALIDATION */ --- a/tools/include/linux/objtool.h +++ b/tools/include/linux/objtool.h @@ -141,6 +141,12 @@ struct unwind_hint { .popsection .endm
+.macro STACK_FRAME_NON_STANDARD func:req + .pushsection .discard.func_stack_frame_non_standard, "aw" + .long \func - . + .popsection +.endm + #endif /* __ASSEMBLY__ */
#else /* !CONFIG_STACK_VALIDATION */ @@ -158,6 +164,8 @@ struct unwind_hint { .endm .macro ANNOTATE_NOENDBR .endm +.macro STACK_FRAME_NON_STANDARD func:req +.endm #endif
#endif /* CONFIG_STACK_VALIDATION */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Antipov dmantipov@yandex.ru
[ Upstream commit d6b27eb997ef9a2aa51633b3111bc4a04748e6d3 ]
In 'ath_ant_try_scan()', (most likely) the 2nd LNA's signal strength should be used in comparison against RSSI when selecting first LNA as the main one. Compile tested only.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Dmitry Antipov dmantipov@yandex.ru Acked-by: Toke Høiland-Jørgensen toke@toke.dk Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://msgid.link/20231211172502.25202-1-dmantipov@yandex.ru Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath9k/antenna.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath9k/antenna.c b/drivers/net/wireless/ath/ath9k/antenna.c index 988222cea9dfe..acc84e6711b0e 100644 --- a/drivers/net/wireless/ath/ath9k/antenna.c +++ b/drivers/net/wireless/ath/ath9k/antenna.c @@ -643,7 +643,7 @@ static void ath_ant_try_scan(struct ath_ant_comb *antcomb, conf->main_lna_conf = ATH_ANT_DIV_COMB_LNA1; conf->alt_lna_conf = ATH_ANT_DIV_COMB_LNA1_PLUS_LNA2; } else if (antcomb->rssi_sub > - antcomb->rssi_lna1) { + antcomb->rssi_lna2) { /* set to A-B */ conf->main_lna_conf = ATH_ANT_DIV_COMB_LNA1; conf->alt_lna_conf = ATH_ANT_DIV_COMB_LNA1_MINUS_LNA2;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Markus Elfring elfring@users.sourceforge.net
[ Upstream commit ffc15626c861f811f9778914be004fcf43810a91 ]
The kfree() function was called in one case by the batadv_dat_forward_data() function during error handling even if the passed variable contained a null pointer. This issue was detected by using the Coccinelle software.
* Thus return directly after a batadv_dat_select_candidates() call failed at the beginning.
* Delete the label “out” which became unnecessary with this refactoring.
Signed-off-by: Markus Elfring elfring@users.sourceforge.net Acked-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/batman-adv/distributed-arp-table.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/net/batman-adv/distributed-arp-table.c b/net/batman-adv/distributed-arp-table.c index ddd3b4c70a516..b1cb6ecffceb9 100644 --- a/net/batman-adv/distributed-arp-table.c +++ b/net/batman-adv/distributed-arp-table.c @@ -687,7 +687,7 @@ static bool batadv_dat_forward_data(struct batadv_priv *bat_priv,
cand = batadv_dat_select_candidates(bat_priv, ip, vid); if (!cand) - goto out; + return ret;
batadv_dbg(BATADV_DBG_DAT, bat_priv, "DHT_SEND for %pI4\n", &ip);
@@ -731,7 +731,6 @@ static bool batadv_dat_forward_data(struct batadv_priv *bat_priv, batadv_orig_node_put(cand[i].orig_node); }
-out: kfree(cand); return ret; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Markus Elfring elfring@users.sourceforge.net
[ Upstream commit 5593e9abf1cf2bf096366d8c7fd933bc69d561ce ]
The kfree() function was called in up to three cases by the batadv_throw_uevent() function during error handling even if the passed variable contained a null pointer. This issue was detected by using the Coccinelle software.
* Thus adjust jump targets.
* Reorder kfree() calls at the end.
Signed-off-by: Markus Elfring elfring@users.sourceforge.net Acked-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/batman-adv/main.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c index 9f267b190779f..ac3ebdba83040 100644 --- a/net/batman-adv/main.c +++ b/net/batman-adv/main.c @@ -732,29 +732,31 @@ int batadv_throw_uevent(struct batadv_priv *bat_priv, enum batadv_uev_type type, "%s%s", BATADV_UEV_TYPE_VAR, batadv_uev_type_str[type]); if (!uevent_env[0]) - goto out; + goto report_error;
uevent_env[1] = kasprintf(GFP_ATOMIC, "%s%s", BATADV_UEV_ACTION_VAR, batadv_uev_action_str[action]); if (!uevent_env[1]) - goto out; + goto free_first_env;
/* If the event is DEL, ignore the data field */ if (action != BATADV_UEV_DEL) { uevent_env[2] = kasprintf(GFP_ATOMIC, "%s%s", BATADV_UEV_DATA_VAR, data); if (!uevent_env[2]) - goto out; + goto free_second_env; }
ret = kobject_uevent_env(bat_kobj, KOBJ_CHANGE, uevent_env); -out: - kfree(uevent_env[0]); - kfree(uevent_env[1]); kfree(uevent_env[2]); +free_second_env: + kfree(uevent_env[1]); +free_first_env: + kfree(uevent_env[0]);
if (ret) +report_error: batadv_dbg(BATADV_DBG_BATMAN, bat_priv, "Impossible to send uevent for (%s,%s,%s) event (err: %d)\n", batadv_uev_type_str[type],
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com
[ Upstream commit 19b070fefd0d024af3daa7329cbc0d00de5302ec ]
Syzkaller hit 'WARNING in dg_dispatch_as_host' bug.
memcpy: detected field-spanning write (size 56) of single field "&dg_info->msg" at drivers/misc/vmw_vmci/vmci_datagram.c:237 (size 24)
WARNING: CPU: 0 PID: 1555 at drivers/misc/vmw_vmci/vmci_datagram.c:237 dg_dispatch_as_host+0x88e/0xa60 drivers/misc/vmw_vmci/vmci_datagram.c:237
Some code commentry, based on my understanding:
544 #define VMCI_DG_SIZE(_dg) (VMCI_DG_HEADERSIZE + (size_t)(_dg)->payload_size) /// This is 24 + payload_size
memcpy(&dg_info->msg, dg, dg_size); Destination = dg_info->msg ---> this is a 24 byte structure(struct vmci_datagram) Source = dg --> this is a 24 byte structure (struct vmci_datagram) Size = dg_size = 24 + payload_size
{payload_size = 56-24 =32} -- Syzkaller managed to set payload_size to 32.
35 struct delayed_datagram_info { 36 struct datagram_entry *entry; 37 struct work_struct work; 38 bool in_dg_host_queue; 39 /* msg and msg_payload must be together. */ 40 struct vmci_datagram msg; 41 u8 msg_payload[]; 42 };
So those extra bytes of payload are copied into msg_payload[], a run time warning is seen while fuzzing with Syzkaller.
One possible way to fix the warning is to split the memcpy() into two parts -- one -- direct assignment of msg and second taking care of payload.
Gustavo quoted: "Under FORTIFY_SOURCE we should not copy data across multiple members in a structure."
Reported-by: syzkaller syzkaller@googlegroups.com Suggested-by: Vegard Nossum vegard.nossum@oracle.com Suggested-by: Gustavo A. R. Silva gustavoars@kernel.org Signed-off-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Reviewed-by: Gustavo A. R. Silva gustavoars@kernel.org Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/20240105164001.2129796-2-harshit.m.mogalapalli@ora... Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/vmw_vmci/vmci_datagram.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/misc/vmw_vmci/vmci_datagram.c b/drivers/misc/vmw_vmci/vmci_datagram.c index f50d22882476f..d1d8224c8800c 100644 --- a/drivers/misc/vmw_vmci/vmci_datagram.c +++ b/drivers/misc/vmw_vmci/vmci_datagram.c @@ -234,7 +234,8 @@ static int dg_dispatch_as_host(u32 context_id, struct vmci_datagram *dg)
dg_info->in_dg_host_queue = true; dg_info->entry = dst_entry; - memcpy(&dg_info->msg, dg, dg_size); + dg_info->msg = *dg; + memcpy(&dg_info->msg_payload, dg + 1, dg->payload_size);
INIT_WORK(&dg_info->work, dg_delayed_dispatch); schedule_work(&dg_info->work);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Ogness john.ogness@linutronix.de
[ Upstream commit d988d9a9b9d180bfd5c1d353b3b176cb90d6861b ]
If the kernel crashes in a context where printk() calls always defer printing (such as in NMI or inside a printk_safe section) then the final panic messages will be deferred to irq_work. But if irq_work is not available, the messages will not get printed unless explicitly flushed. The result is that the final "end Kernel panic" banner does not get printed.
Add one final flush after the last printk() call to make sure the final panic messages make it out as well.
Signed-off-by: John Ogness john.ogness@linutronix.de Reviewed-by: Petr Mladek pmladek@suse.com Link: https://lore.kernel.org/r/20240207134103.1357162-14-john.ogness@linutronix.d... Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/panic.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/kernel/panic.c b/kernel/panic.c index bc39e2b27d315..30d8da0d43d8f 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -427,6 +427,14 @@ void panic(const char *fmt, ...)
/* Do not scroll important messages printed above */ suppress_printk = 1; + + /* + * The final messages may not have been printed if in a context that + * defers printing (such as NMI) and irq_work is not available. + * Explicitly flush the kernel log buffer one last time. + */ + console_flush_on_panic(CONSOLE_FLUSH_PENDING); + local_irq_enable(); for (i = 0; ; i += PANIC_TIMER_STEP) { touch_softlockup_watchdog();
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Jonker jbx6244@gmail.com
[ Upstream commit 1d00ba4700d1e0f88ae70d028d2e17e39078fa1c ]
Fix rk3328 hdmi ports node so that it matches the rockchip,dw-hdmi.yaml binding.
Signed-off-by: Johan Jonker jbx6244@gmail.com Link: https://lore.kernel.org/r/e5dea3b7-bf84-4474-9530-cc2da3c41104@gmail.com Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/rockchip/rk3328.dtsi | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3328.dtsi b/arch/arm64/boot/dts/rockchip/rk3328.dtsi index 72112fe05a5c4..10df6636a6b6c 100644 --- a/arch/arm64/boot/dts/rockchip/rk3328.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3328.dtsi @@ -732,11 +732,20 @@ hdmi: hdmi@ff3c0000 { status = "disabled";
ports { - hdmi_in: port { + #address-cells = <1>; + #size-cells = <0>; + + hdmi_in: port@0 { + reg = <0>; + hdmi_in_vop: endpoint { remote-endpoint = <&vop_out_hdmi>; }; }; + + hdmi_out: port@1 { + reg = <1>; + }; }; };
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Jonker jbx6244@gmail.com
[ Upstream commit f051b6ace7ffcc48d6d1017191f167c0a85799f6 ]
Fix rk3399 hdmi ports node so that it matches the rockchip,dw-hdmi.yaml binding.
Signed-off-by: Johan Jonker jbx6244@gmail.com Link: https://lore.kernel.org/r/a6ab6f75-3b80-40b1-bd30-3113e14becdd@gmail.com Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/rockchip/rk3399.dtsi | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi index 3180f576ed02e..e2515218ff734 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi @@ -1769,6 +1769,7 @@ simple-audio-card,codec { hdmi: hdmi@ff940000 { compatible = "rockchip,rk3399-dw-hdmi"; reg = <0x0 0xff940000 0x0 0x20000>; + reg-io-width = <4>; interrupts = <GIC_SPI 23 IRQ_TYPE_LEVEL_HIGH 0>; clocks = <&cru PCLK_HDMI_CTRL>, <&cru SCLK_HDMI_SFR>, @@ -1777,13 +1778,16 @@ hdmi: hdmi@ff940000 { <&cru PLL_VPLL>; clock-names = "iahb", "isfr", "cec", "grf", "vpll"; power-domains = <&power RK3399_PD_HDCP>; - reg-io-width = <4>; rockchip,grf = <&grf>; #sound-dai-cells = <0>; status = "disabled";
ports { - hdmi_in: port { + #address-cells = <1>; + #size-cells = <0>; + + hdmi_in: port@0 { + reg = <0>; #address-cells = <1>; #size-cells = <0>;
@@ -1796,6 +1800,10 @@ hdmi_in_vopl: endpoint@1 { remote-endpoint = <&vopl_out_hdmi>; }; }; + + hdmi_out: port@1 { + reg = <1>; + }; }; };
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shannon Nelson shannon.nelson@amd.com
[ Upstream commit c699f35d658f3c21b69ed24e64b2ea26381e941d ]
We claim to have the AdminQ on our irq0 and thus cpu id 0, but we need to be sure we set the affinity hint to try to keep it there.
Signed-off-by: Shannon Nelson shannon.nelson@amd.com Reviewed-by: Brett Creeley brett.creeley@amd.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c index 49c28134ac2cc..a37ca4b1e5665 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c @@ -2708,9 +2708,12 @@ static int ionic_lif_adminq_init(struct ionic_lif *lif)
napi_enable(&qcq->napi);
- if (qcq->flags & IONIC_QCQ_F_INTR) + if (qcq->flags & IONIC_QCQ_F_INTR) { + irq_set_affinity_hint(qcq->intr.vector, + &qcq->intr.affinity_mask); ionic_intr_mask(idev->intr_ctrl, qcq->intr.index, IONIC_INTR_MASK_CLEAR); + }
qcq->flags |= IONIC_QCQ_F_INITED;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kunwu Chan chentao@kylinos.cn
[ Upstream commit 98bc7e26e14fbb26a6abf97603d59532475e97f8 ]
kasprintf() returns a pointer to dynamically allocated memory which can be NULL upon failure. Ensure the allocation was successful by checking the pointer validity.
Signed-off-by: Kunwu Chan chentao@kylinos.cn Link: https://lore.kernel.org/r/20240118100206.213928-1-chentao@kylinos.cn Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/pstore/zone.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/pstore/zone.c b/fs/pstore/zone.c index b50fc33f2ab29..2426fb6794fd3 100644 --- a/fs/pstore/zone.c +++ b/fs/pstore/zone.c @@ -973,6 +973,8 @@ static ssize_t psz_kmsg_read(struct pstore_zone *zone, char *buf = kasprintf(GFP_KERNEL, "%s: Total %d times\n", kmsg_dump_reason_str(record->reason), record->count); + if (!buf) + return -ENOMEM; hlen = strlen(buf); record->buf = krealloc(buf, hlen + size, GFP_KERNEL); if (!record->buf) {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Samasth Norway Ananda samasth.norway.ananda@oracle.com
[ Upstream commit f85450f134f0b4ca7e042dc3dc89155656a2299d ]
In function get_pkg_num() if fopen_or_die() succeeds it returns a file pointer to be used. But fclose() is never called before returning from the function.
Signed-off-by: Samasth Norway Ananda samasth.norway.ananda@oracle.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c b/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c index ff6c6661f075f..1c80aa498d543 100644 --- a/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c +++ b/tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c @@ -1152,6 +1152,7 @@ unsigned int get_pkg_num(int cpu) retval = fscanf(fp, "%d\n", &pkg); if (retval != 1) errx(1, "%s: failed to parse", pathname); + fclose(fp); return pkg; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Sterba dsterba@suse.com
[ Upstream commit 7411055db5ce64f836aaffd422396af0075fdc99 ]
The unhandled case in btrfs_relocate_sys_chunks() loop is a corruption, as it could be caused only by two impossible conditions:
- at first the search key is set up to look for a chunk tree item, with offset -1, this is an inexact search and the key->offset will contain the correct offset upon a successful search, a valid chunk tree item cannot have an offset -1
- after first successful search, the found_key corresponds to a chunk item, the offset is decremented by 1 before the next loop, it's impossible to find a chunk item there due to alignment and size constraints
Reviewed-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9a05313c69f33..09c23626feba4 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3178,7 +3178,17 @@ static int btrfs_relocate_sys_chunks(struct btrfs_fs_info *fs_info) mutex_unlock(&fs_info->delete_unused_bgs_mutex); goto error; } - BUG_ON(ret == 0); /* Corruption */ + if (ret == 0) { + /* + * On the first search we would find chunk tree with + * offset -1, which is not possible. On subsequent + * loops this would find an existing item on an invalid + * offset (one less than the previous one, wrong + * alignment and size). + */ + ret = -EUCLEAN; + goto error; + }
ret = btrfs_previous_item(chunk_root, path, key.objectid, key.type);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Sterba dsterba@suse.com
[ Upstream commit 26b66d1d366a375745755ca7365f67110bbf6bd5 ]
The get_parent handler looks up a parent of a given dentry, this can be either a subvolume or a directory. The search is set up with offset -1 but it's never expected to find such item, as it would break allowed range of inode number or a root id. This means it's a corruption (ext4 also returns this error code).
Reviewed-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/export.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c index bfa2bf44529c2..d908afa1f313c 100644 --- a/fs/btrfs/export.c +++ b/fs/btrfs/export.c @@ -161,8 +161,15 @@ struct dentry *btrfs_get_parent(struct dentry *child) ret = btrfs_search_slot(NULL, root, &key, path, 0, 0); if (ret < 0) goto fail; + if (ret == 0) { + /* + * Key with offset of -1 found, there would have to exist an + * inode with such number or a root with such id. + */ + ret = -EUCLEAN; + goto fail; + }
- BUG_ON(ret == 0); /* Key with offset of -1 found */ if (path->slots[0] == 0) { ret = -ENOENT; goto fail;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Sterba dsterba@suse.com
[ Upstream commit 3c6ee34c6f9cd12802326da26631232a61743501 ]
Change BUG_ON to proper error handling if building the path buffer fails. The pointers are not printed so we don't accidentally leak kernel addresses.
Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/send.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 0b04adfd4a4a4..0519a3557697a 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -966,7 +966,15 @@ static int iterate_inode_ref(struct btrfs_root *root, struct btrfs_path *path, ret = PTR_ERR(start); goto out; } - BUG_ON(start < p->buf); + if (unlikely(start < p->buf)) { + btrfs_err(root->fs_info, + "send: path ref buffer underflow for key (%llu %u %llu)", + found_key->objectid, + found_key->type, + found_key->offset); + ret = -EINVAL; + goto out; + } } p->start = start; } else {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 00af2aa93b76b1bade471ad0d0525d4d29ca5cc0 ]
Many syzbot reports show extreme rtnl pressure, and many of them hint that smc acquires rtnl in netns creation for no good reason [1]
This patch returns early from smc_pnet_net_init() if there is no netdevice yet.
I am not even sure why smc_pnet_create_pnetids_list() even exists, because smc_pnet_netdev_event() is also calling smc_pnet_add_base_pnetid() when handling NETDEV_UP event.
[1] extract of typical syzbot reports
2 locks held by syz-executor.3/12252: #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491 #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline] #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878 2 locks held by syz-executor.4/12253: #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491 #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline] #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878 2 locks held by syz-executor.1/12257: #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491 #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline] #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878 2 locks held by syz-executor.2/12261: #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491 #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline] #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878 2 locks held by syz-executor.0/12265: #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491 #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline] #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878 2 locks held by syz-executor.3/12268: #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491 #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline] #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878 2 locks held by syz-executor.4/12271: #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491 #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline] #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878 2 locks held by syz-executor.1/12274: #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491 #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline] #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878 2 locks held by syz-executor.2/12280: #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491 #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline] #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878
Signed-off-by: Eric Dumazet edumazet@google.com Cc: Wenjia Zhang wenjia@linux.ibm.com Cc: Jan Karcher jaka@linux.ibm.com Cc: "D. Wythe" alibuda@linux.alibaba.com Cc: Tony Lu tonylu@linux.alibaba.com Cc: Wen Gu guwen@linux.alibaba.com Reviewed-by: Wenjia Zhang wenjia@linux.ibm.com Link: https://lore.kernel.org/r/20240302100744.3868021-1-edumazet@google.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/smc_pnet.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c index 30bae60d626c6..ed9cfa11b589f 100644 --- a/net/smc/smc_pnet.c +++ b/net/smc/smc_pnet.c @@ -797,6 +797,16 @@ static void smc_pnet_create_pnetids_list(struct net *net) u8 ndev_pnetid[SMC_MAX_PNETID_LEN]; struct net_device *dev;
+ /* Newly created netns do not have devices. + * Do not even acquire rtnl. + */ + if (list_empty(&net->dev_base_head)) + return; + + /* Note: This might not be needed, because smc_pnet_netdev_event() + * is also calling smc_pnet_add_base_pnetid() when handling + * NETDEV_UP event. + */ rtnl_lock(); for_each_netdev(net, dev) smc_pnet_add_base_pnetid(net, dev, ndev_pnetid);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Edward Adam Davis eadavis@qq.com
[ Upstream commit b79e040910101b020931ba0c9a6b77e81ab7f645 ]
If hci_cmd_sync_complete() is triggered and skb is NULL, then hdev->req_skb is NULL, which will cause this issue.
Reported-and-tested-by: syzbot+830d9e3fa61968246abd@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis eadavis@qq.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btintel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/bluetooth/btintel.c b/drivers/bluetooth/btintel.c index 88ce5f0ffc4ba..e1daf6ebd3ada 100644 --- a/drivers/bluetooth/btintel.c +++ b/drivers/bluetooth/btintel.c @@ -344,7 +344,7 @@ int btintel_read_version(struct hci_dev *hdev, struct intel_version *ver) struct sk_buff *skb;
skb = __hci_cmd_sync(hdev, 0xfc05, 0, NULL, HCI_CMD_TIMEOUT); - if (IS_ERR(skb)) { + if (IS_ERR_OR_NULL(skb)) { bt_dev_err(hdev, "Reading Intel version information failed (%ld)", PTR_ERR(skb)); return PTR_ERR(skb);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kunwu Chan chentao@kylinos.cn
[ Upstream commit bc4996184d56cfaf56d3811ac2680c8a0e2af56e ]
While input core can work with input->phys set to NULL userspace might depend on it, so better fail probing if allocation fails. The system must be in a pretty bad shape for it to happen anyway.
Signed-off-by: Kunwu Chan chentao@kylinos.cn Link: https://lore.kernel.org/r/20240117073124.143636-1-chentao@kylinos.cn Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/rmi4/rmi_driver.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/input/rmi4/rmi_driver.c b/drivers/input/rmi4/rmi_driver.c index 258d5fe3d395c..aa32371f04af6 100644 --- a/drivers/input/rmi4/rmi_driver.c +++ b/drivers/input/rmi4/rmi_driver.c @@ -1196,7 +1196,11 @@ static int rmi_driver_probe(struct device *dev) } rmi_driver_set_input_params(rmi_dev, data->input); data->input->phys = devm_kasprintf(dev, GFP_KERNEL, - "%s/input0", dev_name(dev)); + "%s/input0", dev_name(dev)); + if (!data->input->phys) { + retval = -ENOMEM; + goto err; + } }
retval = rmi_init_functions(data);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Geert Uytterhoeven geert+renesas@glider.be
[ Upstream commit 3803584a4e9b65bb5b013f862f55c5055aa86c25 ]
If the number of provided enum IDs in a variable width config register description does not match the expected number, the checker uses the expected number for validating the individual enum IDs.
However, this may cause out-of-bounds accesses on the array holding the enum IDs, leading to bogus enum_id conflict warnings. Worse, if the bug is an incorrect bit field description (e.g. accidentally using "12" instead of "-12" for a reserved field), thousands of warnings may be printed, overflowing the kernel log buffer.
Fix this by limiting the enum ID check to the number of provided enum IDs.
Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://lore.kernel.org/r/c7385f44f2faebb8856bcbb4e908d846fc1531fb.170593080... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/renesas/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/pinctrl/renesas/core.c b/drivers/pinctrl/renesas/core.c index 54f1a7334027a..c390854483680 100644 --- a/drivers/pinctrl/renesas/core.c +++ b/drivers/pinctrl/renesas/core.c @@ -868,9 +868,11 @@ static void __init sh_pfc_check_cfg_reg(const char *drvname, sh_pfc_err("reg 0x%x: var_field_width declares %u instead of %u bits\n", cfg_reg->reg, rw, cfg_reg->reg_width);
- if (n != cfg_reg->nr_enum_ids) + if (n != cfg_reg->nr_enum_ids) { sh_pfc_err("reg 0x%x: enum_ids[] has %u instead of %u values\n", cfg_reg->reg, cfg_reg->nr_enum_ids, n); + n = cfg_reg->nr_enum_ids; + }
check_enum_ids: sh_pfc_check_reg_enums(drvname, cfg_reg->reg, cfg_reg->enum_ids, n);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit f123dc86388cb669c3d6322702dc441abc35c31e ]
syzbot is reporting sleep in atomic context in SysV filesystem [1], for sb_bread() is called with rw_spinlock held.
A "write_lock(&pointers_lock) => read_lock(&pointers_lock) deadlock" bug and a "sb_bread() with write_lock(&pointers_lock)" bug were introduced by "Replace BKL for chain locking with sysvfs-private rwlock" in Linux 2.5.12.
Then, "[PATCH] err1-40: sysvfs locking fix" in Linux 2.6.8 fixed the former bug by moving pointers_lock lock to the callers, but instead introduced a "sb_bread() with read_lock(&pointers_lock)" bug (which made this problem easier to hit).
Al Viro suggested that why not to do like get_branch()/get_block()/ find_shared() in Minix filesystem does. And doing like that is almost a revert of "[PATCH] err1-40: sysvfs locking fix" except that get_branch() from with find_shared() is called without write_lock(&pointers_lock).
Reported-by: syzbot syzbot+69b40dc5fd40f32c199f@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=69b40dc5fd40f32c199f Suggested-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Link: https://lore.kernel.org/r/0d195f93-a22a-49a2-0020-103534d6f7f6@I-love.SAKURA... Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/sysv/itree.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/fs/sysv/itree.c b/fs/sysv/itree.c index e3d1673b8ec97..ef9bcfeec21ad 100644 --- a/fs/sysv/itree.c +++ b/fs/sysv/itree.c @@ -82,9 +82,6 @@ static inline sysv_zone_t *block_end(struct buffer_head *bh) return (sysv_zone_t*)((char*)bh->b_data + bh->b_size); }
-/* - * Requires read_lock(&pointers_lock) or write_lock(&pointers_lock) - */ static Indirect *get_branch(struct inode *inode, int depth, int offsets[], @@ -104,15 +101,18 @@ static Indirect *get_branch(struct inode *inode, bh = sb_bread(sb, block); if (!bh) goto failure; + read_lock(&pointers_lock); if (!verify_chain(chain, p)) goto changed; add_chain(++p, bh, (sysv_zone_t*)bh->b_data + *++offsets); + read_unlock(&pointers_lock); if (!p->key) goto no_block; } return NULL;
changed: + read_unlock(&pointers_lock); brelse(bh); *err = -EAGAIN; goto no_block; @@ -218,9 +218,7 @@ static int get_block(struct inode *inode, sector_t iblock, struct buffer_head *b goto out;
reread: - read_lock(&pointers_lock); partial = get_branch(inode, depth, offsets, chain, &err); - read_unlock(&pointers_lock);
/* Simplest case - block found, no allocation needed */ if (!partial) { @@ -290,9 +288,9 @@ static Indirect *find_shared(struct inode *inode, *top = 0; for (k = depth; k > 1 && !offsets[k-1]; k--) ; + partial = get_branch(inode, k, offsets, chain, &err);
write_lock(&pointers_lock); - partial = get_branch(inode, k, offsets, chain, &err); if (!partial) partial = chain + k-1; /*
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Justin Tee justin.tee@broadcom.com
[ Upstream commit 2ae917d4bcab80ab304b774d492e2fcd6c52c06b ]
The call to lpfc_sli4_resume_rpi() in lpfc_rcv_padisc() may return an unsuccessful status. In such cases, the elsiocb is not issued, the completion is not called, and thus the elsiocb resource is leaked.
Check return value after calling lpfc_sli4_resume_rpi() and conditionally release the elsiocb resource.
Signed-off-by: Justin Tee justin.tee@broadcom.com Link: https://lore.kernel.org/r/20240131185112.149731-3-justintee8345@gmail.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_nportdisc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_nportdisc.c b/drivers/scsi/lpfc/lpfc_nportdisc.c index 1e22364a31fcf..d6287c58d5045 100644 --- a/drivers/scsi/lpfc/lpfc_nportdisc.c +++ b/drivers/scsi/lpfc/lpfc_nportdisc.c @@ -784,8 +784,10 @@ lpfc_rcv_padisc(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp, /* Save the ELS cmd */ elsiocb->drvrTimeout = cmd;
- lpfc_sli4_resume_rpi(ndlp, - lpfc_mbx_cmpl_resume_rpi, elsiocb); + if (lpfc_sli4_resume_rpi(ndlp, + lpfc_mbx_cmpl_resume_rpi, + elsiocb)) + kfree(elsiocb); goto out; } }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Henrie alexhenrie24@gmail.com
[ Upstream commit 4243bf80c79211a8ca2795401add9c4a3b1d37ca ]
I have a CD copy of the original Tom Clancy's Ghost Recon game from 2001. The disc mounts without error on Windows, but on Linux mounting fails with the message "isofs_fill_super: get root inode failed". The error originates in isofs_read_inode, which returns -EIO because de_len is 0. The superblock on this disc appears to be intentionally corrupt as a form of copy protection.
When the root inode is unusable, instead of giving up immediately, try to continue with the Joliet file table. This fixes the Ghost Recon CD and probably other copy-protected CDs too.
Signed-off-by: Alex Henrie alexhenrie24@gmail.com Signed-off-by: Jan Kara jack@suse.cz Message-Id: 20240208022134.451490-1-alexhenrie24@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/isofs/inode.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c index f62b5a5015668..4c763f573faf3 100644 --- a/fs/isofs/inode.c +++ b/fs/isofs/inode.c @@ -907,8 +907,22 @@ static int isofs_fill_super(struct super_block *s, void *data, int silent) * we then decide whether to use the Joliet descriptor. */ inode = isofs_iget(s, sbi->s_firstdatazone, 0); - if (IS_ERR(inode)) - goto out_no_root; + + /* + * Fix for broken CDs with a corrupt root inode but a correct Joliet + * root directory. + */ + if (IS_ERR(inode)) { + if (joliet_level && sbi->s_firstdatazone != first_data_zone) { + printk(KERN_NOTICE + "ISOFS: root inode is unusable. " + "Disabling Rock Ridge and switching to Joliet."); + sbi->s_rock = 0; + inode = NULL; + } else { + goto out_no_root; + } + }
/* * Fix for broken CDs with Rock Ridge and empty ISO root directory but
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 3de49ae81c3a0f83a554ecbce4c08e019f30168e ]
clang-16 warns about casting incompatible function pointers:
drivers/media/pci/sta2x11/sta2x11_vip.c:1057:6: error: cast from 'irqreturn_t (*)(int, struct sta2x11_vip *)' (aka 'enum irqreturn (*)(int, struct sta2x11_vip *)') to 'irq_handler_t' (aka 'enum irqreturn (*)(int, void *)') converts to incompatible function type [-Werror,-Wcast-function-type-strict]
Change the prototype of the irq handler to the regular version with a local variable to adjust the argument type.
Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl [hverkuil: update argument documentation] Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/pci/sta2x11/sta2x11_vip.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/media/pci/sta2x11/sta2x11_vip.c b/drivers/media/pci/sta2x11/sta2x11_vip.c index 336df65c8af11..01ca940aecc2d 100644 --- a/drivers/media/pci/sta2x11/sta2x11_vip.c +++ b/drivers/media/pci/sta2x11/sta2x11_vip.c @@ -760,7 +760,7 @@ static const struct video_device video_dev_template = { /** * vip_irq - interrupt routine * @irq: Number of interrupt ( not used, correct number is assumed ) - * @vip: local data structure containing all information + * @data: local data structure containing all information * * check for both frame interrupts set ( top and bottom ). * check FIFO overflow, but limit number of log messages after open. @@ -770,8 +770,9 @@ static const struct video_device video_dev_template = { * * IRQ_HANDLED, interrupt done. */ -static irqreturn_t vip_irq(int irq, struct sta2x11_vip *vip) +static irqreturn_t vip_irq(int irq, void *data) { + struct sta2x11_vip *vip = data; unsigned int status;
status = reg_read(vip, DVP_ITS); @@ -1053,9 +1054,7 @@ static int sta2x11_vip_init_one(struct pci_dev *pdev,
spin_lock_init(&vip->slock);
- ret = request_irq(pdev->irq, - (irq_handler_t) vip_irq, - IRQF_SHARED, KBUILD_MODNAME, vip); + ret = request_irq(pdev->irq, vip_irq, IRQF_SHARED, KBUILD_MODNAME, vip); if (ret) { dev_err(&pdev->dev, "request_irq failed\n"); ret = -ENODEV;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit 68ee261fb15457ecb17e3683cb4e6a4792ca5b71 ]
If one group is marked as block bitmap corrupted, its free blocks cannot be used and its free count is also deducted from the global sbi->s_freeclusters_counter. User might be confused about the absent free space because we can't query the information about corrupted block groups except unreliable error messages in syslog. So add a hint to show block bitmap corrupted groups in mb_groups.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20240119061154.1525781-1-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/mballoc.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 6ce59e2ac2d47..c1d632725c2c0 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2591,7 +2591,10 @@ static int ext4_mb_seq_groups_show(struct seq_file *seq, void *v) for (i = 0; i <= 13; i++) seq_printf(seq, " %-5u", i <= blocksize_bits + 1 ? sg.info.bb_counters[i] : 0); - seq_puts(seq, " ]\n"); + seq_puts(seq, " ]"); + if (EXT4_MB_GRP_BBITMAP_CORRUPT(&sg.info)) + seq_puts(seq, " Block bitmap corrupted!"); + seq_puts(seq, "\n");
return 0; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ye Bin yebin10@huawei.com
[ Upstream commit d8b945fa475f13d787df00c26a6dc45a3e2e1d1d ]
There's issue as follows When do IO fault injection test: Quota error (device dm-3): find_block_dqentry: Quota for id 101 referenced but not present Quota error (device dm-3): qtree_read_dquot: Can't read quota structure for id 101 Quota error (device dm-3): do_check_range: Getting block 2021161007 out of range 1-186 Quota error (device dm-3): qtree_read_dquot: Can't read quota structure for id 661
Now, ext4_write_dquot()/ext4_acquire_dquot()/ext4_release_dquot() may commit inconsistent quota data even if process failed. This may lead to filesystem corruption. To ensure filesystem consistent when errors=remount-ro there is need to call ext4_handle_error() to abort journal.
Signed-off-by: Ye Bin yebin10@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20240119062908.3598806-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/super.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index e386d67cff9d1..0149d3c2cfd78 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -6205,6 +6205,10 @@ static int ext4_write_dquot(struct dquot *dquot) if (IS_ERR(handle)) return PTR_ERR(handle); ret = dquot_commit(dquot); + if (ret < 0) + ext4_error_err(dquot->dq_sb, -ret, + "Failed to commit dquot type %d", + dquot->dq_id.type); err = ext4_journal_stop(handle); if (!ret) ret = err; @@ -6221,6 +6225,10 @@ static int ext4_acquire_dquot(struct dquot *dquot) if (IS_ERR(handle)) return PTR_ERR(handle); ret = dquot_acquire(dquot); + if (ret < 0) + ext4_error_err(dquot->dq_sb, -ret, + "Failed to acquire dquot type %d", + dquot->dq_id.type); err = ext4_journal_stop(handle); if (!ret) ret = err; @@ -6240,6 +6248,10 @@ static int ext4_release_dquot(struct dquot *dquot) return PTR_ERR(handle); } ret = dquot_release(dquot); + if (ret < 0) + ext4_error_err(dquot->dq_sb, -ret, + "Failed to release dquot type %d", + dquot->dq_id.type); err = ext4_journal_stop(handle); if (!ret) ret = err;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aric Cyr aric.cyr@amd.com
[ Upstream commit 14d68acfd04b39f34eea7bea65dda652e6db5bf6 ]
[Why] Nanosec stats can overflow on long running systems potentially causing statistic logging issues.
[How] Use 64bit types for nanosec stats to ensure no overflow.
Reviewed-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Aric Cyr aric.cyr@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/modules/inc/mod_stats.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/modules/inc/mod_stats.h b/drivers/gpu/drm/amd/display/modules/inc/mod_stats.h index 4220fd8fdd60c..54cd86060f4d6 100644 --- a/drivers/gpu/drm/amd/display/modules/inc/mod_stats.h +++ b/drivers/gpu/drm/amd/display/modules/inc/mod_stats.h @@ -57,10 +57,10 @@ void mod_stats_update_event(struct mod_stats *mod_stats, unsigned int length);
void mod_stats_update_flip(struct mod_stats *mod_stats, - unsigned long timestamp_in_ns); + unsigned long long timestamp_in_ns);
void mod_stats_update_vupdate(struct mod_stats *mod_stats, - unsigned long timestamp_in_ns); + unsigned long long timestamp_in_ns);
void mod_stats_update_freesync(struct mod_stats *mod_stats, unsigned int v_total_min,
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dai Ngo dai.ngo@oracle.com
[ Upstream commit 2c35f43b5a4b9cdfaa6fdd946f5a212615dac8eb ]
When the NFS client is under extreme load the rpc_wait_queue.qlen counter can be overflowed. Here is an instant of the backlog queue overflow in a real world environment shown by drgn helper:
rpc_task_stats(rpc_clnt): ------------------------- rpc_clnt: 0xffff92b65d2bae00 rpc_xprt: 0xffff9275db64f000 Queue: sending[64887] pending[524] backlog[30441] binding[0] XMIT task: 0xffff925c6b1d8e98 WRITE: 750654 __dta_call_status_580: 65463 __dta_call_transmit_status_579: 1 call_reserveresult: 685189 nfs_client_init_is_complete: 1 COMMIT: 584 call_reserveresult: 573 __dta_call_status_580: 11 ACCESS: 1 __dta_call_status_580: 1 GETATTR: 10 __dta_call_status_580: 4 call_reserveresult: 6 751249 tasks for server 111.222.333.444 Total tasks: 751249
count_rpc_wait_queues(xprt): ---------------------------- **** rpc_xprt: 0xffff9275db64f000 num_reqs: 65511 wait_queue: xprt_binding[0] cnt: 0 wait_queue: xprt_binding[1] cnt: 0 wait_queue: xprt_binding[2] cnt: 0 wait_queue: xprt_binding[3] cnt: 0 rpc_wait_queue[xprt_binding].qlen: 0 maxpriority: 0 wait_queue: xprt_sending[0] cnt: 0 wait_queue: xprt_sending[1] cnt: 64887 wait_queue: xprt_sending[2] cnt: 0 wait_queue: xprt_sending[3] cnt: 0 rpc_wait_queue[xprt_sending].qlen: 64887 maxpriority: 3 wait_queue: xprt_pending[0] cnt: 524 wait_queue: xprt_pending[1] cnt: 0 wait_queue: xprt_pending[2] cnt: 0 wait_queue: xprt_pending[3] cnt: 0 rpc_wait_queue[xprt_pending].qlen: 524 maxpriority: 0 wait_queue: xprt_backlog[0] cnt: 0 wait_queue: xprt_backlog[1] cnt: 685801 wait_queue: xprt_backlog[2] cnt: 0 wait_queue: xprt_backlog[3] cnt: 0 rpc_wait_queue[xprt_backlog].qlen: 30441 maxpriority: 3 [task cnt mismatch]
There is no effect on operations when this overflow occurs. However it causes confusion when trying to diagnose the performance problem.
Signed-off-by: Dai Ngo dai.ngo@oracle.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/sunrpc/sched.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h index 256dff36cf720..0527a4bc9a36f 100644 --- a/include/linux/sunrpc/sched.h +++ b/include/linux/sunrpc/sched.h @@ -197,7 +197,7 @@ struct rpc_wait_queue { unsigned char maxpriority; /* maximum priority (0 if queue is not a priority queue) */ unsigned char priority; /* current priority */ unsigned char nr; /* # tasks remaining for cookie */ - unsigned short qlen; /* total # tasks waiting in queue */ + unsigned int qlen; /* total # tasks waiting in queue */ struct rpc_timer timer_list; #if IS_ENABLED(CONFIG_SUNRPC_DEBUG) || IS_ENABLED(CONFIG_TRACEPOINTS) const char * name;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Drake drake@endlessos.org
[ Upstream commit cb98555fcd8eee98c30165537c7e394f3a66e809 ]
This reverts commit d52848620de00cde4a3a5df908e231b8c8868250, which was originally put in place to work around a s2idle failure on this platform where the NVMe device was inaccessible upon resume.
After extended testing, we found that the firmware's implementation of S3 is buggy and intermittently fails to wake up the system. We need to revert to s2idle mode.
The NVMe issue has now been solved more precisely in the commit titled "PCI: Disable D3cold on Asus B1400 PCI-NVMe bridge"
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215742 Link: https://lore.kernel.org/r/20240228075316.7404-2-drake@endlessos.org Signed-off-by: Daniel Drake drake@endlessos.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Acked-by: Jian-Hong Pan jhp@endlessos.org Acked-by: Rafael J. Wysocki rafael@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/sleep.c | 12 ------------ 1 file changed, 12 deletions(-)
diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c index 097a5b5f46ab0..e79c004ca0b24 100644 --- a/drivers/acpi/sleep.c +++ b/drivers/acpi/sleep.c @@ -385,18 +385,6 @@ static const struct dmi_system_id acpisleep_dmi_table[] __initconst = { DMI_MATCH(DMI_PRODUCT_NAME, "20GGA00L00"), }, }, - /* - * ASUS B1400CEAE hangs on resume from suspend (see - * https://bugzilla.kernel.org/show_bug.cgi?id=215742). - */ - { - .callback = init_default_s3, - .ident = "ASUS B1400CEAE", - .matches = { - DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."), - DMI_MATCH(DMI_PRODUCT_NAME, "ASUS EXPERTBOOK B1400CEAE"), - }, - }, {}, };
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers irogers@google.com
[ Upstream commit 1947b92464c3268381604bbe2ac977a3fd78192f ]
Parallel testing appears to show a race between allocating and setting evsel ids. As there is a bounds check on the xyarray it yields a segv like:
``` AddressSanitizer:DEADLYSIGNAL
=================================================================
==484408==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010
==484408==The signal is caused by a WRITE memory access.
==484408==Hint: address points to the zero page.
#0 0x55cef5d4eff4 in perf_evlist__id_hash tools/lib/perf/evlist.c:256 #1 0x55cef5d4f132 in perf_evlist__id_add tools/lib/perf/evlist.c:274 #2 0x55cef5d4f545 in perf_evlist__id_add_fd tools/lib/perf/evlist.c:315 #3 0x55cef5a1923f in store_evsel_ids util/evsel.c:3130 #4 0x55cef5a19400 in evsel__store_ids util/evsel.c:3147 #5 0x55cef5888204 in __run_perf_stat tools/perf/builtin-stat.c:832 #6 0x55cef5888c06 in run_perf_stat tools/perf/builtin-stat.c:960 #7 0x55cef58932db in cmd_stat tools/perf/builtin-stat.c:2878 ... ```
Avoid this crash by early exiting the perf_evlist__id_add_fd and perf_evlist__id_add is the access is out-of-bounds.
Signed-off-by: Ian Rogers irogers@google.com Cc: Yang Jihong yangjihong1@huawei.com Signed-off-by: Namhyung Kim namhyung@kernel.org Link: https://lore.kernel.org/r/20240229070757.796244-1-irogers@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/perf/evlist.c | 18 ++++++++++++------ tools/lib/perf/include/internal/evlist.h | 4 ++-- 2 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c index f76b1a9d5a6e1..53cff32b2cb80 100644 --- a/tools/lib/perf/evlist.c +++ b/tools/lib/perf/evlist.c @@ -226,10 +226,10 @@ u64 perf_evlist__read_format(struct perf_evlist *evlist)
static void perf_evlist__id_hash(struct perf_evlist *evlist, struct perf_evsel *evsel, - int cpu, int thread, u64 id) + int cpu_map_idx, int thread, u64 id) { int hash; - struct perf_sample_id *sid = SID(evsel, cpu, thread); + struct perf_sample_id *sid = SID(evsel, cpu_map_idx, thread);
sid->id = id; sid->evsel = evsel; @@ -239,21 +239,27 @@ static void perf_evlist__id_hash(struct perf_evlist *evlist,
void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel, - int cpu, int thread, u64 id) + int cpu_map_idx, int thread, u64 id) { - perf_evlist__id_hash(evlist, evsel, cpu, thread, id); + if (!SID(evsel, cpu_map_idx, thread)) + return; + + perf_evlist__id_hash(evlist, evsel, cpu_map_idx, thread, id); evsel->id[evsel->ids++] = id; }
int perf_evlist__id_add_fd(struct perf_evlist *evlist, struct perf_evsel *evsel, - int cpu, int thread, int fd) + int cpu_map_idx, int thread, int fd) { u64 read_data[4] = { 0, }; int id_idx = 1; /* The first entry is the counter value */ u64 id; int ret;
+ if (!SID(evsel, cpu_map_idx, thread)) + return -1; + ret = ioctl(fd, PERF_EVENT_IOC_ID, &id); if (!ret) goto add; @@ -282,7 +288,7 @@ int perf_evlist__id_add_fd(struct perf_evlist *evlist, id = read_data[id_idx];
add: - perf_evlist__id_add(evlist, evsel, cpu, thread, id); + perf_evlist__id_add(evlist, evsel, cpu_map_idx, thread, id); return 0; }
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h index 2d0fa02b036f6..8999f2cc8ee44 100644 --- a/tools/lib/perf/include/internal/evlist.h +++ b/tools/lib/perf/include/internal/evlist.h @@ -118,10 +118,10 @@ u64 perf_evlist__read_format(struct perf_evlist *evlist);
void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel, - int cpu, int thread, u64 id); + int cpu_map_idx, int thread, u64 id);
int perf_evlist__id_add_fd(struct perf_evlist *evlist, struct perf_evsel *evsel, - int cpu, int thread, int fd); + int cpu_map_idx, int thread, int fd);
#endif /* __LIBPERF_INTERNAL_EVLIST_H */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Smirnov r.smirnov@omp.ru
[ Upstream commit 93f52fbeaf4b676b21acfe42a5152620e6770d02 ]
The expression dst->nr_samples + src->nr_samples may have zero value on overflow. It is necessary to add a check to avoid division by zero.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Signed-off-by: Roman Smirnov r.smirnov@omp.ru Reviewed-by: Sergey Shtylyov s.shtylyov@omp.ru Link: https://lore.kernel.org/r/20240305134509.23108-1-r.smirnov@omp.ru Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-stat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-stat.c b/block/blk-stat.c index ae3dd1fb8e61d..6e602f9b966e4 100644 --- a/block/blk-stat.c +++ b/block/blk-stat.c @@ -28,7 +28,7 @@ void blk_rq_stat_init(struct blk_rq_stat *stat) /* src is a per-cpu stat, mean isn't initialized */ void blk_rq_stat_sum(struct blk_rq_stat *dst, struct blk_rq_stat *src) { - if (!src->nr_samples) + if (dst->nr_samples + src->nr_samples <= dst->nr_samples) return;
dst->min = min(dst->min, src->min);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Manjunath Patil manjunath.b.patil@oracle.com
[ Upstream commit 96d9cbe2f2ff7abde021bac75eafaceabe9a51fa ]
Add timeout to cm_destroy_id, so that userspace can trigger any data collection that would help in analyzing the cause of delay in destroying the cm_id.
New noinline function helps dtrace/ebpf programs to hook on to it. Existing functionality isn't changed except triggering a probe-able new function at every timeout interval.
We have seen cases where CM messages stuck with MAD layer (either due to software bug or faulty HCA), leading to cm_id getting stuck in the following call stack. This patch helps in resolving such issues faster.
kernel: ... INFO: task XXXX:56778 blocked for more than 120 seconds. ... Call Trace: __schedule+0x2bc/0x895 schedule+0x36/0x7c schedule_timeout+0x1f6/0x31f ? __slab_free+0x19c/0x2ba wait_for_completion+0x12b/0x18a ? wake_up_q+0x80/0x73 cm_destroy_id+0x345/0x610 [ib_cm] ib_destroy_cm_id+0x10/0x20 [ib_cm] rdma_destroy_id+0xa8/0x300 [rdma_cm] ucma_destroy_id+0x13e/0x190 [rdma_ucm] ucma_write+0xe0/0x160 [rdma_ucm] __vfs_write+0x3a/0x16d vfs_write+0xb2/0x1a1 ? syscall_trace_enter+0x1ce/0x2b8 SyS_write+0x5c/0xd3 do_syscall_64+0x79/0x1b9 entry_SYSCALL_64_after_hwframe+0x16d/0x0
Signed-off-by: Manjunath Patil manjunath.b.patil@oracle.com Link: https://lore.kernel.org/r/20240309063323.458102-1-manjunath.b.patil@oracle.c... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cm.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c index db1a25fbe2fa9..2a30b25c5e7e5 100644 --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -33,6 +33,7 @@ MODULE_AUTHOR("Sean Hefty"); MODULE_DESCRIPTION("InfiniBand CM"); MODULE_LICENSE("Dual BSD/GPL");
+#define CM_DESTROY_ID_WAIT_TIMEOUT 10000 /* msecs */ static const char * const ibcm_rej_reason_strs[] = { [IB_CM_REJ_NO_QP] = "no QP", [IB_CM_REJ_NO_EEC] = "no EEC", @@ -1056,10 +1057,20 @@ static void cm_reset_to_idle(struct cm_id_private *cm_id_priv) } }
+static noinline void cm_destroy_id_wait_timeout(struct ib_cm_id *cm_id) +{ + struct cm_id_private *cm_id_priv; + + cm_id_priv = container_of(cm_id, struct cm_id_private, id); + pr_err("%s: cm_id=%p timed out. state=%d refcnt=%d\n", __func__, + cm_id, cm_id->state, refcount_read(&cm_id_priv->refcount)); +} + static void cm_destroy_id(struct ib_cm_id *cm_id, int err) { struct cm_id_private *cm_id_priv; struct cm_work *work; + int ret;
cm_id_priv = container_of(cm_id, struct cm_id_private, id); spin_lock_irq(&cm_id_priv->lock); @@ -1171,7 +1182,14 @@ static void cm_destroy_id(struct ib_cm_id *cm_id, int err)
xa_erase(&cm.local_id_table, cm_local_id(cm_id->local_id)); cm_deref_id(cm_id_priv); - wait_for_completion(&cm_id_priv->comp); + do { + ret = wait_for_completion_timeout(&cm_id_priv->comp, + msecs_to_jiffies( + CM_DESTROY_ID_WAIT_TIMEOUT)); + if (!ret) /* timeout happened */ + cm_destroy_id_wait_timeout(cm_id); + } while (!ret); + while ((work = cm_dequeue_work(cm_id_priv)) != NULL) cm_free_work(work);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gergo Koteles soyer@irl.hu
[ Upstream commit cfeb98b95fff25c442f78a6f616c627bc48a26b7 ]
Newer Lenovo Yogas and Legions with 60Hz/90Hz displays send a wmi event when Fn + R is pressed. This is intended for use to switch between the two refresh rates.
Allocate a new KEY_REFRESH_RATE_TOGGLE keycode for it.
Signed-off-by: Gergo Koteles soyer@irl.hu Acked-by: Dmitry Torokhov dmitry.torokhov@gmail.com Link: https://lore.kernel.org/r/15a5d08c84cf4d7b820de34ebbcf8ae2502fb3ca.171006575... Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/uapi/linux/input-event-codes.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/include/uapi/linux/input-event-codes.h b/include/uapi/linux/input-event-codes.h index 7989d9483ea75..bed20a89c14c1 100644 --- a/include/uapi/linux/input-event-codes.h +++ b/include/uapi/linux/input-event-codes.h @@ -602,6 +602,7 @@
#define KEY_ALS_TOGGLE 0x230 /* Ambient light sensor */ #define KEY_ROTATE_LOCK_TOGGLE 0x231 /* Display rotation lock */ +#define KEY_REFRESH_RATE_TOGGLE 0x232 /* Display refresh rate toggle */
#define KEY_BUTTONCONFIG 0x240 /* AL Button Configuration */ #define KEY_TASKMANAGER 0x241 /* AL Task/Project Manager */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alban Boyé alban.boye@protonmail.com
[ Upstream commit 1266e2efb7512dbf20eac820ca2ed34de6b1c3e7 ]
Signed-off-by: Alban Boyé alban.boye@protonmail.com Link: https://lore.kernel.org/r/20240227223919.11587-1-alban.boye@protonmail.com Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/touchscreen_dmi.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/platform/x86/touchscreen_dmi.c b/drivers/platform/x86/touchscreen_dmi.c index ebe959db1eeb9..fbaa618594628 100644 --- a/drivers/platform/x86/touchscreen_dmi.c +++ b/drivers/platform/x86/touchscreen_dmi.c @@ -1084,6 +1084,15 @@ const struct dmi_system_id touchscreen_dmi_table[] = { DMI_MATCH(DMI_BIOS_VERSION, "CHUWI.D86JLBNR"), }, }, + { + /* Chuwi Vi8 dual-boot (CWI506) */ + .driver_data = (void *)&chuwi_vi8_data, + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Insyde"), + DMI_MATCH(DMI_PRODUCT_NAME, "i86"), + DMI_MATCH(DMI_BIOS_VERSION, "CHUWI2.D86JHBNR02"), + }, + }, { /* Chuwi Vi8 Plus (CWI519) */ .driver_data = (void *)&chuwi_vi8_plus_data,
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ricardo B. Marliere ricardo@marliere.net
[ Upstream commit 07283c1873a4d0eaa0e822536881bfdaea853910 ]
The test type "make_warnings_file" should have no mandatory configuration parameters other than the ones required by the "build" test type, because its purpose is to create a file with build warnings that may or may not be used by other subsequent tests. Currently, the only way to use it as a stand-alone test is by setting POWER_CYCLE, CONSOLE, SSH_USER, BUILD_TARGET, TARGET_IMAGE, REBOOT_TYPE and GRUB_MENU.
Link: https://lkml.kernel.org/r/20240315-ktest-v2-1-c5c20a75f6a3@marliere.net
Cc: John Hawley warthog9@eaglescrag.net Signed-off-by: Ricardo B. Marliere ricardo@marliere.net Signed-off-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/ktest/ktest.pl | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/testing/ktest/ktest.pl b/tools/testing/ktest/ktest.pl index ea26f2b0c1bc2..f72da30795dd6 100755 --- a/tools/testing/ktest/ktest.pl +++ b/tools/testing/ktest/ktest.pl @@ -773,6 +773,7 @@ sub set_value { if ($lvalue =~ /^(TEST|BISECT|CONFIG_BISECT)_TYPE([.*])?$/ && $prvalue !~ /^(config_|)bisect$/ && $prvalue !~ /^build$/ && + $prvalue !~ /^make_warnings_file$/ && $buildonly) {
# Note if a test is something other than build, then we
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: linke li lilinke99@qq.com
[ Upstream commit f1e30cb6369251c03f63c564006f96a54197dcc4 ]
In function ring_buffer_iter_empty(), cpu_buffer->commit_page is read while other threads may change it. It may cause the time_stamp that read in the next line come from a different page. Use READ_ONCE() to avoid having to reason about compiler optimizations now and in future.
Link: https://lore.kernel.org/linux-trace-kernel/tencent_DFF7D3561A0686B5E8FC07915...
Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Signed-off-by: linke li lilinke99@qq.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/ring_buffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index fb87def73a3a4..2df8e13a29e57 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -4211,7 +4211,7 @@ int ring_buffer_iter_empty(struct ring_buffer_iter *iter) cpu_buffer = iter->cpu_buffer; reader = cpu_buffer->reader_page; head_page = cpu_buffer->head_page; - commit_page = cpu_buffer->commit_page; + commit_page = READ_ONCE(cpu_buffer->commit_page); commit_ts = commit_page->page->time_stamp;
/*
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Petre Rodan petre.rodan@subdimension.ro
[ Upstream commit 4e6500bfa053dc133021f9c144261b77b0ba7dc8 ]
Replace seekdir() with rewinddir() in order to fix a localized glibc bug.
One of the glibc patches that stable Gentoo is using causes an improper directory stream positioning bug on 32bit arm. That in turn ends up as a floating point exception in iio_generic_buffer.
The attached patch provides a fix by using an equivalent function which should not cause trouble for other distros and is easier to reason about in general as it obviously always goes back to to the start.
https://sourceware.org/bugzilla/show_bug.cgi?id=31212
Signed-off-by: Petre Rodan petre.rodan@subdimension.ro Link: https://lore.kernel.org/r/20240108103224.3986-1-petre.rodan@subdimension.ro Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/iio/iio_utils.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/iio/iio_utils.c b/tools/iio/iio_utils.c index 48360994c2a13..b8745873928c5 100644 --- a/tools/iio/iio_utils.c +++ b/tools/iio/iio_utils.c @@ -373,7 +373,7 @@ int build_channel_array(const char *device_dir, goto error_close_dir; }
- seekdir(dp, 0); + rewinddir(dp); while (ent = readdir(dp), ent) { if (strcmp(ent->d_name + strlen(ent->d_name) - strlen("_en"), "_en") == 0) {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marco Felsch m.felsch@pengutronix.de
[ Upstream commit 8774ea7a553e2aec323170d49365b59af0a2b7e0 ]
The driver already support the tcpci binding for the i2c_device_id so add the support for the of_device_id too.
Signed-off-by: Marco Felsch m.felsch@pengutronix.de Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20240222210903.208901-3-m.felsch@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/typec/tcpm/tcpci.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/typec/tcpm/tcpci.c b/drivers/usb/typec/tcpm/tcpci.c index e34e46df80243..33c67adf7c67a 100644 --- a/drivers/usb/typec/tcpm/tcpci.c +++ b/drivers/usb/typec/tcpm/tcpci.c @@ -732,6 +732,7 @@ MODULE_DEVICE_TABLE(i2c, tcpci_id); #ifdef CONFIG_OF static const struct of_device_id tcpci_of_match[] = { { .compatible = "nxp,ptn5110", }, + { .compatible = "tcpci", }, {}, }; MODULE_DEVICE_TABLE(of, tcpci_of_match);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Colin Ian King colin.i.king@gmail.com
[ Upstream commit 12f371e2b6cb4b79c788f1f073992e115f4ca918 ]
Function checkdone is only required if QUIRK2 is defined, so add appropriate #if / #endif around the function.
Cleans up clang scan build warning: drivers/usb/host/sl811-hcd.c:588:18: warning: unused function 'checkdone' [-Wunused-function]
Signed-off-by: Colin Ian King colin.i.king@gmail.com Link: https://lore.kernel.org/r/20240307111351.1982382-1-colin.i.king@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/host/sl811-hcd.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/usb/host/sl811-hcd.c b/drivers/usb/host/sl811-hcd.c index 9465fce99c822..f803079a9f263 100644 --- a/drivers/usb/host/sl811-hcd.c +++ b/drivers/usb/host/sl811-hcd.c @@ -585,6 +585,7 @@ done(struct sl811 *sl811, struct sl811h_ep *ep, u8 bank) finish_request(sl811, ep, urb, urbstat); }
+#ifdef QUIRK2 static inline u8 checkdone(struct sl811 *sl811) { u8 ctl; @@ -616,6 +617,7 @@ static inline u8 checkdone(struct sl811 *sl811) #endif return irqstat; } +#endif
static irqreturn_t sl811h_irq(struct usb_hcd *hcd) {
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Burakov a.burakov@rosalinux.ru
[ Upstream commit bc87bb342f106a0402186bcb588fcbe945dced4b ]
There are some actions with value 'tmp' but 'dst_addr' is checked instead. It is obvious that a copy-paste error was made here and the value of variable 'tmp' should be checked here.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Aleksandr Burakov a.burakov@rosalinux.ru Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/via/accel.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/video/fbdev/via/accel.c b/drivers/video/fbdev/via/accel.c index 0a1bc7a4d7853..1e04026f08091 100644 --- a/drivers/video/fbdev/via/accel.c +++ b/drivers/video/fbdev/via/accel.c @@ -115,7 +115,7 @@ static int hw_bitblt_1(void __iomem *engine, u8 op, u32 width, u32 height,
if (op != VIA_BITBLT_FILL) { tmp = src_mem ? 0 : src_addr; - if (dst_addr & 0xE0000007) { + if (tmp & 0xE0000007) { printk(KERN_WARNING "hw_bitblt_1: Unsupported source " "address %X\n", tmp); return -EINVAL; @@ -260,7 +260,7 @@ static int hw_bitblt_2(void __iomem *engine, u8 op, u32 width, u32 height, writel(tmp, engine + 0x18);
tmp = src_mem ? 0 : src_addr; - if (dst_addr & 0xE0000007) { + if (tmp & 0xE0000007) { printk(KERN_WARNING "hw_bitblt_2: Unsupported source " "address %X\n", tmp); return -EINVAL;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiawei Fu (iBug) i@ibugone.com
[ Upstream commit e89086c43f0500bc7c4ce225495b73b8ce234c1f ]
This commit adds NVME_QUIRK_NO_DEEPEST_PS and NVME_QUIRK_BOGUS_NID for device [126f:2262], which appears to be a generic VID:PID pair used for many SSDs based on the Silicon Motion SM2262/SM2262EN controller.
Two of my SSDs with this VID:PID pair exhibit the same behavior:
* They frequently have trouble exiting the deepest power state (5), resulting in the entire disk unresponsive. Verified by setting nvme_core.default_ps_max_latency_us=10000 and observing them behaving normally. * They produce all-zero nguid and eui64 with `nvme id-ns` command.
The offending products are:
* HP SSD EX950 1TB * HIKVISION C2000Pro 2TB
Signed-off-by: Jiawei Fu i@ibugone.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/pci.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 970a1b374a669..5242feda5471a 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -3199,6 +3199,9 @@ static const struct pci_device_id nvme_id_table[] = { NVME_QUIRK_BOGUS_NID, }, { PCI_VDEVICE(REDHAT, 0x0010), /* Qemu emulated controller */ .driver_data = NVME_QUIRK_BOGUS_NID, }, + { PCI_DEVICE(0x126f, 0x2262), /* Silicon Motion generic */ + .driver_data = NVME_QUIRK_NO_DEEPEST_PS | + NVME_QUIRK_BOGUS_NID, }, { PCI_DEVICE(0x126f, 0x2263), /* Silicon Motion unidentified */ .driver_data = NVME_QUIRK_NO_NS_DESC_LIST, }, { PCI_DEVICE(0x1bb1, 0x0100), /* Seagate Nytro Flash Storage */
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Smirnov r.smirnov@omp.ru
[ Upstream commit c2d953276b8b27459baed1277a4fdd5dd9bd4126 ]
The expression htotal * vtotal can have a zero value on overflow. It is necessary to prevent division by zero like in fb_var_to_videomode().
Found by Linux Verification Center (linuxtesting.org) with Svace.
Signed-off-by: Roman Smirnov r.smirnov@omp.ru Reviewed-by: Sergey Shtylyov s.shtylyov@omp.ru Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/core/fbmon.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/video/fbdev/core/fbmon.c b/drivers/video/fbdev/core/fbmon.c index 1bf82dbc9e3cf..3c29a5eb43805 100644 --- a/drivers/video/fbdev/core/fbmon.c +++ b/drivers/video/fbdev/core/fbmon.c @@ -1311,7 +1311,7 @@ int fb_get_mode(int flags, u32 val, struct fb_var_screeninfo *var, struct fb_inf int fb_videomode_from_videomode(const struct videomode *vm, struct fb_videomode *fbmode) { - unsigned int htotal, vtotal; + unsigned int htotal, vtotal, total;
fbmode->xres = vm->hactive; fbmode->left_margin = vm->hback_porch; @@ -1344,8 +1344,9 @@ int fb_videomode_from_videomode(const struct videomode *vm, vtotal = vm->vactive + vm->vfront_porch + vm->vback_porch + vm->vsync_len; /* prevent division by zero */ - if (htotal && vtotal) { - fbmode->refresh = vm->pixelclock / (htotal * vtotal); + total = htotal * vtotal; + if (total) { + fbmode->refresh = vm->pixelclock / total; /* a mode must have htotal and vtotal != 0 or it is invalid */ } else { fbmode->refresh = 0;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
commit a45e6889575c2067d3c0212b6bc1022891e65b91 upstream.
Unlike early commit path stage which triggers a call to abort, an explicit release of the batch is required on abort, otherwise mutex is released and commit_list remains in place.
Add WARN_ON_ONCE to ensure commit_list is empty from the abort path before releasing the mutex.
After this patch, commit_list is always assumed to be empty before grabbing the mutex, therefore
03c1f1ef1584 ("netfilter: Cleanup nft_net->module_list from nf_tables_exit_net()")
only needs to release the pending modules for registration.
Cc: stable@vger.kernel.org Fixes: c0391b6ab810 ("netfilter: nf_tables: missing validation from the abort path") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index dd440d7910a62..6c5de620da9f1 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -8800,10 +8800,11 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) struct nft_trans *trans, *next; LIST_HEAD(set_update_list); struct nft_trans_elem *te; + int err = 0;
if (action == NFNL_ABORT_VALIDATE && nf_tables_validate(net) < 0) - return -EAGAIN; + err = -EAGAIN;
list_for_each_entry_safe_reverse(trans, next, &nft_net->commit_list, list) { @@ -8968,7 +8969,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) else nf_tables_module_autoload_cleanup(net);
- return 0; + return err; }
static int nf_tables_abort(struct net *net, struct sk_buff *skb, @@ -8982,6 +8983,8 @@ static int nf_tables_abort(struct net *net, struct sk_buff *skb, ret = __nf_tables_abort(net, action); nft_gc_seq_end(nft_net, gc_seq);
+ WARN_ON_ONCE(!list_empty(&nft_net->commit_list)); + mutex_unlock(&nft_net->commit_mutex);
return ret; @@ -9716,8 +9719,11 @@ static void __net_exit nf_tables_exit_net(struct net *net)
gc_seq = nft_gc_seq_begin(nft_net);
- if (!list_empty(&nft_net->commit_list)) - __nf_tables_abort(net, NFNL_ABORT_NONE); + WARN_ON_ONCE(!list_empty(&nft_net->commit_list)); + + if (!list_empty(&nft_net->module_list)) + nf_tables_module_autoload_cleanup(net); + __nft_release_tables(net);
nft_gc_seq_end(nft_net, gc_seq);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
commit 0d459e2ffb541841714839e8228b845458ed3b27 upstream.
The commit mutex should not be released during the critical section between nft_gc_seq_begin() and nft_gc_seq_end(), otherwise, async GC worker could collect expired objects and get the released commit lock within the same GC sequence.
nf_tables_module_autoload() temporarily releases the mutex to load module dependencies, then it goes back to replay the transaction again. Move it at the end of the abort phase after nft_gc_seq_end() is called.
Cc: stable@vger.kernel.org Fixes: 720344340fb9 ("netfilter: nf_tables: GC transaction race with abort path") Reported-by: Kuan-Ting Chen hexrabbit@devco.re Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 6c5de620da9f1..ef271628975a9 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -8964,11 +8964,6 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) nf_tables_abort_release(trans); }
- if (action == NFNL_ABORT_AUTOLOAD) - nf_tables_module_autoload(net); - else - nf_tables_module_autoload_cleanup(net); - return err; }
@@ -8985,6 +8980,14 @@ static int nf_tables_abort(struct net *net, struct sk_buff *skb,
WARN_ON_ONCE(!list_empty(&nft_net->commit_list));
+ /* module autoload needs to happen after GC sequence update because it + * temporarily releases and grabs mutex again. + */ + if (action == NFNL_ABORT_AUTOLOAD) + nf_tables_module_autoload(net); + else + nf_tables_module_autoload_cleanup(net); + mutex_unlock(&nft_net->commit_mutex);
return ret;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
commit 1bc83a019bbe268be3526406245ec28c2458a518 upstream.
Hook unregistration is deferred to the commit phase, same occurs with hook updates triggered by the table dormant flag. When both commands are combined, this results in deleting a basechain while leaving its hook still registered in the core.
Fixes: 179d9ba5559a ("netfilter: nf_tables: fix table flag updates") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index ef271628975a9..ab7f7e45b9846 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -1084,6 +1084,24 @@ static void nf_tables_table_disable(struct net *net, struct nft_table *table) #define __NFT_TABLE_F_UPDATE (__NFT_TABLE_F_WAS_DORMANT | \ __NFT_TABLE_F_WAS_AWAKEN)
+static bool nft_table_pending_update(const struct nft_ctx *ctx) +{ + struct nftables_pernet *nft_net = net_generic(ctx->net, nf_tables_net_id); + struct nft_trans *trans; + + if (ctx->table->flags & __NFT_TABLE_F_UPDATE) + return true; + + list_for_each_entry(trans, &nft_net->commit_list, list) { + if (trans->ctx.table == ctx->table && + trans->msg_type == NFT_MSG_DELCHAIN && + nft_is_base_chain(trans->ctx.chain)) + return true; + } + + return false; +} + static int nf_tables_updtable(struct nft_ctx *ctx) { struct nft_trans *trans; @@ -1101,7 +1119,7 @@ static int nf_tables_updtable(struct nft_ctx *ctx) return 0;
/* No dormant off/on/off/on games in single transaction */ - if (ctx->table->flags & __NFT_TABLE_F_UPDATE) + if (nft_table_pending_update(ctx)) return -EINVAL;
trans = nft_trans_alloc(ctx, NFT_MSG_NEWTABLE,
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thadeu Lima de Souza Cascardo cascardo@canonical.com
commit 67c37756898a5a6b2941a13ae7260c89b54e0d88 upstream.
Any unprivileged user can attach N_GSM0710 ldisc, but it requires CAP_NET_ADMIN to create a GSM network anyway.
Require initial namespace CAP_NET_ADMIN to do that.
Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com Link: https://lore.kernel.org/r/20230731185942.279611-1-cascardo@canonical.com From: Salvatore Bonaccorso carnil@debian.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/n_gsm.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/tty/n_gsm.c +++ b/drivers/tty/n_gsm.c @@ -2661,6 +2661,9 @@ static int gsmld_open(struct tty_struct { struct gsm_mux *gsm;
+ if (!capable(CAP_NET_ADMIN)) + return -EPERM; + if (tty->ops->write == NULL) return -EINVAL;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Hildenbrand david@redhat.com
commit 310227f42882c52356b523e2f4e11690eebcd2ab upstream.
Currently, we don't reenable the config if freezing the device failed.
For example, virtio-mem currently doesn't support suspend+resume, and trying to freeze the device will always fail. Afterwards, the device will no longer respond to resize requests, because it won't get notified about config changes.
Let's fix this by re-enabling the config if freezing fails.
Fixes: 22b7050a024d ("virtio: defer config changed notifications") Cc: stable@kernel.org Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Jason Wang jasowang@redhat.com Cc: Xuan Zhuo xuanzhuo@linux.alibaba.com Signed-off-by: David Hildenbrand david@redhat.com Message-Id: 20240213135425.795001-1-david@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/virtio/virtio.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -403,13 +403,19 @@ EXPORT_SYMBOL_GPL(unregister_virtio_devi int virtio_device_freeze(struct virtio_device *dev) { struct virtio_driver *drv = drv_to_virtio(dev->dev.driver); + int ret;
virtio_config_disable(dev);
dev->failed = dev->config->get_status(dev) & VIRTIO_CONFIG_S_FAILED;
- if (drv && drv->freeze) - return drv->freeze(dev); + if (drv && drv->freeze) { + ret = drv->freeze(dev); + if (ret) { + virtio_config_enable(dev); + return ret; + } + }
return 0; }
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Hildenbrand david@redhat.com
commit 04c35ab3bdae7fefbd7c7a7355f29fa03a035221 upstream.
PAT handling won't do the right thing in COW mappings: the first PTE (or, in fact, all PTEs) can be replaced during write faults to point at anon folios. Reliably recovering the correct PFN and cachemode using follow_phys() from PTEs will not work in COW mappings.
Using follow_phys(), we might just get the address+protection of the anon folio (which is very wrong), or fail on swap/nonswap entries, failing follow_phys() and triggering a WARN_ON_ONCE() in untrack_pfn() and track_pfn_copy(), not properly calling free_pfn_range().
In free_pfn_range(), we either wouldn't call memtype_free() or would call it with the wrong range, possibly leaking memory.
To fix that, let's update follow_phys() to refuse returning anon folios, and fallback to using the stored PFN inside vma->vm_pgoff for COW mappings if we run into that.
We will now properly handle untrack_pfn() with COW mappings, where we don't need the cachemode. We'll have to fail fork()->track_pfn_copy() if the first page was replaced by an anon folio, though: we'd have to store the cachemode in the VMA to make this work, likely growing the VMA size.
For now, lets keep it simple and let track_pfn_copy() just fail in that case: it would have failed in the past with swap/nonswap entries already, and it would have done the wrong thing with anon folios.
Simple reproducer to trigger the WARN_ON_ONCE() in untrack_pfn():
<--- C reproducer ---> #include <stdio.h> #include <sys/mman.h> #include <unistd.h> #include <liburing.h>
int main(void) { struct io_uring_params p = {}; int ring_fd; size_t size; char *map;
ring_fd = io_uring_setup(1, &p); if (ring_fd < 0) { perror("io_uring_setup"); return 1; } size = p.sq_off.array + p.sq_entries * sizeof(unsigned);
/* Map the submission queue ring MAP_PRIVATE */ map = mmap(0, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, ring_fd, IORING_OFF_SQ_RING); if (map == MAP_FAILED) { perror("mmap"); return 1; }
/* We have at least one page. Let's COW it. */ *map = 0; pause(); return 0; } <--- C reproducer --->
On a system with 16 GiB RAM and swap configured: # ./iouring & # memhog 16G # killall iouring [ 301.552930] ------------[ cut here ]------------ [ 301.553285] WARNING: CPU: 7 PID: 1402 at arch/x86/mm/pat/memtype.c:1060 untrack_pfn+0xf4/0x100 [ 301.553989] Modules linked in: binfmt_misc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_g [ 301.558232] CPU: 7 PID: 1402 Comm: iouring Not tainted 6.7.5-100.fc38.x86_64 #1 [ 301.558772] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebu4 [ 301.559569] RIP: 0010:untrack_pfn+0xf4/0x100 [ 301.559893] Code: 75 c4 eb cf 48 8b 43 10 8b a8 e8 00 00 00 3b 6b 28 74 b8 48 8b 7b 30 e8 ea 1a f7 000 [ 301.561189] RSP: 0018:ffffba2c0377fab8 EFLAGS: 00010282 [ 301.561590] RAX: 00000000ffffffea RBX: ffff9208c8ce9cc0 RCX: 000000010455e047 [ 301.562105] RDX: 07fffffff0eb1e0a RSI: 0000000000000000 RDI: ffff9208c391d200 [ 301.562628] RBP: 0000000000000000 R08: ffffba2c0377fab8 R09: 0000000000000000 [ 301.563145] R10: ffff9208d2292d50 R11: 0000000000000002 R12: 00007fea890e0000 [ 301.563669] R13: 0000000000000000 R14: ffffba2c0377fc08 R15: 0000000000000000 [ 301.564186] FS: 0000000000000000(0000) GS:ffff920c2fbc0000(0000) knlGS:0000000000000000 [ 301.564773] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 301.565197] CR2: 00007fea88ee8a20 CR3: 00000001033a8000 CR4: 0000000000750ef0 [ 301.565725] PKRU: 55555554 [ 301.565944] Call Trace: [ 301.566148] <TASK> [ 301.566325] ? untrack_pfn+0xf4/0x100 [ 301.566618] ? __warn+0x81/0x130 [ 301.566876] ? untrack_pfn+0xf4/0x100 [ 301.567163] ? report_bug+0x171/0x1a0 [ 301.567466] ? handle_bug+0x3c/0x80 [ 301.567743] ? exc_invalid_op+0x17/0x70 [ 301.568038] ? asm_exc_invalid_op+0x1a/0x20 [ 301.568363] ? untrack_pfn+0xf4/0x100 [ 301.568660] ? untrack_pfn+0x65/0x100 [ 301.568947] unmap_single_vma+0xa6/0xe0 [ 301.569247] unmap_vmas+0xb5/0x190 [ 301.569532] exit_mmap+0xec/0x340 [ 301.569801] __mmput+0x3e/0x130 [ 301.570051] do_exit+0x305/0xaf0 ...
Link: https://lkml.kernel.org/r/20240403212131.929421-3-david@redhat.com Signed-off-by: David Hildenbrand david@redhat.com Reported-by: Wupeng Ma mawupeng1@huawei.com Closes: https://lkml.kernel.org/r/20240227122814.3781907-1-mawupeng1@huawei.com Fixes: b1a86e15dc03 ("x86, pat: remove the dependency on 'vm_pgoff' in track/untrack pfn vma routines") Fixes: 5899329b1910 ("x86: PAT: implement track/untrack of pfnmap regions for x86 - v3") Acked-by: Ingo Molnar mingo@kernel.org Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Borislav Petkov bp@alien8.de Cc: "H. Peter Anvin" hpa@zytor.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: David Hildenbrand david@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/mm/pat/memtype.c | 50 +++++++++++++++++++++++++++++++++------------- mm/memory.c | 4 +++ 2 files changed, 40 insertions(+), 14 deletions(-)
--- a/arch/x86/mm/pat/memtype.c +++ b/arch/x86/mm/pat/memtype.c @@ -56,6 +56,7 @@
#include "memtype.h" #include "../mm_internal.h" +#include "../../../mm/internal.h" /* is_cow_mapping() */
#undef pr_fmt #define pr_fmt(fmt) "" fmt @@ -987,6 +988,38 @@ static void free_pfn_range(u64 paddr, un memtype_free(paddr, paddr + size); }
+static int get_pat_info(struct vm_area_struct *vma, resource_size_t *paddr, + pgprot_t *pgprot) +{ + unsigned long prot; + + VM_WARN_ON_ONCE(!(vma->vm_flags & VM_PAT)); + + /* + * We need the starting PFN and cachemode used for track_pfn_remap() + * that covered the whole VMA. For most mappings, we can obtain that + * information from the page tables. For COW mappings, we might now + * suddenly have anon folios mapped and follow_phys() will fail. + * + * Fallback to using vma->vm_pgoff, see remap_pfn_range_notrack(), to + * detect the PFN. If we need the cachemode as well, we're out of luck + * for now and have to fail fork(). + */ + if (!follow_phys(vma, vma->vm_start, 0, &prot, paddr)) { + if (pgprot) + *pgprot = __pgprot(prot); + return 0; + } + if (is_cow_mapping(vma->vm_flags)) { + if (pgprot) + return -EINVAL; + *paddr = (resource_size_t)vma->vm_pgoff << PAGE_SHIFT; + return 0; + } + WARN_ON_ONCE(1); + return -EINVAL; +} + /* * track_pfn_copy is called when vma that is covering the pfnmap gets * copied through copy_page_range(). @@ -997,20 +1030,13 @@ static void free_pfn_range(u64 paddr, un int track_pfn_copy(struct vm_area_struct *vma) { resource_size_t paddr; - unsigned long prot; unsigned long vma_size = vma->vm_end - vma->vm_start; pgprot_t pgprot;
if (vma->vm_flags & VM_PAT) { - /* - * reserve the whole chunk covered by vma. We need the - * starting address and protection from pte. - */ - if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) { - WARN_ON_ONCE(1); + if (get_pat_info(vma, &paddr, &pgprot)) return -EINVAL; - } - pgprot = __pgprot(prot); + /* reserve the whole chunk covered by vma. */ return reserve_pfn_range(paddr, vma_size, &pgprot, 1); }
@@ -1085,7 +1111,6 @@ void untrack_pfn(struct vm_area_struct * unsigned long size) { resource_size_t paddr; - unsigned long prot;
if (vma && !(vma->vm_flags & VM_PAT)) return; @@ -1093,11 +1118,8 @@ void untrack_pfn(struct vm_area_struct * /* free the chunk starting from pfn or the whole chunk */ paddr = (resource_size_t)pfn << PAGE_SHIFT; if (!paddr && !size) { - if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) { - WARN_ON_ONCE(1); + if (get_pat_info(vma, &paddr, NULL)) return; - } - size = vma->vm_end - vma->vm_start; } free_pfn_range(paddr, size); --- a/mm/memory.c +++ b/mm/memory.c @@ -4934,6 +4934,10 @@ int follow_phys(struct vm_area_struct *v goto out; pte = *ptep;
+ /* Never return PFNs of anon folios in COW mappings. */ + if (vm_normal_page(vma, address, pte)) + goto unlock; + if ((flags & FOLL_WRITE) && !pte_write(pte)) goto unlock;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chris Wilson chris@chris-wilson.co.uk
commit 4a3859ea5240365d21f6053ee219bb240d520895 upstream.
Originally, with strict in order execution, we could complete execution only when the queue was empty. Preempt-to-busy allows replacement of an active request that may complete before the preemption is processed by HW. If that happens, the request is retired from the queue, but the queue_priority_hint remains set, preventing direct submission until after the next CS interrupt is processed.
This preempt-to-busy race can be triggered by the heartbeat, which will also act as the power-management barrier and upon completion allow us to idle the HW. We may process the completion of the heartbeat, and begin parking the engine before the CS event that restores the queue_priority_hint, causing us to fail the assertion that it is MIN.
<3>[ 166.210729] __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1)) <0>[ 166.210781] Dumping ftrace buffer: <0>[ 166.210795] --------------------------------- ... <0>[ 167.302811] drm_fdin-1097 2..s1. 165741070us : trace_ports: 0000:00:02.0 rcs0: promote { ccid:20 1217:2 prio 0 } <0>[ 167.302861] drm_fdin-1097 2d.s2. 165741072us : execlists_submission_tasklet: 0000:00:02.0 rcs0: preempting last=1217:2, prio=0, hint=2147483646 <0>[ 167.302928] drm_fdin-1097 2d.s2. 165741072us : __i915_request_unsubmit: 0000:00:02.0 rcs0: fence 1217:2, current 0 <0>[ 167.302992] drm_fdin-1097 2d.s2. 165741073us : __i915_request_submit: 0000:00:02.0 rcs0: fence 3:4660, current 4659 <0>[ 167.303044] drm_fdin-1097 2d.s1. 165741076us : execlists_submission_tasklet: 0000:00:02.0 rcs0: context:3 schedule-in, ccid:40 <0>[ 167.303095] drm_fdin-1097 2d.s1. 165741077us : trace_ports: 0000:00:02.0 rcs0: submit { ccid:40 3:4660* prio 2147483646 } <0>[ 167.303159] kworker/-89 11..... 165741139us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence c90:2, current 2 <0>[ 167.303208] kworker/-89 11..... 165741148us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:c90 unpin <0>[ 167.303272] kworker/-89 11..... 165741159us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 1217:2, current 2 <0>[ 167.303321] kworker/-89 11..... 165741166us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:1217 unpin <0>[ 167.303384] kworker/-89 11..... 165741170us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 3:4660, current 4660 <0>[ 167.303434] kworker/-89 11d..1. 165741172us : __intel_context_retire: 0000:00:02.0 rcs0: context:1216 retire runtime: { total:56028ns, avg:56028ns } <0>[ 167.303484] kworker/-89 11..... 165741198us : __engine_park: 0000:00:02.0 rcs0: parked <0>[ 167.303534] <idle>-0 5d.H3. 165741207us : execlists_irq_handler: 0000:00:02.0 rcs0: semaphore yield: 00000040 <0>[ 167.303583] kworker/-89 11..... 165741397us : __intel_context_retire: 0000:00:02.0 rcs0: context:1217 retire runtime: { total:325575ns, avg:0ns } <0>[ 167.303756] kworker/-89 11..... 165741777us : __intel_context_retire: 0000:00:02.0 rcs0: context:c90 retire runtime: { total:0ns, avg:0ns } <0>[ 167.303806] kworker/-89 11..... 165742017us : __engine_park: __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1)) <0>[ 167.303811] --------------------------------- <4>[ 167.304722] ------------[ cut here ]------------ <2>[ 167.304725] kernel BUG at drivers/gpu/drm/i915/gt/intel_engine_pm.c:283! <4>[ 167.304731] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4>[ 167.304734] CPU: 11 PID: 89 Comm: kworker/11:1 Tainted: G W 6.8.0-rc2-CI_DRM_14193-gc655e0fd2804+ #1 <4>[ 167.304736] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022 <4>[ 167.304738] Workqueue: i915-unordered retire_work_handler [i915] <4>[ 167.304839] RIP: 0010:__engine_park+0x3fd/0x680 [i915] <4>[ 167.304937] Code: 00 48 c7 c2 b0 e5 86 a0 48 8d 3d 00 00 00 00 e8 79 48 d4 e0 bf 01 00 00 00 e8 ef 0a d4 e0 31 f6 bf 09 00 00 00 e8 03 49 c0 e0 <0f> 0b 0f 0b be 01 00 00 00 e8 f5 61 fd ff 31 c0 e9 34 fd ff ff 48 <4>[ 167.304940] RSP: 0018:ffffc9000059fce0 EFLAGS: 00010246 <4>[ 167.304942] RAX: 0000000000000200 RBX: 0000000000000000 RCX: 0000000000000006 <4>[ 167.304944] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009 <4>[ 167.304946] RBP: ffff8881330ca1b0 R08: 0000000000000001 R09: 0000000000000001 <4>[ 167.304947] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8881330ca000 <4>[ 167.304948] R13: ffff888110f02aa0 R14: ffff88812d1d0205 R15: ffff88811277d4f0 <4>[ 167.304950] FS: 0000000000000000(0000) GS:ffff88844f780000(0000) knlGS:0000000000000000 <4>[ 167.304952] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 167.304953] CR2: 00007fc362200c40 CR3: 000000013306e003 CR4: 0000000000770ef0 <4>[ 167.304955] PKRU: 55555554 <4>[ 167.304957] Call Trace: <4>[ 167.304958] <TASK> <4>[ 167.305573] ____intel_wakeref_put_last+0x1d/0x80 [i915] <4>[ 167.305685] i915_request_retire.part.0+0x34f/0x600 [i915] <4>[ 167.305800] retire_requests+0x51/0x80 [i915] <4>[ 167.305892] intel_gt_retire_requests_timeout+0x27f/0x700 [i915] <4>[ 167.305985] process_scheduled_works+0x2db/0x530 <4>[ 167.305990] worker_thread+0x18c/0x350 <4>[ 167.305993] kthread+0xfe/0x130 <4>[ 167.305997] ret_from_fork+0x2c/0x50 <4>[ 167.306001] ret_from_fork_asm+0x1b/0x30 <4>[ 167.306004] </TASK>
It is necessary for the queue_priority_hint to be lower than the next request submission upon waking up, as we rely on the hint to decide when to kick the tasklet to submit that first request.
Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy") Closes: https://gitlab.freedesktop.org/drm/intel/issues/10154 Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com Cc: Mika Kuoppala mika.kuoppala@linux.intel.com Cc: stable@vger.kernel.org # v5.4+ Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com Reviewed-by: Andi Shyti andi.shyti@linux.intel.com Signed-off-by: Andi Shyti andi.shyti@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240318135906.716055-2-janusz... (cherry picked from commit 98850e96cf811dc2d0a7d0af491caff9f5d49c1e) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 3 --- drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +++ 2 files changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -250,9 +250,6 @@ static int __engine_park(struct intel_wa intel_engine_park_heartbeat(engine); intel_breadcrumbs_park(engine->breadcrumbs);
- /* Must be reset upon idling, or we may miss the busy wakeup. */ - GEM_BUG_ON(engine->execlists.queue_priority_hint != INT_MIN); - if (engine->park) engine->park(engine);
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -5032,6 +5032,9 @@ static void execlists_park(struct intel_ { cancel_timer(&engine->execlists.timer); cancel_timer(&engine->execlists.preempt); + + /* Reset upon idling, or we may delay the busy wakeup. */ + WRITE_ONCE(engine->execlists.queue_priority_hint, INT_MIN); }
void intel_execlists_set_default_submission(struct intel_engine_cs *engine)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
commit 6e62ebfb49eb65bdcbfc5797db55e0ce7f79c3dd upstream.
This fixes the following build regression:
drivers-bluetooth-btintel.c-btintel_read_version()-warn: passing-zero-to-PTR_ERR
Fixes: b79e04091010 ("Bluetooth: btintel: Fix null ptr deref in btintel_read_version") Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bluetooth/btintel.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/bluetooth/btintel.c +++ b/drivers/bluetooth/btintel.c @@ -344,13 +344,13 @@ int btintel_read_version(struct hci_dev struct sk_buff *skb;
skb = __hci_cmd_sync(hdev, 0xfc05, 0, NULL, HCI_CMD_TIMEOUT); - if (IS_ERR_OR_NULL(skb)) { + if (IS_ERR(skb)) { bt_dev_err(hdev, "Reading Intel version information failed (%ld)", PTR_ERR(skb)); return PTR_ERR(skb); }
- if (skb->len != sizeof(*ver)) { + if (!skb || skb->len != sizeof(*ver)) { bt_dev_err(hdev, "Intel version event size mismatch"); kfree_skb(skb); return -EILSEQ;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vasiliy Kovalev kovalev@altlinux.org
commit e606e4b71798cc1df20e987dde2468e9527bd376 upstream.
The changes are similar to those given in the commit 19b070fefd0d ("VMCI: Fix memcpy() run-time warning in dg_dispatch_as_host()").
Fix filling of the msg and msg_payload in dg_info struct, which prevents a possible "detected field-spanning write" of memcpy warning that is issued by the tracking mechanism __fortify_memcpy_chk.
Signed-off-by: Vasiliy Kovalev kovalev@altlinux.org Link: https://lore.kernel.org/r/20240219105315.76955-1-kovalev@altlinux.org Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/vmw_vmci/vmci_datagram.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/misc/vmw_vmci/vmci_datagram.c +++ b/drivers/misc/vmw_vmci/vmci_datagram.c @@ -378,7 +378,8 @@ int vmci_datagram_invoke_guest_handler(s
dg_info->in_dg_host_queue = false; dg_info->entry = dst_entry; - memcpy(&dg_info->msg, dg, VMCI_DG_SIZE(dg)); + dg_info->msg = *dg; + memcpy(&dg_info->msg_payload, dg + 1, dg->payload_size);
INIT_WORK(&dg_info->work, dg_delayed_dispatch); schedule_work(&dg_info->work);
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Kubecek mkubecek@suse.cz
commit c93db682cfb213501881072a9200a48ce1dc3c3f upstream.
Commit 3fb0fdb3bbe7 ("x86/stackprotector/32: Make the canary into a regular percpu variable") modified the stackprotector check on 32-bit x86 to check if gcc supports using %fs as canary. Adjust dummy-tools gcc script to pass this new test by returning "%fs" rather than "%gs" if it detects -mstack-protector-guard-reg=fs on command line.
Fixes: 3fb0fdb3bbe7 ("x86/stackprotector/32: Make the canary into a regular percpu variable") Signed-off-by: Michal Kubecek mkubecek@suse.cz Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/dummy-tools/gcc | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/scripts/dummy-tools/gcc +++ b/scripts/dummy-tools/gcc @@ -76,7 +76,11 @@ fi if arg_contain -S "$@"; then # For scripts/gcc-x86-*-has-stack-protector.sh if arg_contain -fstack-protector "$@"; then - echo "%gs" + if arg_contain -mstack-protector-guard-reg=fs "$@"; then + echo "%fs" + else + echo "%gs" + fi exit 0 fi fi
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
commit 288b3271d920c9ba949c3bab0f749f4cecc70e09 upstream.
When the sd driver revalidates host-managed SMR disks, it calls disk_set_zoned() which changes the zone_write_granularity attribute value to the logical block size regardless of the device type. After that, the sd driver overwrites the value in sd_zbc_read_zone() with the physical block size, since ZBC/ZAC requires this for host-managed disks. Between the calls to disk_set_zoned() and sd_zbc_read_zone(), there exists a window where the attribute shows the logical block size as the zone_write_granularity value, which is wrong for host-managed disks. The duration of the window is from 20ms to 200ms, depending on report zone command execution time.
To avoid the wrong zone_write_granularity value between disk_set_zoned() and sd_zbc_read_zone(), modify the value not in sd_zbc_read_zone() but just after disk_set_zoned() call.
Fixes: a805a4fa4fa3 ("block: introduce zone_write_granularity limit") Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Link: https://lore.kernel.org/r/20230306063024.3376959-1-shinichiro.kawasaki@wdc.c... Reviewed-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/sd.c | 7 ++++++- drivers/scsi/sd_zbc.c | 8 -------- 2 files changed, 6 insertions(+), 9 deletions(-)
--- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3026,8 +3026,13 @@ static void sd_read_block_characteristic }
if (sdkp->device->type == TYPE_ZBC) { - /* Host-managed */ + /* + * Host-managed: Per ZBC and ZAC specifications, writes in + * sequential write required zones of host-managed devices must + * be aligned to the device physical block size. + */ blk_queue_set_zoned(sdkp->disk, BLK_ZONED_HM); + blk_queue_zone_write_granularity(q, sdkp->physical_block_size); } else { sdkp->zoned = (buffer[8] >> 4) & 3; if (sdkp->zoned == 1) { --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -793,14 +793,6 @@ int sd_zbc_read_zones(struct scsi_disk * blk_queue_max_active_zones(q, 0); nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks);
- /* - * Per ZBC and ZAC specifications, writes in sequential write required - * zones of host-managed devices must be aligned to the device physical - * block size. - */ - if (blk_queue_zoned_model(q) == BLK_ZONED_HM) - blk_queue_zone_write_granularity(q, sdkp->physical_block_size); - /* READ16/WRITE16 is mandatory for ZBC disks */ sdkp->device->use_16_for_rw = 1; sdkp->device->use_10_for_rw = 0;
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov (AMD) bp@alien8.de
commit b377c66ae3509ccea596512d6afb4777711c4870 upstream.
srso_alias_untrain_ret() is special code, even if it is a dummy which is called in the !SRSO case, so annotate it like its real counterpart, to address the following objtool splat:
vmlinux.o: warning: objtool: .export_symbol+0x2b290: data relocation to !ENDBR: srso_alias_untrain_ret+0x0
Fixes: 4535e1a4174c ("x86/bugs: Fix the SRSO mitigation on Zen3/4") Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/r/20240405144637.17908-1-bp@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/lib/retpoline.S | 1 + 1 file changed, 1 insertion(+)
--- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -258,6 +258,7 @@ SYM_CODE_START(__x86_return_thunk) UNWIND_HINT_FUNC ANNOTATE_NOENDBR ANNOTATE_UNRET_SAFE + ANNOTATE_NOENDBR ret int3 SYM_CODE_END(__x86_return_thunk)
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Roth michael.roth@amd.com
commit 469693d8f62299709e8ba56d8fb3da9ea990213c upstream.
Due to
103a4908ad4d ("x86/head/64: Disable stack protection for head$(BITS).o")
kernel/head{32,64}.c are compiled with -fno-stack-protector to allow a call to set_bringup_idt_handler(), which would otherwise have stack protection enabled with CONFIG_STACKPROTECTOR_STRONG.
While sufficient for that case, there may still be issues with calls to any external functions that were compiled with stack protection enabled that in-turn make stack-protected calls, or if the exception handlers set up by set_bringup_idt_handler() make calls to stack-protected functions.
Subsequent patches for SEV-SNP CPUID validation support will introduce both such cases. Attempting to disable stack protection for everything in scope to address that is prohibitive since much of the code, like the SEV-ES #VC handler, is shared code that remains in use after boot and could benefit from having stack protection enabled. Attempting to inline calls is brittle and can quickly balloon out to library/helper code where that's not really an option.
Instead, re-enable stack protection for head32.c/head64.c, and make the appropriate changes to ensure the segment used for the stack canary is initialized in advance of any stack-protected C calls.
For head64.c:
- The BSP will enter from startup_64() and call into C code (startup_64_setup_env()) shortly after setting up the stack, which may result in calls to stack-protected code. Set up %gs early to allow for this safely. - APs will enter from secondary_startup_64*(), and %gs will be set up soon after. There is one call to C code prior to %gs being setup (__startup_secondary_64()), but it is only to fetch 'sme_me_mask' global, so just load 'sme_me_mask' directly instead, and remove the now-unused __startup_secondary_64() function.
For head32.c:
- BSPs/APs will set %fs to __BOOT_DS prior to any C calls. In recent kernels, the compiler is configured to access the stack canary at %fs:__stack_chk_guard [1], which overlaps with the initial per-cpu '__stack_chk_guard' variable in the initial/"master" .data..percpu area. This is sufficient to allow access to the canary for use during initial startup, so no changes are needed there.
[1] 3fb0fdb3bbe7 ("x86/stackprotector/32: Make the canary into a regular percpu variable")
[ bp: Massage commit message. ]
Suggested-by: Joerg Roedel jroedel@suse.de #for 64-bit %gs set up Signed-off-by: Michael Roth michael.roth@amd.com Signed-off-by: Brijesh Singh brijesh.singh@amd.com Signed-off-by: Borislav Petkov bp@suse.de Link: https://lore.kernel.org/r/20220307213356.2797205-24-brijesh.singh@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/setup.h | 1 - arch/x86/kernel/Makefile | 1 - arch/x86/kernel/head64.c | 9 --------- arch/x86/kernel/head_64.S | 24 +++++++++++++++++++++--- 4 files changed, 21 insertions(+), 14 deletions(-)
--- a/arch/x86/include/asm/setup.h +++ b/arch/x86/include/asm/setup.h @@ -49,7 +49,6 @@ extern unsigned long saved_video_mode; extern void reserve_standard_io_resources(void); extern void i386_reserve_resources(void); extern unsigned long __startup_64(unsigned long physaddr, struct boot_params *bp); -extern unsigned long __startup_secondary_64(void); extern void startup_64_setup_env(unsigned long physbase); extern void early_setup_idt(void); extern void __init do_early_exception(struct pt_regs *regs, int trapnr); --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -49,7 +49,6 @@ endif # non-deterministic coverage. KCOV_INSTRUMENT := n
-CFLAGS_head$(BITS).o += -fno-stack-protector CFLAGS_cc_platform.o += -fno-stack-protector
CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -302,15 +302,6 @@ unsigned long __head __startup_64(unsign return sme_get_me_mask(); }
-unsigned long __startup_secondary_64(void) -{ - /* - * Return the SME encryption mask (if SME is active) to be used as a - * modifier for the initial pgdir entry programmed into CR3. - */ - return sme_get_me_mask(); -} - /* Wipe all early page tables except for the kernel symbol map */ static void __init reset_early_page_tables(void) { --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -74,6 +74,22 @@ SYM_CODE_START_NOALIGN(startup_64) leaq (__end_init_task - SIZEOF_PTREGS)(%rip), %rsp
leaq _text(%rip), %rdi + + /* + * initial_gs points to initial fixed_percpu_data struct with storage for + * the stack protector canary. Global pointer fixups are needed at this + * stage, so apply them as is done in fixup_pointer(), and initialize %gs + * such that the canary can be accessed at %gs:40 for subsequent C calls. + */ + movl $MSR_GS_BASE, %ecx + movq initial_gs(%rip), %rax + movq $_text, %rdx + subq %rdx, %rax + addq %rdi, %rax + movq %rax, %rdx + shrq $32, %rdx + wrmsr + pushq %rsi call startup_64_setup_env popq %rsi @@ -141,9 +157,11 @@ SYM_INNER_LABEL(secondary_startup_64_no_ * Retrieve the modifier (SME encryption mask if SME is active) to be * added to the initial pgdir entry that will be programmed into CR3. */ - pushq %rsi - call __startup_secondary_64 - popq %rsi +#ifdef CONFIG_AMD_MEM_ENCRYPT + movq sme_me_mask, %rax +#else + xorq %rax, %rax +#endif
/* Form the CR3 value being sure to include the CR3 modifier */ addq $(init_top_pgt - __START_KERNEL_map), %rax
Hi!
This is the start of the stable review cycle for the 5.10.215 release. There are 294 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
CIP testing did not find any problems here:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-5...
Tested-by: Pavel Machek (CIP) pavel@denx.de
Best regards, Pavel
On 4/11/24 02:52, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.215 release. There are 294 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 13 Apr 2024 09:53:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.215-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
Greg Kroah-Hartman wrote on Thu, Apr 11, 2024 at 11:52:43AM +0200:
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.215-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
Tested 244ca117cb3c ("Linux 5.10.215-rc1") on: - arm i.MX6ULL (Armadillo 640) - arm64 i.MX8MP (Armadillo G4)
No obvious regression in dmesg or basic tests: Tested-by: Dominique Martinet dominique.martinet@atmark-techno.com
On Thu, 11 Apr 2024 11:52:43 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.215 release. There are 294 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 13 Apr 2024 09:53:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.215-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.10: 10 builds: 10 pass, 0 fail 26 boots: 26 pass, 0 fail 68 tests: 68 pass, 0 fail
Linux version: 5.10.215-rc1-g244ca117cb3c Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Thu, 11 Apr 2024 at 16:01, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.10.215 release. There are 294 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 13 Apr 2024 09:53:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.215-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 5.10.215-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-5.10.y * git commit: 244ca117cb3cec4bb8f936b136312bb8482eeae3 * git describe: v5.10.214-295-g244ca117cb3c * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.10.y/build/v5.10....
## Test Regressions (compared to v5.10.214)
## Metric Regressions (compared to v5.10.214)
## Test Fixes (compared to v5.10.214)
## Metric Fixes (compared to v5.10.214)
## Test result summary total: 94495, pass: 74097, fail: 3307, skip: 17023, xfail: 68
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 104 total, 104 passed, 0 failed * arm64: 31 total, 31 passed, 0 failed * i386: 25 total, 25 passed, 0 failed * mips: 22 total, 22 passed, 0 failed * parisc: 3 total, 0 passed, 3 failed * powerpc: 23 total, 23 passed, 0 failed * riscv: 9 total, 9 passed, 0 failed * s390: 9 total, 9 passed, 0 failed * sh: 10 total, 10 passed, 0 failed * sparc: 6 total, 6 passed, 0 failed * x86_64: 27 total, 27 passed, 0 failed
## Test suites summary * boot * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-exec * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-filesystems-epoll * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mm * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-user_events * kselftest-vDSO * kselftest-watchdog * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libgpiod * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-smoketest * ltp-syscalls * ltp-tracing * perf * rcutorture
-- Linaro LKFT https://lkft.linaro.org
Hi,
On Thu, Apr 11, 2024 at 11:52:43AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.215 release. There are 294 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 13 Apr 2024 09:53:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.215-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
[ ... ]
Oliver Neukum oneukum@suse.com usb: cdc-wdm: close race between read and workqueue
Just in case it has not been reported yet:
This patch is causing connection failures (timeouts) on all Chromebooks using the cdc-wdm driver for cellular modems, with all kernel branches where this patch has been applied. Reverting it fixes the problem.
I am copying some of the Google employees involved in identifying the regression in case additional feedback is needed.
Guenter
On Sat, Apr 13, 2024 at 07:11:57AM -0700, Guenter Roeck wrote:
Hi,
On Thu, Apr 11, 2024 at 11:52:43AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.215 release. There are 294 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 13 Apr 2024 09:53:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.215-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
[ ... ]
Oliver Neukum oneukum@suse.com usb: cdc-wdm: close race between read and workqueue
Just in case it has not been reported yet:
This patch is causing connection failures (timeouts) on all Chromebooks using the cdc-wdm driver for cellular modems, with all kernel branches where this patch has been applied. Reverting it fixes the problem.
I am copying some of the Google employees involved in identifying the regression in case additional feedback is needed.
Can you all respond to Oliver on the linux-usb list where this was originally submitted to work it out? This commit has been in the tree for almost a month now with no reported problems that I can see.
thanks,
greg k-h
Hi Greg,
On Sun, Apr 14, 2024 at 08:09:39AM +0200, Greg Kroah-Hartman wrote:
On Sat, Apr 13, 2024 at 07:11:57AM -0700, Guenter Roeck wrote:
Hi,
On Thu, Apr 11, 2024 at 11:52:43AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.215 release. There are 294 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 13 Apr 2024 09:53:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.215-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
[ ... ]
Oliver Neukum oneukum@suse.com usb: cdc-wdm: close race between read and workqueue
Just in case it has not been reported yet:
This patch is causing connection failures (timeouts) on all Chromebooks using the cdc-wdm driver for cellular modems, with all kernel branches where this patch has been applied. Reverting it fixes the problem.
I am copying some of the Google employees involved in identifying the regression in case additional feedback is needed.
Can you all respond to Oliver on the linux-usb list where this was originally submitted to work it out? This commit has been in the tree for almost a month now with no reported problems that I can see.
Who knows, maybe only a certain type of usb cellular modems using cdc-wdm is affected. Either case, the problem was found less than two days after the stable tree merges into ChromeOS, and it took only about a week from there to identify the offending patch. I think that was actually an amazing job, given the size of those merges and because the failure is not absolute but results in unreliable tests due to timeouts.
Anyway, sure, we'll get in touch with Oliver. Other than that, please take this report as a heads-up in case anyone else reports similar problems.
Thanks, Guenter
On Sun, Apr 14, 2024 at 02:18:03PM -0700, Guenter Roeck wrote:
Hi Greg,
On Sun, Apr 14, 2024 at 08:09:39AM +0200, Greg Kroah-Hartman wrote:
On Sat, Apr 13, 2024 at 07:11:57AM -0700, Guenter Roeck wrote:
Hi,
On Thu, Apr 11, 2024 at 11:52:43AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.215 release. There are 294 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 13 Apr 2024 09:53:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.215-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
[ ... ]
Oliver Neukum oneukum@suse.com usb: cdc-wdm: close race between read and workqueue
Just in case it has not been reported yet:
This patch is causing connection failures (timeouts) on all Chromebooks using the cdc-wdm driver for cellular modems, with all kernel branches where this patch has been applied. Reverting it fixes the problem.
I am copying some of the Google employees involved in identifying the regression in case additional feedback is needed.
Can you all respond to Oliver on the linux-usb list where this was originally submitted to work it out? This commit has been in the tree for almost a month now with no reported problems that I can see.
Who knows, maybe only a certain type of usb cellular modems using cdc-wdm is affected. Either case, the problem was found less than two days after the stable tree merges into ChromeOS, and it took only about a week from there to identify the offending patch. I think that was actually an amazing job, given the size of those merges and because the failure is not absolute but results in unreliable tests due to timeouts.
It is an amazing job, and I wasn't trying to be snarky, I was trying to say that "that's odd, normally usb problems are found very quickly by lots of people and you should let the author know about this as there's nothing I can do about it through this stable report".
I see the post on linux-usb now, thanks.
greg k-h
Hi!
This is the start of the stable review cycle for the 5.10.215 release. There are 294 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Alex Williamson alex.williamson@redhat.com vfio/pci: Create persistent INTx handler
This introduces memory leak in vfio_intx_enable() -- name is not freed in case vdev->ctx = kzalloc() fails, for example.
Sean Christopherson seanjc@google.com x86/cpufeatures: Add CPUID_LNX_5 to track recently added Linux-defined word
AFAICT this is not needed in 5.10.
Josh Poimboeuf jpoimboe@redhat.com objtool: Add asm version of STACK_FRAME_NON_STANDARD
Asm version of this macro is not used in 5.10.
Michael Roth michael.roth@amd.com x86/head/64: Re-enable stack protection
This is preparation for preparation for SEV-SNP CPUID patches, I don't believe we plan that for 6.1.
David Sterba dsterba@suse.com btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks()
(This applies to 4.19, too). mutex_unlock() is needed before "goto error" here.
Aric Cyr aric.cyr@amd.com drm/amd/display: Fix nanosec stat overflow
(This applies to 4.19, too). This is wrong. It updates prototypes but not actual functions.
On Wed, Apr 17, 2024 at 02:59:16PM +0200, Pavel Machek wrote:
Hi!
This is the start of the stable review cycle for the 5.10.215 release. There are 294 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Alex Williamson alex.williamson@redhat.com vfio/pci: Create persistent INTx handler
This introduces memory leak in vfio_intx_enable() -- name is not freed in case vdev->ctx = kzalloc() fails, for example.
So is the upstream commit wrong, or the backport wrong?
Sean Christopherson seanjc@google.com x86/cpufeatures: Add CPUID_LNX_5 to track recently added Linux-defined word
AFAICT this is not needed in 5.10.
Why not?
Josh Poimboeuf jpoimboe@redhat.com objtool: Add asm version of STACK_FRAME_NON_STANDARD
Asm version of this macro is not used in 5.10.
It fixed an issue.
Michael Roth michael.roth@amd.com x86/head/64: Re-enable stack protection
This is preparation for preparation for SEV-SNP CPUID patches, I don't believe we plan that for 6.1.
This is 5.10, not 6.1.
And are you sure that this is not needed? Remember the x86 speculation mess that is happening here.
David Sterba dsterba@suse.com btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks()
(This applies to 4.19, too). mutex_unlock() is needed before "goto error" here.
So can you provide that fix please?
Aric Cyr aric.cyr@amd.com drm/amd/display: Fix nanosec stat overflow
(This applies to 4.19, too). This is wrong. It updates prototypes but not actual functions.
So should it be dropped or added to 4.19?
confused,
greg k-h
Greg Kroah-Hartman wrote on Wed, Apr 17, 2024 at 03:28:04PM +0200:
David Sterba dsterba@suse.com btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks()
(This applies to 4.19, too). mutex_unlock() is needed before "goto error" here.
ugh, I need to look more at the context when reviewing these, btrfs is part of what I'm more or less looking at when updating...
So can you provide that fix please?
This bug affects upstream as well and I don't see any upstream fix for it yet on linux-btrfs@vger, I'll send a patch with Pavel's reported at.
linux-stable-mirror@lists.linaro.org